google cloud platform – “gcloud ai endpoints deploy-model” fails with “Model server exited unexpectedly” (BERT model deployment on Vertex AI)
I’m encountering persistent issues deploying a custom container to Vertex AI using gcloud ai endpoints deploy-model. I’m trying to deploy a BERT model packaged in a Docker image, but I’m consistently facing errors despite providing the correct Artifact Registry image path.
Here’s a breakdown of my setup:
- I have a Docker image containing a BERT model and a Flask application for inference. The image is approximately 18GB in size.
- The image is successfully built and pushed to Google Cloud Artifact Registry.
- I’m using the image’s fully qualified digest in the gcloud command.
- I have an existing Vertex AI endpoint.
- The service account has the necessary permissions.
When I run the following command:
gcloud ai endpoints deploy-model ENDPOINT_ID \
--region=us-central1 \
--model="us-central1-docker.pkg.dev/gemini-demo-429713/bert-repo/bert-vertex-ai@sha256:bcec3e..." \
--deployed-model-id=bert-model-production-v1 \
--machine-type=n1-standard-8 \
--display-name=DISPLAY_NAME \
--service-account=SERVICE_ACC_NAME \
--traffic-split="0=100"
I get the following error:
ERROR: (gcloud.ai.endpoints.deploy-model) There is an error while getting the model information. Please make sure the model 'projects/gemini-demo-429713/locations/us-central1/models/us-central1-docker.pkg.dev/gemini-demo-429713/bert-repo/bert-vertex-ai@sha256:bcec3e...' exists.
I have also tried uploading a model with the following command:
gcloud ai models upload \
--region=us-central1 \
--display-name=bert-model-name-predict-luxure \
--container-image-uri=us-central1-docker.pkg.dev/gemini-demo-429713/bert-repo/bert-vertex-ai:latest \
--format="value(name)"
and then tried to deploy the model with:
gcloud ai endpoints deploy-model ENDPOINT_ID \
--region=us-central1 \
--model=MODEL_ID \
--machine-type=n1-standard-4 \
--display-name=DISPLAY_NAME \
--service-account=SERVICE_ACC_NAME\
--verbosity=debug
The deployment process goes on for more than 30 mins and then eventually fails with no logs and a generic error:
RROR: (gcloud.ai.endpoints.deploy-model) Model server exited unexpectedly. Model server logs can be found at
The logs are empty.
What could be causing this issue?
Read more here: Source link