google cloud platform – “gcloud ai endpoints deploy-model” fails with “Model server exited unexpectedly” (BERT model deployment on Vertex AI)

I’m encountering persistent issues deploying a custom container to Vertex AI using gcloud ai endpoints deploy-model. I’m trying to deploy a BERT model packaged in a Docker image, but I’m consistently facing errors despite providing the correct Artifact Registry image path.

Here’s a breakdown of my setup:

  • I have a Docker image containing a BERT model and a Flask application for inference. The image is approximately 18GB in size.
  • The image is successfully built and pushed to Google Cloud Artifact Registry.
  • I’m using the image’s fully qualified digest in the gcloud command.
  • I have an existing Vertex AI endpoint.
  • The service account has the necessary permissions.

When I run the following command:

gcloud ai endpoints deploy-model ENDPOINT_ID \
    --region=us-central1 \
    --model="us-central1-docker.pkg.dev/gemini-demo-429713/bert-repo/bert-vertex-ai@sha256:bcec3e..." \
    --deployed-model-id=bert-model-production-v1 \
    --machine-type=n1-standard-8 \
    --display-name=DISPLAY_NAME \
    --service-account=SERVICE_ACC_NAME \
    --traffic-split="0=100"

I get the following error:

ERROR: (gcloud.ai.endpoints.deploy-model) There is an error while getting the model information. Please make sure the model 'projects/gemini-demo-429713/locations/us-central1/models/us-central1-docker.pkg.dev/gemini-demo-429713/bert-repo/bert-vertex-ai@sha256:bcec3e...' exists.

I have also tried uploading a model with the following command:

gcloud ai models upload \
    --region=us-central1 \
    --display-name=bert-model-name-predict-luxure \
    --container-image-uri=us-central1-docker.pkg.dev/gemini-demo-429713/bert-repo/bert-vertex-ai:latest \
    --format="value(name)"

and then tried to deploy the model with:

gcloud ai endpoints deploy-model ENDPOINT_ID \
    --region=us-central1 \
    --model=MODEL_ID \
    --machine-type=n1-standard-4 \
    --display-name=DISPLAY_NAME \
    --service-account=SERVICE_ACC_NAME\
    --verbosity=debug

The deployment process goes on for more than 30 mins and then eventually fails with no logs and a generic error:

RROR: (gcloud.ai.endpoints.deploy-model) Model server exited unexpectedly. Model server logs can be found at 

The logs are empty.

What could be causing this issue?

Read more here: Source link