google cloud platform – “gcloud ai endpoints deploy-model” fails with “Model server exited unexpectedly” (BERT model deployment on Vertex AI)

I’m encountering persistent issues deploying a custom container to Vertex AI using gcloud ai endpoints deploy-model. I’m trying to deploy a BERT model packaged in a Docker image, but I’m consistently facing errors despite providing the correct Artifact Registry image path.

Here’s a breakdown of my setup:

I have a Docker image containing a BERT model and a Flask application for inference. The image is approximately 18GB in size.
The image is successfully built and pushed to Google Cloud Artifact Registry.
I’m using the image’s fully qualified digest in the gcloud command.
I have an existing Vertex AI endpoint.
The service account has the necessary permissions.

When I run the following command:

gcloud ai endpoints deploy-model ENDPOINT_ID \
    --region=us-central1 \
    --model="us-central1-docker.pkg.dev/gemini-demo-429713/bert-repo/bert-vertex-ai@sha256:bcec3e..." \
    --deployed-model-id=bert-model-production-v1 \
    --machine-type=n1-standard-8 \
    --display-name=DISPLAY_NAME \
    --service-account=SERVICE_ACC_NAME \
    --traffic-split="0=100"

I get the following error:

ERROR: (gcloud.ai.endpoints.deploy-model) There is an error while getting the model information. Please make sure the model 'projects/gemini-demo-429713/locations/us-central1/models/us-central1-docker.pkg.dev/gemini-demo-429713/bert-repo/bert-vertex-ai@sha256:bcec3e...' exists.

I have also tried uploading a model with the following command:

gcloud ai models upload \
    --region=us-central1 \
    --display-name=bert-model-name-predict-luxure \
    --container-image-uri=us-central1-docker.pkg.dev/gemini-demo-429713/bert-repo/bert-vertex-ai:latest \
    --format="value(name)"

and then tried to deploy the model with:

gcloud ai endpoints deploy-model ENDPOINT_ID \
    --region=us-central1 \
    --model=MODEL_ID \
    --machine-type=n1-standard-4 \
    --display-name=DISPLAY_NAME \
    --service-account=SERVICE_ACC_NAME\
    --verbosity=debug

The deployment process goes on for more than 30 mins and then eventually fails with no logs and a generic error:

RROR: (gcloud.ai.endpoints.deploy-model) Model server exited unexpectedly. Model server logs can be found at

The logs are empty.

What could be causing this issue?