google cloud platform – Compute Engine – CPU Utilization & Load average spikes at mid-night during day switch over

CPU Utilization & Load average on Virtual Instance (Compute Engine) spikes during midnight when day switches. We use 8/6 Core VM instance running on Ubuntu 20.04 LTS. Further, during midnight we don’t have much traffic. Regularly, since last 8 months CPU Utilization & Load Average shoots to 100% & 150+ respectively which takes the website down for 1 or 2 minutes till the another VM shoots up to handle the spike. Spike in CPU Utilization/Load Average gets over within 5 minutes. During CPU load spike, spike in Disk throughput & Disk IOPS is also visible but that looks manageable by VM.
It is pretty annoying to receive Website Down/High Traffic alerts (via SMS and Emails) during midnight regularly. I have checked and made sure that –

  • It is not log rotation on Ubuntu during midnight as VM console confirms that Log Rotation gets completed before spike occurs.
  • MySQL is also responsive and queries per seconds VM sends to MySQL Cloud server remains normal.
  • No Google services through API is accessed during midnight by us e.g. Translation API which might take long to finish.

I doubt it might be some house keeping done by Cloud Team on day switch over which might be causing high Load Average. Console Graph for VM shows spike in New connection under Google Services which changes average 70/s to 200/s. I have attached VM observability snapshot.

Looking for some help to resolve the issue.

VM Observability Snapshot

Read more here: Source link