Previous incidents
Volumes is down
Resolved Oct 31 at 10:52am EDT
Volumes recovered.
1 previous update
Image builds is down
Resolved Sep 30 at 05:17pm EDT
Image builds recovered.
1 previous update
CPU container performance degradation from over-scheduling on GCE n1-standard...
Resolved Sep 24 at 01:51pm EDT
Between Monday 23rd 15:54 UTC and 15:36 UTC a change was made to make GCP workers poll more quickly for containers and this caused them to pick up more CPU-only containers.
Between 13:48 UTC and 16:18 UTC on September 24th CPU containers were allowed to be scheduled on GCP NVIDIA T4 workers which are the n1
class of GCE VMs.
Both of these changes combined to cause over-scheduling of CPU work onto GCP T4 workers and significantly degrade the performance of the hosted containers (whether CP...
Container filesystem 'file not found' bug
Resolved Sep 13 at 01:59pm EDT
At 12:57 UTC a container filesystem change rolled out across workers which caused spurious 'file not found' (ENOENT) errors against the container's rootfs filesystem.
We identified the bad commit and rolled back to prevent new workers from picking up the bug.
We then worked to roll over all running production workers which were affected by the bad change and scale up new workers.
All affected workers have been quarantined and thus new containers will start on new, healthy workers.
We a...
CPU functions on GCP is down
Resolved Sep 06 at 11:05am EDT
CPU functions on GCP recovered.
1 previous update
CPU functions on GCP is down
Resolved Sep 05 at 10:09am EDT
CPU functions on GCP recovered.
1 previous update
CPU functions on GCP is down
Resolved Sep 04 at 12:18pm EDT
CPU functions on GCP recovered.
1 previous update