Previous incidents

October 2024
Oct 31, 2024
1 incident

Volumes is down

Downtime

Resolved Oct 31 at 10:52am EDT

Volumes recovered.

1 previous update

September 2024
Sep 30, 2024
1 incident

Image builds is down

Downtime

Resolved Sep 30 at 05:17pm EDT

Image builds recovered.

1 previous update

Sep 24, 2024
1 incident

CPU container performance degradation from over-scheduling on GCE n1-standard...

Resolved Sep 24 at 01:51pm EDT

Between Monday 23rd 15:54 UTC and 15:36 UTC a change was made to make GCP workers poll more quickly for containers and this caused them to pick up more CPU-only containers.

Between 13:48 UTC and 16:18 UTC on September 24th CPU containers were allowed to be scheduled on GCP NVIDIA T4 workers which are the n1 class of GCE VMs.

Both of these changes combined to cause over-scheduling of CPU work onto GCP T4 workers and significantly degrade the performance of the hosted containers (whether CP...

Sep 13, 2024
1 incident

Container filesystem 'file not found' bug

Resolved Sep 13 at 01:59pm EDT

At 12:57 UTC a container filesystem change rolled out across workers which caused spurious 'file not found' (ENOENT) errors against the container's rootfs filesystem.

We identified the bad commit and rolled back to prevent new workers from picking up the bug.

We then worked to roll over all running production workers which were affected by the bad change and scale up new workers.

All affected workers have been quarantined and thus new containers will start on new, healthy workers.

We a...

Sep 06, 2024
1 incident

CPU functions on GCP is down

Downtime

Resolved Sep 06 at 11:05am EDT

CPU functions on GCP recovered.

1 previous update

Sep 05, 2024
1 incident

CPU functions on GCP is down

Downtime

Resolved Sep 05 at 10:09am EDT

CPU functions on GCP recovered.

1 previous update

Sep 04, 2024
1 incident

CPU functions on GCP is down

Downtime

Resolved Sep 04 at 12:18pm EDT

CPU functions on GCP recovered.

1 previous update

August 2024
No incidents reported