Previous incidents
Image builds is down
Resolved Sep 30 at 05:17pm EDT
Image builds recovered.
1 previous update
CPU container performance degradation from over-scheduling on GCE n1-standard...
Resolved Sep 24 at 01:51pm EDT
Between Monday 23rd 15:54 UTC and 15:36 UTC a change was made to make GCP workers poll more quickly for containers and this caused them to pick up more CPU-only containers.
Between 13:48 UTC and 16:18 UTC on September 24th CPU containers were allowed to be scheduled on GCP NVIDIA T4 workers which are the n1
class of GCE VMs.
Both of these changes combined to cause over-scheduling of CPU work onto GCP T4 workers and significantly degrade the performance of the hosted containers (whether CP...
Container filesystem 'file not found' bug
Resolved Sep 13 at 01:59pm EDT
At 12:57 UTC a container filesystem change rolled out across workers which caused spurious 'file not found' (ENOENT) errors against the container's rootfs filesystem.
We identified the bad commit and rolled back to prevent new workers from picking up the bug.
We then worked to roll over all running production workers which were affected by the bad change and scale up new workers.
All affected workers have been quarantined and thus new containers will start on new, healthy workers.
We a...
CPU functions on GCP is down
Resolved Sep 06 at 11:05am EDT
CPU functions on GCP recovered.
1 previous update
CPU functions on GCP is down
Resolved Sep 05 at 10:09am EDT
CPU functions on GCP recovered.
1 previous update
CPU functions on GCP is down
Resolved Sep 04 at 12:18pm EDT
CPU functions on GCP recovered.
1 previous update
Ephemeral apps and image builds are degraded.
Resolved Jul 25 at 02:50am EDT
Trigger processing issues have caused issues in ephemeral Apps and modal.Image builds. We're investigating.
Modal function executions are degraded
Resolved Jul 16 at 02:43pm EDT
The system has recovered, and function executions should be proceeding as normal. The team is on standby for any further issues.
2 previous updates
CPU functions on GCP, Web endpoints, and 1 other service are down
Resolved Jul 12 at 07:35pm EDT
Web endpoints recovered.
6 previous updates