Google Cloud MAJOR

us-east1: Multiple Google Cloud Services Impacted

June 4, 2023 · 02:02 PM UTC – 03:45 PM UTC · Duration: 1h 43min

Affected Services

Google Compute EngineGoogle Kubernetes EngineGoogle Cloud DataflowGoogle Cloud SQL

Timeline

03:26 PM
Mini Incident Report We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support. (All Times US/Pacific) Incident Start: 4 June 2023 06:02 Incident End: 4 June 2023 07:45 Duration: 1 hour, 43 minutes Affected Services and Features: Google Compute Engine, Google Kubernetes Engine, Google Cloud Dataflow, Google Cloud SQL Regions/Zones: us-east1-b, us-east1-c, and us-east1-d Description: Google Compute Engine experienced errors and elevated latency when creating, deleting, or updating instances. This also impacted Google Kubernetes Engine, Google Cloud Dataflow and Google Cloud SQL. The issue started after a partial failure of cooling systems in the region. During the mitigation process, one of the internal infrastructure components used to manage compute resources was accidentally disabled. Google engineering team shall review the internal procedures to ensure mitigation processes are applied as intended. Customer Impact: Google Kubernetes Engine: Affected customers were unable to create, upgrade or delete new clusters, or create new node pools. Google Compute Engine: Affected customers were unable to create or delete instances. However, instances that were already running were unaffected. Google Cloud Dataflow: Affected customers experienced issues launching jobs (both batch and streaming jobs) due to the inability to launch instances for worker pools. Existing jobs may have experienced slowness due to the inability to scale up worker pools. Google Cloud SQL: Affected customers were unable to create, recreate, or update instances.
04:18 PM
The issue with Google Cloud Dataflow, Google Cloud SQL, Google Compute Engine, Google Kubernetes Engine has been resolved for all affected users as of Sunday, 2023-06-04 08:14 US/Pacific. We thank you for your patience while we worked on resolving the issue.
03:52 PM
Summary: us-east1: Multiple Google Cloud Services Impacted Description: Google Cloud experienced a cooling issue in one of its data centers. During mitigation, steps were taken that caused elevated latency and failures of instance creation operations in some zones in us-east1. Teams are investigating whether additional services are impacted by the event. Our engineering team continues to investigate the issue. We will provide an update by Sunday, 2023-06-04 09:00 US/Pacific with current details. Diagnosis: GCE - Customers may see elevated latency and failures when creating instances in us-east1-b, us-east1-c, or us-east1-d. DataFlow - Customers may experience issues launching batch jobs and will need to be relaunched. Streaming jobs may experience latency. Workaround: GCE - Customers can create VMs in zone us-east1-a or another region. DataFlow - None at this time.
03:27 PM
Summary: us-east1: GCE instance creation failures Description: Google Compute Engine experienced a cooling issue in one of its data centers. During mitigation, steps were taken that caused elevated latency and failures of instance creation operations in some zones in us-east1. Teams are investigating whether VM performance is impacted by the event. Our engineering team continues to investigate the issue. We will provide an update by Sunday, 2023-06-04 08:30 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: Customers may see elevated latency and failures when creating instances in us-east1-b, us-east1-c, or us-east1-d. Workaround: Customers can create VMs in zone us-east1-a or another region.
02:40 PM
Summary: We are investigating a problem with GCE VM creation in us-east1 region Description: We are experiencing an issue with Google Compute Engine. Our engineering team continues to investigate the issue. We will provide an update by Sunday, 2023-06-04 07:15 US/Pacific with current details. We apologize to all who are affected by the disruption. Diagnosis: None at this time. Workaround: None at this time.