Google Cloud MAJOR
Inter-regional VM to VM packet loss towards regions in Europe
November 8, 2023 · 03:59 PM UTC – 04:07 PM UTC · Duration: 8min
Affected Services
Google BigQueryGoogle Cloud StorageGoogle Cloud NetworkingCloud Load Balancing
Timeline
08:10 AM
Mini Incident Report
We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Cloud Support using https://cloud.google.com/support or to Google Workspace Support using help article https://support.google.com/a/answer/1047213.
(All Times US/Pacific)
Incident Start: 8 November, 2023 07:59
Incident End: 8 November, 2023 08:07
Duration: 8 minutes
Affected Services and Features:
Multiple Google Cloud Products Including Google Cloud Storage (GCS)
Regions/Zones: After further analysis, it was confirmed that impact was limited to europe-west4 and europe-north1.
Description:
Multiple Google Cloud products experienced elevated error rates and packet loss for inter-regional traffic towards and from europe-west4 and europe-north1 for a duration of 8 minutes. The only exception to this was for Google Cloud Storage (GCS), which was impacted for a duration of 72 minutes with partial recovery after approximately 31 minutes.
From preliminary analysis, the root cause of the issue was a significant transatlantic fiber cut that was detected by internal monitoring and triggered re-routing of both routed and tunneled traffic to alternate links. Some services were affected by a link capacity shortage after the event and until additional link capacity was made available.
Customer Impact:
Customers may have experienced elevated error rates and packet loss for inter-regional traffic to the affected European regions.
For Google Cloud Storage (GCS)
Customers may have experienced 500 errors for GCS resources in the affected regions.
GCS impact was partially mitigated at approximately 08:30 US/Pacific with full recovery by 09:10 US/Pacific, as capacity recovered.
05:30 PM
The issue with Cloud Load Balancing, Google BigQuery, Google Cloud Networking, Google Cloud Storage has been resolved for all affected users.
The issue was caused by a fiber cut which was detected by our internal alerting and the traffic was routed to alternate links by our automatic repair mechanisms.
The issue started at 07:59 US/Pacific and the impact for Cloud Load Balancing, Google BigQuery, and Google Cloud Networking ended at 08:07 US/Pacific.
The impact for Google Cloud Storage ended at around 08:30 US/Pacific
We thank you for your patience while we worked on resolving the issue.
04:51 PM
Summary: Inter-regional VM to VM packet loss towards regions in Europe
Description: Mitigation work is completed by our engineering team and our internal monitoring shows full recovery of cloud services.
Our engineering team is closely monitoring to ensure full recovery.
We will provide more information by Wednesday, 2023-11-08 09:30 US/Pacific.
Diagnosis: Affected customers would experience VM to VM packet loss towards regions in Europe
Workaround: None at this time
04:41 PM
Summary: Inter-continental user facing packet loss between Europe and North America
Description: We are experiencing an issue with Google Cloud Networking, Cloud Load Balancing, Google Cloud Storage.
Our engineering team continues to investigate the issue.
We will provide an update by Wednesday, 2023-11-08 09:45 US/Pacific with current details.
We apologize to all who are affected by the disruption.
Diagnosis: None at this time
Workaround: None at this time
04:37 PM
Summary: Inter-continental user facing packet loss between Europe and North America
Description: We are experiencing an issue with Google Cloud Networking, Cloud Load Balancing, Google Cloud Storage.
Our engineering team continues to investigate the issue.
We will provide an update by Wednesday, 2023-11-08 09:55 US/Pacific with current details.
We apologize to all who are affected by the disruption.
Diagnosis: None at this time
Workaround: None at this time