Google Cloud CRITICAL
Cloud Networking: Up to 40% packet loss between affected zones
March 18, 2022 · 11:20 PM UTC – 11:28 PM UTC · Duration: 8min
Affected Services
Google Cloud Networking
Timeline
04:49 AM
Summary
On Friday, 18 March 2022 at 15:20 US/Pacific, Google Cloud Networking experienced intermittent packet loss for traffic between multiple cloud regions for a duration of 8 minutes. The issue was identified and mitigated automatically by 15:28 US/Pacific.
We understand this issue has affected our valued customers and users, and we apologize to those who were affected.
Root Cause
Google’s production backbone is a global network that enables connectivity for all user-facing traffic via Points of Presence (POPs) or internet exchanges.
A rare hardware failure of a component on the fiber paths from one of the transpacific gateway campuses in Google’s production backbone led to a decrease in available network bandwidth between the gateway and multiple edge locations, causing packet loss.
Remediation and Prevention
Google’s automated repair mechanisms detected the decrease in available network bandwidth on Friday, 18 March 2022 at 15:20 US/Pacific and automatically routed the traffic through alternate links. The traffic rerouting completed on Friday, 18 March 2022 at 15:28 US/Pacific, mitigating the issue.
While our automated mechanisms worked as intended and recovered the traffic without manual intervention, we understand that the scope of impact caused by this rare event affected our customers. We have been working on optimizing our global network to minimize the time spent automatically reconfiguring around failures like this (known as "convergence time"). While we have made progress, efforts to improve still further remain ongoing. We continue to ensure that the current technology is optimally configured to minimize the frequency and severity of these issues.
Google is committed to quickly and continually improving our technology and operations to prevent service disruptions. We appreciate your patience and apologize again for the impact to your organization. We thank you for your business.
Detailed Description of Impact
Customers may have observed packet loss for traffic routed via transpacific links on Google's backbone. This could include traffic from or to any of the following cloud regions:
asia-east1
asia-northeast1
asia-northeast2
asia-northeast3
asia-southeast1
asia-southeast2
australia-southeast1
australia-southeast2
us-west1
us-central1
us-east1
us-east4
northamerica-northeast1
europe-west1
europe-west2
europe-west3
europe-west4
europe-west6
europe-west8
europe-west9
europe-central2
europe-north1
06:36 PM
We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Support by opening a case using https://cloud.google.com/support.
(All Times US/Pacific)
Incident Start: 18 March 2022 15:20
Incident End: 18 March 2022 15:28
Duration: 8 minutes
Affected Services and Features:
Google Cloud Networking
Regions/Zones:
asia-east1, asia-northeast1, asia-northeast2, asia-northeast3, asia-southeast1, asia-southeast2,
austrailia-southeast1, austrailia-southeast2
us-west1, us-central1, us-east1, us-east4, northamerica-northeast1,
europe-west1, europe-west2, europe-west3, europe-west4, europe-west6, europe-west8, europe-west9, europe-central2, europe-north1
Description:
Google Cloud Networking experienced intermittent packet loss for transit traffic in multiple cloud regions for 8 minutes. From preliminary analysis, the root cause is a hardware issue on a component of Google Cloud’s networking equipment. The issue was identified and mitigated automatically.
Customer Impact:
Customers may have observed packet loss for transit traffic in the above mentioned cloud regions.
01:01 AM
We experienced an issue with Cloud Networking beginning at Friday, 2022-03-18 15:20 US/Pacific.
Self-diagnosis:
Customers may have experienced up to 40% packet loss between VMs in the following affected regions:
asia-east1 asia-northeast1 asia-northeast2 asia-northeast3 asia-southeast1 asia-southeast2 austrailia-southeast1 austrailia-southeast2 us-west1 us-central1 us-east1 us-east4 northamerica-northeast1 europe-west1 europe-west2 europe-west3 europe-west4 europe-west6 europe-west8 europe-west9 europe-southwest1 europe-central2 europe-north1
The issue has been resolved for all affected projects as of Friday, 2022-03-18 15:36 US/Pacific.
We thank you for your patience while we worked on resolving the issue.
12:53 AM
Summary: Cloud Networking: Up to 40% packet loss between affected zones
Description: We are investigating a potential issue with Cloud Networking.
We will provide more information by Friday, 2022-03-18 17:25 US/Pacific.
Diagnosis: None at this time.
Workaround: None at this time.