Google Cloud MAJOR
Global: Cloud Monitoring Metrics may be unavailable or underreported for Cloud Pub/Sub
April 23, 2022 · 03:10 AM UTC – 06:21 PM UTC · Duration: 15h 11min
Affected Services
Google Cloud Pub/Sub
Timeline
09:05 PM
We apologize for the inconvenience this service disruption/outage may have caused. We would like to provide some information about this incident below. Please note, this information is based on our best knowledge at the time of posting and is subject to change as our investigation continues. If you have experienced impact outside of what is listed below, please reach out to Google Support by opening a case https://cloud.google.com/support.
(All Times US/Pacific)
Incident Start: 22 April 2022 19:10 PT
Incident End: 23 April 2022 10:21 PT
Duration: 15 hours,11 minutes
Affected Services and Features:
Google Cloud Pub/Sub - Google Cloud Monitoring
Regions/Zones: Global Locale
Description:
Google Cloud Pub/Sub customers experienced issues with metrics in Google Cloud Monitoring for a duration of 15 hours, 11 minutes. The issue was caused by a configuration change to the backend for Cloud Monitoring that affected Cloud Pub/Sub metric recording. The issue was mitigated by reverting this change.
Customer Impact:
Cloud Pub/Sub metrics in Cloud Monitoring for times during the incident may be missing or underreported.
The metric values lost in this timeframe will not be recoverable.
Any alerting based on these metrics might have fired erroneously or not fired when they should have during the time of the incident.
Any auto-scaling of Google Kubernetes Engine (GKE) based on these metrics may not have functioned as expected during the time of the incident.
Cloud Pub/Sub administrative, publish, and subscribe operations were not affected by the incident.
06:40 PM
The issue with Google Cloud Pub/Sub monitoring has been resolved for all affected projects as of Saturday, 2022-04-23 10:21 US/Pacific.
We will publish an analysis of this incident once we have completed our internal investigation.
We thank you for your patience while we worked on resolving the issue.
06:27 PM
Summary: Global: Cloud Monitoring Metrics may be unavailable or underreported for Cloud Pub/Sub
Description: We believe the issue with Google Cloud Pub/Sub monitoring was partially resolved as of 10:20 US/Pacific and are continuing to monitor the recovery of the service.
We do not have an ETA for full resolution at this point.
We will provide more information by Saturday, 2022-04-23 11:30 US/Pacific.
Diagnosis: Customers impacted by this issue may see Cloud Monitoring metrics for Cloud Pub/Sub that show no or underreported values. Any alerting based on these metrics may fire erroneously.
Workaround: Non-Cloud-Pub/Sub metrics and logs on publish and subscriber clients can be used as a proxy to ensure that publishing and subscribing is still behaving as expected. For example, metrics available for clients running on GCE include:
instance/cpu/utilization
instance/network/received_bytes_count
instance/network/sent_bytes_count
05:53 PM
Summary: Global: Cloud Monitoring Metrics may be unavailable or underreported for Cloud Pub/Sub
Description: We are experiencing an issue with Google Cloud Pub/Sub beginning on Friday, 2022-04-22 19:10 US/Pacific.
There is no known impact on Cloud Pub/Sub administrative, publish, or subscribe operations at this time.
Engineering is continuing to investigate the issue.
We will provide an update by Saturday, 2022-04-23 10:30 US/Pacific with current details.
We apologize to all who are affected by the disruption.
Diagnosis: Customers impacted by this issue may see Cloud Monitoring metrics for Cloud Pub/Sub that show no or underreported values. Any alerting based on these metrics may fire erroneously.
Workaround: Non-Cloud-Pub/Sub metrics and logs on publish and subscriber clients can be used as a proxy to ensure that publishing and subscribing is still behaving as expected. For example, metrics available for clients running on GCE include:
instance/cpu/utilization
instance/network/received_bytes_count
instance/network/sent_bytes_count
05:27 PM
Summary: Global: Cloud Monitoring Metrics may be unavailable or underreported for Cloud Pub/Sub
Description: We are experiencing an issue with Google Cloud Pub/Sub beginning on Friday, 2022-04-22 19:10 US/Pacific.
There is no known impact on Cloud Pub/Sub administrative, publish, or subscribe operations at this time.
Engineering is continuing to investigate the issue.
We will provide an update by Saturday, 2022-04-23 10:00 US/Pacific with current details.
We apologize to all who are affected by the disruption.
Diagnosis: Customers impacted by this issue may see Cloud Monitoring metrics for Cloud Pub/Sub that show no or underreported values. Any alerting based on these metrics may fire erroneously.
Workaround: Non-Cloud-Pub/Sub metrics and logs on publish and subscriber clients can be used as a proxy to ensure that publishing and subscribing is still behaving as expected. For example, metrics available for clients running on GCE include:
instance/cpu/utilization
instance/network/received_bytes_count
instance/network/sent_bytes_count