News

Google Blames API Mishap for Global Cloud Outage That Disrupted Billions of Requests

Published

19 minutes ago

June 14, 2025

<p data-start="222" data-end="433">A routine quota update turned into a global meltdown Thursday as Google Cloud&&num;8217&semi;s API management system buckled, causing a ripple effect that froze core Google apps and third-party services for over three hours&period;</p>&NewLine;<p data-start="435" data-end="643">From around 10&colon;49 a&period;m&period; ET to 3&colon;49 p&period;m&period; ET, the outage disrupted millions of users and impacted services used by everyone from remote workers and tech startups to social media addicts and enterprise engineers&period;</p>&NewLine;<h2 data-start="645" data-end="705">What Went Wrong&quest; Google&&num;8217&semi;s API Platform Choked on Bad Data</h2>&NewLine;<p data-start="707" data-end="928">Google says the culprit was &OpenCurlyDoubleQuote;invalid automated quota data” that slipped through internal checks and reached global systems&period; That’s a fancy way of saying their API gatekeeper had a data hiccup—and no one caught it in time&period;</p>&NewLine;<p data-start="930" data-end="1258">Only after Google’s engineering team bypassed the faulty quota validation logic did things start returning to normal&period; That workaround got most regions back online within two hours&period; But the us-central1 region (which covers major data centers in Iowa) took much longer, with some services stuck in cleanup mode even after the fix&period;</p>&NewLine;<p data-start="1260" data-end="1316">Here&&num;8217&semi;s what Google shared in their preliminary analysis&colon;</p>&NewLine;<ul data-start="1318" data-end="1512">&NewLine;<li data-start="1318" data-end="1367">&NewLine;<p data-start="1320" data-end="1367">The root cause&colon; a flawed automated quota update</p>&NewLine;</li>&NewLine;<li data-start="1368" data-end="1436">&NewLine;<p data-start="1370" data-end="1436">The symptom&colon; 503 errors when external apps tried using Google APIs</p>&NewLine;</li>&NewLine;<li data-start="1437" data-end="1512">&NewLine;<p data-start="1439" data-end="1512">The domino effect&colon; widespread failure across Google services and partners</p>&NewLine;</li>&NewLine;</ul>&NewLine;<p data-start="1514" data-end="1691">For such a tech giant, that kind of single point of failure seems&&num;8230&semi; avoidable&period; But then again, this isn’t the first time an API change has thrown a wrench into a global system&period;</p>&NewLine;<p data-start="1514" data-end="1691"><a href="https&colon;//www&period;theibulletin&period;com/wp-content/uploads/2025/06/google-cloud-data-center-outage-incident&period;jpg"><img class="aligncenter size-full wp-image-57674" src="https&colon;//www&period;theibulletin&period;com/wp-content/uploads/2025/06/google-cloud-data-center-outage-incident&period;jpg" alt="google-cloud-data-center-outage-incident" width="1046" height="816" /></a></p>&NewLine;<h2 data-start="1693" data-end="1747">Who Got Hit the Hardest&quest; It Wasn&&num;8217&semi;t Just Gmail Users</h2>&NewLine;<p data-start="1749" data-end="1917">The list of affected services reads like a who&&num;8217&semi;s who of Google apps&period; Think Gmail, Calendar, Drive, Meet, Docs—basically, most tools the average office leans on daily&period;</p>&NewLine;<p data-start="1919" data-end="2139">But the real damage was seen in third-party platforms that sit on top of Google Cloud infrastructure&period; Apps like Discord, Spotify, and Snapchat all reported issues ranging from slow performance to complete unavailability&period;</p>&NewLine;<p data-start="2141" data-end="2207">For developers and backend engineers, the headache was even worse&colon;</p>&NewLine;<ul data-start="2209" data-end="2447">&NewLine;<li data-start="2209" data-end="2285">&NewLine;<p data-start="2211" data-end="2285">NPM, a popular JavaScript package manager, struggled to serve code modules</p>&NewLine;</li>&NewLine;<li data-start="2286" data-end="2369">&NewLine;<p data-start="2288" data-end="2369">Firebase Studio, critical for app development and deployment, became inaccessible</p>&NewLine;</li>&NewLine;<li data-start="2370" data-end="2447">&NewLine;<p data-start="2372" data-end="2447">Some Cloudflare services relying on Workers KV sputtered or failed outright</p>&NewLine;</li>&NewLine;</ul>&NewLine;<p data-start="2449" data-end="2544">You could almost hear the collective groan of developers from different time zones all at once&period;</p>&NewLine;<h2 data-start="2546" data-end="2595">The Chain Reaction&colon; Cloudflare Takes a Hit Too</h2>&NewLine;<p data-start="2597" data-end="2795">Cloudflare, a key backbone provider for countless websites and apps, also got caught in the blast radius&period; It wasn’t their infrastructure that failed directly—it was their dependency on Google Cloud&period;</p>&NewLine;<p data-start="2797" data-end="3019">The failure hit Workers KV, Cloudflare&&num;8217&semi;s key-value store used for everything from authentication tokens to CDN asset configuration&period; The result&quest; Spikes in error rates, configuration failures, and disrupted service delivery&period;</p>&NewLine;<p data-start="3021" data-end="3151">In a candid post-mortem, Cloudflare clarified the failure wasn&&num;8217&semi;t security-related and no data was lost&period; But they did confirm that&colon;</p>&NewLine;<p data-start="3155" data-end="3289">&OpenCurlyDoubleQuote;The underlying storage infrastructure used by our Workers KV service, backed by a third-party cloud provider, experienced an outage&period;”</p>&NewLine;<p data-start="3291" data-end="3330">Translation&quest; It was Google Cloud again&period;</p>&NewLine;<h2 data-start="3332" data-end="3388">Google Admits Testing and Error Handling Were Lacking</h2>&NewLine;<p data-start="3390" data-end="3600">In an unusual show of transparency, Google admitted that the system didn&&num;8217&semi;t have sufficient safeguards in place to catch the problem early&period; The flawed data update should’ve been flagged in testing—but it wasn’t&period;</p>&NewLine;<p data-start="3602" data-end="3758">That kind of lapse is raising eyebrows across the industry, particularly because of Google Cloud&&num;8217&semi;s positioning as a platform for mission-critical workloads&period;</p>&NewLine;<p data-start="3811" data-end="3868">&OpenCurlyDoubleQuote;We lacked effective testing and error-handling systems&period;”</p>&NewLine;<p data-start="3870" data-end="4018">For a company pushing AI, quantum computing, and enterprise-scale infrastructure, that’s a bit like a pilot admitting they forgot to check the fuel&period;</p>&NewLine;<h2 data-start="4020" data-end="4075">By the Numbers&colon; How Long Did It Last and What Broke&quest;</h2>&NewLine;<p data-start="4077" data-end="4185">Below is a rough breakdown of service impact and duration based on publicly available data and user reports&colon;</p>&NewLine;<div class="&lowbar;tableContainer&lowbar;16hzy&lowbar;1">&NewLine;<div class="&lowbar;tableWrapper&lowbar;16hzy&lowbar;14 group flex w-fit flex-col-reverse" tabindex="-1">&NewLine;<table class="w-fit min-w-(--thread-content-width)" data-start="4187" data-end="4865">&NewLine;<thead data-start="4187" data-end="4283">&NewLine;<tr data-start="4187" data-end="4283">&NewLine;<th data-start="4187" data-end="4213" data-col-size="sm">Service Affected</th>&NewLine;<th data-start="4213" data-end="4232" data-col-size="sm">Issue Start (ET)</th>&NewLine;<th data-start="4232" data-end="4251" data-col-size="sm">Partial Recovery</th>&NewLine;<th data-start="4251" data-end="4267" data-col-size="sm">Full Recovery</th>&NewLine;<th data-start="4267" data-end="4283" data-col-size="sm">Duration</th>&NewLine;</tr>&NewLine;</thead>&NewLine;<tbody data-start="4381" data-end="4865">&NewLine;<tr data-start="4381" data-end="4477">&NewLine;<td data-start="4381" data-end="4407" data-col-size="sm">Gmail</td>&NewLine;<td data-start="4407" data-end="4426" data-col-size="sm">10&colon;49 AM</td>&NewLine;<td data-start="4426" data-end="4445" data-col-size="sm">~12&colon;30 PM</td>&NewLine;<td data-start="4445" data-end="4461" data-col-size="sm">3&colon;49 PM</td>&NewLine;<td data-start="4461" data-end="4477" data-col-size="sm">~5 hours</td>&NewLine;</tr>&NewLine;<tr data-start="4478" data-end="4574">&NewLine;<td data-start="4478" data-end="4504" data-col-size="sm">Google Drive</td>&NewLine;<td data-start="4504" data-end="4523" data-col-size="sm">11&colon;00 AM</td>&NewLine;<td data-start="4523" data-end="4542" data-col-size="sm">~12&colon;45 PM</td>&NewLine;<td data-start="4542" data-end="4558" data-col-size="sm">3&colon;49 PM</td>&NewLine;<td data-start="4558" data-end="4574" data-col-size="sm">~4&period;5 hours</td>&NewLine;</tr>&NewLine;<tr data-start="4575" data-end="4671">&NewLine;<td data-start="4575" data-end="4601" data-col-size="sm">Firebase</td>&NewLine;<td data-start="4601" data-end="4620" data-col-size="sm">10&colon;55 AM</td>&NewLine;<td data-start="4620" data-end="4639" data-col-size="sm">~12&colon;40 PM</td>&NewLine;<td data-start="4639" data-end="4655" data-col-size="sm">3&colon;30 PM</td>&NewLine;<td data-start="4655" data-end="4671" data-col-size="sm">~4&period;5 hours</td>&NewLine;</tr>&NewLine;<tr data-start="4672" data-end="4768">&NewLine;<td data-start="4672" data-end="4698" data-col-size="sm">Cloudflare Workers KV</td>&NewLine;<td data-start="4698" data-end="4717" data-col-size="sm">11&colon;15 AM</td>&NewLine;<td data-start="4717" data-end="4736" data-col-size="sm">~1&colon;15 PM</td>&NewLine;<td data-start="4736" data-end="4752" data-col-size="sm">4&colon;00 PM</td>&NewLine;<td data-start="4752" data-end="4768" data-col-size="sm">~5 hours</td>&NewLine;</tr>&NewLine;<tr data-start="4769" data-end="4865">&NewLine;<td data-start="4769" data-end="4795" data-col-size="sm">Spotify / Snapchat</td>&NewLine;<td data-start="4795" data-end="4814" data-col-size="sm">11&colon;30 AM</td>&NewLine;<td data-start="4814" data-end="4833" data-col-size="sm">~2&colon;00 PM</td>&NewLine;<td data-start="4833" data-end="4849" data-col-size="sm">4&colon;00 PM</td>&NewLine;<td data-start="4849" data-end="4865" data-col-size="sm">~4&period;5 hours</td>&NewLine;</tr>&NewLine;</tbody>&NewLine;</table>&NewLine;<div class="sticky end-(--thread-content-margin) h-0 self-end select-none">&NewLine;<div class="absolute end-0 flex items-end"></div>&NewLine;</div>&NewLine;</div>&NewLine;</div>&NewLine;<p data-start="4867" data-end="4977">It’s worth noting that not all users were hit equally&period; Some services were spotty&semi; others were totally offline&period;</p>&NewLine;<h2 data-start="4979" data-end="5024">Fallout and Fixes&colon; Will This Happen Again&quest;</h2>&NewLine;<p data-start="5026" data-end="5240">Google is still preparing a full incident report&period; But they’ve already pledged changes&period; That includes more robust data validation, better internal testing, and presumably more human oversight on system-wide updates&period;</p>&NewLine;<p data-start="5242" data-end="5463">Cloudflare, for its part, has already taken action&period; They&&num;8217&semi;re moving the core of their KV store to their own R2 object storage solution&period; It’s a long-term play to reduce dependency on third-party providers like Google Cloud&period;</p>&NewLine;<p data-start="5465" data-end="5611">As one engineer half-jokingly put it on X (formerly Twitter), &OpenCurlyDoubleQuote;If one quota update can break the internet, maybe we’ve over-optimized just a bit&period;”</p>&NewLine;

THE iBULLETIN

Google Blames API Mishap for Global Cloud Outage That Disrupted Billions of Requests

News

Google Blames API Mishap for Global Cloud Outage That Disrupted Billions of Requests

Leave a Reply

Leave a Reply

Sony Hints at Potential PS Plus Price Tweaks to Boost Profit Margins