Amazon announced that its cloud computing division, Amazon Web Services (AWS), had fully resumed normal service on Monday afternoon after a widespread internet disruption disrupted thousands of websites and apps across the globe — from Snapchat to Reddit.
Although service had been restored, Amazon stated that “some AWS services had a backlog of messages that they will finish processing over the next few hours.”
The outage interrupted online operations for businesses and individuals worldwide, leaving many unable to perform basic digital tasks such as making payments or rescheduling flights.
Users continued to report difficulties accessing apps like Zoom and Venmo well into the afternoon.
Experts described the event as the most significant internet failure since the previous year’s CrowdStrike glitch, which disabled technology systems in hospitals, airports, and banks.
It marked the third time in five years that AWS’s northern Virginia data hub, known as US-EAST-1, triggered a widespread breakdown. Amazon has not yet explained why that particular location remains vulnerable to recurring problems.
The disruption was linked to issues with the Domain Name System (DNS), which prevented applications from locating the correct AWS DynamoDB API addresses used for storing user data.
AWS later confirmed that the root cause lay in “an underlying subsystem that monitors the health of its network load balancers,” affecting the “EC2 internal network.”
By around 3 p.m. PT (2200 GMT), Amazon reported that “all AWS services returned to normal operations.”
Cornell University professor Ken Birman said that developers need to implement stronger resilience measures to prevent similar failures.
“When people cut costs and cut corners to try to get an application up, and then forget that they skipped that last step and didn’t really protect against an outage, those companies are the ones who really ought to be scrutinized later,” he told Reuters.
Read next: Google Warns of Major Cyberattack Affecting Oracle Clients
A Reminder of the Fragile Nature of Global Cloud Systems
AWS, the world’s largest cloud provider, supports governments, businesses, and individuals by hosting data storage and digital infrastructure.
Outages at its US-EAST-1 data center—also hit in 2020 and 2021—can ripple across the internet, disrupting financial services, airlines, gaming, and communication apps.
“This outage once again highlights the dependency we have on relatively fragile infrastructures,” said Jake Moore of cybersecurity firm ESET.
In the UK, services from Lloyd Bank, Bank of Scotland, Vodafone, BT, and HMRC were all affected. Ookla reported over four million incident reports tied to the issue.
“For major businesses, hours of cloud downtime translate to millions in lost productivity and revenue,” said Ryan Griffin of McGill and Partners.
Apps like Reddit, Roblox, Snapchat, and Duolingo were among those impacted, alongside Coinbase, Robinhood, and Amazon’s own Alexa and Prime Video. Despite the disruption, Amazon shares closed up 1.6% at $216.48.
Read next: Micron to Halt Server Chip Sales to Chinese Data Centers