When a Typo Knocked Down Amazon’s S3 – Backbone of Much of the Internet

Rafia Shaikh • Mar 3, 2017 at 10:37am EST

Amazon's Web Services suffered a major outage earlier this week, affecting a number of websites and online services. The tech giant is now blaming the massive internet outage caused by its S3 web service on a typo...

How an Amazon typo broke the internet

Everyone was surprised at the failure of S3 (Simple Storage Solution) that is popular for its track record of availability. The failure knocked down a huge number of services and sites down, including Quora, Apple's iCloud services, Trello, and others. Not much was known at the time of this outage, but Amazon has now published a blog post detailing exactly what caused the "internet" outage.

Amazon said that at the time of the outage its S3 team was trying to diagnose why its billing service for S3 was running slowly. During this process, an engineer executed an incorrect command that ended up removing a larger set of servers than what was originally intended.

Unfortunately, one of the inputs to the command was entered incorrectly and a larger set of servers was removed than intended. The servers that were inadvertently removed supported two other S3 subsystems. One of these subsystems, the index subsystem, manages the metadata and location information of all S3 objects in the region. This subsystem is necessary to serve all GET, LIST, PUT, and DELETE requests.

The error required a restart that "took longer than expected."

Similar to other cloud providers, Amazon's S3 subsystems are also designed to support removal or failure of servers with no customer impact, keeping redundancy in mind. This ensures that even when engineers have to remove any servers, it wouldn't affect the system. However, the company couldn't anticipate the time it took to restart some services, due to AWS' exponential growth.

We have not completely restarted the index subsystem or the placement subsystem in our larger regions for many years. S3 has experienced massive growth over the last several years and the process of restarting these services and running the necessary safety checks to validate the integrity of the metadata took longer than expected.

While the issue only affected Amazon’s Northern Virginia region, it was enough to cause significant problems for a large number of websites and services.

Amazon apologized for the issue saying the company is proud of its track record of availability with Amazon S3. "We know how critical this service is to our customers, their applications and end users, and their businesses. We will do everything we can to learn from this event and use it to improve our availability even further."

About the author: Rafia joined Wccftech in 2012 as a tech reporter. She is currently working on stories focusing on people and technologies that are turning Microsoft into a “company to watch” again. She is also responsible for collaborating with tech makers and e-commerce platforms to bring annoying but tempting deals to our readers.

Follow Wccftech on Google to get more of our news coverage in your feeds.

Read all comments on When a Typo Knocked Down Amazon’s S3 – Backbone of Much of the Internet

When a Typo Knocked Down Amazon’s S3 – Backbone of Much of the Internet

How an Amazon typo broke the internet

Trending Stories

Intel Foundry Snags AMD, NVIDIA, and OpenAI as Design Wins on 18A & 14A Nodes While EMIB Achieves 98% Yields

NVIDIA RTX 50 Series Hotspot Temperature Readings Are Back Through HWMonitor Utility

Ubisoft Barcelona Built Assassin’s Creed Black Flag Resynced’s Acclaimed Underwater Levels, Then Got 51 Layoffs After 2 Million Sales

CAPCOM Reportedly Plans to Create Bigger Expansions Starting With Resident Evil Requiem, As It Prepares Veronica Q1 2027 Release

SK hynix May Add Just One-Sixth Of Its Planned New Memory Capacity By 2028, Handing Ammunition To The DRAM Price-Fixing Lawsuit

Popular Discussions

AMD Radeon Drivers Silently Add Multi Frame Generation “MFG 8x”, Ray Regeneration, and Neural Radiance Overrides, Hinting At A Bigger FSR Push

AMD Prepares For Zen 6 EPYC CPUs Launch For July 22nd-23rd, Confirms AMD’s Mark Papermaster

NVIDIA’s GeForce RTX 5070 Ti SUPER – Specs, Performance, And Price, Everything We Know So Far

AMD’s Next-Gen Medusa Point “10-Core” CPU Beats Strix “10-Core” By 29% In Single-Core & 22% In Multi-Core While Running At Just 2.0 GHz

AMD Ryzen Becomes The Top CPU Choice While Radeon Powers 1 In Every 3 Desktop Gaming GPUs Sold at Microcenter

When a Typo Knocked Down Amazon’s S3 – Backbone of Much of the Internet

How an Amazon typo broke the internet

Related Story Samsung Gen 5.0 1 TB And 2 TB 9100 PRO SSDs Are Now Retailing For The Same Price As Gen 4.0 990 PRO SSD Variants

Further Reading

Trending Stories

Popular Discussions