This was the source of yesterday’s outage. And in the extensive day-to-day work of maintaining this infrastructure, our engineers often need to take part of the backbone offline for maintenance - perhaps repairing a fiber line, adding more capacity, or updating the software on the router itself. The data traffic between all these computing facilities is managed by routers, which figure out where to send all the incoming and outgoing data. That’s where the information needed by your app gets retrieved and processed, and sent back over the network to your phone. When you open one of our apps and load up your feed or messages, the app’s request for data travels from your device to the nearest facility, which then communicates directly over our backbone network to a larger data center. Some are massive buildings that house millions of machines that store data and run the heavy computational loads that keep our platforms running, and others are smaller facilities that connect our backbone network to the broader internet and the people using our platforms. Those data centers come in different forms. The backbone is the network Facebook has built to connect all our computing facilities together, which consists of tens of thousands of miles of fiber-optic cables crossing the globe and linking all our data centers. This outage was triggered by the system that manages our global backbone network capacity. Now that our platforms are up and running as usual after yesterday’s outage, I thought it would be worth sharing a little more detail on what happened and why - and most importantly, how we’re learning from it.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. Archives
December 2022
Categories |