Most people reading this probably already know that Mozilla utilizes a network of volunteers hosting downloads of Firefox (and other Mozilla products). All of these volunteer sites are listed in a database with a numeric weight that shows, relative to the other sites in the database, how much traffic they can handle. When you click the download link for Firefox on the mozilla.com site, you get sent to download.mozilla.org, which picks one of the sites out of that list at random (the chance of a given site getting picked is its weight divided by the sum of the weights of all of the available sites) and redirects you to that site to download the file.
When you think about the sheer number of people using Firefox these days (These 9-month-old stats say we have 60 million active daily users – I’m sure it’s probably grown since then), and Firefox’s built-in application update functionality that notifies users that a new version is available and installs it for them, that means when we release a security update, we’re going to have at least 60 million downloads of it (mostly via the automatic update service) within that first 24 hours after release. The amount of bandwidth required to host that many downloads in one day is staggering. We don’t have that much bandwidth available in Mozilla’s datacenters, which is why we rely on our network of mirror sites for the downloads. Each one of these sites may only be able to handle a small number of downloads, but when you add them all together, there’s a lot more capacity than our datacenters have.
Recently, as the number of Firefox users continues to grow, even our network of download mirror sites was starting to feel the pinch from the sheer volume of downloads during our security releases. During both the Firefox 3.0.6 and 3.0.7 releases, we ended up having to enact a throttling mechanism on the update service in order to slow down the number of downloads being requested to a point where we weren’t completely burying all of our volunteer download sites, many of whom also host downloads for other open source projects besides Firefox. When a Firefox application would check to see if there’s an update available, a percentage of users were told there wasn’t one, even though there really was. If you manually picked “Check for Updates” from the Help menu, you always got it, though. It was only the automatic checks that were throttled. This mechanism is always our last resort. When we have a security update, we want it in the end-user’s hands as quickly as possible. Delaying it for a day for a percentage of users is completely counter to that goal, and so we try everything to avoid having to use it.
On Wednesday this last week, when the firedrill started that became the Firefox 3.0.8 release, I sent out an email to all of our download mirror admins, warning them that Firefox 3.0.8 was imminent. I also pointed out how we had ended up needing to throttle updates during the 3.0.6 and 3.0.7 releases, and saying I still didn’t think we had enough capacity on the download mirror network to handle the release. Paraphrased, “If you know anyone who’s not mirroring Mozilla yet, and would like to, get them in touch with me.”
The community came through in shining colors. In the 48 hours following that email, we increased the capacity of the download mirror network by more than half. We left the peak traffic period of the first full day of Firefox 3.0.8 downloads about 6 to 8 hours ago, and I’m quite happy to report that we never had to throttle the updates at all for the Firefox 3.0.8 release. Every Firefox browser that checked in to see if there was an update available got its update notification. Not only that, but I had mirror admins telling me on IRC during the peak traffic hours “hey, my site can still handle more traffic, go ahead and bump my weight up some.” Quite a welcome change from all the reports of dead servers during the last two releases.
Now, to be fair, we did have one other thing going for us. This release happened in the afternoon on a Friday. This effectively splits most of the download traffic between the home users on Saturday and the business users this coming Monday. Given the way we performed today, though, I’m pretty confident we could have handled the release happening on another weekday anyway.
But, the weekend lull brings up one other point that was raised on IRC by Mike Beltzner yesterday… As I mentioned above, when we have a security update, the goal is to get it into the hands of the end user as fast as possible. Mozilla’s QA signed off on the release in the early morning hours on Friday. The bits were released out to the download mirror sites shortly afterwards. Enough of those mirror sites had picked up the files by late morning to handle the normal release traffic, but we had to wait until mid-afternoon to release because of a long-standing tradition based on capacity planning to schedule the release at a time of day that will cause the fewest simultaneous downloads to avoid overloading the download mirror network. Quite simply: the Pacific Ocean covers a lot of timezones. Enough of them that there’s a general lull in internet traffic when it’s daytime hours over the Pacific Ocean. If we schedule the release to happen as the west coast of North America is going offline for the day, then we start picking up traffic one timezone at a time as Asia and then eastern Europe start coming online for the new day.
In the interest of getting the update out to the end users as quickly as possible, wouldn’t it be great if we had enough capacity on our download mirror network that we didn’t have to wait for the lull in Internet traffic caused by the Pacific Ocean to do a release? If we released in the early morning hours in the US, we’d have a large number of timezones online for the day at release time and would get a much larger percentage of the users in the first few hours after release.
So, how about it, community? You guys pulled off some awesomeness this last week. Let’s see if we can do a little bit more, and be able to handle a release without regard for the time of day it happens! If you know anyone who might be willing to host downloads for us, have them check out our Mirroring Instructions Page. All the info about how to set up and get included in the download pool is listed there.
Here’s some numbers: For a normal Firefox security update, I’ve been saying that we need an availability rating of 35000 or higher to handle the release traffic. That number is the sum of the weights of the available mirrors that currently have the release files. There’s a loose perception of that number being tied to the amount of available Mbit of download bandwidth, but that’s not really accurate, and depends on a lot of factors. During both of the Firefox 3.0.6 and 3.0.7 releases, that number was hovering around 26000 most of the time. For most of the Firefox 3.0.8 release so far, it’s been somewhere between 45000 and 55000. I’m betting we could probably handle the traffic that would be generated by a morning release if we got that above 65000 consistently.