Outage list updated

June 30, 2005 by · Comment
Filed under: Work 

As Alex points out, the clusters are finally coming. I’ve updated the outage schedule to reflect the first round of services being moved over. Everything in this round should be transparent, it’ll just be DNS changes, so you probably won’t even notice it happening.

ftp-staging will be down for an hour on Friday

June 24, 2005 by · Comment
Filed under: Work 

We will be moving the ftp-staging server physically from one colo to the other (since it’s the only machine we have with enough disk space) at or shortly after 10:00am on Friday. If all goes well, it’ll be up and running again at the new location within 15 minutes, but we’re allowing a window of about an hour in case of any unforeseen circumstances. This could potentially disrupt automated nightly build uploads from tinderboxes.

You can continue to watch http://nagios.mozilla.org/outages/ for up-to-the-minute information about the moves.

Multiple outages planned June 20-24

June 12, 2005 by · Comment
Filed under: Work 

The Mozilla Foundation has had much of its server infrastructure hosted by Meer.net since we left AOL. Meer.net has set up a new colocation facility a few miles away from the one we were originally using, which is a much nicer facility with better access control and server racks that are much easier to work with. We’ve been in a “transition” phase for a few months, with some servers hosted in each facility, moving a few things here and there to the new facility as time and opportunity allows. We’re now down to the final servers to be moved, which have been getting put off because they’re all the end-user-facing servers running the various websites and developer services, and taking those down to move them means impacting end-user services.

But even those have to be moved sometime, and we’ll be attempting to move them during the week of June 20 to 24. During that week, there will be sporadic outages of various services, anywhere from a few minutes to a few hours each, as various services or servers get moved.

The following public-facing services will be affected:

Bonsai,
Bugzilla,
Despot,
Hendrix,
LXR,
Tinderbox
CVS,
CVS-mirror,
CVS-www,
IRC
mozilla.org email and mailing lists, mozillafoundation.org email
www.mozilla.org (2 of the 3 servers)
www.bugzilla.org
developer.mozilla.org (“Devmo”) (not developer-test)
wiki.mozilla.org,
reporter.mozilla.org,
planet.mozilla.org
ftp-staging
primary DNS service for almost every domain we own
Talkback services

We have lots of new servers. The plan is to rack a bunch of the new servers in the new facility and start moving services over by staging them on the new machines, then switching DNS at the point the new machines are ready to take over. This can be done with minimal downtime on almost all of the above services except for Talkback, which will probably need to be shut down and moved in the original servers, because as far as I know, it’s a pain in the butt to set up, and nobody wants to do it all over again :)

Individual outages will be announced on nagios (I’ll put a link in the middle of the main page) somewhere between a couple hours and a day in advance of each outage.