March 2008 Archives
Guys, the outage from 3/27/2008 23:45 to 3/28/2008 17:00 was due to a motherboard failure. More info is at http://status.gigo.com/ for those that care.
-jason
I've moved to using OpenFire for the jabber server. If you want to use an @gigo.com jabber address, contact me.
Some comparisons:
- OpenFire only supports a single domain. ejabberd supported all domains I hosted.
- OpenFire is java, and sucks memory like crazy.
- OpenFire was insanely easy to setup - unlink ejabberd and erlang.
Alternate title: But damn, FreeBSD is pissing my off good this time around.
One thing about my job that really spoils me is the shear size of it. Where I'm at, we have more machines down for maintenance than most folks have in service company wide. We don't bundle up too many services on any one box - less things to go wrong when a box fails. And, boxes do fail - a fairly predictable amount fail every day, like clockwork. I tell people I plan for failure and they look at me funny. But what I mean is, I know things will fail - we can build to accommodate it.
So, in advance of upgrade day (Saturday), I stopped at the colo this evening for a quick 5 minute hard drive swap. The intent: Swap the boot drive, so as to have a offline seperate bootable disk that we can fallback to if the upgrade sucks.
Unfortunately, the moment I removed one of the two drives that acts as the mirror for gigo.com, I/O froze on the system, entirely. Poof.
After I reset, the system did not want to boot. I broke the mirror, and the first drive of couse was the one I removed. Getting it to boot without the mirror was impossible - booting /dev/ad4 instead of /dev/mirror/gm0 wasn't happening, since the boot drive was told to forget about being gm0.
After a lot of hassle I got the system to the point where it would start fsck'ing. The beauty of this is, is the other mirrored file system was checking itself, end to end, while trying to fsck. after 3 hours, I aborted it, and let the rest of the system come up.
Backups won't run tonight - that's the file system I'm fsck'ing now. Everything else is back to normal.
Mail from gigo.com (including mailing lists I host) is being blocked by Yahoo.com. Despite being an employee there, I've got no recourse to quickly get this resolved. I've filled out their form, we'll see how long it takes.
I hate telemarketers. And, despite the do not call list, or perhaps in spite of, they've gotten much meaner and naster as of late - bogus or missing caller id, no identification when the predicitve callers call you, and if you do get a human, the moment you utter DNC lists, actually _before you finish_, you get a click. One was so rude as to tell me *I* had the wrong number, before the click.
I finally ran across the device at http://interceptorid.com/. It is known as a few different names but is really the same device. Plug it into the phone line between your phone and the wall. It intercepts all phone calls, finds the caller id after that first ring, then either sends it to your phone, or to an answering machine. For my own setup, calls with valid caller id, coming from "local" area codes (basically, northern California) all ring my phone directly; *everything* else (especially toll free #'s) ring the answering machine instead. Best part is, those screened calls, don't ring me at all.
Good stuff, but hard to find at this time - looks like they are preparing to change the design some. This was definately a version 1.0 product - a bit clunky. But, it definately works as advertise.
In the event gigo.com is down and not coming back up I've created http://status.gigo.com. You might want to bookmark it.
If you have an RSS reader, consider bookmarking
the RSS feed.
It will in particular come in handy during the maintenance planned for 3/22/08 and 3/22/08. :-)
I am looking at upgrading from FreeBSD 6 to FreeBSD 7. Unfortunately this means downtime. Additionally, as I'll be moving to a 64 bit OS, I can't just build the "next" gigo.com at home without buying a 64 bit capable spare machine that's only gonna be needed for a few days.
What this means is, I need to actually bring gigo.com down in a big way to do this upgrade. I expect it to take a weekend.
What I'm proposing is 3/22 to 3/23 being declared as "maintenance". I'll obviously try and limit how long mail and web are down, but .. this upgrade is unfortunately going to take time. If this time does not work for you, please let me know. I expect 1 day of major impact, 1 day of minor impact.
The priority order on what I'd get back up and running would be:
- firewalls, dns, ssh (then work from a hotel)
- mailing lists (delayed, until brought back up)
- greylisting,spamd,regular mail,imap (delayed, until brought back up)
- mysql, web,webmail (flat out offline until brought back up - sorry)
- irc, jabber
- bitlbee, rsync
- nagios
I apologize that this is so soon after last August's update - unfortunately, FreeBSD 7 was only just now released. Minor upgrades are not nearly as big of a deal (usually just a minor install and a reboot). But a major upgrade, those are a bit more painful (especially changing from 32 bit to 64 bit at the same time).
Before anyone asks: Yes, in theory, I could move *everything* to another site, somewhere else, maybe even volunteered space, but the overhead in doing so is too much, for the amount of stuff here. Given my limited free time, that's not an option. But, thanks in advance for thinking about it.
Gigo crashed. Back up now...
