Thursday, December 02, 2004

The Server has Died ...

... through no fault of mine (sort of). the day before and the day of thanksgiving i was working on keeping the server running. we all knew it was on it's last leg and had been building a replacement server ... actually, we had been building up a full network core replacement. due to some of the problems we've had, we are outright replacing the entire structure. everything will be redone from scratch. all of this was on schedule for replacing everything during the christmas break. then the current server crashed. the one that houses our mail and our network logon information. yeah. ouch.

we brought it back up and found that the filesystem was severely corrupted. appears that as it crashed, it basically scribbled on parts of the hard drive. imagine trying to sort out and read 10 pages of contract type fine print after a three year old has drawn some lovely pictures on it ...

anyway, we broke the mirror (hard drive array was setup to have two identicle copies, mirror is the normal term for this) and placed one half in another machine for a file systems check while we brought the server up with the other half of the mirror. the filesystems check took more than three days. during this time, we built up a new server and began transfering services to the new server. as we were transfering, the old server continued to display corrupt file system messages and to periodically crash. since i am the "security specialist" *and* "data recovery specialist" guess who had the bulk of the complex recovery of the security database ... yeah ... that was all day wednesday and a good chunk of thursday morning. after we finished and decommissioned the old server, we finally got the results of the file system scan (three days after starting the scan) which said that due to the massive number and severity of the errors corrected, we should really rerun the scan for errors that could have been missed due to the severity of the damage found.

we didn't bother.

after moving all services to the new server and removing the old server from the security configuration, we started one last search through the old one to see if we could find anything that was missed before. during this search, the server crashed again and this time wouldn't come back online.

so the old server is now officially dead.

and the new server is scheduled for wiping in less than a month when we bring the new network core online.

was all of this a waste of time?

--- The Dancer in the Shadows


