Hello Guest

Atis outage - the whole story

  • 1 Replies
  • 11599 Views
Atis outage - the whole story
« on: January 16, 2011, 10:02:45 PM »
Here is the full story, for anyone who is interested in ugly technical details.....

The server went down the evening of 1-3, but interestingly so did my  home computer (on the same power backup).   The power backup was fried,  but the house computer came back up when plugged into the wall.   The  ATIS server did not.   I replaced the ATIS server power supply about a  year ago, so I figured the cheap Chinese replacement couldn't handle  whatever the power backup did.    So I replaced the power supply again,  but the server would only boot intermittently.    After a few of these  power cycles, it wouldn't even boot.   In the afternoon of 1-4 I took it  to the local techs and a quick check of the motherboard confirmed the  motherboard was dead.

To be honest, I don't know if the power backup fried the board, or a bad  power supply fried the power backup and the board.  In the end, it  didn't matter.   It was dead.


Thank Goodness data was OK, as disk was good.

I bought a new server on 1-5, installed the old drive along the new one,  and began the process of re-building.  It only took me a few days to  install a new operating system, get my security, BIND and mail services  running.  But I ran into serious problems with the mailing list  software(Mailman).     So I punted Mailman for the moment and worked on  everything else.  Within another day I had web services up.   BTW - as a  tip to all the other systems folks reading on:   The latest version of  Apache and PHP do NOT work together on Fedora 14 machines.  Just a word  to the wise  (-:  Within another day I had the database software  upgraded and that meant I could upgrade the bulletin board software and  get that running.  So for all intents and purpose the system was up in  less than a week; except for Mailman.    

While I luckily had a vacation day 1-4,  the rest of the time I am  working, and commuting, a total of 12 hours a day.  So by now, I am  running on about 4 hours a sleep a night.

The mailing list software was another matter.   The language Mailman  uses is called python.  Unfortunately, they never updated the current  version of Mailman to use the most current versions of python.   The  previous versions of Mailman obviously didn't support the new pythons  either, so downgrading python was the only answer.  Unfortunately a lot  of system utility, scripts, and other software use python, and many of  them make use of the newer features in the latest versions of the  language.  So downgrading ended up not being a possibility.    I was  stuck - no way to downgrade either software, and no upgrades that solved  my problems were available.  

So my only option was to run  parallel copies of python.   So I rounded  up an old copy of python that was compatible with Mailman and built and  installed that in a completely separate directory structure, outside of  normal system paths.   Then I modified all the headers, make files,  import statements, etc in Mailman to use that parallel copy.  At this  point, Mailman was working but required extensive additional tweaks  inside a bunch of other utilities it includes.  In addition, there were  some re-configurations of both the old version of python and Mailman.   This ended up taking me a few days.

In the end - the system is a BUNCH faster, very, very up to date, and should serve us for years to come.


I think I am going to get more sleep now.....


Spencer
-------------------------------------------------
More tractors than time.....

*

RG8800

  • *****
  • 607
    • View Profile
Atis outage - the whole story
« Reply #1 on: January 17, 2011, 01:40:45 PM »
Thanks for the updates Spencer. And thanks for these forums too.
Ralph in Sask.