[conspire] Server supermicro 1U 5013C-MT Raid1

Daniel Gimpelevich daniel at gimpelevich.san-francisco.ca.us
Mon Sep 25 16:26:13 PDT 2006

On Mon, 25 Sep 2006 15:17:24 -0700, Rick Moen wrote:

> Daniel Gimpelevich <daniel at gimpelevich.san-francisco.ca.us> wrote:
>> Assuming there is no such diagnostic software, which would be a
>> godsend to you if it exists, this would be going from the extremely
>> pro-forma to the extremely drastic. There is a middle ground between
>> resetting the CMOS to factory defaults and reflashing the BIOS with
>> the very same image it currently contains, usually in the form of a
>> jumper on the motherboard.
> Believe it or not _if_ the problem is at root a CMOS one, reflashing the
> BIOS often _will_ fix it.  Yes, you would have to re-do all your CMOS 
> settings; that is A Good Thing for most folk.

Not only am I well aware of that fact, but I often point it out myself. I
thought I was rather careful not to negate that assertion. I was saying
that it is a rather drastic way of accomplishing only that, and the
resetting to factory defaults a possibly ineffective way. I then proposed
a way with neither disadvantage.

>> That would be the first thing I would try in the even I suspected a
>> scrambled CMOS, which is not the first thing that comes to mind given the
>> symptoms described. It seems much more plausible for the RAM to be at
>> fault, perhaps either badly seated in its socket(s) or not fully
>> compatible with the motherboard, especially if it's heterogeneous. 
> That's not the way bad RAM generally manifests in my experience:  You 
> more typically just get kernel panics.  However, I suppose it's
> possible.  Unfortunately, Stephan didn't give us a lot to go on, just
> that it involves "single bit error location" or "multiple bit error
> location" messages "before bios is really starting", and that Stephan
> "thinks it has to do with the Bios".

Kernel panics apparently only stem from really, really bad RAM, because
none of the bad RAM I've ever seen has generated a single one. Sometimes
segfaults occur, but typically the symptom is nothing more than
inexplicably corrupt data.

> (I've said this so often that I'm basically worn out, and tired of
> fruitlessly trying to teach it to people:  Give us the _raw symptoms_ of
> your problems in chronological order, dammit:  When, by contrast, you
> give us your interpretations, you are severely shooting in the foot our
> ability to help you.)

You can say THAT again. ;->

> If the user ends up having reason to suspect RAM problems, then there
> are any number of very find "live CD" Linux distributions that furnish
> memtest86, such as Ultimate Boot CD:  http://ubcd.sourceforge.net/

This assumes that "RAM problems" is synonymous with "bad RAM," which it is
not. I was not suggesting that the RAM was bad, although there wasn't
enough info to rule that out, either.

More information about the conspire mailing list