[conspire] question about EIDE ATA hard drive to buy
Rick Moen
rick at linuxmafia.com
Thu Dec 21 13:17:07 PST 2006
Quoting Don Marti (dmarti at zgp.org):
> Nice of the manufacturers to put a thermometer on
> each drive, isn't it?
>
> smartd logs this, along with other drive self-test info.
> http://www.linuxjournal.com/article/6983
>
> Rick, what drive temperatures do you consider to be
> "needs work" and "needs immediate attention" points?
If I had to pick a number, I'd say around 40 degrees C, I start getting
really concerned, and I'm much happier seeing 25-30, if possible --
though that's off the top of my head.
I realise this really _does_ make me a fossil, but: Since I started
doing this stuff long before we had Smartmontools to track internal
drive reporting, I got accustomed to doing this (well, part of it --
more below) by feel and by intuition -- sometimes, learning from my own
screw-ups, sometimes from others'.
One of those screwups was negligently frying a 2.1GB IBM SCSI drive
during a heat wave in SOMA. My main workstation was a vertical
mid-tower with two SCSI drives mounted in an aluminum U-bracket, and I'd
negligently jammed the two drives against one another as close as
possible to the bottom of the U (in the two lowest bays), with the
result that the lower drive's electronics had only about a 5 mm air gap
above the bottom of the U bracket -- and poor air circulation.
I was rebuilding the machine (stripped to my waist on account of the
very warm day -- over 100 degrees F, upstairs of the CoffeeNet), and had
the lid off (which with that design, as with many, actually _impairs_
air flow through the box), and left it that way as I was reloading the
OS from CD-ROM. I started getting gibberish at the end of install:
Directory readings had large amounts of random garbage. I quickly
realised that heat buildup had fried some of the electronics on the
lower drive's printed circuit board. Feeling the drive board and metal
surfaces, it was no longer even comfortable to touch (i.e., was not just
warm but uncomfortably hot).
Having no shame, I called up IBM and described the symptom but not the
root cause, and got a free replacement under warranty.
I then took some metal-working tools and snipped a 120 mm more-or-less
round hole in the bottom of the U-bracket, and mounted a big muffin fan
blowing upwards. I also never, ever repeated the error of packing hard
drives that closely with that little clearance or air flow. And of
course I avoided minitowers and other too-tiny cases, as those simply
cannot handle significant heat sources (such as two 7200 RPM SCSI
drives) at all.
Last, I make a point of making sure any newly deployed drives are only
warm to the point of being still quite comfortable to touch, after
reaching heat equilibrium.
Getting back to the point about the Deathstar drives, intuition (and
having talked to a few of the involved parties at the time) suggests
that the bulk of the failures _may_ have been desktop users unaware of
the possibility of heat damage, who just blithely mounted a hot-running
7200 RPM drive into cases with poor heat-handling capacity, and who
never thought to leave the case open for the first hour and then feel
the drive case to check for unacceptable heat buildup. Of course, it
helps to have done this with normally-running drives to know how hot is
normal, but let's just say that if you're even tempted in any way to
pull your fingers back, it's way too hot, and you need to fix the
situation,
somehow.
The evidence (somewhat) suggests that the Thailand facility made some
big batches of drives that were more vulnerable to damage from heat
buildup than most, thus affecting the 7200 RPM models in particular, but
it _also_ suggests that cautious people who paid attention to heat
problems were able to use those drives with no more than normal failure
rates.
--
Cheers, First they came for the verbs, and I said nothing, for
Rick Moen verbing weirds language. Then, they arrival for the nouns
rick at linuxmafia.com and I speech nothing, for I no verbs. - Peter Ellis
More information about the conspire
mailing list