[conspire] Guy with the OpenSUSE server now has problems

Rick Moen rick at linuxmafia.com
Thu Jun 28 18:54:54 PDT 2012


Definite hardware failure.

----- Forwarded message from ??? ### <fwcarr at gmail.com> -----

Date: Thu, 28 Jun 2012 18:35:53 -0700
From: ??? ### <fwcarr at gmail.com>
To: Rick Moen <rick at linuxmafia.com>
Subject: Re: Future Installfests

Rick,

First, vital files (and almost all non-vital files) were backed-up to an
USB drive before I made the trip to your place. The complete failure
occurred after re-backing up vital files to my desktop. I figured that if I
had to reload everything, doing so via the network vs a 2.0 USB drive was
preferable. Anything that may not had made it to the desktop should on the
USB drive, and in the event of worst-case scenario, just about everything
that is important is also on either on optical disks or the systems of two
friends.

No problem on the delay, and besides, I have progress farther along since
that point. It now seems that two of the PATA drives are on their last
legs, and I've concluded that as such, its probably not worth the time and
effort to continue trying to salvage 'em.

I do understand your sediment of ?It's just that one learns over a period
of years that people tend to state their surmises, and that attempting to
do diagnosis on the basis of surmises (interpretations) rather than raw
diagnostic data tends to lead down dead-ends?. Aptly said, and it bought
quite the smile to my face.

What is, what should be, what is not, what should not be, what could be a
possibility, what is the probability of 'what could be', what can be
eliminated, and so on, all important considerations in failure analysis.

When faced with such mysteries, I tend to gravitate towards an abbreviated
fault-tree with either an inductive or deducted direction, depending the
circumstances of the failure. Fault-tree can lead you down a dead end, but
if approached correctly, even such dead ends should provide direction to
discovering the causal factors of the failure. Elimination often speaks
louder and clearer than any determination *based on simple *i*mplication*.
Mind you, its not that I have never found myself embracing an
unsubstantiated supposition. What we are speaking of, in vernacular of the
lowly technician, is the difference between a true trouble shooter and a
mere parts changer.

I was not clear on exactly where I was at that time I sent the E-mail. Its
the ole ?what was clear as crystal to me was stated in a manner that was
clear as mud?, something I do have a bad habit of doing. I had, at that
point, already tried a fresh install with the RAID partitioning failing.
What I ended up with, was 12.1, with all the appearances of being correctly
installed, but instead two RAID 5 spans, I had two 128 GB primary
partitions and a whole lot of unallocated space. After two subsequent
attempts at installing, with the RAID 5 formatting failing, I figure that
this was a strong indication that something was amiss at deeper level than
just a bad upgrade.

Yes, after the failing to boot, diagnostics did strongly indicate that root
files where corrupted, probably ?/Boot?. I was able to bring up bash after
the failed boot and breezed through the error messages, a lot of such and
such tain't nowhere to be found or such and such tain't willing to load.
Add to that, that prior to the boot failure, many, many files was
systemically disappearing, the directories would still be there, but no
files. All of which, were files that I had previously just copied.

SeaTools indication that the drives were good, although not a true dead
end, did take me lead me down the proverbial scenic route. However, it
wasn't a path I followed blindly, with strong indications that something
was seriously wrong, such as the possibility of corrupted partition tables,
I decided to followed that route first.

After experiencing the same results with a live disk, I fdisk the drives
and found ?WARNING: GPT (GUID Partition Table) detected?, but the none of
the drives had 0xEE GUID protected disk identifier. I don't how, when or
where any GPT tables could have introduced , as I avoid GTP (yes, yes, wave
of the future and all that, but with older hardware, iffy OS support, and
almost non-existent default boot-loader compatibility, its just been
something that I haven't wanted to deal with). But, I also don't discount
that it could have, indeed, been human error on my part.

I used Parted --- mklabel msdos to clear all drives (knowing that this may
not completely clean all GPT *remements* ) and after still having problems
with PATA Drives , I started down the road Sector 0 zeroing and full
Zeroing, which brought me to the drive problems.

Shortly after Sector 0 zeroing, two of the drives started showing FAIL
codes on SeaTools short test. The 650 GB started showing Bad Sectors. After
re-allocating, and the drive Passing after repairs, new Bad Sectors would
appear. At this point, I tried a did a full Zeroing of the drive, and
everything seemed in order, but subsequent tests would only show more bad
sectors . Additionally, the 500 GB, the drive that DOS (SeaTools booting)
keep showing an Init Disk Error Reading Partition Table Sector 0 prior to
the Sector zeroing started FAILING during zero writes and the Dos Init
Partition reading error re-appeared. Strange thing is, that both drives
readily formatted and showed as healthy in Windows.

As far a proper documentation, yes, yes, that has bite me in the rear
before. But, truthfully, I knew how my drives were partition. What I did
notice, but didn't catch the meaning of when we upgraded, was that root was
showing as 20 GB partition when it should have been a 140 GB partition.
What I found out during the fresh install, was that despite that I would
choose the option of let me partition the drive and would place root on the
650 GB, it would tried to place a 20 GB root on one of the 1.5 TB drives,
and I wonder if that may have happened during the up-grade.

The 650 GB still has a few months left on its warranty, and I could RMA it.
I also have an identical 500 GB drive that I damaged a controller board
induction coil a few years back, so I probably could just swapped
controller boards (or pulled the coil from the Bad Drive). But as I've
said, its a question of whether it would be worth the time and effort as I
now believe there is a good probability that continued usage of these drive
will only lead to more heartache in the future. Besides, while I discount
it, I cannot completely dismiss that the MB IDE controller(s) may be a
contributing factor (I won't go into the reasons for that, though). So I'm
biting the bullet and starting from scratch. New MB, CPU, Memory and
Drives. Just waiting for all the pieces to arrive.

I had read about GParted in the past few weeks and had figured on tackling
its learning curve before Seatools started showing the two drives as being
bad. So, thanks for the information on GParted Live CD, its now already
downloaded, and I'll give the drives one last look/see with it before
disassembling the old system. I do want to ensure that 1.5 TB SATAs are,
indeed, in proper order before using them again. And yes, I want to give
TestDisk a run through. I had made arrangements to have a friend bring over
his bootable copy of Gibson Research's SpinRite, but that fell through.
Sound to me like TestDisk may well prove to be a good alternative to
SpinRite.

Talk at ya later.

----- End forwarded message -----




More information about the conspire mailing list