[conspire] Re: RHL 9 Install problems

Rick Moen rick at linuxmafia.com
Thu Jul 10 15:53:53 PDT 2003

Quoting Greg Dougherty (rhl at molecularsoftware.com):

> I HAD an ext2 partition on it.  I did another install, however, and this is my
> current setup:
> hda 1: ~10 GB, NTFS
> hda 2: 100 MB, ext3 /boot
> hda 3: 2000 MB swap
> hda 5: the rest ext3 /


> It's a WD 40 "GB" HD, which means it holds ~40 billion bytes of data (NOT 40
> GB).

Yes, I know that scam.  Annoying, isn't it?  

Greg means they define "megabyte" and "gigabyte" as 1000 and 1,000,000,
rather than 1024^2 and 1024^3 -- in order to shortchange you on space.
Some people such as my friend Karsten Self feel this is both not worth
fighting and also technically correct per the metric system's
definitions of those scale prefixes, and that the neologisms "MiB" and
"GiB" should be used if you insist on multiples of 1024.

> I ran fsck -c -c on hda2, no problem.  I ran it on hda5, it said
> "vdone" after spending 20+ hours checking for bad blocks, then hung my
> machine.  Control-c and -D did nothing, neither did anything else, so
> I shut it down.

Ouch.  That's a real pain -- especially since it leaves you objectively
in doubt as to whether its a logical (formatting) problem or a physical 
problem on the hard drive.

Logical drive problems come in two varieties:  There are the ones that
go away when you blow away and remake filesystems using standard tools
whose integrity you have confidence in, and there are those that don't.
As a practical matter, what you do in the latter case is have a whack at
the drive using the manufacturer's pseudo-low-level formatting
utility[1].  If the problem goes away _then_, it must have been of
logical origin.  If not, by process of elimination, it's physical and
you return the drive under warranty (or hurl it in the compost heap,
absent an unexpired warranty).

The easy way to blow away a filesystem (partition) using standard
high-level tools is using /sbin/fdisk (or cfdisk; your choice).  The
obvious way to make a new one is mkfs.ext2, mkfs.ext3, etc.  

Blowing away high-level information (partition definitions) using
fdisk/cfdisk has the advantage over low-level reformatting of enabling
you to blank out less than the entire drive.  Low-level formatters blitz
absolutely everything -- which is both advantage and disadvantage at the
same time.

(That is, there's nothing like the reassurance of knowing that you've
redone the layout of a drive only after clobbering every last magnetised
bit on it using low-level tools _first_ -- but you'd damned well better
have everything backed up.  Everything.  Because the drive will get
completely blanked.)

> I have found that I can usually do a Custom Minimal RHL 9 install.  I
> try to do more than that, or try to do anything once I have install,
> and my computer goes tits up.

So, the problem is that I don't want to rush to conclusions, and we
don't have indicative enough symptoms or error messages to point the
finger definitively at high-level formatting (filesystem definition),
low-level formatting, a recurring CD-ROM _drive_ glitch, a defect on
your CD-ROM _media_ surface, or some other hardware problem.  From what
we've heard, it could still be any of those, I think.

It's certainly looking more like _something_ to do with the hard drive,
though, after your attempt to use fsck.ext2 from the LNX-BBC.

[making a fresh, from-scratch filesystems on your hard drive]

> How do I do that?  

Boot the LNX-BBC.  Type "mount".  umount anything concerned with
/dev/hda5.  Then, type "mkfs.ext3 /dev/hda5".  Wait a few minutes;
you'll get your prompt back when it's done.

Do something similar for anything else you want to make afresh.  (You
might need to invoke "mke3fs" instead of mkfs.ext3.)`

The above will suffice if you wish to just remake partitions that
already exist.  If you wish to change from ext3 to any other partition
type (no reason why you should) or to change the size and quantity of
partitions, then you need to _first_ edit the partition table using
fdisk or cfdisk, and _then_ use appropriate mkfs.* tools (mkfs.ext3,
mkfs.ext2, mkfs.xfs, mkreiserfs...) on the partitions thus defined.

> how do I get RHL to install onto the already created partitions?

In the installer, right around where you enter Disk Druid, you'll see
the list of partitions, including those you've just re-made.  In the
filesystem options, confirm the filesystem type (ext3), make sure the
box to make (format) the partition is _not_ checked, and supply the
filesystem's mountpoint (e.g., "/").

I'll address the question about "reasonable ways to partition a large HD
with Linux" in a separate message, to be posted after this one.

>> [1] In your shoes, I'd try it without badblocks checking, I guess:  20
>> hours _is_ a heck of a long time.
> I'm not doing anything else with the computer, and I'm happy to leave
> it running all night long, if it will do me any good.

That's a game of likelihood-of-satisfaction percentages that looks
reasonable if you think you'll do it once and gain some benefit, but
more like a waste of time if you either might have to do it multiple
times or gain no benefit from it.

Based on your one 20-hour run and the (reasonable) assumption that the
LNX-BBC's copy of fsck.ext2 [2] is OK, unfortunately we didn't gain any 
functionality and didn't learn much.  

So:  What would I do?  I'd _not_ play around with rearranging filesystem
sizes, for now.  (It's bad to needlessly introduce variables into a
diagnostic situation.)  I'd boot the LNX-BBC, make sure /dev/hda5 isn't
mounted (umount it if it is), then do "fsck.ext3 /dev/hda5".  

If it misbehaved in any way, I'd back up anything I cared about, then
blitz the whole damned drive with the low-level formatter, then restart
from scratch.  If problems persisted, I guess I'd think hard about where
a hardware problem might have crept into my system.

[1] That's if it's ATA ("IDE").  If it's SCSI, you don't need a
manufacturer's app, as you can easily rewrite the drive's low-level
information using your SCSI host adapter's firmware utility for that

[2] Honestly, we should have had you run fsck.ext3.  To explain, ext3 is
just the old ext2 filesystem with a journal file, used to log pending
write operations so they can be then written to disk in a single atomic
operation (with a number of incidental benefits).  So, I _believe_
running fsck.ext2 on an ext3 filesystem is perfectly fine:  It should
just check the underlying filesystem and just not bother to check the
journal.  To my knowledge, fsck.ext3 _is_ fsck.ext2 with that one extra
check added in.

May those that love us love us; and those that don't love us, may
God turn their hearts; and if he doesn't turn their hearts, may
he turn their ankles so we'll know them by their limping.

More information about the conspire mailing list