[sf-lug] For SysAdmins upgrading of the hashing algorithm

Fri Jun 8 19:21:21 PDT 2012

Quoting Bobbie Sellers (bliss-sf4ever at dslextreme.com):

>     This URL leads to an article about the need for upgrading the
> hashing algorithm and the means to accomplish that.
>     If not using the two mentioned systems read the comments
> and you might find the needed advice.
> 
> <http://www.cyberciti.biz/faq/rhel-centos-fedora-linux-upgrading-password-hashing/>
> 
>     Hoping this will be helpful.
>     Also it perhaps explains some of the extra requests for authentication
> I have seen lately.

It's an interesting article, and thanks for the pointer.  I can provide
a bit of background, and also the reason why this probably has nothing
to do with extra requests for authentication.

Short version:  There's a small benefit to reconfiguring Linux
systems' user login authentication to store better hashes of user
passwords.  But not as much as you might think.

Some folks might be interested in this matter because of recent theft
and subsequent cracking of 6.4 million MD5 hashes of LinkedIn passwords
that someone posted to a Web forum in Russia.  The story goes, they were
easy to crack because of LinkedIn's use of MD5 years after that hashing
method was judged too weak to be safe -- and therefore it must be
equally dangerous to use for user-login password hashes on Linux.

Right?  Well, no, not entirely.  

Definitely MD5 has been known to be weak since 2004, and people have
been phasing it out.  But that doesn't materially help someone trying to
steal passwords from a Linux system.

First, understand what a hash is.  It's a 'one-way function' or
'fingerprint' value calculated from an input string -- usually shorter,
so you can use the hash to stand in for the original input value (much
like a checksum).  An informal sort of hash might be referring to people
by their initials, e.g. RM for me and BS for you.  That's not a very
good hash because of the high likelihood of interacting with other
people sharing our initials, so saying you came to talk to RM doesn't
make clear whether you're arriving for lunch with me or with Rachel
McAdams.  (This is called a 'hash collision'.)

When you login, the system listens to the password you type in and
recalculates the typed-in string's hash, then compares the calculated
has against the stored value for your account.  If the hashes match,
you're allowed in.  Stored where, you ask?

Early in the history of Unix, hashes of password rather than the
passwords were stored in /etc/passwd.  At first, DES (Data Encryption
Standard) hashes were put there.  DES is a 56-bit block cipher and not
very good because it's not very random and produces way too many
'collisions' -- generating the same hash for too many different
passwords.  So, that was changed to MD5, which was good enough for a
long while, even though better, more random and thus more one-way
(unique) hashes came along:  SHA-1, SHA-256, SHA-512, Blowfish, and many
more.[1] 

However, problem was, people shared Unix machines with each other, and
there was always some joker who'd grab a copy of the shared /etc/passwd 
file -- which must be world-readable or software will break -- and uses
a lot of computing power somewhere to run dictionaries full of words and
word combinations through hashing calculations to see if any match the 
hashes in /etc/passwd.  If you know what string, if hashed, produces a
calculated hash, and match that calculation against a hash stored in
/etc/passwd, you have just guessed what the password is.  Note that this 
isn't decryption -- the hashing function cannot be run in reverse -- but
is functionally equivalent to doing that.

Two changes closed this security hole.  The most important:  Passwords
were moved to new system file /etc/shadow, which is deliberately kept
readable only by root.  The user account remains defined in /etc/passwd,
but the password itself is no longer there.  

The other change:  salting.  A 'salt' is a crypto nickname for a random
text string of predetermined length prepended to a plaintext password
just before it's hashed.  The advantage is that someone who steals the 
hashed passwords somehow can no longer just bulk-hash dictionaries worth
of words and do a simple match.  Your password, even if you use the same
one on many different systems, will end up with different hashes because
of the salting.

Have a look at your system's /etc/shadow file (using root authority),
and you'll see a suffix in front of each stored password.  It'll
probably be '$1$' for most.  That means MD5.  '$6$' means SHA-512, 
which I've set as default on my own systems for a few years as part of
the gradual move away from MD5.  (You need to tweak
/etc/pam.d/password-common and /etc/login.defs, to change default
password hashing algorithm.  SHA-512 is the installation default in
recent distros.)

Even those of us who started moving to SHA-512 some years ago tend to
still have /etc/shadow entries starting with '$1$' hanging around --
because those predate the reconfig.  And it's really not urgent.

Why?  Because anyone who steals your /etc/shadow file by definition has
to have root-user authority already, which means you have a lot bigger
problems than cracked passwords.  Also, because the stored passwords are
salted, it's pretty difficult (expensive, time-consuming) even for
someone able to steal the /etc/shadow file to match plaintext on the 
entries.

Rumour has it that the LinkedIn passwords were not stored salted, and
obviously they weren't kept in very secure storage.  If they'd been
stored using a computationally more expensive hashing method like
SHA-512 or Blowfish _and_ using salt, they would have been a great deal
more difficult to match to plaintext, even ignoring the obviously poor
security on hash storage.

Hope that helps.

Here's a pretty good article for further information:  
http://phpsec.org/articles/2005/password-hashing.html

[1] https://en.wikipedia.org/wiki/List_of_hash_functions#Cryptographic_hash_functions