Neohapsis is currently accepting applications for employment. For more information, please visit our website www.neohapsis.com or email hr@neohapsis.com
Bugtraq archives for 2nd quarter (Apr-Jun) 1995: Re: passwd hashing algorithm

Re: passwd hashing algorithm

Sun, 16 Apr 95 11:35:19 EDT

	 Agreed. Personally, I am wondering when Unix will get
	 overhauled so that these recurring holes (sendmail, crypt<>,
	 etc) will be brought to a higher level of perfection.
	 Regarding crypt() I would think a one-way mechanism is the
	 answer, versus having keys that are left around the system.

This thread is getting old, and there's an awful lot of misinformation
being passed back and forth (as well, of course, as a few absolutely
correct answers).  Let me make one last stab at explaining UNIX passwords,
crack, triple DES, etc.  Before I start, though, let me refer folks to
this paper:

	   author = {Robert H. Morris and Ken Thompson},
	   journal = {Communications of the ACM},
	   month = {November},
	   number = {11},
	   pages = {594},
	   title = {Unix Password Security},
	   volume = {22},
	   year = {1979},
	   annote = {Gives the rationale for the design of the current Unix password
		hashing algorithm.}

It explains, in detail, exactly how and why the current password scheme
was adopted.

First -- the crypt() subroutine is a one-way function, for precisely the
reason given above.  The authors of the paper -- that's Bob Morris the
Elder, who later became chief scientist of the NSA's National Computer
Security Center, and Ken Thompson, one of the inventors of UNIX -- were
too smart to want to have a decryption key lying around the system.
Thus, rather than encrypting the user's password with some key, the
crypt() subroutine uses the password as the key and encrypts a system-specific
constant.  Finding a password, then, is equivalent to breaking the
encryption algorithm given a single ciphertext/plaintext pair.

The encryption algorithm used is 25 iterations of a modified version
of DES.  They used 25 iterations for two reasons:  they wanted to guard
against discovery of a shortcut cryptanalysis of DES, and they wanted
to make the encryption process inherently slow.  The older scheme that
crypt() replaced required 1.25 milliseconds on a PDP 11/70, a 16-bit,
1 MIPS machine.  Improvements in both hardware and software technology
brought the current scheme to the same level several years ago; see

		author = {David C. Feldmeier and Philip R. Karn},
		title = {Unix Password Security---Ten Years Later},
		booktitle = {Advances in Cryptology: Proceedings of {CRYPTO} '89},
		year    = {1990},
		publisher = {Springer-Verlag},
		pages = {44--63}

The modification to DES involves modifying the E-box according to a
12-bit random number called the salt.  The salt is picked when the password
is changed, and is stored in the clear along with each hashed password.
It has several purposes.  First, it means that commercial DES chips can't
easily be used to build a password cracker (but see

	   author = {Philip Leong and Chris Tham},
	   address = {Dallas, TX},
	   booktitle = {Proc. Winter USENIX Conference},
	   title = {Unix Password Encryption Considered Insecure},
	   year = {1991},
	   annote = {How to build a hardware password-cracker.}

for a hardware approach).  Second, it means that the fact that two passwords
are the same isn't readily apparent.  Third, it makes it 2^12 times more
expensive to build a large dictionary of precomputed passwords.

Crack works by guessing at passwords and running them through a highly
optimized version of the password hashing algorithm.  (I should note that
this implementation is not just straight DES; it has several optimizations
tuned to password-guessing.  See

	%A Matt Bishop
	%T An Application of a Fast Data Encryption Standard Implementation
	%P 221-254
	%B Computing Systems
	%D Summer 1988
	%V 1
	%N 3
	%W Dartmouth College

for a discussion of some of these optimizations.)  It absolutely categorically
does not try to ``decrypt'' the passwords.

The current password scheme suffers from two major problems.  For one
thing, it's too fast.  But that's an inherent problem; machines keep
getting better, and there's such a wide range of machine speeds that
what's just slow enough for one system is unacceptably slow on another,
and too fast on a third.  Besides, password-guessing is inherently easy
to run in parallel on multiple machines, and if your enemy has access
to a few LANsful of machines, you're not going to buy yourself much by
slowing down your own login process.  It wouldn't hurt to have the
time constant be user-specific, though, to let system administrators
change their own delay parameters as their machines improve.

The other problem is that the maximum possible password -- 8 characters --
is too short.  The key for the encryptions described above is formed by
taking the 8 bytes and using them, as is, as a DES key.  The current
scheme could be much improved simply by changing this algorithm to accept
a larger string and crunching it down to the same 56 bits.  The actual
*information content* of today's passwords is very low; no more than
about 2.3 bits/character for many of them.  No wonder they're guessable;
that's only 2^19 possibilities!  (See p. 13 of my firewalls book for
a brief explanation of the reasoning behind that statement.)

It should be clear by now why using triple DES won't help very much,
save for one point.  Triple DES was intended to deal with DES's 56
bit key size, which can be searched exhaustively by specialized hardware.
But that isn't the threat we're dealing with today; passwords don't
come close to exhausting that space.  Nor is the timing difference that
important; tripling the encryption time is just a matter of 2-3 years
technology improvement.  There's only one facet of triple DES that's
at all useful here:  it provides an easy way to accept longer passwords.
But as I've noted, there are other ways to do that.  (Double DES is
most likely quite sufficient if you want to pursue that route, though;
few people are going to use passwords longer than 16 characters, and
the attacks on double DES described in the cryptographic literature
require O(2^55) storage, if I recall correctly -- I may be off by a
factor or so of 2.)

How should a password algorithm be designed today?  I'd use iterated,
salted, locally-parameterized SHA or MD5 (the differences in cryptographic
strength between them are not significant for these purposes, a statement
I'll be glad to explain privately if anyone asks).  For the local
parameterization, I'd prepend and append a system-specific string to
the user's password.  I'm not certain how I'd do the salting; the obvious
method is to change the state of the ABCD registers at startup time,
but given recent discussions in the IPSEC working group of the IETF,
I'd expect commercial MD5 chips to have a primitive to do that.  I'd
use an iteration count stored with the hashed password, as explained
above; a nice feature of using MD5 or SHA is that the administrator can
unilaterally strengthen all local passwords.  With today's DES-based
scheme -- which is probably just as strong cryptographically -- the
password itself must be known when the iteration count is increased. 

I've gone on long enough (well, too long, I suspect), but I hope this
note can settle this issue once and for all.  I've tried to provide
enough citations that anyone with further questions can look things
up themselves; if you need more pointers, drop me a note.

		--Steve Bellovin