Tuesday, February 26, 2013

Ruby's String#crypt and Strings containing a 0-byte

There are several minor discrepancies between Ruby on Linux and Ruby on Windows. One that I became aware of a few months ago is in the implementation of String#crypt. Ruby relies on the underlying platform (e.g., Linux, Windows, etc.) to provide the implementation of this method.

As you may be aware, passwords on a Unix system are "hashed" before storage, and stored to the file /etc/shadow (which is readable only by root). The password is hashed using the Unix/GNU "crypt" function; Ruby's String#crypt is essentially a thin wrapper around that function.

See the function's documentation, e.g., here -- esp. the "Glibc Notes" section near the bottom:
http://manpages.ubuntu.com/manpages/precise/man3/crypt.3.html

For a programmer the crypt function is most useful for hashing passwords. From the documentation the function appears capable of hashing data of any length so long as the data does not contain the "null-terminator" (or zero byte). Indeed, you may see some unexpected behavior in Ruby if the String contains a 0-byte:

>> "abc".crypt("$5$salt$")
=> "$5$salt$6XvgFG1LMWL/SedlWdxAafOEHFFpkPXLqeNHsdc7P16"
>> "abc\0xyz".crypt("$5$salt$")
=> "$5$salt$6XvgFG1LMWL/SedlWdxAafOEHFFpkPXLqeNHsdc7P16"

In this example, two different strings produce the same hash. It's clear that the underlying "crypt" function only considers bytes in the string up until it encounters (what it considers to be) the termination of that string.

I taught a class on Ruby in the Fall of 2012 in which I assigned a homework requiring students to "crack" passwords.  Of course, I limited the scope so that a brute-force approach would work w/o taxing the server's CPU for very long, using just a few thousand candidate passwords requiring < 30s to iterate over.  To facilitate students using a Windows or MAC at home (and b/c I thought it'd be cool to write an implementation in Ruby), I implemented a portion of the GNU crypt function. Specifically, I implemented the SHA-256 and SHA-512 variations utilizing Ruby's stdlib implementation of SHA: Digest::SHA256 and Digest::SHA512.

The implementation is intended to return the same result as String#crypt when running on Linux, except when the string contains a 0-byte. For example (using my Ruby implementation):

>> "abc".crypt("$5$salt$")
=> "$5$salt$6XvgFG1LMWL/SedlWdxAafOEHFFpkPXLqeNHsdc7P16"
>> "abc\0xyz".crypt("$5$salt$")
=> "$5$salt$Qn8eGBpACkRjw1DYrtYAzXj7c4qfZ4VHm1E9McTQp/9"

There's also a little-known (little-documented?) "rounds" option in GNU's crypt that can be included with the salt to make computing an individual hash more difficult, and thus harder to crack via brute-force.

>> "abc".crypt("$5$rounds=9999salt")
=> "$5$rounds=9999saltYfNEaceKMQb0DL395eE8hQrplhEtsHtiDBBfJwsn4B5"


By default the rounds value is 5000. The hash created by the command passwd will only use rounds if an appropriate option is specified in /etc/pam.d/passwd. My implementation does not accept rounds, but it would likely be an easy change to make.

My code is available at: http://pastebin.com/gzekbGP3