One of the trickier aspects of any data migration is migrating users and ensuring that the authentication details which worked on the legacy site also continue to work on the new Drupal site. Neither Drupal nor probably the legacy system actually stores the password in plain text. A one way hash, often salted will be used instead. This level of security is essential for ensuring that user passwords cannot be easily captured, however, it does provide a challenge when the method of hashing is unknown.
This article will take a look at how Drupal handles authentication and how it can be extended to handle new methods, such as those used by a legacy system. We will then go on to take a look at the hashing algorithm used on a .Net site and how we were able to implement some Drupal code to ensure that the hashes could be understood by Drupal. This took a fair bit of detective work and the main point of the article is to document how we did it.
Authentication in Drupal
Drupal contains the logic for user authentication in /includes/password.inc. An important function is user_check_password() where the first three characters of the $stored_hash are used to define the $type of password. A Drupal 7 password is denoted as '$S$' as can be seen from the code below.
/includes/password.inc
function user_check_password($password, $account) { if (substr($account->pass, 0, 2) == 'U$') { // This may be an updated password from user_update_7000(). Such hashes // have 'U' added as the first character and need an extra md5(). $stored_hash = substr($account->pass, 1); $password = md5($password); } else { $stored_hash = $account->pass; } $type = substr($stored_hash, 0, 3); switch ($type) { case '$S$': // A normal Drupal 7 password using sha512. $hash = _password_crypt('sha512', $password, $stored_hash); break; case '$H$': // phpBB3 uses "$H$" for the same thing as "$P$". case '$P$': // A phpass password generated using md5. This is an // imported password or from an earlier Drupal version. $hash = _password_crypt('md5', $password, $stored_hash); break; default: return FALSE; } return ($hash && $stored_hash == $hash); }
The secret to defining your own hashing algorithm is to replace this function with another, and this can be done with a small module which swaps out the password.inc file. By implementing your own "myauthmodule" you can swap out the password.inc file and include your own logic as desired.
/sites/all/modules/myauthmodule/myauthmodule.install
/** * @file * Supply alternate authentication mechanism. */ /** * Implements hook_enable(). */ function myauthmodule_enable() { variable_set('password_inc', drupal_get_path('module', 'myauthmodule') . '/password.inc'); } /** * Implements hook_disable(). */ function myauthmodule_disable() { variable_set('password_inc', 'includes/password.inc'); }
As you can see we use our own custom password.inc when the modules is enabled, and revert back to the old one when the module is disabled.
This is what our new user_check_password() looks like.
/sites/all/modules/myauthmodule/password.inc
/** * @file * Alternate authentication mechanism implementation. */ function user_check_password($password, $account) { if (substr($account->pass, 0, 2) == 'U$') { // This may be an updated password from user_update_7000(). Such hashes // have 'U' added as the first character and need an extra md5(). $stored_hash = substr($account->pass, 1); $password = md5($password); } else { $stored_hash = $account->pass; } $type = substr($stored_hash, 0, 3); switch ($type) { case '$S$': // A normal Drupal 7 password using sha512. $hash = _password_crypt('sha512', $password, $stored_hash); break; case '$H$': // phpBB3 uses "$H$" for the same thing as "$P$". case '$P$': // A phpass password generated using md5. This is an // imported password or from an earlier Drupal version. $hash = _password_crypt('md5', $password, $stored_hash); break; case '$X$': // The legacy .Net method $hash = _myauthmodule_crypt($password, $stored_hash); break; default: return FALSE; } return ($hash && $stored_hash == $hash); }
In this case, if the hash starts with '$X$' our custom algorithm will kick in and check the $password as entered against the $stored_hash. It's up to you to define the correct algorithm so that the $password is transformed into something which can be compared against the $stored_hash. If there is a match, the user will be authenticated.
.Net hashing algorithm
We will now go on to examine the hashing algorithm used in the .Net web application. The most difficult piece of the puzzle was working out exactly what algorithm was being used for the hash. After a lot of poking around we discovered that the .Net application was using SHA1 with 1000 iterations (RFC 2898). We also have to pick apart the string we were given by base64 decoding the string and then pulling the salt off the front of it.
Playing with .Net hashing algorithm, working out what was what and then how to implement it was my job. This Stack Overflow article held the clues for how to solve it.
/sites/all/modules/myauthmodule/crypt.inc
/** * @file * Hash algorithm based on .Net hashing algorithm. */ /** * Legacy hashing constants. */ define('LEGACY_HASH_SUBKEY_LENGTH', 32); define('LEGACY_HASH_ALGORITHM', 'sha1'); define('LEGACY_HASH_ITERATIONS_NUMBER', 1000); define('LEGACY_HASH_PREFIX', '$X$'); /** * Returns a hash for the password as per the .Net legacy mechanism. */ function _myauthmodule_crypt($password, $stored_hash) { // Remove the prefix. $legacy_hash = substr($stored_hash, strlen(LEGACY_HASH_PREFIX)); // Calculate the hash. $legacy_hash_decoded = base64_decode($legacy_hash); $legacy_salt = substr($legacy_hash_decoded, 1, 16); $subkey = hash_pbkdf2(LEGACY_HASH_ALGORITHM, $password, $legacy_salt, LEGACY_HASH_ITERATIONS_NUMBER, LEGACY_HASH_SUBKEY_LENGTH, TRUE); // Hash = null char + salt + subkey. $hash = chr(0x00) . $legacy_salt . $subkey; return LEGACY_HASH_PREFIX . base64_encode($hash); }
As we were unable to rely on the hash_pbkdf2() function existing in PHP (supported from PHP 5 >= 5.5.0) we had to code our own as a fallback. The PHP code for the hash algorithm was taken from comment posted on PHP hash_pbkdf2 manual page here.
/sites/all/modules/myauthmodule/hash_pbkdf2_fallback.inc (this file is not needed if you are using PHP 5 >= 5.5.0)
/** * PBKDF2 key derivation function as defined by RSA's PKCS #5: https://www.ietf.org/rfc/rfc2898.txt * $algorithm - The hash algorithm to use. Recommended: SHA256 * $password - The password. * $salt - A salt that is unique to the password. * $count - Iteration count. Higher is better, but slower. Recommended: At least 1000. * $key_length - The length of the derived key in bytes. * $raw_output - If true, the key is returned in raw binary format. Hex encoded otherwise. * Returns: A $key_length-byte key derived from the password and salt. */ if (!function_exists("hash_pbkdf2")) { class pbkdf2 { public $algorithm; public $password; public $salt; public $count; public $key_length; public $raw_output; private $hash_length; private $output = ""; public function __construct($data = null) { if ($data != null) { $this->init($data); } } public function init($data) { $this->algorithm = $data["algorithm"]; $this->password = $data["password"]; $this->salt = $data["salt"]; $this->count = $data["count"]; $this->key_length = $data["key_length"]; $this->raw_output = $data["raw_output"]; } public function hash() { $this->algorithm = strtolower($this->algorithm); if (!in_array($this->algorithm, hash_algos(), true)) throw new Exception('PBKDF2 ERROR: Invalid hash algorithm.'); if ($this->count <= 0 || $this->key_length <= 0) throw new Exception('PBKDF2 ERROR: Invalid parameters.'); $this->hash_length = strlen(hash($this->algorithm, "", true)); $block_count = ceil($this->key_length / $this->hash_length); for ($i = 1; $i <= $block_count; $i++) { // $i encoded as 4 bytes, big endian. $last = $this->salt . pack("N", $i); // first iteration $last = $xorsum = hash_hmac($this->algorithm, $last, $this->password, true); // perform the other $this->count - 1 iterations for ($j = 1; $j < $this->count; $j++) { $xorsum ^= ($last = hash_hmac($this->algorithm, $last, $this->password, true)); } $this->output .= $xorsum; } if ($this->raw_output) { return substr($this->output, 0, $this->key_length); } else { return bin2hex(substr($this->output, 0, $this->key_length)); } } } function hash_pbkdf2($algorithm, $password, $salt, $count, $key_length, $raw_output = false) { $data = array('algorithm' => $algorithm, 'password' => $password, 'salt' => $salt, 'count' => $count, 'key_length' => $key_length, 'raw_output' => $raw_output); try { $pbkdf2 = new pbkdf2($data); return $pbkdf2->hash(); } catch (Exception $e) { throw $e; } } }
Getting it all to work
Now that our custom code is in place we can give it a spin with some real live data. The first step of the process is to import the data into your users table. The string you write into the pass column must include the following concatenated items:
- the hash type, in this case '$X$',
- the hash
In our case the hash string was base64 encoded and was concatenation of 3 parts:
- null character (8 bits)
- 16 characters of salt (128 bits)
- 32 characters of subkey (256 bits)
The hash string, as extracted from the legacy database looked similar to the following:
AFnR63Ykym/kDXLFEM5tlL450Y+drbfdwRGhsOCOMlcR273QYod3QZdKwhiKHKHjXw==
After it was written to the password column in the users table it looked like:
$X$AFnR63Ykym/kDXLFEM5tlL450Y+drbfdwRGhsOCOMlcR273QYod3QZdKwhiKHKHjXw==
With that knowledge (we had to learn and investigate to acquire them in first place) we were able to extract the legacy .Net salt and use it for calculation of our new subkey. The subkey is generated by hash_pbkdf2() function mentioned before, but to get the right subkey we need to provide correct settings and inputs (the order is exactly the same as the hash_pbkdf2() function requires):
- the hash algorithm to be used (in our case sha1)
- a password provided by a user, the legacy .Net salt
- the number of iterations for the key derivation process (in our case 1000)
- the length of the derived key in bytes (in our case 32)
- the raw output set to TRUE to get our new subkey in raw binary format
To see clearly how the mechanism of getting .Net legacy hash is working, here's the code again:
/** * Legacy hashing constants. */ define('LEGACY_HASH_SUBKEY_LENGTH', 32); define('LEGACY_HASH_ALGORITHM', 'sha1'); define('LEGACY_HASH_ITERATIONS_NUMBER', 1000); define('LEGACY_HASH_PREFIX', '$X$'); /** * Returns a hash for the password as per the .Net legacy mechanism. */ function _myauthmodule_crypt($password, $stored_hash) { // Remove the prefix. $legacy_hash = substr($stored_hash, strlen(LEGACY_HASH_PREFIX)); // Calculate the hash. $legacy_hash_decoded = base64_decode($legacy_hash); $legacy_salt = substr($legacy_hash_decoded, 1, 16); $subkey = hash_pbkdf2(LEGACY_HASH_ALGORITHM, $password, $legacy_salt, LEGACY_HASH_ITERATIONS_NUMBER, LEGACY_HASH_SUBKEY_LENGTH, TRUE); // Hash = null char + salt + subkey. $hash = chr(0x00) . $legacy_salt . $subkey; return LEGACY_HASH_PREFIX . base64_encode($hash); }
The _myauthmodule_crypt() function returns a newly calculated hash based on the password provided and the salt we extracted from the stored hash. This is combined with the prefix and the whole result is returned back to the calling function where it is compared with the stored hash.
The hash construction by .Net:
To make the hash construction more clear, we can look at it from .Net perspective. The .Net web has its own specific salt. It's represented in hexadecimal format, let it be: 59d1eb7624ca6fe40d72c510ce6d94be (32 chars). User has provided some password, let it be: UserPassword . After the necessary operations are done we have the hash: AFnR63Ykym/kDXLFEM5tlL450Y+drbfdwRGhsOCOMlcR273QYod3QZdKwhiKHKHjXw== . So now we need to understand what are the necessary operations that creates hash from salt and user password inputs. Below is PHP code with hash_example() function that provides step by step example:
/sites/all/modules/myauthmodule/doc/hash_example.inc
require_once 'hex2bin_fallback.inc'; /** * Legacy hashing constants. */ define('LEGACY_HASH_SALT', '59d1eb7624ca6fe40d72c510ce6d94be'); // 32 chars in hexadecimal form define('LEGACY_HASH_SUBKEY_LENGTH', 32); define('LEGACY_HASH_ALGORITHM', 'sha1'); define('LEGACY_HASH_ITERATIONS_NUMBER', 1000); function hash_example() { // Convert the salt into binary form. $salt_binary = hex2bin(LEGACY_HASH_SALT); // 16 chars in binary form // Get the password from user. $password = 'UserPassword'; $hash = legacy_hash($password, $salt_binary); // $hash = AFnR63Ykym/kDXLFEM5tlL450Y+drbfdwRGhsOCOMlcR273QYod3QZdKwhiKHKHjXw== return $hash; } /** * Legacy hash construction based on .Net procedure. */ function legacy_hash($password, $salt) { $subkey = hash_pbkdf2(LEGACY_HASH_ALGORITHM, $password, $salt, LEGACY_HASH_ITERATIONS_NUMBER, LEGACY_HASH_SUBKEY_LENGTH, TRUE); // Hash = null char + salt + subkey. $hash = chr(0x00) . $salt . $subkey; return base64_encode($hash); }
As we are unable to rely on the hex2bin() function existing in PHP (supported from PHP 5 >= 5.4.0) we had to code our own as a fallback. The PHP code for the hash algorithm was taken from comment posted on PHP hex2bin manual page here.
/sites/all/modules/myauthmodule/doc/hex2bin_fallback.inc
/** * hex2bin() decodes a hexadecimally encoded binary string. * http://php.net/manual/en/function.hex2bin.php * (PHP >= 5.4.0) */ if (!function_exists('hex2bin')) { function hex2bin($str) { $sbin = ""; $len = strlen($str); for ($i = 0; $i < $len; $i += 2) { $sbin .= pack("H*", substr($str, $i, 2)); } return $sbin; } }
Conclusion
This kind of approach is used for all migrations from legacy systems where users need to be migrated. Generally it is a fairly simple approach as the hashing algorithm is either simple or well documented. In this case we had to do a fair deal of sleuthing to work out how to do it. We hope that this article will be of help to other developers who are neck deep in salts and hashes.