Hashing in .NET

Hashing is a concept to get a fixed size result from an arbitrary input.  That means you can pass any input of any size to a Hash method and you will get a fixed size result for the same.  In this post, I will talk about Hashing as a whole and the approaches that are available to you in .NET to hash your data. Lets look at some of the properties that Hash algorithm should provide.

Stability of a Hash

By stability of Hash we mean under same situation with same input the hash algorithm should generate the same result. By this what I mean that the the if Input to the hash algorithm is same the output result from the generator should remain the same as well.  So if you pass the same input a million times to the same hash algorithm, it will generate the same result. We use this stability feature of the Hash algorithm to identify if the input is correct or not.  This is the fundamental property for every hash algorithm. So if the Hash is not stable it is not useful anymore.

Uniformity of Hash

Uniformity of hash says that for a Hash algorithm every valid Value of result should have at – least one input value for the available result address space.  In other words, if you look at the result of hash algorithm in the result address space, there should be no result that is within the result address space but cannot make a result of any input. Generally Perfect uniformity is not possible for any Hash algorithm. Hash algorithm is bound to have collision  Practically the address space of Hash is smaller than the address space of the inputs, because using Hash we are actually making the data smaller.

Efficiency of Hash

We consider one hash algorithm to be efficient when the Hashing does not take long time to generate Hashing.  The cost of hashing should be always balanced with the application needs. Hence if the hash algorithm is very complex and does take long time (say 1 milliseconds for instance), it cannot be used for hashing. Any hash algorithm should be fast enough to generate from any input (say 1 microsecond for instance) to prove its efficiency.

Security of Hash

By security of Hash algorithm we mean that the Hash algorithm cannot be reverse engineered. By this we mean given a Hash value, finding the input should not be feasible.  Security of Hashing is one of the important criteria for a good Hash algorithm. So if you are using some hashing that requires security ( I mean if you are hashing some password) make sure you use Secured hash algorithm to do the same.

Lets put the simplest implementation of Hash Algorithm :

We call it Naive Implementation of summing all the ASCII values of a string.

public int AdditiveHashAlgorithm(string str)
{
        int result = 0;
        foreach(int ascii in str)
           unchecked
           {
               result += ascii;
           }
        return result;
}

The above code represents one simplest hash algorithm that adds up all the Ascii characters of a string and returns the integer equivalent. You can see I have used unchecked block to avoid integer overflows. This Hash is Stable, Uniform, Efficient but not secure.

Available Hash Algorithms in .NET

Amongst all algorithms available in .NET BCL, the ones that are most commonly used are MD5, SHA 1, SHA 2(256), SHA512 etc. We will look how you can work with them.
MD5 Hash

public string CalculateMD5(string input)
{
// Create MD5 Hash from input
MD5 md5 = MD5.Create();
byte[] inputBytes = Encoding.ASCII.GetBytes(input);
byte[] hash = md5.ComputeHash(inputBytes);
// Convert byte array to string
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hash.Length; i++)
sb.Append(hash[i].ToString("X2"));
return sb.ToString();
}

This code generates MD5 Hash algorithm for you which you can use. Generally we give Hash to the external parties so that if they have same input they can generate the result in the same way and compare the two hash. The MD5 algorithm is Stable, Uniform and Efficient but not secure.
SHA 1 (Secure Hash Algorithm)

public static string CalculateSHA1(string text, Encoding enc)
{
byte[] buffer = enc.GetBytes(text);
SHA1CryptoServiceProvider crypto =  new SHA1CryptoServiceProvider();
byte[] hash = crypto.ComputeHash(buffer);
// Convert byte array to string
StringBuilder sb = new StringBuilder();
for (int i = 0; i < hash.Length; i++)
sb.Append(hash[i].ToString("X2"));
return hash;
}

The SHA1 is secure hash algorithm. The same thing you can do with SHA2 (256 bit) by replacing SHA1CryptoServiceProvider with SHA256CryptoServiceProvider; SHA384CryptoServiceProvider for 384 bit hash and SHA512CryptoServiceProvider for 512 bit hash algorithms.

Remember
SHA1 is not secure anymore as potential hack is already introduced. Hence for security it is recommended to use SHA2 or above.

I hope this post helps

Thanks for reading.

Abhishek Sur

Abhishek Sur is a Microsoft MVP since year 2011. He is an architect in the .NET platform. He has profound theoretical insight and years of hands on experience in different .NET products and languages. He leads the Microsoft User Group in Kolkata named KolkataGeeks, and regularly organizes events and seminars in various places for spreading .NET awareness. He is associated with the Microsoft Insider list on WPF and C#, and is in constant touch with product group teams. He blogs at http://www.abhisheksur.com His Book : Visual Studio 2012 and .NET 4.5 Expert Development Cookbook. Follow Abhishek at Twitter : @abhi2434