It is human nature to want to protect sensitive information. In elementary school, you would pass notes by folding the paper many times before stapling it so no one could read it. As you got older, you realized mail goes in an envelope to protect the privacy of the enclosed letters.
And now as we move into a digital age, the way we protect ourselves has changed, but privacy is just as important (if not more important). In 2018, we have more passwords than days in the week. We are now bombarded by buzzwords like encryption, hash functions, and multi-factor authentication. The ways we protect ourselves and our data are rapidly changing, and it is important to stay up to date on emerging technologies.
What is Encryption? What is Hashing?
In previous articles, I have mentioned Hashing, but what does it really mean to hash something? Hashing is a very popular concept when it comes to blockchain technology. This article will examine questions like; What is Hashing? What is Encryption? And How are Encryption and Hashing different? Comparing encryption vs hashing will help us understand their specific use cases.
2 Types of Encryption
Encryption is a process of taking a message and scrambling the contents in a way that allows only specific people the ability to view the message. There are 2 main types of encryption; Symmetric and Asymmetric. It will help to first examine Symmetric encryption, this will allow us to understand some of the reasons Asymmetric encryption was created.
To simply break down symmetric encryption, we will use the example of 2 coworkers that are looking to exchange a sensitive document. We will call these coworkers Alex and Becky. In this situation, Alex has a sensitive document that he needs to share with Becky. Alex uses an encryption program to lock and protect his document with a password or passphrase that he chooses. Alex then sends the encrypted file to Becky, however, she cannot open it until Alex provides her the passphrase used to lock the document. The main problem symmetric encryption in this scenarios becomes, how does Alex safely share the passphrase (or key) with Becky? Sending this via email is quite risky, and could lead to the password ending up in the hands of unwanted users. Additionally, if someone got a hold of the password, they would be able to decrypt any future messages between Alex and Becky.
This issue is the exact problem that Asymmetric encryption aims to solve. When you think of asymmetric encryption, think of a mailbox in front of someone’s house. The mailbox is exposed and accessible to anyone who knows its location. It is safe to say that the location of the mailbox is completely public, anyone is free to go to the location and drop in a letter. However, only the owner of the mailbox has the key to open the box and access the messages.
With this example in mind, let’s go back to the technical descriptions. When using asymmetric encryption, both parties involved (Alex and Becky) have to generate a Key Pair on their computer. A very popular and common way to do this is using the RSA encryption. The RSA algorithm will generate a public and private key, these keys are mathematically linked to one another. The public key is used to encrypt data, and only the corresponding private key can be used to decrypt the data. One important fact to remember is, even though the keys are linked together, one can not be derived from another. This means that if someone gets your public key there is no way to know what private key it is linked with (and vice versa).
Going back to the mailbox example, the mailbox access would be the public key. The owner of the mailbox is the only one with the private key, which is needed to open the box. Let’s now look at how the exchange takes place between Alex and Becky using Asymmetric encryption.
Alex and Becky start by exchanging public keys, Alex gives his public key to Becky, and Becky gives her public key to Alex. Once the public keys are exchanged, Alex is ready to send the document to Becky.
He will encrypt the sensitive file using Becky’s public key, he can then send the encrypted file to Becky. When Becky receives the encrypted document, she will use her private key to unlock and view the sensitive information. Because Alex has used Becky’s public key to encrypt the message, her private key will be the only key that can decrypt the message. Even Alex himself would not be able to decrypt the message because he does not have access to Becky’s private key.
In short, the strength of asymmetric encryption ultimately relies on the users to keep their private keys securely stored. If an attacker gains access to a users private key, they would be able to decrypt all messages meant for that particular user.
Asymmetry in Action
Asymmetric encryption is used for a variety of different reasons where security is very important. For example everytime you visit a secure website (via HTTPS) you are using an SSL (Secure Socket Layer) certificate. This certificate uses asymmetric encryption to establish encrypted links between a web server and a browser in online communication. Asymmetric encryption is also used to send and receive secure emails via the PGP protocol. Pretty Good Privacy (or PGP Encryption) is a program that provides cryptographic authentication during data communication. The last example of asymmetric encryption is Bitcoin, more specifically your digital currency wallet. Your wallet has a public and private key. This allows you to receive funds from others using a public key, but only the owner can withdraw using their private key.
What is Hashing?
Hashing is the process of converting an input of any length into a fixed size string of text using a mathematical function. This means that any text no matter how long it is can be converted into an array of number and letters using an algorithm.
The message or file you want to hash is called the Input. The algorithm used in the process of converting the file is called a Hash Function. When you run an input through a hash function, you get the Hash Value.
Each hash value (or output) needs to be unique. This means it needs to be impossible to produce the same hash value using 2 different inputs. In addition, the same message should always produce the same hash value. This quality makes a hash similar in nature to the human fingerprint. It will always be unique and can be used to identify someone or something.
Hashing in Blockchain
In Blockchain, hashes are used to represent the current state of a blockchain and to ensure its immutability. Every transaction contains specific bits of information. This information includes the amount being sent, the sending and receiving addresses, timestamp, and much more. All of the information is combined into the formula to create a hash called Transaction ID. The transaction ID is a hash value that can be used to identify and verify that a transaction has taken place on the network.
Now that we have a basic understanding of both encryption and hashing, let’s compare the 2 technologies. Encryption and hashing are very similar in concept. You have data that you want to protect, and you use a specific protocol to scramble the data. This is done in a way so that even if the file is lost or gets into the wrong hands, no one can access the information.
Protection is where the similarity between encryption and hashing ends. There is one prime difference between a cryptographic hash and an encrypted file. When you encrypt data, you are doing so because at some point you want to be able to retrieve the data. This means that you only want the data to be encrypted for a certain period of time (eg. during an email transfer). However, on the other end, you want the recipient to be able to decrypt that file and access the information contained. Conversely, when you process data using a cryptographic hash, you will not be able to take the hash and decrypt it to get the original content. You can think of encryption as a 2-way function, where you can encrypt and then decrypt the data. Hashing is a 1-way function, whereby knowing the hash cannot give you the original data.
Passwords and Security
Now that we know how encryption and hashing compare with one another, let’s dive deeper into how each is used for the storage and security of passwords.
There are 3 main ways that passwords can be stored. A password can be stored as plain text, it can be encrypted, or the password can be hashed and the hash value stored. The most basic and dangerous of these methods is plain text. Using plain text storage, if a hacker gained access to the company database they would see all the company usernames and passwords. Now you would like to think that this type of storage doesn’t happen. I wish that was the case, but we have seen numerous security breaches over the years that site passwords being stored in plain text form.
An alternative to plain text storage is encryption. You take the password of the user, and before you store it, you encrypt it with an encryption key. This prevents hackers from obtaining the user’s real passwords, however, it is still a little risky. Underneath the encryption layer of security is still a plain text password. This means that if an attacker hacks the database, he will only see the encrypted passwords. However, if he can also gain access to the encryption key, then he will gain access to all the password information. The problem with encryption as it relates to password protection is that it works 2-ways. This capability makes it ideal for securely sharing files, but riskier for securely storing passwords.
This brings us to the third technique for password storage, and that is by using a cryptographic hash function. As we know, a hash function takes an input and converts it to a fixed length string of characters that is always unique. To illustrate this better, below you will see 2 different hash functions.
There are many different hash functions available to use, below is a real output of for SHA-256
A Chain of Blocks → 615F4A914BF66218009873A5E9606FDA4681BB3776BAD06B70492C4D0266DC81
A Chain Of Blocks → ED8707B2F8606D5B5363F2FF2ADE1CB0AF66366C051ACAB5DDD2BBC9D52556D5
As you can see from the above example, changing just the capitalization of the ‘O’ in the word ‘of’ gives us an entirely different output.
How is Hashing Different?
Hash Functions are very different from encryption because they only work 1-way. You can calculate the hash of a password, but you cannot take the hash and turn it back into the original password. Using hashes allows a company to verify that you are logging using the correct password, without having to store the real password in the plain or encrypted form. When a user enters their password to log in, the submission is hashed and compared to the hash of the password that is saved in the database.
The one-way capability of hashing seemingly makes it a good choice for storing passwords, but everything method has drawbacks. Because an input will always create the same output, hackers can create their own database of hashes for commonly used passwords. This is called a Rainbow Table, a precomputed table for reversing cryptographic hash functions.
Add Some Salt…
Salting a password is a method to avoid a hacker being able to use a Rainbow Table. When a user enters their password for the 1st time, a string of random characters is added to the end of the password. The password along with the added salt is then hashed and added to the database. With salting, you then need to answer the question of where the salt is stored. Most of the time it is stored in plain text in the database. This may sound counter-intuitive in the process of password security. However, the attacker still will need to know that this information is indeed a salt and wherein the password it should be located.
Can’t have Salt without Pepper…
When you have salt, you always seem to have pepper as well. A Pepper in regards to hashing is similar to a salt as in you are adding an addition to the Password. However, a Pepper is much shorter in length, for this example we will use only upper and lower case letters of the alphabet. The Pepper is not stored in the database, so when a user logs in, the system tries all 52 combinations (a-z & A-Z) to match the hashed password. This technique slows an attacker down who would essentially have up to try 52 different combinations to hack the password.
In conclusion, when it comes to passwords and protection in general. It is smart to take a layered approach. In this article, we have looked at and compared encryption and hashing, but nothing says that you can’t use the 2 together. This is what a lot of the big companies are moving toward with security in high demand. Example, hash a password, then encrypt the hash, then hash again with an added salt, and store encryption keys on a separate server. The moral of the story is, you can assure you will avoid a hack attempt if you can make it more costly to complete the hack then the value of the information you will receive.