01

# What is Cryptography?

EPFT UIJT MPPL MJLF OPOTFOTF UP ZPV? Does this look like nonsense to you? Did you notice that we asked you the same question twice? Confused? Look at the set of random alphabets in that first “sentence”. Replace each letter with the letter which comes before it in the alphabet e.g. replace B with A, P with O, F with E and so on and reread the sentence. It’s easy but in case you’re slow (which we doubt considering you’re a Digit reader), let us help. That would translate to: “DOES THIS LOOK LIKE NONSENSE TO YOU?”. Now read the third sentence and the confusion should go away. You see, cryptography is simple! All it takes is a trick up your sleeve and the patience to decode or encrypt a message.

WHAT IS CRYPTOGRAPHY?

This chapter gives you an introduction to the mysterious world ciphers, codes and other secret forms of communication

EPFT UIJT MPPL MJLF OPOTFOTF UP ZPV? Does this look like nonsense to you? Did you notice that we asked you the same question twice? Confused?

Look at the set of random alphabets in that first “sentence”. Replace each letter with the letter which comes before it in the alphabet e.g. replace B with A, P with O, F with E and so on and reread the sentence. It’s easy but in case you’re slow (which we doubt considering you’re a Digit reader), let us help. That would translate to: “DOES THIS LOOK LIKE NONSENSE TO YOU?”. Now read the third sentence and the confusion should go away. You see, cryptography is simple! All it takes is a trick up your sleeve and the patience to decode or encrypt a message.

What is Cryptography?

Cryptography is the art and science of converting ordinary information into gibberish and converting it back to its original, meaningful form. It can be done using either simple procedures or very complicated mathematical algorithms. Don’t be intimidated by its complexities though. As always, we’re here to simplify it for you. And don’t forget, you did manage to understand the first code we threw at you, didn’t you?

It’s amazing how rotating the alphabets by even just one position is enough to confuse the human mind! Oh, and we call it ‘rotating’, not ‘shifting’ because shifting would mean having ‘Z’ replace ‘A’ as well.

Symmetric Cryptography

Cryptography involves tricks like there to scramble messages to hide them from unintended audiences. The above example could be easily understood after basic trial and error even if the code breaker didn’t have the instructions required to unscramble it. To make it unbreakable, we’ll need to employ a sophisticated algorithm. But let’s get to that later when you’ll be familiarized with some simple algorithms. You’ll also learn to create your own small algorithm to encrypt and decrypt data and it’s going to be fun.

One of the most prominent secret language systems in India is the famous tapori language well known by Mumbaites. With keywords such as peti, khokha and supari, the language has found wide acceptance in Bollywood movies, especially those revolving around organized crime. Though these words are famous now, they were once known only by street thugs. Notice that the vocabulary consists of actual words from the Hindi language whose meanings differ when used by the seasoned tapori. This is a good example of cryptography gone wrong – not due to the nature of its use but due to its failure to remain a secret language.

Who uses Cryptography?

You. Yes, you do use cryptography. How? Answer the following questions:

2. Do you use Gmail?

3. Have you ever opened a PDF or ZIP file that asked you a password?

4. Did you ever use an HTTPS site?

5. Do you chat with your friends on the internet?

6. Do you fill forms on the internet?

If the answer to all the questions is ‘no’, then maybe you don’t use cryptography. Our guess is there’s a very rare chance that you answered with a ‘no’ to all the questions. Cryptography is everywhere and we use it daily. It doesn’t only involve using arcane commands on a black terminal. It’s present in the day-to-day tools we use – zipping software, PDF viewers, word processors, browsers and what not. People who don’t use computers also use cryptography.

Day-to-day tricks we use

You’ve probably heard of celebrities changing contact names in their phone books to avoid having to admit to infidelity. You must have done it too but for more worthy reasons such as for keeping a relationship under wraps by changing your boyfriend’s contact name in your address book to ‘Vampire’ just because he is as loving to you as Edward was to Bella, then you have used cryptography. You may however need to get a life but that apart, changing your sweetheart’s name to one belonging to the opposite sex is nothing but cryptography.

Twitter over HTTPS. Most web browsers allow you to see the cryptographic information of a secure page by clicking the favicon on the address bar

What is not Cryptography?

Let’s say you skipped work for a day to party with some friends. You have only your office computer on which you can save the pictures taken that day. Since you can’t keep them in the main photo collection, you put them inside the C:\Windows\System32\Config folder. That’s clever since no one will ever look for pictures in that folder. Did you just do some crypto? Well, the answer is no.

Cryptography is the science of scrambling information. Hiding information cannot be called as cryptography. Hence, uploading files to your Dropbox or Google Drive account whose password is unknown to anyone but you would still not qualify as cryptography.

It does, however, if you zipped the set of pictures into a zip file with a password. This is because the software you used for zipping with a password (e.g. Windows shell or WinZip) actually uses an algorithm called AES to encrypt the data – in other words, it modifies the data.

You might ask “Even zipping pictures into a zip file without a password would modify the data. Why doesn’t that count as cryptography?” Remember that cryptography doesn’t just involve ‘altering/modifying the data’. It is a method to ‘alter the data in such a way that unintended people are unable to interpret it’. Given the easy availability of compression software which can easily detect a ZIP file (even if you change the extension), just compressing the picture set won’t protect it.

Another question might arise at this point. At the beginning of this chapter, we used a technique of rotating the letter of the alphabet – is that a part of cryptography given that it was even easier to crack than it would be by compressing/ decompressing it? Well, it depends. Altering information to render it incomprehensible is one part of cryptography. If the trial and error method to understand the message works and you didn’t need to know that rotating the letters of the alphabet was the key, then it’s no cryptography at all. But to someone who can’t figure out what those random letters mean, the message remains unbreakable and thus, successful confidentiality is achieved. Once again though it depends, on who’s attempting to break the code. Given the intelligence of our species which aims to colonize Mars, it wouldn’t be an unimaginable feat for people of a certain level of intelligence. In such a case, the message encrypted using the ‘alphabet rotation’ technique is vulnerable to being intercepted and would qualify as a weak cryptographic method.

Breaking down the jargon

If you’ve been using computers since some time and have read this far, you might wonder who really uses these techniques in real life. You’re right, not many people do. If you wanted to secure a name in your phonebook, you’d probably use an app on your smartphone or an online web service to do the job. It is this very laziness that contributes to cryptography’s effectiveness. The fewer the people who know about it (or are bothered to learn about it), the more successful you are at communicating messages without interception.

When you proceed towards learning about cryptography, you’ll come across many terms. Terms such as keys, ciphers, encryption, codes, algorithm etc. Here we’ll talk about them briefly so that you won’t be intimidated by them. Also, you’ll come across the jargon time and again while reading this little book so it’s better to know them beforehand. Let’s start with the simplest one.

Plaintext

All scrambled messages were once pieces of texts. These pieces which are yet to be scrambled are called ‘plaintext’ in cryptography vocabulary. Back when the science of scrambling messages started taking shape, it was usually text that needed to be sent securely. It’s due to this reason that the original message which needs to be encrypted is called plaintext.

In the present context though, the word ‘plaintext’ doesn’t justify all cases of encryption. We strive to keep all data communication secure and that surely doesn’t include just text. Images, videos, programs and documents that don’t fit into the ‘simple text’ mold are also transferred securely. Regardless of all this, you’ll find numerous instances of data that  needs to be encrypted being called plaintext.

Ciphertext

After you have scrambled the plaintext, you get an output which should pretty much look like mystical nonsense. This nonsense is called ‘ciphertext’. It’s also noteworthy that the algorithms to encrypt data are often called as ‘ciphers’. Thus, the output they produce is called ciphertext.

Encoding & Decoding/ Encryption & Decryption

There’s no shortage of examples where encryption’ and ‘decryption’ are commonly mistaken for ‘encoding’ and ‘decoding’. The reason for this is the similarities they share. Both transform data into another format and both are reversible (unlike ‘hashing’). Though there are differences (Flip over to Chapter 3 for indepth information on the differences), all four are a part of cryptographic literature. For the record, the alphabet rotation technique in the beginning is an example of encoding/ decoding.

To avoid confusion, for now let’s assume that the terms are interchangeable.

(Cryptographic) Algorithm

In computer science, algorithms are a set (series) of instructions which when executed would solve a problem. The word retains its meaning in cryptography. The algorithm one uses to encrypt / decrypt messages is called ‘cryptographic algorithm’. Note that when we talk about cryptographic algorithms, there will always be a pair. The first algorithm of the pair is responsible for encryption of the data i.e. to convert plaintext to ciphertext and the second one is responsible for decryption i.e. to convert the ciphertext back to plaintext. Though not in all, but in many cases (typically symmetric algorithms) the decryption algorithm is basically the reverse of the encryption algorithm and vice versa. For this reason, in most literary works on the science of cryptography, both the algorithms are referred to as one algorithm only. They’re usually also implemented as a single program, hence the style of reference.

Encryption Key

Encryption or ciphering requires two things – an algorithm and at least one key. The algorithm converts the plaintext into ciphertext based on both, the algorithm as well as the key. If you change any one of them, the result will differ. Taking again the alphabet rotation example, the algorithm is ‘Forward Rotation’ and the ‘Key’ is ‘1’. If we would have rotated the alphabet by 2 places (e.g. replace A with C, B with D and so on), then the key would have been ‘2’.

Note that if you change the algorithm (like from forward rotation to backward rotation) or the key (e.g. rotate 3 places or 7 places) then the ciphertext will change. A good algorithm and a strong key are the necessary properties to make an encryption system strong.

Symmetric Ciphers

‘Symmetric ciphers’ or ‘symmetric algorithms’ are algorithms which use the same key to encrypt as well as decrypt data. Traditionally, this type of algorithm has been in use in methods of cryptography. The example that we gave is one of a symmetric cipher. When you encrypt the plaintext by replacing each letter with a letter one place ahead in the alphabet then you can’t revert to the plaintext by replacing the letters in the ciphertext by letters two place back. You’ll have to replace the letters by the same number of positions backwards.

Symmetric ciphers are named as such because both sides of the process (encryption and decryption) utilize the same key. Symmetric ciphers are also known as ‘private key cryptography’.

Asymmetric Ciphers

Asymmetric ciphers are usually known as public key cryptography and are well known as ‘public key cryptography’ algorithms. In a public key cryptography method, the key which is used to encrypt plaintext to ciphertext can’t be used to decrypt the ciphertext back into plaintext. The two keys involved are called as public key and private key.

Public Key: Public keys are used to encrypt plaintext. They’re named as public keys because they can be made public.

Private Key: Private keys are used to decrypt ciphertext back to plaintext. They’re named as such because they’re meant to be private.

Both these keys are related to each other using a mathematical relationship in a way that if you’re given one key, you wouldn’t be able to derive the second key. This allows the public key to be public. Anyone can have it and send you a message encrypted with the public key but no one can decrypt an encrypted message being sent to you by another person.

Let’s consider a hypothetical situation. Agent Archer wants to send a secret message to his boss, Jim; so he asks Jim for his public key. Jim creates a public-private key pair and sends the public key to Archer. A spy named Cyril uncovers the public key! Agent Archer then encrypts the message and sends it to his boss. The spy also intercepts the encrypted message. In the meanwhile, Jim has also received the encrypted message and he uses his private key to decrypt the message. Cyril is still trying to figure out how to decrypt the message using the key!

This property of asymmetric algorithms makes them indispensible in the modern era. They allow to freely send the key as well as the message! Since the key for decryption is separate from what is being used for encryption, it makes the communication safe!

However asymmetric algorithms are usually much more costly on the CPU than symmetric algorithms. This brings up an interesting usage of public key algorithm (asymmetric algorithm) whereby it is only used to transfer the key of a symmetric algorithm. This allows the key to be transferred securely while minimizing the amount of computational power required to do the actual encryption or decryption.

Key strength

Key strength usually refers to the length of the key being used for the algorithm. However it isn’t always the measure of its cryptographic strength. Randomness of characters used in the key is also an important factor in determining the strength of the cipher.

Hashing

While you can encrypt passwords for any system (website user passwords, operating systems account passwords or passwords to access a program), it’s possible to decrypt them. Since passwords are still the major way to authenticate users for various purposes, they do pose a threat to security.

Make that a serious threat to security in the case of Open Source systems such as Linux. It’s common knowledge that Open Source can allow hackers to know about the weaknesses or loopholes of the software more easily than Closed Source software. If the password file gets into the wrong hand, security would be a thing of the past. Obviously, storing passwords in plaintext (original) format would be downright stupid!

Compression software such as 7-zip help in encrypting compressed archives

For these reasons passwords are neither stored in encrypted form nor in plain format. So how are they stored? Well, they’re stored in ‘hashed’ form. Hashing is a technique to map text of variable length to a string of fixed length in such a way that the only way to find out the original text from the hash is to run a brute force attack. Since hashing functions are irreversible, they don’t qualify strictly as cryptographic functions. However, they’re associated with security often enough and it’s easy to find them mentioned alongside passwords in the context of cryptography.

Cryptographic algorithms and security – married by keys

Though we have a chapter dedicated to cryptanalysis, we’ll give you quick rundown on what it takes to break a cipher and what factors govern the security of a cryptographic algorithm.

Algorithm strength

A very strong key and a weak algorithm will result in a weak cipher. For convenience sake, let’s take the example we showed in the beginning again. We’re rotating the alphabet by n number of times. Here n is the key. Now, it won’t matter if we put n = 200 or n = 1000, there’s a limit to its effectiveness. The effectiveness limit is 25 times. You can’t rotate alphabets by more than 25 times. The 26th time it would always come back to the same position as if the rotation was never done. In simple terms, the algorithm itself is limiting the difficulty it can pose to an unintended recipient against cracking the cipher (converting ciphertext back to plaintext).

Not only this, the algorithm is very simple too. Rotating alphabets is something a child can do easily to pass messages to his friends in a classroom. It’s no big deal. Consider a scenario where the teacher catches the paper on which he was passing the message. The message is encrypted but how difficult would it be for the teacher to interpret? All she would have to do is put herself in the child’s place thinking “what can a child do to encrypt a message?” Add some trial and error into the mix and the message being passed would be easily interpreted by the unintended recipient (the teacher in this case). Hence, for maintaining security, using a good algorithm is important.

Computational power required

Okay, so you own a mobile phone that’s more powerful than a computer NASA sent to the moon years ago. But is that power enough to break a DES cipher? What about a Triple DES? Or RSA perhaps? Nope, right? Your mobile phone wouldn’t be able to do it. If you have the ciphertext and the algorithm that was used, then you need to know the key in order to break the cipher. In most cases, the only way to figure out the key would be to run a brute-force attack which would use the algorithm with every possible input as a key. Once the data is decrypted in an acceptable pattern, it can stop and record (and display) the key that was used for encryption. Now, when it comes to brute-forcing a cryptographic algorithm, it’s difficult to do it for a simple reason - most algorithms are computationally intensive and require more computational power. To break them would take plenty of computational resources and time. That brings us to another factor - time.

Asymmetric algorithms are also used for digital signatures

Time

We mentioned that your mobile phone can’t crack DES or RSA algorithms, but it in fact can crack any algorithm. The only problem – it will perhaps take millions of years to break it! That would be pretty impractical, wouldn’t it?

HashCalc is a good tool with small footprint to generate hashes for popular algorithms

Apart from requiring a strong algorithm and huge amounts of computational powers, it would take time to break a cipher. However we have an interesting case – DES. DES was an algorithm which was developed in the 1970s and was so popular that it became a standard. The algorithm’s full name happens to be ’Data Encryption Standard’. When it was invented, it was said to be very strong. Computers in those days would have taken eons to break it. Today, many loopholes have been discovered in the algorithm and it has been proven vulnerable. Time is an interesting factor when it comes to cryptography however, in most cases, the data protected by an algorithm which is deemed to be strong in its own time becomes useless by the time the cipher becomes vulnerable dues to further research in the area or due to increase in computing capability.

Encryption keys

RSA is widely known for being a difficult nut to crack! This is because RSA is an algorithm that depends on the difficulty of finding the factors of the product of two prime numbers. The product of two prime numbers is an important part of the key in RSA algorithm (RSA is an asymmetric algorithm). The truth, though, is that RSA is very much crackable. If you want to find out the factors of a number below a million which is a product of two prime numbers then modern computers would probably be able to do that in a few seconds! And yet, RSA is used widely. Why?

The reason is that an RSA’s security increases as you use a larger number for a key i.e. if you choose a number which is a product of two very large prime numbers then RSA becomes very difficult to crack. The use of this algorithm exhibits the importance of strong keys for gaining more security!