Tuesday 7 November 2017

The Vigenere Cipher

This article is meant for writers looking for inspiration, cipher enthusiasts, and anyone interested in this topic. Don't bother using this for nefarious purposes - a real cryptoanalyst will probably pick it apart in minutes.

This is a continuation from my previous article on ciphers on simple substitution ciphers. The Vigenere cipher is the next logical step from the simple substitution cipher. Before we start, let's get the terminology cleared up:

Plaintext: The text that is ciphered, in plain English (or whatever language you prefer).
Ciphertext: The ciphered text.
Key: the key to cracking the cipher, either a number or a word, or a random selection of letters. More on this later.

So, to get started, do you remember the Caesar cipher from the last article? It involves shifting the entire alphabet by a fixed amount to produce the cipher alphabet. The whole Caesar cipher table is given below, with shifts from 0 to 25 are given below. It will be useful for understanding the Vigenere cipher.


The Vigenere Cipher works by combining several Caesar Ciphers into one. This is best illustrated with an example. Consider the following plaintext:

TEXT:            It is not in the stars to hold our destiny but in ourselves.
PLAINTEXT:
ITZISZNOTZINZTHEZSTARSZTOZHOLDZOURZDESTINYZBUTZINZOURSELVES

Note that all spaces have been replaced with 'Z' as discussed in the previous article. the entire text has been turned into uppercase, as the cipher we will be using here is not case sensitive. Now, let's try applying a simple substitution to it, say, Caesar 5:

CIPHERTEXT_CAESAR5: NYENXESTYENSEYMJEXYFWXEYTEMTQIETZWEIJXYNSDEGZYENSETZWXJQAJX

Several weaknesses are immediately obvious - for example, there are way too many 'E's, which gives away that they are spaces, and it all goes downhill from there. If you want to make this stronger, you could potentially use several Caesar ciphers in combination. Let's pick a random keyword - say, APPLE. (Don't judge me. If English books can do it, so can I).

To use this as a key,we have to convert it to the numerical equivalent - 0-15-15-11-4. If you want to, you can remove duplicates before this step. I chose not to in this example.

So, what does this key mean? It means that we will be using five alternating Caesar ciphers throughout the text - the first character will be Caesar 0, the second Caesar 15, and so on, until we hit the sixth character which will be Caesar 0 again. Ciphering the same plaintext gives:

CIPHERTEXT_KEY_APPLE:
IIOTWZBDEDIBOELEOHEERHOESZWDWHZDJCDDTHEMNNOMYTOXYDOJGDILLTW

The formula, by the way, is [plaintext+key, mod 26], with the alphabet at A=0, B=1, etc. Some sample calculation:

I 8 8+0, mod 26 = 8 I
T19 → 19+15, mod 26 = 34, mod 26 = 8 → I
Z → 25 → 25+15, mod 26 = 40, mod 26 = 14 → O
I → 8 → 8+11, mod 26 = 19 → T
S → 18 → 18+4, mod 26 = 22 → W
Z → 25 → 25+0, mod 26 = 25 → Z
and so on.

As you can see, there are fewer obvious clues in here. The letter frequency count gives 8 'D's, 7 'E's, and 7 'O's, and 'H', 'I', 'T', and 'W' are present four times each. This doesn't tell us so much. [frequncy analysis from https://www.mtholyoke.edu/courses/quenell/s2003/ma139/js/count.html]

The most reliable way to attack this cipher, if you know the key length, is to isolate the different Caesar ciphers and to attack them separately. For this cipher, provided you knew that the key length was 5, you could isolate the 1st, 6th and so on, then the 2nd, 7th, and so on - you get the idea. Then you can attack these as individual simple substitution ciphers.


As you can see, the resulting ciphertexts are much smaller than the original ciphertext. This makes it much more difficult to crack them. If your keyword/key phrase is the same length as the plaintext, it really will be nearly impossible to crack.

The risk of course, is the possibility of the key being intercepted. There are different ways to get around this - use a book cipher, encode your key (often a short word) in a simple substitution cipher that is not a Caeser cipher, or have the person on the other end know the keyword for the day/week/ whatever.

Variants

You can also use some variations on the Vigenere cipher. One is to take the key to its logical conclusion as mentioned above, and use a key of the same length as the message you want to send. There is always the problem of that being intercepted.

Another possibility is to use a numerical key - say, the Fibonacci sequence (1, 1, 2, 3, 5, 8, 13, 21, 34, ...). True, it increased beyond the number of characters rather quickly, and is strictly increasing (which can show itself in the cipher text), but this can be overcome by grouping them by digits (11, 23, 5, 8, 13, 21, 3, 4, ...). You can also consider using irrational numbers, or an irrational number of sequence multiplied by a constant - the number of ways you can play about with this is infinite.

Enjoy!

Falcon-15-X-C

No comments:

Post a Comment

How to write a character who is smarter than you

We all have that one character (or few) who is significantly smarter than the writer. So, as a writer, how do you write such a character con...