Number Theory

Sunday, November 7, 2010

Vigenere Cipher

The Vigenere Cipher

Author: R. Morelli

One of the main problems with simple substitution ciphers is that they are so vulnerable to frequency analysis. Given a sufficiently large ciphertext, it can easily be broken by mapping the frequency of its letters to the know frequencies of, say, English text. Therefore, to make ciphers more secure, cryptographers have long been interested in developing enciphering techniques that are immune to frequency analysis. One of the most common approaches is to suppress the normal frequency data by using more than one alphabet to encrypt the message. A polyalphabetic substitution cipher involves the use of two or more cipher alphabets. Instead of there being a one-to-one relationship between each letter and its substitute, there is a one-to-many relationship between each letter and its substitutes.

The Vigenere Tableau

The Vigenere Cipher, proposed by Blaise de Vigenere from the court of Henry III of France in the sixteenth century, is a polyalphabetic substitution based on the following tableau:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B
D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D
F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E
G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F
H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G
I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H
J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I
K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J
L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K
M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L
N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M
O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N
P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O
Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P
R R S T U V W X Y Z A B C D E F G H I J K L M N O P Q
S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R
T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S
U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T
V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U
W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V
X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W
Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X
Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y
Note that each row of the table corresponds to a Caesar Cipher. The first row is a shift of 0; the second is a shift of 1; and the last is a shift of 25.
The Vigenere cipher uses this table together with a keyword to encipher a message. For example, suppose we wish to encipher the plaintext message:

TO BE OR NOT TO BE THAT IS THE QUESTION
using the keyword RELATIONS. We begin by writing the keyword, repeated as many times as necessary, above the plaintext message. To derive the ciphertext using the tableau, for each letter in the plaintext, one finds the intersection of the row given by the corresponding keyword letter and the column given by the plaintext letter itself to pick out the ciphertext letter.

Keyword: RELAT IONSR ELATI ONSRE LATIO NSREL
Plaintext: TOBEO RNOTT OBETH ATIST HEQUE STION
Ciphertext: KSMEH ZBBLK SMEMP OGAJX SEJCS FLZSY
Decipherment of an encrypted message is equally straightforward. One writes the keyword repeatedly above the message:
Keyword: RELAT IONSR ELATI ONSRE LATIO NSREL
Ciphertext: KSMEH ZBBLK SMEMP OGAJX SEJCS FLZSY
Plaintext: TOBEO RNOTT OBETH ATIST HEQUE STION
This time one uses the keyword letter to pick a column of the table and then traces down the column to the row containing the ciphertext letter. The index of that row is the plaintext letter.
The strength of the Vigenere cipher against frequency analysis can be seen by examining the above ciphertext. Note that there are 7 'T's in the plaintext message and that they have been encrypted by 'H,' 'L,' 'K,' 'M,' 'G,' 'X,' and 'L' respectively. This successfully masks the frequency characteristics of the English 'T.' One way of looking at this is to notice that each letter of our keyword RELATIONS picks out 1 of the 26 possible substitution alphabets given in the Vigenere tableau. Thus, any message encrypted by a Vigenere cipher is a collection of as many simple substitution ciphers as there are letters in the keyword.

Although the Vigenere cipher has all the features of a useful field cipher -- i.e., easily transportable key and tableau, requires no special apparatus, easy to apply, etc. -- it did not catch on its day. A variation of it, known as the Gronsfeld cipher , did catch on in Germany and was widely used in Central Europe. The Gronsfeld variant used the digits of a keynumber instead of a the letters of keyword, but remained unchanged in all other respects. So in fact the Gronsfeld is a weaker technique than Vigenere since it only uses 10 substitute alphabets (one per digit 0..9) instead of the 26 used by Vigenere.

Cryptanalyzing the Vigenere Cipher: The Kasiski/Kerckhoff Method

Vigenere-like substitution ciphers were regarded by many as practically unbreakable for 300 years. In 1863, a Prussian major named Kasiski proposed a method for breaking a Vigenere cipher that consisted of finding the length of the keyword and then dividing the message into that many simple substitution cryptograms. Frequency analysis could then be used to solve the resulting simple substitutions.

Kasiski's technique for finding the length of the keyword was based on measuring the distance between repeated bigrams in the ciphertext. Note that in the above cryptogram the plaintext bigram 'TO' occurs twice in the message at position 0 and 9 and in both cases it lines up perfectly with the first two letters of the keyword. Because of this it produces the same ciphertext bigram, 'KS.' The same can be said of plaintext 'BE' which occurs twice starting at positions 2 and 11, and also is encrypted with the same ciphertext bigram, 'ME.' In fact, any message encrypted with a Vigenere cipher will produce many such repeated bigrams. Although not every repeated bigram will be the result of the encryption of the same plaintext bigram, many will, and this provides the basis for breaking the cipher. By measuring and factoring the distances between recurring bigrams -- in this case the distance is 9 -- Kasiski was able to guess the length of the keyword. For this example,

Location: 01234 56789 01234 56789 01234 56789
Keyword: RELAT IONSR ELATI ONSRE LATIO NSREL
Plaintext: TOBEO RNOTT OBETH ATIST HEQUE STION
Ciphertext: KSMEH ZBBLK SMEMP OGAJX SEJCS FLZSY
the Kasiski method would create something like the following list:

Repeated Bigram Location Distance Factors
KS 9 9 3, 9
SM 10 9 3, 9
ME 11 9 3, 9
...
Factoring the distances between repeated bigrams is a way of identifying possible keyword lengths. Those factors that occur most frequently will be the best candidates for the length of the keyword. Note that in this example since 3 is also a factor of 9 (and any of its multiples) both 3 and 9 would be reasonable candidates for keyword length. Although in this example we don't have a clear favorite, we've narrowed down the possibilities to a very small list. Note also that if a longer ciphertext were encrypted with the same keyword ('RELATIONS'), we would expect to find repeated bigrams at multiples of 9 -- i.e., 18, 27, 81, etc. These would also have 3 as a factor. Kasiski's important contribution is to note this phenomenon of repeated bigrams and propose a method -- factoring of distances -- to analyze it.

Once the length of the keyword is known, the ciphertext can be broken up into that many simple substitution cryptograms. That is, for a keyword of length 9, every 9-th letter in the ciphertext was encrypted with the same keyword letter. Given the structure of the Vigenere tableau, this is equivalent to using 9 distinct simple substitution ciphers, each of which was derived from 1 of the 26 possible Caesar shifts given in the tableau. The pure Kasiski method proceeds by analyzing these simple substitution cryptograms using frequency analysis and the other standard techniques.

A variant of this method, proposed by the French cryptographer Kerckhoff, is based on discovering the keyword itself and then using it to decipher the cryptogram. In Kerckhoff's method, after the message has been separated into several columns, corresponding to the simple substitution cryptograms, one tallies the frequencies in each column and then uses frequency and logical analysis to construct the key. For example, suppose the most frequent letter in the first column is 'K'. We would hypothesize that 'K' corresponds to the English 'E'. If we consult the Vigenere tableau at this point, we can see that if English 'E' were enciphered into 'K' then row G of the table must have been the alphabet used for the first letter of the keyword. This implies that the first letter of the keyword is 'G'.

The problem with this "manual" approach is that for short messages there are often several good candidates for English 'E' in each column. This requires the testing of multiple hypotheses, which can get quite tedious and involved. Therefore we need a more sensitive test to discover the alphabet used by each letter of the keyword.

Recalling that each row of the Vigenere tableau is one of the 26 Caesar shifts, we can use the chi-square test to determine which of the 26 possible shifts was used for each letter of the keyword. This modern day version of the Kerckhoff method turns out to be very effective. And this is the algorithm that is used in CryptoToolJ's Vigenere Analyzer.

http://www.cs.trincoll.edu/~crypto/historical/vigenere.html

Maryyyy

"Transposition ciphers

In a transposition cipher, the letters themselves are kept unchanged, but their order within the message is scrambled according to some well-defined scheme. Many transposition ciphers are done according to a geometric design. A simple (and once again easy to crack) encryption would be to write every word backwards. For example "Hello my name is Alice." would now be "olleH ym eman si ecilA." A scytale is a machine that aids in the transposition of methods.

In a columnar cipher, the original message is arranged in a rectangle, from left to right and top to bottom. Next, a key is chosen and used to assign a number to each column in the rectangle to determine the order of rearrangement. The number corresponding to the letters in the key is determined by their place in the alphabet, i.e. A is 1, B is 2, C is 3, etc. For example, if the key word is CAT and the message is THE SKY IS BLUE, this is how you would arrange your message:

Next, you take the letters in numerical order and that is how you would transpose the message. You take the column under A first, then the column under C, then the column under T, as a result your message "The sky is blue" has become: HKSUTSILEYBE

In the Chinese cipher's method of transposing, the letters of the message are written from right to left, down and up columns to scramble the letters. Then, starting in the first row, the letters are taken in order to get the new ciphertext. For example, if the message needed to be enciphered was THE DOG RAN FAR, the Chinese cipher would look like this:

                           R R G T
                          A A O H 
                          F N D E

The cipher text then reads: RRGT AAOH FNDE

Many transposition ciphers are similar to these two examples, usually involving rearranging the letters into rows or columns and then taking them in a systematic way to transpose the letters. Other examples include the Vertical Parallel and the Double Transposition Cipher.

More complex algorithms can be formed by mixing substitution and transposition in a product cipher; modern block ciphers such as DES iterate through several stages of substitution and transposition."

http://en.wikipedia.org/wiki/Classical_cipher

"The Playfair cipher was the first practical digraph substitution cipher. The scheme was invented in 1854 by Charles Wheatstone, but was named after Lord Playfair who promoted the use of the cipher. The technique encrypts pairs of letters (digraphs), instead of single letters as in the simple substitution cipher. The Playfair is significantly harder to break since the frequency analysis used for simple substitution ciphers does not work with it. Frequency analysis can still be undertaken, but on the ~600 possible digraphs rather than the 26 possible monographs. Frequency analysis thus requires much more ciphertext in order to work.

It was used for tactical purposes by British forces in the Second Boer War and in World War I and for the same purpose by the Australians during World War II. This was because Playfair is reasonably fast to use and requires no special equipment. A typical scenario for Playfair use would be to protect important but non-critical secrets during actual combat. By the time the enemy cryptanalysts could break the message the information was useless to them [1].

From Kahn's 'The CodeBreakers':

Perhaps the most famous cipher of 1943 involved the future president of the U.S., J. F. Kennedy, Jr. On 2 August 1943, Australian Coastwatcher Lieutenant Arthur Reginald Evans of the Royal Australian Naval Volunteer Reserve saw a pinpoint of flame on the dark waters of Blackett Strait from his jungle ridge on Kolombangara Island, one of the Solomons. He did not know that the Japanese destroyer Amagiri had rammed and sliced in half an American patrol boat PT-109, under the command of Lieutenant John F. Kennedy, United States Naval Reserve. Evans received the following message at 0930 on the morning of the 2 of August 1943:" (Practicalcryptography.com)

KXJEY UREBE ZWEHE WRYTU HEYFS
KREHE GOYFI WTTTU OLKSY CAJPO
BOTEI ZONTX BYBWT GONEY CUZWR
GDSON SXBOU YWRHE BAAHY USEDQ

The translation:

PT BOAT ONE OWE NINE LOST IN ACTION IN BLACKETT
STRAIT TWO MILES SW MERESU COVE X CREW OF TWELVE
X REQUEST ANY INFORMATION.

I can't really relate this to what we're doing right now, but I'm sure we'll learn this when we start cryptography

post for 10/31

What are the different uses of prime factorization? Include some we have done in class and some that you find. Also is it related to cryptography?

Prime factorization can be used for many many things. Some include finding the greatest common factor (GCF), lowest common multiple (LCM), and also divisors of a number.

Using prime factorization and knowing what the uses of it are can help you do all of these things effectively and quickly.

I can also imagine how it would be related to cryptography.

11/7

"One of the simplest examples of a substitution cipher is the Caesar cipher, which is said to have been used by Julius Caesar to communicate with his army. Caesar is considered to be one of the first persons to have ever employed encryption for the sake of securing messages. Caesar decided that shifting each letter in the message would be his standard algorithm, and so he informed all of his generals of his decision, and was then able to send them secured messages. Using the Caesar Shift (3 to the right), the message,
"RETURN TO ROME"
would be encrypted as,
"UHWXUA WR URPH"
In this example, 'R' is shifted to 'U', 'E' is shifted to 'H', and so on. Now, even if the enemy did intercept the message, it would be useless, since only Caesar's generals could read it.
Thus, the Caesar cipher is a shift cipher since the ciphertext alphabet is derived from the plaintext alphabet by shifting each letter a certain number of spaces. For example, if we use a shift of 19, then we get the following pair of ciphertext and plaintext alphabets:
Plaintext: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Ciphertext: T U V W X Y Z A B C D E F G H I J K L M N O P Q R S
To encipher a message, we perform a simple substitution by looking up each of the message's letters in the top row and writing down the corresponding letter from the bottom row. For example, the message
THE FAULT, DEAR BRUTUS, LIES NOT IN OUR STARS BUT IN OURSELVES.
would be enciphered as
MAX YTNEM, WXTK UKNMNL, EBXL GHM BG HNK LMTKL UNM BG HNKLXEOXL.
Essentially, each letter of the alphabet has been shifted nineteen places ahead in the alphabet, wrapping around the end if necessary. Notice that punctuation and blanks are not enciphered but are copied over as themselves."

The Caesar Cipher is basically just shifting the whole alphabet and doing codes from there on. A could be C. Then C would be E. You get the point.

Source: http://www.cs.trincoll.edu/~crypto/historical/caesar.html

week 4 prompt

They have a man by the name of Petros Petrosyan who is a famous cipher and who has solved the great pyrmid cipher that remained unsolved for so many years. This is a cipher code on how to pyramids are so closely connected with the way of problem solving today and how they connect with the modern day math of ancient egypt. It has a great amount of geometry within the codes and how to slove them. I think that the reason for this is because it was something that was built and geometry is the foundation of all the building supplies needed in life. I dont think that it has related to anything that we have learned this year but I think it does have something to do with what we learned in Geometry.
www.hermann-uwe.de/blog/famous-unsolved-codes-and-ciphers

Alaina's blog, 7 Nov. 2010

"Steganography is the art of hiding the existence of a message rather than its meaning. It is currently a very HOT topic as one subdivision of steganography is digital watermarking, a technique used to copyright and to protect digital images, music and software."
Older methods of steganography include writing in lemon juice or tracing words through a piece of paper so that they have to be colored or shaded over to be seen.
Stealthy, right?

SOURCE:
http://home.cogeco.ca/~cipher/