Number Theory: November 2010

Sunday, November 7, 2010

Vigenere Cipher

The Vigenere Cipher

Author: R. Morelli

One of the main problems with simple substitution ciphers is that they are so vulnerable to frequency analysis. Given a sufficiently large ciphertext, it can easily be broken by mapping the frequency of its letters to the know frequencies of, say, English text. Therefore, to make ciphers more secure, cryptographers have long been interested in developing enciphering techniques that are immune to frequency analysis. One of the most common approaches is to suppress the normal frequency data by using more than one alphabet to encrypt the message. A polyalphabetic substitution cipher involves the use of two or more cipher alphabets. Instead of there being a one-to-one relationship between each letter and its substitute, there is a one-to-many relationship between each letter and its substitutes.

The Vigenere Tableau

The Vigenere Cipher, proposed by Blaise de Vigenere from the court of Henry III of France in the sixteenth century, is a polyalphabetic substitution based on the following tableau:

A B C D E F G H I J K L M N O P Q R S T U V W X Y Z

A A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
B B C D E F G H I J K L M N O P Q R S T U V W X Y Z A
C C D E F G H I J K L M N O P Q R S T U V W X Y Z A B
D D E F G H I J K L M N O P Q R S T U V W X Y Z A B C
E E F G H I J K L M N O P Q R S T U V W X Y Z A B C D
F F G H I J K L M N O P Q R S T U V W X Y Z A B C D E
G G H I J K L M N O P Q R S T U V W X Y Z A B C D E F
H H I J K L M N O P Q R S T U V W X Y Z A B C D E F G
I I J K L M N O P Q R S T U V W X Y Z A B C D E F G H
J J K L M N O P Q R S T U V W X Y Z A B C D E F G H I
K K L M N O P Q R S T U V W X Y Z A B C D E F G H I J
L L M N O P Q R S T U V W X Y Z A B C D E F G H I J K
M M N O P Q R S T U V W X Y Z A B C D E F G H I J K L
N N O P Q R S T U V W X Y Z A B C D E F G H I J K L M
O O P Q R S T U V W X Y Z A B C D E F G H I J K L M N
P P Q R S T U V W X Y Z A B C D E F G H I J K L M N O
Q Q R S T U V W X Y Z A B C D E F G H I J K L M N O P
R R S T U V W X Y Z A B C D E F G H I J K L M N O P Q
S S T U V W X Y Z A B C D E F G H I J K L M N O P Q R
T T U V W X Y Z A B C D E F G H I J K L M N O P Q R S
U U V W X Y Z A B C D E F G H I J K L M N O P Q R S T
V V W X Y Z A B C D E F G H I J K L M N O P Q R S T U
W W X Y Z A B C D E F G H I J K L M N O P Q R S T U V
X X Y Z A B C D E F G H I J K L M N O P Q R S T U V W
Y Y Z A B C D E F G H I J K L M N O P Q R S T U V W X
Z Z A B C D E F G H I J K L M N O P Q R S T U V W X Y
Note that each row of the table corresponds to a Caesar Cipher. The first row is a shift of 0; the second is a shift of 1; and the last is a shift of 25.
The Vigenere cipher uses this table together with a keyword to encipher a message. For example, suppose we wish to encipher the plaintext message:

TO BE OR NOT TO BE THAT IS THE QUESTION
using the keyword RELATIONS. We begin by writing the keyword, repeated as many times as necessary, above the plaintext message. To derive the ciphertext using the tableau, for each letter in the plaintext, one finds the intersection of the row given by the corresponding keyword letter and the column given by the plaintext letter itself to pick out the ciphertext letter.

Keyword: RELAT IONSR ELATI ONSRE LATIO NSREL
Plaintext: TOBEO RNOTT OBETH ATIST HEQUE STION
Ciphertext: KSMEH ZBBLK SMEMP OGAJX SEJCS FLZSY
Decipherment of an encrypted message is equally straightforward. One writes the keyword repeatedly above the message:
Keyword: RELAT IONSR ELATI ONSRE LATIO NSREL
Ciphertext: KSMEH ZBBLK SMEMP OGAJX SEJCS FLZSY
Plaintext: TOBEO RNOTT OBETH ATIST HEQUE STION
This time one uses the keyword letter to pick a column of the table and then traces down the column to the row containing the ciphertext letter. The index of that row is the plaintext letter.
The strength of the Vigenere cipher against frequency analysis can be seen by examining the above ciphertext. Note that there are 7 'T's in the plaintext message and that they have been encrypted by 'H,' 'L,' 'K,' 'M,' 'G,' 'X,' and 'L' respectively. This successfully masks the frequency characteristics of the English 'T.' One way of looking at this is to notice that each letter of our keyword RELATIONS picks out 1 of the 26 possible substitution alphabets given in the Vigenere tableau. Thus, any message encrypted by a Vigenere cipher is a collection of as many simple substitution ciphers as there are letters in the keyword.

Although the Vigenere cipher has all the features of a useful field cipher -- i.e., easily transportable key and tableau, requires no special apparatus, easy to apply, etc. -- it did not catch on its day. A variation of it, known as the Gronsfeld cipher , did catch on in Germany and was widely used in Central Europe. The Gronsfeld variant used the digits of a keynumber instead of a the letters of keyword, but remained unchanged in all other respects. So in fact the Gronsfeld is a weaker technique than Vigenere since it only uses 10 substitute alphabets (one per digit 0..9) instead of the 26 used by Vigenere.

Cryptanalyzing the Vigenere Cipher: The Kasiski/Kerckhoff Method

Vigenere-like substitution ciphers were regarded by many as practically unbreakable for 300 years. In 1863, a Prussian major named Kasiski proposed a method for breaking a Vigenere cipher that consisted of finding the length of the keyword and then dividing the message into that many simple substitution cryptograms. Frequency analysis could then be used to solve the resulting simple substitutions.

Kasiski's technique for finding the length of the keyword was based on measuring the distance between repeated bigrams in the ciphertext. Note that in the above cryptogram the plaintext bigram 'TO' occurs twice in the message at position 0 and 9 and in both cases it lines up perfectly with the first two letters of the keyword. Because of this it produces the same ciphertext bigram, 'KS.' The same can be said of plaintext 'BE' which occurs twice starting at positions 2 and 11, and also is encrypted with the same ciphertext bigram, 'ME.' In fact, any message encrypted with a Vigenere cipher will produce many such repeated bigrams. Although not every repeated bigram will be the result of the encryption of the same plaintext bigram, many will, and this provides the basis for breaking the cipher. By measuring and factoring the distances between recurring bigrams -- in this case the distance is 9 -- Kasiski was able to guess the length of the keyword. For this example,

Location: 01234 56789 01234 56789 01234 56789
Keyword: RELAT IONSR ELATI ONSRE LATIO NSREL
Plaintext: TOBEO RNOTT OBETH ATIST HEQUE STION
Ciphertext: KSMEH ZBBLK SMEMP OGAJX SEJCS FLZSY
the Kasiski method would create something like the following list:

Repeated Bigram Location Distance Factors
KS 9 9 3, 9
SM 10 9 3, 9
ME 11 9 3, 9
...
Factoring the distances between repeated bigrams is a way of identifying possible keyword lengths. Those factors that occur most frequently will be the best candidates for the length of the keyword. Note that in this example since 3 is also a factor of 9 (and any of its multiples) both 3 and 9 would be reasonable candidates for keyword length. Although in this example we don't have a clear favorite, we've narrowed down the possibilities to a very small list. Note also that if a longer ciphertext were encrypted with the same keyword ('RELATIONS'), we would expect to find repeated bigrams at multiples of 9 -- i.e., 18, 27, 81, etc. These would also have 3 as a factor. Kasiski's important contribution is to note this phenomenon of repeated bigrams and propose a method -- factoring of distances -- to analyze it.

Once the length of the keyword is known, the ciphertext can be broken up into that many simple substitution cryptograms. That is, for a keyword of length 9, every 9-th letter in the ciphertext was encrypted with the same keyword letter. Given the structure of the Vigenere tableau, this is equivalent to using 9 distinct simple substitution ciphers, each of which was derived from 1 of the 26 possible Caesar shifts given in the tableau. The pure Kasiski method proceeds by analyzing these simple substitution cryptograms using frequency analysis and the other standard techniques.

A variant of this method, proposed by the French cryptographer Kerckhoff, is based on discovering the keyword itself and then using it to decipher the cryptogram. In Kerckhoff's method, after the message has been separated into several columns, corresponding to the simple substitution cryptograms, one tallies the frequencies in each column and then uses frequency and logical analysis to construct the key. For example, suppose the most frequent letter in the first column is 'K'. We would hypothesize that 'K' corresponds to the English 'E'. If we consult the Vigenere tableau at this point, we can see that if English 'E' were enciphered into 'K' then row G of the table must have been the alphabet used for the first letter of the keyword. This implies that the first letter of the keyword is 'G'.

The problem with this "manual" approach is that for short messages there are often several good candidates for English 'E' in each column. This requires the testing of multiple hypotheses, which can get quite tedious and involved. Therefore we need a more sensitive test to discover the alphabet used by each letter of the keyword.

Recalling that each row of the Vigenere tableau is one of the 26 Caesar shifts, we can use the chi-square test to determine which of the 26 possible shifts was used for each letter of the keyword. This modern day version of the Kerckhoff method turns out to be very effective. And this is the algorithm that is used in CryptoToolJ's Vigenere Analyzer.

http://www.cs.trincoll.edu/~crypto/historical/vigenere.html

Maryyyy

"Transposition ciphers

In a transposition cipher, the letters themselves are kept unchanged, but their order within the message is scrambled according to some well-defined scheme. Many transposition ciphers are done according to a geometric design. A simple (and once again easy to crack) encryption would be to write every word backwards. For example "Hello my name is Alice." would now be "olleH ym eman si ecilA." A scytale is a machine that aids in the transposition of methods.

In a columnar cipher, the original message is arranged in a rectangle, from left to right and top to bottom. Next, a key is chosen and used to assign a number to each column in the rectangle to determine the order of rearrangement. The number corresponding to the letters in the key is determined by their place in the alphabet, i.e. A is 1, B is 2, C is 3, etc. For example, if the key word is CAT and the message is THE SKY IS BLUE, this is how you would arrange your message:

Next, you take the letters in numerical order and that is how you would transpose the message. You take the column under A first, then the column under C, then the column under T, as a result your message "The sky is blue" has become: HKSUTSILEYBE

In the Chinese cipher's method of transposing, the letters of the message are written from right to left, down and up columns to scramble the letters. Then, starting in the first row, the letters are taken in order to get the new ciphertext. For example, if the message needed to be enciphered was THE DOG RAN FAR, the Chinese cipher would look like this:

                           R R G T
                          A A O H 
                          F N D E

The cipher text then reads: RRGT AAOH FNDE

Many transposition ciphers are similar to these two examples, usually involving rearranging the letters into rows or columns and then taking them in a systematic way to transpose the letters. Other examples include the Vertical Parallel and the Double Transposition Cipher.

More complex algorithms can be formed by mixing substitution and transposition in a product cipher; modern block ciphers such as DES iterate through several stages of substitution and transposition."

http://en.wikipedia.org/wiki/Classical_cipher

"The Playfair cipher was the first practical digraph substitution cipher. The scheme was invented in 1854 by Charles Wheatstone, but was named after Lord Playfair who promoted the use of the cipher. The technique encrypts pairs of letters (digraphs), instead of single letters as in the simple substitution cipher. The Playfair is significantly harder to break since the frequency analysis used for simple substitution ciphers does not work with it. Frequency analysis can still be undertaken, but on the ~600 possible digraphs rather than the 26 possible monographs. Frequency analysis thus requires much more ciphertext in order to work.

It was used for tactical purposes by British forces in the Second Boer War and in World War I and for the same purpose by the Australians during World War II. This was because Playfair is reasonably fast to use and requires no special equipment. A typical scenario for Playfair use would be to protect important but non-critical secrets during actual combat. By the time the enemy cryptanalysts could break the message the information was useless to them [1].

From Kahn's 'The CodeBreakers':

Perhaps the most famous cipher of 1943 involved the future president of the U.S., J. F. Kennedy, Jr. On 2 August 1943, Australian Coastwatcher Lieutenant Arthur Reginald Evans of the Royal Australian Naval Volunteer Reserve saw a pinpoint of flame on the dark waters of Blackett Strait from his jungle ridge on Kolombangara Island, one of the Solomons. He did not know that the Japanese destroyer Amagiri had rammed and sliced in half an American patrol boat PT-109, under the command of Lieutenant John F. Kennedy, United States Naval Reserve. Evans received the following message at 0930 on the morning of the 2 of August 1943:" (Practicalcryptography.com)

KXJEY UREBE ZWEHE WRYTU HEYFS
KREHE GOYFI WTTTU OLKSY CAJPO
BOTEI ZONTX BYBWT GONEY CUZWR
GDSON SXBOU YWRHE BAAHY USEDQ

The translation:

PT BOAT ONE OWE NINE LOST IN ACTION IN BLACKETT
STRAIT TWO MILES SW MERESU COVE X CREW OF TWELVE
X REQUEST ANY INFORMATION.

I can't really relate this to what we're doing right now, but I'm sure we'll learn this when we start cryptography

post for 10/31

What are the different uses of prime factorization? Include some we have done in class and some that you find. Also is it related to cryptography?

Prime factorization can be used for many many things. Some include finding the greatest common factor (GCF), lowest common multiple (LCM), and also divisors of a number.

Using prime factorization and knowing what the uses of it are can help you do all of these things effectively and quickly.

I can also imagine how it would be related to cryptography.

11/7

"One of the simplest examples of a substitution cipher is the Caesar cipher, which is said to have been used by Julius Caesar to communicate with his army. Caesar is considered to be one of the first persons to have ever employed encryption for the sake of securing messages. Caesar decided that shifting each letter in the message would be his standard algorithm, and so he informed all of his generals of his decision, and was then able to send them secured messages. Using the Caesar Shift (3 to the right), the message,
"RETURN TO ROME"
would be encrypted as,
"UHWXUA WR URPH"
In this example, 'R' is shifted to 'U', 'E' is shifted to 'H', and so on. Now, even if the enemy did intercept the message, it would be useless, since only Caesar's generals could read it.
Thus, the Caesar cipher is a shift cipher since the ciphertext alphabet is derived from the plaintext alphabet by shifting each letter a certain number of spaces. For example, if we use a shift of 19, then we get the following pair of ciphertext and plaintext alphabets:
Plaintext: A B C D E F G H I J K L M N O P Q R S T U V W X Y Z
Ciphertext: T U V W X Y Z A B C D E F G H I J K L M N O P Q R S
To encipher a message, we perform a simple substitution by looking up each of the message's letters in the top row and writing down the corresponding letter from the bottom row. For example, the message
THE FAULT, DEAR BRUTUS, LIES NOT IN OUR STARS BUT IN OURSELVES.
would be enciphered as
MAX YTNEM, WXTK UKNMNL, EBXL GHM BG HNK LMTKL UNM BG HNKLXEOXL.
Essentially, each letter of the alphabet has been shifted nineteen places ahead in the alphabet, wrapping around the end if necessary. Notice that punctuation and blanks are not enciphered but are copied over as themselves."

The Caesar Cipher is basically just shifting the whole alphabet and doing codes from there on. A could be C. Then C would be E. You get the point.

Source: http://www.cs.trincoll.edu/~crypto/historical/caesar.html

week 4 prompt

They have a man by the name of Petros Petrosyan who is a famous cipher and who has solved the great pyrmid cipher that remained unsolved for so many years. This is a cipher code on how to pyramids are so closely connected with the way of problem solving today and how they connect with the modern day math of ancient egypt. It has a great amount of geometry within the codes and how to slove them. I think that the reason for this is because it was something that was built and geometry is the foundation of all the building supplies needed in life. I dont think that it has related to anything that we have learned this year but I think it does have something to do with what we learned in Geometry.
www.hermann-uwe.de/blog/famous-unsolved-codes-and-ciphers

Alaina's blog, 7 Nov. 2010

"Steganography is the art of hiding the existence of a message rather than its meaning. It is currently a very HOT topic as one subdivision of steganography is digital watermarking, a technique used to copyright and to protect digital images, music and software."
Older methods of steganography include writing in lemon juice or tracing words through a piece of paper so that they have to be colored or shaded over to be seen.
Stealthy, right?

SOURCE:
http://home.cogeco.ca/~cipher/

Stephen's Post 11/7/2010

"In cryptography, a substitution cipher is a method of encryption by which units of plaintext are replaced with ciphertext according to a regular system; the "units" may be single letters (the most common), pairs of letters, triplets of letters, mixtures of the above, and so forth. The receiver deciphers the text by performing an inverse substitution.

Substitution ciphers can be compared with transposition ciphers. In a transposition cipher, the units of the plaintext are rearranged in a different and usually quite complex order, but the units themselves are left unchanged. By contrast, in a substitution cipher, the units of the plaintext are retained in the same sequence in the ciphertext, but the units themselves are altered."

It seems that this is one of the most famous methods of ciphering, or deciphering, to say the least.

"There are a number of different types of substitution cipher. If the cipher operates on single letters, it is termed a simple substitution cipher; a cipher that operates on larger groups of letters is termed polygraphic. A monoalphabetic cipher uses fixed substitution over the entire message, whereas a polyalphabetic cipher uses a number of substitutions at different times in the message, where a unit from the plaintext is mapped to one of several possibilities in the ciphertext and vice-versa."

Source: http://en.wikipedia.org/wiki/Substitution_cipher

Chase 11/7

The cipher disk was invented in 1467 by Leon Battista Alberti, a famous Italian philosopher and architect.

Alberti used two different alphabets located on concentric rings - this means one ring is inside of or on top of another. By lining up two different letters, one from each ring, he could make a simple substitution alphabet in which he could create a cipher.

For example, if he aligned the A on the outer ring with the G on the inner ring, this would make the following substitution alphabet used to encrypt a message:

OUTER RING: ABCDEFGHIJKLMNOPQRSTUVWXYZ
INNER RING: GHIJKLMNOPQRSTUVWXYZABCDEF

From there, he could encrypt his message and send it to someone who knew the secret to revealing the message.

http://www.nsa.gov/kids/ciphers/ciphe00005.shtml

kaitlynnn's blogg

In 1885, a pamplet was published supposedly telling where to find a bunch of gold and silver. The instructions were encrypted. The pamplet told the story of Thomas Jefferson Beale, who made a fortune out west, and buried it in Bedford County, Virginia. Three separate messages were encrypted in the pamplet, and consist of pages of numbers, separated by commas. He gave the solution to the second message.

The key to decrypting the second message was the Declaration of Independence. Every number in the message refers to the first letter of a word in it. The third message is relatively uninteresting to treasure hunters, as it tells who the treasure belongs to. Apparently, no one has ever decrypted the first or third message.

There are a few good reasons to think that the pamplet is a hoax. For example, the first message would seem to use the same message of encryption as the second, but with a few much larger numbers, implying that the key document is much longer than the Declaration of Independence. But no matter how long a text is, you probably would not have to search 2000 words deep to find any letter you wanted. Besides, there is internal evidence that the key is actually the Declaration of Independence, and that the original message is mostly meaningless letters with no words, and with a few large numbers

Mal's Post

"Probably the most well known cipher of all time is the German Enigma of World War II. German military and diplomatic information was encoded using an typewriter style machine called “Enigma.” This device had three wheels which would type a different letter than the one you choose on the keyboard. To further complicate the encryption, after each character, the wheels would change so that a different letter would be produced for repeated characters. This made the classic style of decryption and code breaking impossible to complete on an Enigma created document."

So. my understanding of Enigma is this. to decrypt a code, the receiver has to know which wheel to use and where to start on the wheel. Apparently this created a barrier during the war for the enemies. England deciphered the code, but kept it secret until like 30 years ago.

I don't know exactly how this is related to what we've been doing in number theory. Probably something to do with the probability that it would encrypt a letter for another letter given several wheels.

Feroz's Blog

The Advance Encryption Standard, also known as Rijndael.

"Rijndael is the block cipher algorithm recently chosen by the National Institute of Science and Technology (NIST) as the Advanced Encryption Standard (AES). It supercedes the Data Encryption Standard (DES). NIST selected Rijndael as the standard symmetric key encryption algorithm to be used to encrypt sensitive (unclassified) American federal information. The choice was based on a careful and comprehensive analysis of the security and efficiency characteristics of Rijndael's algorithm."

From what I understand it's used by the government, and it's made by two Belgian dudes.

"The Rijndael cipher was developed by two Belgian cryptographers, Joan Daemen and Vincent Rijmen, and submitted by them to the AES selection process. Rijndael (pronounced "Rhine dall") is a wordplay with the names of the two inventors."

Source:

http://en.wikipedia.org/wiki/Advanced_Encryption_Standard

11/7

Voynich Manuscript - At least 400 years old, this is a 232-page illuminated manuscript entirely written in a secret script. It is filled with drawings of unidentified plants, herbal recipes of some sort, astrological diagrams, and many small human figures in strange plumbing-like contraptions. The script is unlike anything else in existence, but is written in a confident style, seemingly by someone who was very comfortable with it. In 2004 there were some compelling arguments which described a technique that would seemingly prove that the manuscript was a fake, but to date, none of the described techniques have been able to replicate a single section of the Manuscript, so speculations continue.

source:
http://elonka.com/UnsolvedCodes.html

11/7/10

Okay, I'm going to talk about The Famous Beale Cipher. It states that in the year 1885 some sort of pamphlet was publish telling where to find gold and silver that's worth millions of dollars. Here's the problem, though. All the instructions are encrypted (in code). Anyway, it is called the Famous Beale Cipher on behalf of Thomas Jefferson Beale who somehow made a large fortune and buried it Bedford County, Virginia. In the pamphlet, there were three separate messages..telling exactly where the gold and silver was, how much of it was there, who it originally belonged to, etc..Actually the writer of the pamplet solved the second message. You see, the messages are written entirely in numbers with commas between them. He said the way that it was arranged you had to use the Declaration of Independence. As in, whatever number was there, that number represented the first letter of a word in the Declaration of Independence. For example, if you see the number 7, that means that you find the 7th word of the Declaration of Independence and take the first letter of that word. Of course after you find all the letters, you have to somehow figure out what the words are by trying different things..But, to this day, the puzzle still has not been solved.

Source:
Loy, Jim. 2001. The Famous Beale Cipher. 7 Nov. 2010. http://www.jimloy.com/puzz/beale.htm.

Saturday, November 6, 2010

Week 4 Blog Prompt

Research a famous cipher method online. Everyone needs to have different sources and types. Briefly explain the method and how it works to the best of your understanding. Does it use anything similar to what we have done so far this year?

Friday, November 5, 2010

Number Theory Blogs

The reason I didn't do a blog for the weekend of 10/30 - 10/31 is because I went on the blog on Saturday, and they had no prompt. So, I figured that there wasn't a blog just like all the other weeks...

Thursday, November 4, 2010

Alaina's blog, 31 october 2010

Prime factorization can be used to find many things. It's about breaking things down, going back to the basics.It's used for things like finding lcm, gcd, exponential form, number of divisors, product of positive divisors and to simplify fractions, etc.Cryptography is used to break down & decipher codes. prime factorization is used to break down numbers. Sounds pretty similar. Well I’m sure, in some strange way, it is used in cryptography.