Classical Cryptography

Classical Cryptography
Xu Jia, XueMingQiang, Zhang Weiqi, Zhu Qian
School of Computing, National University of Singapore
{xujia, xuemingq, zhangwei, zhuqian}@comp.nus.edu.sg
Abstract
This paper is a survey on classical cryptography. In this paper, the ciphers of classical cryptography are
mainly discussed. Starting from mathematical and information theory background, ciphers with two major
categories – substitution and transposition are analyzed with encoding, decoding algorithms, cryptanalysis
matters and some applications. At last, some of the machines used in the early age of cryptography are
discussed.
Key words: Cryptography, cipher, encryption, decryption, key, plaintext, ciphertext
Introduction
Classical Cryptography, the history of which has at least 4000 years as we know1, is mainly used
in diplomacy and war over centuries. However, comparing to modern cryptography which are
mainly used in computer security nowadays, most of the classical ciphers are claimed to be
vulnerable in front of today’s powerful computers.
The motivation for us to write the paper is that the classical cryptography was useful in history
and is useful in some of the recent applications as well. They give basic ideas of how people do
confusions2 and diffusions3 which are the properties of some secure ciphers nowadays. Moreover,
they give clues on how cryptography theory is developed along the history.
1
From http://williamstallings.com/Extras/Security-Notes/lectures/classical.html
2
confusing: an encrypting algorithm that make the original message unrecognizable.
3
Diffusion :a principle that changes in one part of the plaintext will affect many parts of the entire
plaintext.
1
In the following sections, we are going to survey on each classical ciphers starting from
mathematical background and information theory. In our analysis, English (26 letters) is used as
the template for most of the ciphers. Similar analysis can be conducted for other languages which
have different alphabet size. For example, German language (30 letters in the alphabet).
Swedish(29 letters in alphabet).
Background Information
Mathematical Backgrounds
XOR operation
XOR operation is a binary bitwise operator who takes in two operands which could be either 0 or
1(True or False) and output 0 or 1 (or Boolean value)
XOR operation satisfies the following four rules:
Table 1: Truth table for XOR
A B A XOR B
0 0 0
0 1 1
1 0 1
1 1 0
Modular Arithmetic
Mod is a binary operator that takes in two integers as its operands. The result of mod operation is
the reminder of integer division of the left argument and the right argument.
Example of mod operation:
2
5 mod 3 =2
9 mod 5 =4
Congruence is a mathematic concept closely related with mod. Integers a and b are called
congruent (denoted as follow) modular a non-zero integer n iff a mod n= b mod n.
Another equivalent definition is that integers a and b are congruent modular a non-zero integer n
iff the difference of a and b is divisible by n.
For example, 63 and 83 are congruent to each other modular 10.
Information theory background
Index of coincidence
Index of coincidence was discovered by Philip Friedman stated in his article “The Index of
Coincidence and its Applications in Cryptography”, Riverbank Publications Number 22.
The index of coincidence of a ciphertext measures the probability of two letters that are randomly
selected from text to be identical. It will become less when the key length goes larger. The formula
is given by:
z
Freq(O ) * ( Freq(O ) 1)
IC = ¦
O a N * ( N 1)
IC stands for index of coincidence, Freq(x) is the number of occurrences of symbol x in the a text,
N is the length of the text.
The value for IC ranges from 0.0384, for a polyalphabetic substitution with a perfect flat
distribution, to 0.068, for a monoalphabetic substitution from common English text.
3
Table 2: Number of Enciphering Alphabets Versus Index of Coincidence;
No. of 1 2 3 4 5 10 Large
alphabets
IC 0.068 0.052 0.047 0.044 0.044 0.041 0.038
Unicity Distance
Unicity distance measures the minimal length of cipher text for which there is only one single
possible plaintext decryption. Usually the larger distance value, the better the cryptosystem is. For
example, unicity distance for substitution for English text is 27 which means given a 27 letters
long message it is possible to get a unique meaning.
U§logK/RlogP
(K is the size of key space, R is the redundancy, P is the size of alphabet used)
Frequency Analysis
In most languages, certain letters, words or symbols appear at certain frequencies if the text is long
enough. Frequency analysis is based on this idea. For Example, in English text, ‘e’ is the most
frequently used letter, that means it appears at highest frequency. The differences between the high
frequency letters and the low frequency letters can be used to analyze the cipher text. In the
appendix, there are statistic data for most common used letters and digrams and trigrams.
Substitution Ciphers
In substitution ciphers one letter is replaced by another letter. There are many categories of
substitution ciphers. In this section, we are going to discuss monoalphabetic substitution ciphers,
homophonic ciphers, polygraphic ciphers, polyalphabetic substitution ciphers and the one time
pad.
4
Monoalphabetic Substitution ciphers
The Monoalphabetic Substitution cipher, also called as Simple Substitution cipher, is the one in
which each character in the plaintext is replaced by a corresponding one from a cipher alphabet.
The cipher alphabet can be reversed or shifted or scrambled. Although the number of possible
keys is very large (e.g, 26!-1 for English), this cipher is not very strong and considered easily
breakable by frequency analysis. However, the advantage for this cipher is that it can be
performed by direct lookup, and the time to encrypt message of n characters is proportional to n.
We are going to look at Caesar cipher in details and briefly introduce some other ciphers such as
Affine cipher and Atbash cipher.
Caesar Cipher
Caesar cipher is one of the simplest encryption methods by shifting the alphabet to a fixed number
of positions.
For example, in English, with a shifted position of 23:
Plain: ABCDEFGHIJKLMNOPQRSTUVWXYZ
Cipher: XYZABCDEFGHIJKLMNOPQRSTUVW
The encryption can also be transformed by modular arithmetic. Letters are transformed to
respective numbers. Let A=0, B=1, C=2, D=3 … Z=25. Encryption a letter X by shifting a
position of n can be encoded as: E(x) = x +n mod 26 while the decryption can be represented as
D(x) = x -n mod 26. Thus, in English, there are 25 different ciphers while a language with an
alphabet consisting m letters has m-1 different Caesar ciphers.
General history of Caesar Cipher
The Caesar Cipher is said to be invented by Julius Caesar to communicate with his army. He is
considered as the first person who has ever used encryption for secure messages. Although this
cipher is relatively easy to break at present, it is unlikely to be broken at that time.
Cryptanalysis of Caesar Cipher
5
It is quite easy to break the Caesar cipher. Take English as an example, since the key space is only
25, we can break it by hand with less than 25 tries (exhaustive key search). That is, rotate it and
see whether the resulting decoded text is readable according to English syntax and common sense.
However, with frequency analysis of English letters4, it becomes easier and faster to break the
cipher.
By roughly mapping the frequency distribution curve (rearrange the letters to enable the curve
increasing) of the ciphertext with the normal frequency distribution curve of normal English, we
may get a readable English text. This method works well especially for the messages with long
content.
Another way to break the cipher is the recognition of the short, commonly-used words. For
example, in English, “the”, “and” and “of” appear regularly5. When the cipher text includes the
spaces, the two or three letters—so called digram and trigram--- are likely to be standing out and
repeated. Trying the regularly used digrams and trigrams, it is possible to decode the cipher easily.
Besides short words, consecutive and repeated letters also give hints to break the cipher. In
English, “tt”, “ss” and “ee” are the ones commonly repeated consecutively.
Application
ROT13
ROT13 is a self-reversing Caesar cipher popularly used on Usenet and other online forums as a
means of masking joke punchlines, movie and story spoilers, and offensive expressions from the
casual glance6.
The name “ROT13” stands for “Rotate by 13 places”. Since there are 26=2*13 letters in English,
the ROT13 function is its own inverse:
4
Discussed in the background section
5
From http://www.all-science-fair-projects.com/science_fair_projects_encyclopedia/Caesar_cipher
6
From http://www.fact-index.com/r/ro/rot13.html
6
ROT13(ROT13(x)) = ROT26(x) = x for any text x
To apply ROT13 to a piece of text, simply shift every English letter by 13 places leaving numbers,
symbols and other characters unchanged.
ROT13 is not intended to be secure. Instead of protecting the message, ROT13 protecting the
readers from materials they may not wish to view in the forums. Thus the viewer of the message
will be the ones who consciously choose to decipher it using rotate by 13 scheme.
Affine Cipher
Affine Cipher is a special case for the substitution cipher.
The encryption function for the cipher is e(x) = ax + b mod m where a and m are relatively prime
and m is the size of the alphabet.
The decryption function is d(x) = a - 1(x - b) mod m where a – 1 is the multiplicative inverse
modular m.
The cipher is less secure in the way that if a cryptanalyst can discover two of the ciphertext
characters then the key can be obtained by solving the equations system.
Atbash Cipher
Table3 : Atbash Cipher
Atbash Cipher
A B G D H V Z Ch T Y K L M N S O P Tz Q R Sh Th
Th Sh R Q Tz P O S N M L K Y T Ch Z V H D G B A
The Atbash Cipher is a simple substitution cipher in Hebrew. It substitutes the first letter by the
last one and the second letter by the second last oneǄ˄As shown in the table˅
Homophonic Substitution Ciphers

Homophonic ciphers are invented to increase the difficulty of frequency analysis attacks on
7
substitution ciphers. The way is to disguise the letter frequencies by homophony. Usually in this
cipher, high frequency letters are given more ciphertext symbols while the lower frequency ones
are given less. Thus, it is different from monoaphabetic cipher in the way that one letter can be
mapped to more than one ciphertext.
Book Cipher
The key of a book cipher is the identity of a book. The ciphertext of a plaintext word is the
location of the word in the book. One of the problems is that the word in the plaintext may not
appear in the book. So one of the alternative ways is to encode the plaintext letter by the location
of the letter in the book. However, when a large ciphertext is needed, the time for encoding the
message is long.
Straddling checkerboard
The Straddling checkerboard is a device to convert letters into digits.
An example of the checkerboard is like this7:
Table 4 : Straddling checkerboard
0 1 2 3 4 5 6 7 8 9
E T A O N R I S
2 B C D F G H J K L M
6 P Q U V W X Y Z .
From the table A=3 B=20 C=21 …Z=68, a plaintext of A T T A C K A T D A W N becomes 3 1
1 3 21 27 3 1 22 3 65 5 . Then add a secret key number (0452) by non carrying addition.
3113212731223655
+0452045204520452
7
From http://en.wikipedia.org/wiki/Straddling_checkerboard
8
=3565257935743007
Then use the same cherkerboard to turn it into letters:
3 5 65 25 7 9 3 5 7 4 3 0 0 7
ANWH R SAN R OAE E R
Decryption is simple reverse the process.
Polygraphic Substitution Ciphers

Instead of substitute individual plaintext letters, the polygraphic Substitution Ciphers substitute for
larger letter groups. It is more difficult for cryptanalyst to use frequency analysis to break the
cipher. However, for a specific language, there are still some frequency patterns for larger letter
groups.
Playfair cipher
The Playfair Cipher is the earliest practical Polygraphic Substitution Cipher. The cipher used a 5
by five table and a key. In order to create the 5 by 5 table and use the cipher, one needs to
remember the key and the four rules below8:
x If the letters of a pair are both the same (or only one letter is left), add an "X" after the
first letter. Encrypt the new pair and continue.
x If the letters appear on the same row of your table, replace them with the letters to their
immediate right respectively (wrapping around to the left side of the row if a letter in the
original pair was on the right side of the row).
x If the letters appear on the same column of your table, replace them with the letters
immediately below respectively (wrapping around to the top side of the column if a letter
in the original pair was on the bottom side of the column).
x If the letters are not on the same row or column, replace them with the letters on the same
row respectively but at the other pair of corners of the rectangle defined by the original
pair.
Use the inverse of these four rules can decrypt the message.
8
From http://www.fact-index.com/p/pl/playfair_cipher.html
9
Hill Cipher9
The Hill cipher is a polygraphic substitution which can combine much larger groups of letters
simultaneously, using linear algebra. Each letter is treated as a digit in base 26: A = 0, B =1, and so
on. A block of n letters is then considered as a vector of n dimensions, and multiplied by a n x n
matrix, modulo 26. The components of the matrix are the key, and should be random provided that
the matrix is invertible in GF(26n) .
Polyalphabetic substitution ciphers
In order to make substitution ciphers more secure, more than one cipher alphabet can be used to
encode a single alphabet in the plaintext. Such ciphers are called polyalphabetic substitution
cipher. Such a one-to-many correspondence makes the use of frequency analysis much more
difficult to attack.
Leon Battista Alberti invented the first published polyalphabetic cipher around 1467.[1] At the
beginning, a good polyalphabetic substitution cipher was extremely hard to break. But after the
mid-1800s when Friedrich Kasiski published the first procedure for attacking polyalphabetic
10
cipher, especially Vigenere cipher.
Vigenere cipher
This cipher is named after a Frenchman--Blaise de Vigenere. The encoding and decoding
procedures utilize a tableau rectum called Vigenere tableau and a key. A Vigenere tableau is a
collection of 26 permutations of 26 English letters. Usually, these permutations are written as a
square matrix indexed by a pair of English letters, with all 26 letters in each row and each column.
9
From http://en.wikipedia.org/wiki/Polygraphic#Polygraphic
10
In the book “Die Geheimschriften und die Dechiffrierkunst” (“Secret writing and the Art of
Deciphering” in English), the polyalphabetic cipher was no longer considered as secure.
10
Suppose the key is K=<k(0),k(1),k(2),(3),…,k(d-1)>, where k(i) is a symbol from the alphabet
used, typically an English letter . The length of the key is d. For example, if the key is “BAD”,
then d=3, k(0)=”B”, k(1)=”A”, k(2)=”D”. The key may repeat as many times as needed because it
is often shorter than the plaintext.
Suppose the plaintext is P=<p(0),p(1),p(2),…, p(n-1)>, where n is the length of plaintext P and
each of p(i) is a symbol from the plaintext.
Suppose the ciphertext encrypted from plaintext P with key K using Vigenere cipher is
C=<c(0),c(1),c(2),c(3),…,c(n-1)> where each of c(i) is a letter and n is the length of the ciphertext.
Note that the ciphertext C and plaintext P are of the same length. This is a characteristic of
Vigenere cipher.
Denote the Vigenere tableau with Vigenere_table. Then the encryption of Vigenere cipher can be
described as:
C(i)=Vigenere_table[k(i mod d)][p(i)], 0<= i < n.
Because the Vigenere tableau is symmetric, i.e. Vigenere_table[i][j]=Vigener_table[j][i] for all
pairs of i and j, the above formula can be written equivalently as:
C(i)=Vigenere_table[p(i)][k(i mod d)].
If we code 26 letters A to Z with 26 integers from 0 to 25 respectively, mathematically we can
describe the encryption rule:
C(i)=p(i) + k(i mod d) mod 26.
Example. For the message COMPUTER SECURITY and keyword LUCKY we proceed the
encryption as follows:
11
Table 5:
L U C K Y L U C K Y L U C K Y L
C O M P U T E R S E C U R I T Y
For each letter of the message, we use the letter of the keyword to determine a row and go across
the row to the column headed by the corresponding letter of the message. As in the following
table (Table 6), it follows that the first two letters "CO" in the message are encoded as "NI".
Table 6: Vigenere Cipher Scheme
Continuing in this way we find the encoded message that appears in table 6
12
Table 7: Encryption of Vigenere Cipher
L U C K Y L U C K Y L U C K Y L
C O M P U T E R S E C U R I T Y
N I O Z S E Y T C C N O T S R J
We can use the following formula to decrypt the message.
p(i)=c(i) - k(i mod d) mod 26.
Or use the reverse procedure as for encryption.
Beaufort cipher
Beaufort cipher is another polyalphabetic cipher which is very similar to the Vigenere Cipher. The
only difference is that Beaufort cipher uses reverse alphabets.
For encryption, we use C(i)= k(i mod d) - p(i) mod 26.
And for decryption we use P(i)=k(i mod d) – c(i) mod 26
Running key cipher
The running key cipher is a type of polyalphabetic substitution cipher, in which a text, typically
from a book, is used to provide a very long key stream. Generally speaking, such a book has to be
determined ahead of time, while the passage to be used as the key would be chosen randomly for
each message. Obviously, nobody except the sender knew the key if it’s not indicated somewhere
in the message. Like Vigenere cipher, running key cipher also employs Vigenere tableau. But in
running key cipher the key is not repeated, instead this cipher uses a key stream, which is as long
as the message. We need a predefined secure way to tell the recipient where to find the running
key for the message in the book.
To our surprise, the security of running key cipher is not as secure as we might image due to the
13
low entropy per character of both plaintext and key. The most obvious and easiest way to improve
the security is to use a predefined mixed alphabets table instead of the tableau recta (Vigener
table).
Autokey cipher
An autokey cipher incorporates the message into the key. It’s also called self-synchronizing stream
cipher. There are two kinds of autokey cipher: key autokey cipher, in which the next element of
the key is determined by the previous elements in the key stream, and text autokey ciphers, in
which the next element in the key is determined by the previous message.
Cirolamo Cardano, a methematian in Italy, invented the first autokey cipher.
Vigenère also invented a kind of autokey cipher. His innovation was to append the message to the
keyword to form the real key. So it’s a text-autokey cipher.This text-autokey cipher was
undeciphered for over 200 years, until Charles Babbage discovered a means of breaking the
cipher.
Cryptanalysis of Polyalphabetic Substitutions
The method to break the polyaphabetic ciphers is to determine the number of alphabets used,
break the ciphertext into pieces which were enciphered with the same alphabet, and solve each
piece as a monoalphabetic substitution. There are two tools that can decrypt messages written with
a large number of alphabets. They are the Kasiski method, to determine when a pattern of
encryption permutation has repeated, and the index of coincidence, to predict the number of
alphabets used for substitution.
The Kasiski Method for Repeated Patterns
14
The method of Kasiski, named from its developer Friedrich Kasiski, a Prussian military officer, is
a way of finding the number of alphabets that were used for encryption.
The method relies on the regularity of English. Not only letters but also letter groupings and words
are repeated. (e.g. –th, -ing, -ed, -ion, -tion, -ation, im-, in-, un-, re-,–eek-, -oot-, -our-, and words
like of, and, to, with, are, is, that etc.)
The Kasiski method follows this rule: if a message is encrypted with n alphabets (e.g., key length
is n for Vigener cipher), and if a particular word or letters group appears k times in the plaintext,
then it should be encrypted approximately k/m times (ceiling of k/m11) from the same alphabet.
This is resulting from the Pigeon Hole Principle12. The distance between the repeated pattern in
cipher text should be a multiple of the key length or say the number of alphabets used.
The algorithm for Kasiski method is as follow:
1. Identity repeated patterns of three or more letters
2. For each pattern, calculate the distance between the position of starting point of
successive instances of the pattern
3. Determine the great common divisor of all distances obtained from step 2
4. If polyalphabetic substitution is used, the key length should be one factor of the
GCD(great common divisor) obtained from step 3.
Short repeated patterns, such as 2 letters pattern, are often accidental, so it’s more trouble to
consider it that to ignore it. Any pattern over 3 characters is almost certainly not accidental. (The
likelihood of two four letters pattern not being from the same plaintext segment is 1/264 ) [security
in computing]. The distance of two repeated pattern should be divided evenly by the key length.
So if the distance is calculated with two non-successive instances, the number of candidates for
11
Ceiling is a mathematic function which takes into a real number as argument and output the least
integer value which is larger than or equal to the argument.
12
If you have fewer pigeon holes than pigeons and you put every pigeon in a pigeon hole, then there
must result at least one pigeon hole with more than one pigeon.
15
the key length would become larger.
For the details of the index of coincidence method, we can calculate the IC and look for the table
to find the corresponding key length13.
One Time Pad
One time pad uses a random key to encrypt the message. The reason why it is called one time pad
is because the key is used only once for each segment of message and never used again. Simple
XOR operation is used to encrypt the message.
Example:
Message: COMPUTER
KEY: SECURITY
COMPUTER: 01000011 01001111 01001101 01010000 01010101 01010100 01000101 01010010
SECURITY: 01010011 01000101 01000011 01010101 01010010 01001001 01010100 01011001
______________________________________________________________
CIPHERTEXT: 00010000 00001010 00001110 00000101 00000111 00011101 00010001 01001011
Each encryption is independent of any other encryption thus the pattern cannot be detected. The
unicity distance14 for one time pad is infinite because the key length should be equal to or longer
than text length. Thus, it is the only cipher that has been proven to be perfectly theoretical secure.
However, the length of key is an obvious drawback for one time pad(In one time pad the key
should be at least as long as the plaintext that is to be encrypted). Moreover, one needs to require
the user to agree on a key in advance, thus cause the problem of transmitting the key securely.
Cryptanalysis
One time pad is said to be a key transmission not message transmission. In order of one time pad
to be effectively secure, the key should be random enough. As long as the key is random enough
13
Refer to the background section.
14
Refer to the background section for unicity distance
16
and can be kept safe, one time pad is perfectly secure.
Transposition Ciphers
Transposition is a classical cryptography technique that is different from substitution.
Transposition means reorder the elements of plaintext according to some rule agreed by the sender
and receiver and makes it unrecognizable to adversaries.
The major property of Transposition cipher is that the number of each element in the plaintext is
the same as that are in cipher text, because elements are simply reordered but not substituted. Thus,
it has preservation of frequency distribution. However, the frequencies for digrams and trigrams
are probably not equal to the frequency distribution of original language. From this we may detect
one ciphertext is encrypted with transposition cipher. Transposition is not safe because modern
computer can easily decode the cipher by trying all the possible ways of arrangement and do it
quickly.
Simple Example of Transposition cipher:
plaintext: ILOVECOMPUTERSECURITY
ciphertext: YTIRUCESRETUPMOCEVOLI
If you read carefully, you can find that the plaintext is simply reversed. That is if we reverse the
cipher text we will get the original text.
Examples of Transposition ciphers
For most applications, they apply some bijective function to plaintext. The procedure to encode
the message can be used reversely to decode the message.
Rail Fence Cipher
Write the plaintext into a matrix row by row and the cipher output is column by column. The key
of this cipher is the number of letters in a row.
17
Example:
Message: WEAREDOINGCOMPUTERSECURITYASSIGNMENT
Key length: 6
Matrix:
W E A R E D
O I N G C O
M P U T E R
S E C U R I
T Y A S S I
G N M E N T
Cipher: WOMSTGEIPEYNANUCAMRGTUSEECERSNDORIIT
Columnar transposition
If we want to complicate the route in Rail Fence Cipher, we can permute the column to enhance
security. The way is to read the column in alphabetic order of the key.
Message: WEAREDOINGCOMPUTERSECURITYASSIGNMENT
Key: BIRDAY
Read the column from A->B->I->R->Y
Matrix:
W E A R E D
O I N G C O
M P U T E R
S E C U R I
T Y A S S I
G N M E N T
Cipher: ECERSNWOMSTGRGTUSEEIPEYNANUCAMDORIIT
18
Double transposition
Double transposition is to apply columnar transposition twice on the text to enhance the security.
In one time transposition, the adversary could try all the possible length of the key and get the
plaintext while double transposition will complicate the situation. Since one time transposition we
need one key, double transposition we need two keys.
Other transposition ciphers are: ADFGVX Cipher, Grille.
Machine and Rotors
Encryption and decryption can be done by a rotor machine practically. A rotor machine is a device
to implement the encoding and decoding algorithm of classical cryptography. It is constructed by
matching 26 switches and 26 light bulbs.
To make a rotor machine an encipherer, we need to do the following steps. Firstly, when turning
on anyone of the switches, a corresponding light bulb lights up.
Secondly, we replace the switches to the keys on a typewriter attached to the switch. And the light
bulbs are labeled with letters as well. For example, when you press key “A”, the light bulb “A”
will light up. But this is not an encryption; we need to make it a mono-alphabetic encipherer.
Thirdly, in order to turn it into an encryption system, we simply change the writing by light up
different light bulbs corresponding to each letter pressed on the keyboard. For example, when an
“A” is typed, light bulb “X” will light up. Thus, when we type a message, the lighting of the
light bulbs will encrypt the message. This is similar to a single-alphabet (mono-alphabet)
substitution system, which is insecure and easy to break by frequency analysis.
Since this kind of simple substitution is not safe, how can we make the machine rotor more secure?
The solution is to introduce a poly-alphabetic substitution cipher system by using a rotor in the
19
machine and rotate it! While rotating the rotor, a new substitution will be generated every time
the same letter is pressed. For example, the first time you press an “A”, light bulb ”X” lights up,
the second time you press “A”, light bulb “ S” lights up, the third time you press ”A” some other
letter will light up, And so on. There is a website15 simulate “enigma machine” (an example of
rotor machine) where you can try to press the letters on the keyboard and get a view of how the
light bulbs lights up.
The algorithm involved here is “use the next alphabet with every key press16” the rotor is
generating the key by rotating, and the key is hidden on the wiring of the disk. The first key you
pressed is very important since it is used to generate a large key which is used to encrypt the
following keys.. The generation of the large key is done by rotating from the first key you pressed
(which could be either a number or a letter).
The number of the rotors is also an important factor concerning its degree of security If a
machine with a single rotor is considered not secure enough, the security level can be increased by
simply more rotors. The reason is one rotor is a poly-alphabetic substitution system with 26 keys
while 2 rotors will give you 26*26 = 676 keys.
With more than one rotor, another rotor spin one position after the first rotor spins “all the way”.
For example, after the first rotor spins from position “A” to “Z”, the second rotor spins from ”A”
to “B”. If you are using 3 rotors, the third rotor will spin one position after the second rotor spins
“all the way”, etc.
This is how encryption is done using a rotor machine In order to turn the rotor machine a
decipherer, we could use a symmetrical approach.
Enigma Machine
15
The website is : http://www.ugrad.cs.jhu.edu/~russell/classes/enigma/
16
From http://www.fact-index.com/r/ro/rotor_machine.html
20
Enigma machine is a typical example for rotor machine. (examples of rotor machines are : Enigma
machine ,Fialka ,Hebern rotor machine ,HX-63 ,KL-7 ,Lucida ,NEMA ,SIGABA , Typex )
Enigma machine is a rotor machine with 3 rotors, a unique feature and a reflector. The
mechanization for enigma machine is a complex algorithm. The task of encoding and decoding it
could be solved mechanically.
Enigma machine has been used during the World War II in early 1920s, most famously by Nazi
Germany17.
17
From http://webhome.idirect.com/~jproc/crypto/enigma.html
21
18
Other machines
There are other machines used for encryption and decryption purposes. The algorithms and
principles behind also vary from one to another.
Jefferson Cylinder (1790)

This is a cylinder of wood which is 15cm in length and 4 cm in across width.
A picture is more than one thousand words.
19
( The cylinder is cut to slices with each slice 5 mm in width) and on each slice, there are 26
random allocated equal size letters written on the side of the slice. )
An important feature of the Jefferson Cylinder is that, the person who receive the secrete message
should have an exactly same allocated cylinder as the the person who encrypt and send the
message. In another words, there must be 2 identical cylinders to carry on the encryption and
18
From http://en.wikipedia.org/wiki/Enigma_machine
19
From http://williamstallings.com/Extras/Security-Notes/lectures/classical.html
22
decryption process.
The encryption process is carried out like this: firstly, you turn the wheels on the cylinder and get
the letters of the secrete message alone the side of the cylinder. And another random chosen line of
letters (on which the order of the letters also appears to be quite random) is copied. The random
letters of that line is the cipher text to be sent to the receiver.
At the receiver side, as he received the cipher text, he could just organize the letters on his
cylinder by arranging each letter of the cipher text on his cylinder. Since the cylinder used for
encrypt the message is identical to the one used to encrypt, when he turn the cylinder around, he
will be able to find a line of letters which is meaningful thus can find the plaintext.
Wheatstone Disk ( 1870)

Consider the case where we have 2 concentric wheels. Each wheel have 26 letters at the edge, by
rotating the 2 wheels, the inner wheel will have all letters towards to the letter at the outer wheel.
This is similar to the Caesar cipher.
The encryption will generate a poly-alphabetic cipher. The construction of the Wheatstone disk is
similar to a clock. There are 2 hands on the disk, one big hand, one small hand, which look like
the hour and minute pointer on the clock. These 2 hands are connected by gears. When the large
23
hand is pointing to a letter, the small hand will point to the corresponding cipher text. That is how
encryption is done using the Wheatstone disk. Note that when you rearrange the gears, the
encryption will be changed, which means, the small hand will not point to the same position when
the large hand is pointing the same letter.
Conclusion
In this paper, we have discussed the various kinds of ciphers and some of their applications. We
have seen that most of the ciphers are based on changing characters or stream of characters and
most of them are symmetric – once you know how to encode it, you will know how to decode it.
From the analysis, we have seen that most of the classical encryption methods are vulnerable and
can be easily attacked by the technology today. That is why we seldom use them in the computer
security nowadays. However, these encryption methods had given us clues on how cryptography
can be done. These basic theories, concepts of classical cryptography are important to the
development of the modern cryptography and will be important to the development of
cryptography in the future.
References
[1] Definition of substitution cipher

http://www.wordiq.com/definition/Substitution_cipher#Polyalphabetic
http://rinkworks.com/words/letterfreq.shtml
[2] English words frequency table
24
http://www.edict.com.hk/TextAnalyser/default.htm
[3] Classic Cryptography and Diagraphic Substituion

http://www.thinkquest.org/library/site_sum.html?tname=27158&url=27158/conce
pt1_13.html
[4] Codes and Ciphers Wheatstone Disk

http://www.otr.com/ciphers.shtm
l
[5] Computer Security Website
https://www.maths.uwa.edu.au/~praeger/teaching/3CC/WWW/chapter7.html#tth
_chAp7
[6] Enigma
http://webhome.idirect.com/~jproc/crypto/enigma.html
[7] Historical Cryptography

http://starbase.trincoll.edu/~crypto/
[8] Frequency Analysis

http://www.fact-index.com/f/fr/frequency_analysis.html
[9] Rotor Machine

http://www.fact-index.com/r/ro/rotor_machine.html
[10] Rotor Machines

http://raphael.math.uic.edu/~jeremy/crypt/rotor.html
[11] Secret Language

http://www.exploratorium.edu/ronh/secret/secret.htm
25
l
[12] U-boot Enigma Simulation
http://www.u-boot-greywolf.de/uenigmasimulation.htm
[13] Unicity Distance

http://www.u-boot-greywolf.de/uenigmasimulation.htm
[14] Encryption-Wikipidea
http://en.wikipedia.org/wiki/Cipher
[15] Book reference

“Security in Computing” by Charles P.Pfleeger
26
Appendix
Table I: English words frequency table

Words listed by frequency: the first 2000 most frequent words from the Brown Corpus
(1,015,945 words). These lists reflect general non-academic English as it is used in
newspapers, magazines and books.
Word Instances % Frequency Word Instances % Frequency
1. The 69970 6.8872 16. on 6742 0.6636
2. of 36410 3.5839 17. be 6376 0.6276
3. and 28854 2.8401 18. at 5377 0.5293
4. to 26154 2.5744 19. by 5307 0.5224
5. a 23363 2.2996 20. I 5180 0.5099
6. in 21345 2.1010 21. this 5146 0.5065
7. that 10594 1.0428 22. had 5131 0.5050
8. is 10102 0.9943 23. not 4610 0.4538
9. was 9815 0.9661 24. are 4394 0.4325
10. He 9542 0.9392 25. but 4381 0.4312
11. for 9489 0.9340 26. from 4370 0.4301
12. it 8760 0.8623 27. or 4207 0.4141
13. with 7290 0.7176 28. have 3942 0.3880
14. as 7251 0.7137 29. an 3748 0.3689
15. his 6996 0.6886 30. they 3619 0.3562

(From http://www.edict.com.hk/TextAnalyser/default.htm)
27
Table II : English letters frequency table
Analysis of 45406 Common Words
This table analyzes a pool of words includes plurals and words with common suffix
# of Occurrences # of Words # of Occurrences # of Words
e 42689 11.74% 30254 66.63% p 10063 2.77% 8952 19.72%
i 31450 8.65% 23875 52.58% m 9803 2.70% 8871 19.54%
s 29639 8.15% 22697 49.99% h 7808 2.15% 7372 16.24%
a 28965 7.97% 23408 51.55% b 7368 2.03% 6880 15.15%
r 27045 7.44% 22642 49.87% y 6005 1.65% 5881 12.95%
n 26975 7.42% 21644 47.67% f 4926 1.35% 4385 9.66%
t 24599 6.76% 20040 44.14% v 3971 1.09% 3884 8.55%
o 21588 5.94% 17776 39.15% k 3209 0.88% 3091 6.81%
l 19471 5.35% 16289 35.87% w 3073 0.85% 2997 6.60%
c 15002 4.13% 13142 28.94% z 1631 0.45% 1555 3.42%
d 13849 3.81% 12334 27.16% x 1053 0.29% 1046 2.30%
u 11715 3.22% 10894 23.99% j 727 0.20% 727 1.60%
g 10339 2.84% 9426 20.76% q 682 0.19% 681 1.50%
(From http://www.edict.com.hk/TextAnalyser/default.htm)
28

Classical Cryptography

Încărcat de

Informații document

Descriere originală:

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Classical Cryptography

Încărcat de

Drepturi de autor:

Formate disponibile

Classical Cryptography

Xu Jia, XueMingQiang, Zhang Weiqi, Zhu Qian

School of Computing, National University of Singapore

{xujia, xuemingq, zhangwei, zhuqian}@comp.nus.edu.sg

Key words: Cryptography, cipher, encryption, decryption, key, plaintext, ciphertext

vulnerable in front of today’s powerful computers.

Swedish(29 letters in alphabet).

1(True or False) and output 0 or 1 (or Boolean value)

XOR operation satisfies the following four rules:

Table 1: Truth table for XOR

Example of mod operation:

congruent (denoted as follow) modular a non-zero integer n iff a mod n= b mod n.

iff the difference of a and b is divisible by n.

For example, 63 and 83 are congruent to each other modular 10.

Information theory background

Coincidence and its Applications in Cryptography”, Riverbank Publications Number 22.

N is the length of the text.

distribution, to 0.068, for a monoalphabetic substitution from common English text.

IC 0.068 0.052 0.047 0.044 0.044 0.041 0.038

long message it is possible to get a unique meaning.

Affine cipher and Atbash cipher.

For example, in English, with a shifted position of 23:

alphabet consisting m letters has m-1 different Caesar ciphers.

General history of Caesar Cipher

cipher is relatively easy to break at present, it is unlikely to be broken at that time.

Cryptanalysis of Caesar Cipher

the ROT13 function is its own inverse:

symbols and other characters unchanged.

Affine Cipher is a special case for the substitution cipher.

and m is the size of the alphabet.

Table3 : Atbash Cipher

Homophonic Substitution Ciphers

mapped to more than one ciphertext.

An example of the checkerboard is like this7:

Table 4 : Straddling checkerboard

From the table A=3 B=20 C=21 …Z=68, a plaintext of A T T A C K A T D A W N becomes 3 1

1 3 21 27 3 1 22 3 65 5 . Then add a secret key number (0452) by non carrying addition.

Then use the same cherkerboard to turn it into letters:

ANWH R SAN R OAE E R

Decryption is simple reverse the process.

Polygraphic Substitution Ciphers

remember the key and the four rules below8:

first letter. Encrypt the new pair and continue.

original pair was on the right side of the row).

in the original pair was on the bottom side of the column).

on. A block of n letters is then considered as a vector of n dimensions, and multiplied by a n x n

the matrix is invertible in GF(26n) .

Polyalphabetic substitution ciphers

collection of 26 permutations of 26 English letters. Usually, these permutations are written as a

is often shorter than the plaintext.

each of p(i) is a symbol from the plaintext.

C(i)=Vigenere_table[k(i mod d)][p(i)], 0<= i < n.

Because the Vigenere tableau is symmetric, i.e. Vigenere_table[i][j]=Vigener_table[j][i] for all

pairs of i and j, the above formula can be written equivalently as:

C(i)=Vigenere_table[p(i)][k(i mod d)].

If we code 26 letters A to Z with 26 integers from 0 to 25 respectively, mathematically we can

describe the encryption rule:

C(i)=p(i) + k(i mod d) mod 26.

Table 6: Vigenere Cipher Scheme

We can use the following formula to decrypt the message.

p(i)=c(i) - k(i mod d) mod 26.

Or use the reverse procedure as for encryption.

only difference is that Beaufort cipher uses reverse alphabets.