Basic Cryptanalysis

Posted September 15, 2008 at 6:42 pm in Encryption, Programming

My example is very basic and is intended more as an interesting method to begin the complicated and often impossible task of deciphering encrypted messages/codes.

The following C++ program accepts character input from the keyboard or via file redirection. It will count each character instance and report the amount of times each character was used.

Why would anyone want to do this? Depending on the method the original message was encoded with, it may help to determine which characters in the ciphertext are representing specific characters in the plaintext. Certain letters and combinations of letters are used much more frequently than others in the English language. The top twenty most used words in English are: “the of to in and a for was is that on at he with by be it an as his”. The list of the most used letters in the English language in descending order are: “e t a o i n s r h l d c u m f p g w y b v k x j q z”. The letter frequency of the first letter of a word in descending order is “t o a w b c d s f m r h i y e g l n o u j k”, the second letter’s frequency in a word is “h o e i a u n r t” and the third letter’s frequency is “e s a r n i”.

By using this program to compute these letter frequencies and comparing them to known lists as presented above we can gain some insight into the message and possibly crack the code if the code is encoded poorly (simple substitution cipher).

A few limitations to the program include: it treats characters as case-insensitive but can easily be modified to treat characters as case-sensitive and it only works with English alphabetic characters (a-Z). Modifying the program to accept non-standard characters (#, !, etc.) could be added just as easily.

The code:

#include <iostream>
using namespace std;

void sort(long int charCount[], char alphabet[], int size);
void swap(int& x, int& y);
void print(long int charCount[], char alphabet[], int size);

const int MAX_SIZE = 26;

int main() {

	char alphabet[MAX_SIZE];
	long int charCount[MAX_SIZE];
	char letter;
	int firstLetter = 97;  // Lowercase 'a' in ASCII.

	// Initialize each alphabet element to a letter.
	for (int i = 0; i < MAX_SIZE; i++)
	{
		alphabet[i] = (char)firstLetter;
		firstLetter++;
	}

	// Initialize each charCount element to 0.
	for (int i = 0; i < MAX_SIZE; i++)
	{
		charCount[i] = 0;
	}

	// Prompt user for input cipher and count characters.
	cout << endl;
	cout << "Please enter the ciphertext on the line below.\n";
	cout << "Press Return then CTRL+D when complete." << endl;
	cout << ": ";
	while (cin >> letter)
	{
		if (letter == 'a' || letter == 'A')
			charCount[0]++;
		if (letter == 'b' || letter == 'B')
			charCount[1]++;
		if (letter == 'c' || letter == 'C')
			charCount[2]++;
		if (letter == 'd' || letter == 'D')
			charCount[3]++;
		if (letter == 'e' || letter == 'E')
			charCount[4]++;
		if (letter == 'f' || letter == 'F')
			charCount[5]++;
		if (letter == 'g' || letter == 'G')
			charCount[6]++;
		if (letter == 'h' || letter == 'H')
			charCount[7]++;
		if (letter == 'i' || letter == 'I')
			charCount[8]++;
		if (letter == 'j' || letter == 'J')
			charCount[9]++;
		if (letter == 'k' || letter == 'K')
			charCount[10]++;
		if (letter == 'l' || letter == 'L')
			charCount[11]++;
		if (letter == 'm' || letter == 'M')
			charCount[12]++;
		if (letter == 'n' || letter == 'N')
			charCount[13]++;
		if (letter == 'o' || letter == 'O')
			charCount[14]++;
		if (letter == 'p' || letter == 'P')
			charCount[15]++;
		if (letter == 'q' || letter == 'Q')
			charCount[16]++;
		if (letter == 'r' || letter == 'R')
			charCount[17]++;
		if (letter == 's' || letter == 'S')
			charCount[18]++;
		if (letter == 't' || letter == 'T')
			charCount[19]++;
		if (letter == 'u' || letter == 'U')
			charCount[20]++;
		if (letter == 'v' || letter == 'V')
			charCount[21]++;
		if (letter == 'w' || letter == 'W')
			charCount[22]++;
		if (letter == 'x' || letter == 'X')
			charCount[23]++;
		if (letter == 'y' || letter == 'Y')
			charCount[24]++;
		if (letter == 'z' || letter == 'Z')
			charCount[25]++;
	}

	cout << endl;

	// Sort counts into descending order.
	sort(charCount, alphabet, MAX_SIZE);

	// Output.
	print(charCount, alphabet, MAX_SIZE);

	return 0;
}

void sort(long int charCount[], char alphabet[], int size)
{
	for (int c = 1; c < size; c++)
	{
       for (int i = 0; i < size - c; i++)
       {
           if (charCount[i] < charCount[i+1])
           {
               swap(charCount[i], charCount[i+1]);
               swap(alphabet[i], alphabet[i+1]);
           }
       }
   }
}

void swap(int& x, int& y)
{
     int temp = x;
     x = y;
     y = temp;
}

void print(long int charCount[], char alphabet[], int size)
{
	for (int i = 0; i < size; i++)
	{
		if (charCount[i] >= 1)
			cout << alphabet[i] << " " << charCount[i];
                       cout << endl;
	}

	cout << endl;
}

Commentary

+

Add Your Comment

Your email address will never be shared or published.

Your Name:

Your Email:

Your Site: