Intro to Encryption

 
The Basics

The task of simple encryption is to take a string of text and convert it into text that is indecipherable to someone who doesn't know or cannot figure out all of the information that went into the conversion.

"Plain-text" is the message to be sent

"Crypt-text" is the encrypted form of that text

The Algorithm is the basic computation that converts encrypts the plain-text into the crypt-text.  This is frequently well known in the industry.

The "key" (like a password) is the secret information typically known only to the sender and recipient.


Shift Algorithm

Let's try the algorithm that Julius Caesar reputedly used to send secret messages to his commanders in the field, the Caesar Cipher:

Shift each letter of the plain-text 3 characters to right in the alphabet.  So, "a" becomes "d" and "b" becomes "e", etc. 
"a" -> "d"
"b" -> "e"
...
"w" -> "z"
"x" -> "a"
"y" -> "b"

We can do this mathematically by:
1. Finding the position of the character in the alphabet, with "a" at position 0 and "z" at position 25
2. Adding the shift amount to the position in #1
3. Finding the remainder of  #2 (above) divided by 26 (resulting in a number between 0 and 25)
4. Using #3 (above) as the position of the character to use in the crypt-text.

Example 1:
To shift "f" by 3 positions:
1. In the alphabet, "f" is in position 5 (starting at 0 for "a")
2. Add the shift 3 to the position 5 to get 8. 
3. Divide 8 by 26, and get the remainder -> 8
3. Get the character in position 8, which is our crypt-text character: "i"

Example 2:
To shift "y" by 3 positions:
1. In the alphabet "y" is in position 24
2. Add 3 to get 27
3. Divide 27 by 26 and get the remainder 1
4. The character is position 1 is "b"

Note: This technique works correctly for any size shift including very large numbers and also negative numbers.  So it is just as easy to shift to the left by 3 (subtract 3) as it is to the right by 3 (add  3)

.

Shift both lower- and upper-case characters by N positions (positive or negative)

def ShiftByN(plain, N):
   low =  'abcdefghijklmnopqrstuvwxyz'
   high = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'
   crypt = ''
   for c in plain:
      position = low.find(c)
      if position >= 0:  # if c is a lower-case letter
         result = low[(position + N) % 26]
      else:
         position = high.find(c)
         if position >= 0: # if c is upper-case
            result = high[(position + N) % 26]
         else# it's neither so don't modify it
            result = c
      crypt += result
   return crypt


Example:

>>> print(ShiftByN("For Whom the Bell Tolls", 13))

 'Sbe Jubz gur Oryy Gbyyf'

>> print(ShiftByN('Sbe Jubz gur Oryy Gbyyf', -13))

 'For Whom the Bell Tolls' 


Substitution Algorithm

The substitution algorithm provides 2 alphabets: the input alphabet and the output alphabet.

Let's see this in action with the Substitution equivalent of the Shift By 3 algorithm:

The algorithm finds the numerical position of the plain-text letter in the input alphabet, and then uses the letter in the same numerical position in the output alphabet.


input_alpha =  'abcdefghijklmnopqrstuvwxyz'
output_alpha = 'defghijklmnopqrstuvwxyzabc'

def
Substitution(in_alpha, out_alpha, plain):
   crypt = ''
   for c in plain:
      position = in_alpha.find(c)
      if position >= 0:
         crypt += out_alpha[position]
      else:
         crypt += c
   return crypt


Example:
>>> print(Substitution(input_alpha,output_alpha,'abxy'))

  deab
 
More complicated versions of Shift

ShiftByN is easily broken because it depends only on a single number, and because the same patterns appear throughout the crypt-text.  For instance, the single-letter words "I" and "a" will appear have the same converted letters everywhere that the "I" and "a" appeared in the plain-text.

More complicated shift versions could, for instance, shift the first letter of the plain-text by 5, and shift the second letter of the plain-text by 6, and the third by 7, etc.  This would eliminate the patterns created by the ShiftByN because it is now unlikely that "I" appearing in one place in the plain-text would encrpyt to the same letter as "I" appearing somewhere else.

In fact, one can create a arbitrarily complicated function, f(x), such that the plain-text letter in position Q would be shifted by f(Q).  One would simply have to ensure that f(x) had a unique inverse function,  h(y) such that h(f(x)) -> x, so that decryption could occur.
 
More complicated versions of Substitution

If the output alphabet is just shifted by a fixed number of positions relative to the input alphabet, then this encryption is just as weak as ShiftByN.

But there's no restriction on the output alphabet, except that all of its letters (or symbols) must be unique.  For instance, look at these 4 different possibilities of output alphabets, each of which corresponds to the the lower-case normal letter input alphabet:

input_alpha =   'abcdefghijklmnopqrstuvwxyz'

output_alpha1 = 'defghijklmnopqrstuvwxyzabc'

output_alpha2 = 'ABCDEFGHIJKLMNOPQRSTUVWXYZ'

output_alpha3 = 'badcfehgjilknmporqtsvuxwzy'

output_alpha4 = 'pkemwoxfdutrsvljizagbyhcqn'

output_alpha1 is just ShiftBy3

output_alpha2 converts lower-case to upper-case (one easy way to doing this).

output_alpha3 just switches adjacent letters ("ab" -> "ba") by position

But output_alpha4 is interesting. You can't see any pattern in it, and that's because it was specifically created by an algorithm that generates a random permutation of the lower-case alphabet.  And therefore this encryption is much harder to break, and requires more interesting techniques.  For instance, one cannot try all permutations of the lower-case alphabet to see if one converted all of the crypt-text words of a message into plain-text English because there are so many permutations to try.  Even if you had a computer that could try one million different permutations per second, it would take 929 times the current age of the universe to get through all of them.  In other words, you'd need better clues.