HSCTF 2015: Keith the Comedian Writeup

Easy classical cryptograghy using known plaintext and small block sizes to break a simple Hill cipher. Required knowledge of linear algebra and modular arithmatic.

The problem was given as:

During the 2050s comedy got really really weird. Keith decided to enter into the surrealist comedy scene with his own joke. As was the norm for the time Keith made a ‘Knock, knock!’ joke but encoded it. We have no idea how he encrypted it but we think the joke has something to do with some sort of Hill. Here’s the encrypted joke: 1 232 92 231 59 226 115 31 51 68 53 51 147 218 52 234 17 234 54 144 241 13 49 149 155 247 168 88 22 177 20 238 205 3 86 155 182 240 18 211 121 35 215 3 123 110 108 157 109 229 21 9 166 254 119 238 173 44 142 121 3 86 155 95 42 227 107 102 67 35 193 187 118 112 207 87 254 87 113 131 54 71 243 7 180 100 84 108 100 25 199 5 103 15 170 114 159 203 136 132 155 43 83 66 237 217 2 159 126 60 27 101 181 69 142 45 199 242 180 99 111 158 219 6 174 230 91 127 4 61 151 158 180 100 84 135 222 232 223 81 56 85 63 57 246 92 119 237 196 252 158 143 105 150 131 173 183 226 102 141 240 199 63 82 233

The Cipher

Readinging through there are a few hints that it is a Hill cipher, the biggest one being that it says “some sort of Hill” with a captial ‘H’. This is strange because historically the Hill cipher uses for its alphabet conversion, but this problem has numbers in the hundreds, meaning that it is probably using ascii values to translate the letters (Note: Lester Hill, when inventing the cipher, just mixed up the letters; he did not encode them using this method). To account for this we will have to check a couple of modulus values because we can’t be certain whether it’s mod or . The problem also gave us the beginning of the plaintext message: “Knock, knock!” This allows us to perform a known plaintext attack on the cipher.

Hill ciphers work by using an invertible encryption matrix, , and multiplying it by a plaintext matrix . Decryption is done by taking the encrypted message and multiplying it by , the inverse of the encryption matrix. Both the encryption and decryption occur over a modulus of the max alphabet value.

Since we have known plaintext, as long as we can figure out the modulus and the block size we can calculate the encryption key by constructing an invertible matrix, which can be multiplied by the ciphertext to give the encryption key. The decryption key is the inverse of the encryption key which can then be used to decrypt the rest of the message.

Implementing the Attack

The block size is needed in order to encrypt or decrypt with the Hill cipher, and we know it has to be a factor of the number of characters in our ciphertext. So our possible block size values become the factors of 165 (disregarding 1):

We also know that because the size of is a perfect square, in order for us to be able to perform this attack we must have at least letters of known plaintext to work with.

This means that in order for us to succeed with this attack a block size of must have been used.

Next up is constructing an invertible matrix out of our known plaintext to use to solve for and ultimately . If the plaintext in order doesn’t yield such a matrix, we can pick and pull different parts of the plaintext to construct such a matrix.

Since we are working with a matrix, our message of “Knock, knock!” will be shortened to one block length: “Knock, kn”

This becomes:

Testing if it’s invertible by checking whether its determinant is a coprime to our modulus:

So we won’t need to rearrange our plaintext. Good. Now we can solve this equation using a mod of 256:

Solving for E and D:

Now we can test this solution by multiplying C and D. In order to easily test different modulus values, I wrote a script in Python to do calculate D. It uses NumPy to handle most of the matrices and a few functions from a StackOverflow post that allow me to calculate the inverse mod a matrix.

import numpy
from fractions import gcd


## https://stackoverflow.com/questions/4287721/easiest-way-to-perform-modular-matrix-inversion-with-python 
# Finds the inverse of matrix A mod p
def modMatInv(A,p):       
  n=len(A)
  A=numpy.matrix(A)
  adj=numpy.zeros(shape=(n,n))
  for i in range(0,n):
    for j in range(0,n):
      adj[i][j]=((-1)**(i+j)*int(round(numpy.linalg.det(minor(A,j,i)))))%p
  return (modInv(int(round(numpy.linalg.det(A))),p)*adj)%p
# Finds the inverse of a mod p, if it exists
def modInv(a,p):          
  for i in range(1,p):
    if (i*a)%p==1:
      return i
  raise ValueError(str(a)+" has no inverse mod "+str(p))
# Return matrix A with the ith row and jth column deleted
def minor(A,i,j):    
  A=numpy.array(A)
  minor=numpy.zeros(shape=(len(A)-1,len(A)-1))
  p=0
  for s in range(0,len(minor)):
    if p==i:
      p=p+1
    q=0
    for t in range(0,len(minor)):
      if q==j:
        q=q+1
      minor[s][t]=A[p][q]
      q=q+1
    p=p+1
  return minor


#Function to convert matrix to a message
def getMsg(m):
    string = ''
    for i in range (0, m.size):
        string += chr(int(m.T.item(i)))
    return string

#Declare and define the cipher message
Cmessage = numpy.matrix([[1,231,115,68,147,234,54,13,155,88,20,3,182,211,215,110,109,9,119,44,3,95,107,35,118,87,113,71,180,108,199,15,159,132,83,217,126,101,142,242,111,6,91,61,180,135,223,85,246,237,158,150,183,141,63],
                         [232,59,31,53,218,17,144,49,247,22,238,86,240,121,3,108,229,166,238,142,86,42,102,193,112,254,131,243,100,100,5,170,203,155,66,2,60,181,45,180,158,174,127,151,100,222,81,63,92,196,143,131,226,240,82],
                         [ 92,226,51,51,52,234,241,149,168,177,205,155,18,35,123,157,21,254,173,121,155,227,67,187,207,87,54,7,84,25,103,114,136,43,237,159,27,69,199,99,219,230,4,158,84,232,56,57,119,252,105,173,102,199,233]], dtype=numpy.object)


#Cipher and plaintext, at even 3x3 block. Knock, kn
cipher = numpy.matrix([[1,231,115], [232,59,31], [92,226,51]] ,dtype=numpy.object)
plain = numpy.matrix([[75,99,32],[110,107,107],[111,44,110]], dtype=numpy.object)

#The mod to test
mod = 256

#Calculate Determinent, check if it's coprime to the mod (Invertible)
det = numpy.linalg.det(plain)
print "Determinent: " + str(det) 
print "Coprime: " + str(gcd(int(det), mod) == 1)

#Calculate the Decryption Key
D = modMatInv(numpy.dot(cipher,modMatInv(plain, mod)) % mod, mod)

#Print the Result
print "\nDecryption Key: \n------------\n" + str(D)
print "\nDecryption: \n------------\n" + str(getMsg((D * Cmessage) % mod))

Running this with a mod of 256 and our message is revealed as:

Knock, knock! Who's there? An Ant Hill. I don't have an Aunt Hill! Hahahah (Humor of the 2050s was weird). Well anyway, the flag is: that_aunt_hill_joke_was_horrible

So our final flag is:

that_aunt_hill_joke_was_horrible