Cryptograms - source code


This is a resource of the wordgame-programmers@egroups.com mailing list.



Click to subscribe to wordgame-programmers

Cryptograms, as published in newspapers and puzzle books, are a 1:1 encryption cypher, with the restriction that no letter may stand for itself. Although there are many web pages on the net which "help you solve" these, they generally are merely a substitute for pen and paper, and offer no active guidance.

If you've never tackled a crypto quote puzzle, read this introduction page or this short tutorial first.

When you're ready to try one, here is our own 'Crypto-quote of the day'. Sometimes there may be a short delay as the server fetches a new quotation from another net site.

We concern ourselves here with code which solves these cyphers for you. I had read about programs on the net which solve these using genetic algorithms and recently I did indeed find one, though it turns out the traditional methods work pretty well too. I include here my own attempt to solve these problems using letter patterns and brute force searching. When it works, it's actually remarkably fast. When it doesn't work, it's awful. This was experimental code and can't be considered working, though with a little hand-holding it has solved many puzzles; however Karl Dunn's code below based on roughly the same principles does work and works very well.

It's interesting though - Karl's code will make a valid decode for some cryptoquotes, but it's not the right one. It's obviously not the right one to a human, but not to a program. For example:

maps to It is clearly meant to be "GOOD THING" and "PRESS CHARGES".
A much tougher example of text that can be decoded in two completely different ways is a Dual Cryptogram.

By the way, the best the genetic algorithm could do with the above, after expending a lot more CPU than the classic tree search algorithm, was

A nice improvement to such code would be to generate multiple solutions rather than just one, and use some sort of parser - or word frequency table - to pick the most likely one. Other interesting developments could be: when Karl's code produces a partial solution, feed it to the G.A. to improve it. (The G.A. version 1.2 code specifically has a feature to accept already decoded words). Or when neither can fully decode the text, do an exhaustive search exchanging pairs of letters methodically, keeping any that improve the solution, until no more improvements can be made. This is not quite what the GA is doing, because the GA swaps random pairs. This does all 26x26. A bad algorithm at the start of the problem (it's exponentially expensive if you apply it recursively), but maybe not too expensive on a mostly-solved example that just needs 3 or 4 letters to fall into place, ie only go two (26x26 * 26x26) or three (26x26 * 26x26 * 26x26) levels deep. Whatever can be done in a second or two on a modern computer.

The example solutions for this problem are an excellent demonstration that a problem can be tackled in more than one way, because we have a third solution below from Robert Muth that relies on trigram frequencies to do the decryption.

Finally - for the standard 'crypto quote' puzzles - and I emphasise the 'quote' part of the name - how about a specialist dictionary of quotable people's names which can be used for the attribution part, separate from the main text?



Vaguely related to this field are cryptarithms and phone number words.

Return to the Archive overview.
Keywords: cryptograms, cyphergrams, cryptoquotes, cryptogram, cyphergram, cryptoquote, crypto-grams, cypher-grams, crypto-quotes, crypto-gram, cypher-gram, crypto-quote, caesar cypher, vignere cypher, rot13