It was 1914, and the Dutch had a problem.
A new pan-European war had just broken out, a war that would become known as the Great War for its global scale and the incalculable impact on the lives of its participants. The Netherlands, with a reasonable amount of foresight, decided to opt out, and declared themselves neutral.
Unfortunately, neutrality in war is not a magic charm, and the Netherlands quickly found themselves spending a lot of effort defending their claim. Leaving aside the various accidental attacks — much of the Northern European countryside looks the same to a bomber flying 3 kilometres up in the air — the Netherlands also had to defend against more deliberate intrusions into their sovereign waters.
These problems weren't just limited to the homeland either — far away, in the Dutch East Indies (now known as Indonesia), the Dutch claim of neutrality was also being tested. A Dutch squadron patrolled the archipelago, but to avoid international incidents that could threaten the Netherlands' unaligned status, ships had to radio in to colonial headquarters before each engagement.
To preserve secrecy and speed up communications on what was essentially an open channel, ships had been issued with codebooks — books filled with pages and pages of pre-arranged phrases and words, each assigned a different (ideally secret) code. Unfortunately, this was less than ideal: in peacetime, the secret codes were often far less closely guarded, forcing the Dutch Navy to publish a "secret appendix" of even more secret codes. Moreover, these books were rarely designed to deal with the complexities faced by the squadron in colonies, leaving them floundering and struggling to communicate.
Into this situation stepped Theo van Hengel and R.P.C. Spengler, two Dutch naval officers with a passion for designing torpedoes. In early 1915, the two men started work on an entirely new innovation in cryptography: the rotary cypher.
Here's the problem with cyphers in a nutshell: human writing is very predictable. In a block of English text, for example, the letter "e" is most common, followed by "t" and "a". Letters like "a" and "o" are more common at the start of words, but "s" and "d" often end up at the end. The challenge of encipherment is to create a cypher where, however predictable the input is, the output appears to be completely random.
Consider the demo below. It's a very simple cypher where we take all the letters of a text and rotate them through the alphabet by a fixed amount (the key). They key will be a letter from A to Z — if the key is A, we'll move each character along by one letter, if it's B, two, and so on.
Below each text input is a histogram showing the letter frequencies in the input and output. You can change the input text and the key, and watch how this changes both the output text, and the two histograms.
As you can see, the output is just as predictable as the input. If the input follows the normal rules of English ("e" as the most common letter, "s" at the ends of words, etc.), then the output will follow the same rules, just shifted along in the alphabet a bit.
One solution to this is to use more keys (or rather, have a key made of multiple letters). For example, with three-letter key, we can split the plaintext into chunks of three letters. Each letter in that chunk gets encrypted with a different letter of the key.
In the demo below, you can see this at work. With keys of different lengths, you can see how the distribution changes. With very short keys (one or two letters), the patterns are still mostly recognisable, but with a five- or six-letter key, the pattern is harder to recognise.
That said, harder to recognise does not mean impossible! For an example of this, consider the key ENI. The I means that every third letter in the plaintext has been rotated nine places — "a"s are now "j"s, "b"s are now "k"s, etc.
We know that "e" is the most common letter, and "e" + I is "n". If you look at the cyphertext histogram, you'll see that "n" is probably taller than both of its neighbours. Although using a longer key has helped scramble the letters, there are still some recognisable patterns. If you play around with this demo a bit more, you might spot some other patterns that show up regularly.
- What happens when you have repeated words, or even letters, in the plaintext?
- Given the cyphertext
Spengler and Van Hengel had two key innovations over the simpler cyphers from before. The first was simple, and not entirely new. Rather than use a single, repeating key, they would instead generate the key as the message was being encoded. If you could design that process in such a way, that each new letter of the key looked quite random to the outside observer, but could in fact be determined consistently by both the sender and the receiver, then the message would be almost completely uncrackable.
However, generating such a complicated key by hand would be a slow process, and, given the tense political situation in the Dutch East Indies, speed was very much of the essence. This was the second — and significantly more important — innovation: to provide a machine that would do the generation for you.
The core idea of the machine they came up with is simple: consider a rotor split up into 24 segments. Give each segment an input letter, A to Z, and wire everything up so that when I type "a", it goes to the A segment of the rotor.
Inside the rotor, you then wire the input of each segment to the output of a different segment. For example, the A segment might carry the signal to the S segment. Now when I type "a", it goes into the A segment, and out of the S segment, giving me the result "s".
Here's the trick: now, every time I type a letter, first rotate the rotor by one place. Now, when I type "a", it'll instead go to the "b" segment. And if I type "a" again, the signal will go to the "c" segment, and so on. But because each of the input segments has a different output, the outputs will also be jumbled up.
Every time we type a letter, the rotor moves, which means each letter is encrypted with a different scrambling system. Because there are 26 segments on the rotor, we essentially have the same effect of having a key that's 26 letters long — except now we don't need to go through that 26-letter key manually, because we've got a machine that does that for us.
However, while 26 letters is a start, it's still not impossible to crack if we want to write longer messages. Given the goal of this system was to allow ships in the Dutch East Indies to make complex political decisions with support from headquarters, allowing messages of an arbitrary length was an important goal.
The solution to this, Spengler and Van Hengel found, was to string together a block of four rotors. The output of the first rotor becomes the input of the second rotor, and so on, until the message has been scrambled through all four rotors. In addition, the first rotor would change every keystroke, but the second rotor would change only every 26th keystroke — after the first rotor had completed a full rotation.
With four rotors working together, the total number of different ways that a letter could be scrambled is 264 — or to put it another way, a message that was nothing but the letter A wouldn't repeat itself in 456,976 characters. By comparison, the first Harry Potter book has approximately 360,000 characters.
Citations
- Karl de Leeuw (2003) THE DUTCH INVENTION OF THE ROTOR MACHINE, 1915–1923, Cryptologia, 27:1, 73-94, DOI: 10.1080/0161-110391891775
- https://www.jstor.org/stable/j.ctt46mvb2
- The Codebreakers — David Kahn