The first riddle points to Julia Child’s Mastering the Art of French Cooking, but it wasn’t clear how that helped with the cypher. Numbers ranging from 1 to 26 suggest a simple substitution cypher, but a few of the schemes I tried didn’t seem to work. So, I decided to try brute-forcing it with code.
The general idea behind breaking cyphers is to match the symbol to letters based on their frequency. For example, since ‘e’ is the most common letter, the most common symbol should map to it. This only works if there’s enough example cypher text (e.g. a single word wouldn’t work as it may not even have ‘e’ in it). Surprisingly, the amount needed is theoretically small. With 26! possible cyphers, identifying a single one corresponds to about 88 bits of information. The unique patterns of English letters contain roughly 1.5 bits per character, out of a possible 4.7 bits if all letters were equally common. So the theoretical unicity distance, the amount of text needed to find a cypher, is 88/(4.7-1.5) = 28 characters. This is the theoretical minimum, and in practice no piece of text that short will exactly follow English’s letter frequencies.
The earliest attempt I made was to assign symbols to letters based on their document frequency, but let the measured frequency fluctuate according to Poisson statistics. I could produce a thousand guesses and scan through them in a few minutes but the approach was ineffective. This was too weak of a model and the results were only a step above pure noise. It likely would have worked for longer texts, like around 1000 characters.
The key insight I learned from researching cyphers was to consider letter sequences. Although there are exponentially more bigrams than single letters (monograms), there’s correspondingly more information in each pair, like ‘u’ following ‘q’ almost without fail. (One notable exception is my favorite wordle guess, ‘waqfs’.) It was relatively simple to compute n-gram frequencies up to 5 from a kaggle news referencedatabase. Since the cypher text lacks spaces, punctuation was ignored in the reference text too. With those probabilities computed, a given substitution can be tried on the cyphered text and its likelihood can be computed. And so the problem boils down to a maximum likelihood problem.
To find the best substitution, I used a simple hill-climbing algorithm that started with a random guess and swapped characters that increased the likelihood score. As amusing as it would have been to apply some machine learning optimizer, the likelihood function isn’t continuous and can’t be differentiated simply. While using longer N-grams contains more information, it comes with a tradeoff in computation time. I settled on N=4 as a compromise.After all this development, and trying it on a few test cyphers, I ran my code on the clue. After 20 minutes it was clear it still didn’t work.
Apparently other people were also struggling with the first clue. A week after it was released, Edwins released an updated version with longer text and “fully visible” numbers. With all the extra numbers I thought my code might actually have a shot.
Sure enough, after about a minute, my terminal spit out:
TOCTARTYOURQUECTWITHTACEANDPLAIR PINDTHECQUARCWNTHAPOODSACTAPPAIR ANAMERENOWNEDINPROVENDELIGHT CTOUPPERCHICTORGNOWINCIGHT
Imperfect, but unquestionably close. I modified the script to hold constant some letters, and with a mix of running and updating obtained:
TO START YOUR QUEST WITH TASE AND FLAIR FIND THE SQUARSWNTH A FOOD PASTAFFAIR ANA BE RENOWNED IN FROM ENDELIGHT STOUFFER’S HISTORG NOW IN SIGHT
The misspellings and elisions are an extra challenge and may have scrambled the shorter version too much. Shannon’s elegant theoretical minimum, ruined by poor spelling. The riddle is a reference to Stouffer’s restaurant, an old Cleveland staple that transitioned to frozen food. It wasn’t obvious what to do with this information, so I crawled through the restaurant’s XML sitemap and found links to Cypher 2 and Cypher 3. The pages were password-protected, of course.
“Stouffer’s” unlocked the Cypher 2 page, and another clue: This riddle referenced Anthony Bourdain’s classic article “Don’t Eat Before Reading This”. But, hot on success, I transcribed the digits into my code and got
IN A GRAND STORE WHERE DREAMS WERE REAL FIND THE PLACE WHERE BREAKFAST HAD GREAT APPEAL SILVER SPWNS AND DISHES FINE A GRILL WITH HISTORY SEEK TO JESIGN
A little bit of historical digging informed me of the Higbee’s Silver Grille, the restaurant in Cleveland’s long-departed high street department store. And in under an hour, “The Silver Grille” unlocked Clue #3.
This is where my luck ran out for a bit. My algorithm didn’t put out anything remotely readable, even after trying different N-gram sizes and running for hours. So, I changed tack, and reverse-engineered how the clue related to the first two cyphers. The numbers correspond to the order in which letters are first seen in the referenced text, e.g. if the first word is “The”, then T=1,H=2,E=3, etc. I had tried this manually, before resorting to programming, but stopped early when I got “TOSTRT”. In hindsight, I should have figured that was a promising start. The same scheme solves cypher 2, so I tried it on Cypher 3 and a Cleveland.com article by Joe Crea about Edwins. There’s some ambiguity over what to count as the begining of the article, but this was the best I could come up with:
historinhildsarr arceaogeosrtnoe aunristraenoarwano poruteaorseoto
With some squinting, it starts with “histori” and mentions “aun ristraen”. Frankly, that information wasn’t surpsising given the answers to the last two clues were historic restaurants. I spent a week stuck here, trying different algorithms and substitutions. The other difficulty was that I didn’t see a Cypher 4 web page. After reexamining the XML sitemap, I noticed two URLs named “/cypher-2a” and “/cypher-3a”. And “/cypher-2a” linked to the page for Cypher 3! In turn, “/cypher-3a” linked to “/cypher-final”, with a brand new password box.
Now at least I had a place to put the answer, even if I didn’t have an answer. And so I resorted to a less-dignified version of brute-force. I put together a list of closed historic restaurants in Cleveland from historical websites and asking ChatGPT (possibly the first time it’s truly been useful), and then I started guessing by hand. To my surprise, it only took a few minutes to pick the right one: Sokolowski’s, the Polish restaurant that lasted a century.
The final challenge, and no more cyphers. I queried my family for suggestions about mills, and did some digging by myself, and settled on the Shaker Gristmill as the top candidate. Armed with a shovel, I set out with my sister in the 85 F heat to the local park where the gristmill once stood.
Near the plaque, there was a hollow tree stump with a small shrub growing in it. We carefully removed the plant and felt like tresspassers while we dug out a foot of earth. Finally, the shovel hit something hard, and we carefully excavated a large plastic box. We had found the wine!
This adventure was one of the highlights of my summer, and I’m grateful to Edwins. I hope they were pleased with it too. They announced the hunt had come to an end on their instagram and admitted we’d done it at least twice as fast as they expected. The wine, for now, is safe in a cellar. I’ll open one bottle once I land a job. The other one, who knows? Maybe I’ll return the favor and bury it someday.