Four puzzles, a hundred testsolves, and... many, many hours later

A bunch of other Pals have written about the Hunt, which you might be interested in reading:

Also, it's inevitable that I've forgotten someone in this, so here's a link to the credits.


Last year I wrote a little about writing puzzles for the MIT Mystery Hunt. In that post, I was a little circumspect, but now that it's over, I can probably be a bit more detailed about what went on.

(I also included a brief primer on the Mystery Hunt, as well as puzzle hunts a bit more generally in that post – which I think is mostly accurate – so feel free to head over there and come back if you need to. I'll be here.)

The search for solutions
Creating problems can be fun

There'll be a spoilery section in this post, towards the end, but for now it's all spoiler-free.

⚠️
You'll see another notice like this one when the spoilery section is about to begin, as well as a really obvious header before that

Puzzle hunts and why anyone would subject themselves to them

I think one of the questions I've been asked when I explain puzzle hunts to people is why. Because from the outside, solving puzzles sounds fun. Solving multiple puzzles, even that sounds fun.

Solving hundreds of them? Why?!

Part of it is probably being the kind of person who generally enjoys puzzles, but this doesn't quite explain why people want to do the same puzzles - it's not quite problem solving, because not all problems in the real have a solution that you can work out, or reasonably apply.

A huge part of it is, as people say, just working on a thing with people, trying to solve a puzzle. You get a nice little rush when you spot the thing that leads to the puzzle falling into place. Of course, this is sometimes preceded by a fair amount of consternation, frustration, and general anger.

There's something ineffable about it; doing things, with people, finding patterns in chaos, and acts that feel like you're arranging the world just a little more.

Puzzle hunts and me

I got into puzzle hunts through a friend - I guess they figured that I'd be into these things, which I am. I don't know that I'm a specialist in any areas - I don't knit, or crochet, nor am I great at cryptic crosswords.

But I can code decently in a pinch, and because I'm the kind of person that knows a little bit about everything, I think I'm quite good at nudging things in the right direction.

I started with the group doing Australian hunts (SUMS, MUMS, and Mezzacotta that one time), and then we tried to do the MIT Mystery Hunt as a group of 12 or so, which is probably impossible (not sure - likely never been completed by such a small team?). We glommed onto a US-based team, and then a few years later, that team didn't win, but the winning team, Palindrome offered to let us help them run/write the hunt.

To run or not to run?

I guess I had a bit of a tough choice. I'd been solving with this team for three years now, and I think I've become attached. I was always leaning towards helping out though, because I've probably had weird puzzle ideas floating around in my head since I started doing puzzle hunts. I also really do enjoy putting things together - I'd been part of helping university revues come together back in the day, too.

I considered a fairly comprehensive list of pros and cons, and I won't rehash them here; essentially, for me, what it came down to for me was that I'm not likely to have many chances at helping contribute to the MIT hunt again, while I'll probably have plenty of chances to solve hunts in the future, and that I'd like to give back to the community! Also, I really, really wanted to write at least one puzzle that made it into the Hunt (spoiler: I did).

Infrastructure

We had a Discord server set up for the writing team, and early on, it was a flurry of ideas popping up left, right and centre. This was all prior to the theme being selected, and the round structure being decided.

For document storage and collaboration, we used Google Drive, which is probably the norm for many hunt teams. It's great, but early on we had a couple of hitches: the folders and their contents were visible (and editable!) to anyone with the link, and people didn't always put their documents in the folder.

The folders being editable to anyone with the link was an issue – even though I trust everyone on the team implicitly, it's not about people intentionally leaking puzzles (which no one would do, and likely no team would knowingly exploit). It's about people accidentally dropping the wrong link in the wrong place, and then someone being spoiled on a puzzle without realising it until later. The chances are slim, of course, but why take it at all? You don't have to. This was fixed at some point, which comforted me.

As for people not putting documents in the folders, this was mainly frustrating when testsolving – the horde of Anonymous Animals made it hard to tell who was doing what. Again, this eventually – mostly – got fixed as well.

Given the chance to do it again, I think I'd probably have tighter Discord integration – allowing people to sign up for testsolves from Discord itself, for instance; creating events and threads for testsolves in Discord, etc.

A big mistake I made – and fortunately one that we were able to recover from – was overestimating myself and thinking I had the skills to stand up the Hunt site itself. As it turns out, I've still got things to learn, and fortunately, we were able to get everything done, albeit in less time than we would've if I'd admitted this to myself (and others) sooner. I'm glad we managed to get things running in time, and happy with how it went on the day, but sorry that I added undue stress to our team by waiting longer than I should've to admit that I needed help.

ℹ️
Hours after I shared this post, it became inaccessible, yet more proof that it was good that I wasn't primarily responsible for the Hunt site 😅

As for Hunt infrastructure itself, I spent most of the Hunt making quality of life improvements to the HQ backend, while my more capable tech-Pals were fixing actual user-facing problems.

However, there is this funny story about how we implemented per-user rate-limiting on certain interactive puzzles using IP address, but of course our application sits behind a reverse proxy, so we were reading the wrong IP address (and bucketing every user in the one quota). We eventually remediated this, but not before a few people had run into it – it was compounded by the fact that not everyone had access to logging, so we couldn't immediately identify the issue that was occurring. In any case, apologies for that, if you encountered it.

(Again, given another chance, I'd probably be stricter with backoff/retry logic in our interactive puzzles, so that if nothing else users wouldn't see error codes or unresponsive puzzles if we incorrectly configure rate-limiting.  More visible logging of 429 errors would also have alerted us to this issue sooner. Also, better alerting on non 2xx/4xx statuses, so we can quickly identify puzzles that raise unexpected errors.)

Testsolving

Basically, when you write a puzzle, you want people to test it to make sure it works, and is reasonably fair. Fortunately, we had a large number of people ready and willing to test puzzles.

I tested a lot of puzzles over the year. As I previously said, testsolving is fun. I'd say that over the last year, I've helped solve more puzzles – and at the very least, puzzle hunt puzzles – than I've done in my life up until then, by a wide margin.

I've explained our process to people a couple times, and it seems to differ from the norm, at least insofar as we had authors and editors present for testsolves. There were a small number of puzzles where testing was asynchronous, for reasons peculiar to them, but by and large, we would sign up to a testsolve, then the author and/or editor would sit in the same voice channel as us, muted, while we solved.

Being an author, it was immensely useful to be present in these testsolves. I find it difficult to imagine how testers would provide the richness and detail of information that we got just from being there listening to them work through the possibilities, solve the puzzles, and even when they got stuck. Asynchronous testing might work, but this worked at least as well – or better.

On the topic of getting stuck, though, being there for testsolve sessions meant that we were able to nudge solvers along, and eventually – in most cases – get them across the line, even if it meant that we needed to revise our puzzle so that the next group would be able to solve it without hints. This ability to nudge people along, though, I think contributed to testsolving generally being enjoyable. We wouldn't get stuck spinning our wheels on nothing.

And, on the topic of asynchronous testing – it took us a while to settle into a good rhythm for testing. In the early days, we were putting testsolves all over the place, but eventually we found that they worked best between around 3pm and 11pm Eastern Time. For us Australians, this mean around 7am to 3pm, which wasn't too bad. I was pretty enthusiastic about asynchronous testing, but that never gained traction. I also set up a standing testsolve session for three of us Aussies, which got some usage, but not a lot.

And hey, our puzzles seemed to turn out alright, so the way we did things was successful!

As an actual testsolver, I don't think I punch particularly hard; I've never met a spreadsheet I couldn't bend to my will, but in terms of other puzzl-y skills, I'm probably only middling. Fortunately, we did almost all of, if not actually all of, our testsolving in groups, so I was able to both lean on and support the rest of the group, in most cases. Every now and then I did do some heavy lifting (probably? Maybe not.) but by and large, my contributions were shuttling data around in spreadsheets, indexing into things, and making observations about things at random to hopefully trigger an aha.

2021 was a great year to be testsolving puzzles. I was working from home, and my work afforded me the flexibility to be able to make many testsolve sessions that I otherwise wouldn't have been able to if I'd been in an office – not least because I'd be talking nonsense on a voice chat, to passersby.

Enhancing Puzzup

Early in the year, we forked Puzzlord, which is... well, it's a bit of a mishmash of tools, but I suppose it's best described as puzzle writing workflow management. (The code for Puzzup is now available.)

Wanting to do something helpful, I got stuck into seeing how we could implement some Discord smarts into it, since the less you need people to move between different tools, the happier they are (probably, right?).

Eventually, we settled on giving each puzzle its own channel, and each person access to puzzle channels that they were spoiled on. We also ended up sorting puzzle channels by status, and in hindsight, perhaps this wasn't the right way to go - perhaps by round, then ordered by status?

In any case, could we have just used a Discord library to do this? Sure, but that would've been making life way to easy on myself.

I hacked together a Discord integration that did what we needed it to do, as well as a couple of slash commands (/up commands to create, link and get info on puzzles in Puzzup for the channel), and then that worked... for a while.

The first problem we encountered, and which Dan LePage largely took the lead on fixing, was that Discord's API really assumes that you have an async worker, and so it would occasionally error out when we were trying to update channel titles, or membership, or other things that interacted with it.

The other problem, which was less a tech issue and more a people issue, is that we kept running into the Discord channel limit (it's 500, at the moment, as it was then). Throughout the year - probably three or four times - we had to go through puzzle and puzzle idea channels to ask people whether they wanted to progress it, or whether we could delete the channel (asking them to save anything important from them first).

I suspect this was my single largest contribution to Puzzup. Another thing that I did was to add support requests to Puzzup, based on Dan's spec. Essentially, puzzles can have art, tech or accessibility requests attached to them, and people with those appropriate roles will get emailed when authors create or update notes on them, and vice versa. It's pretty neat, and though we didn't do anything too sophisticated with them, I hope it made life easier for some people. There could still be room to add more to it - comments, for example.

Oh, another thing I did was add a UI for adding hints to puzzles, which was… possibly present in Puzzlord code, but unimplemented? The details escape me, but in any case, we didn't have a way to add hints at some point, and then we did. Other than that, I also implemented other credits in Puzzup, which is why on our pages involving art or significant tech, you'll find those Pals listed beneath the authors of the puzzle. I also made it possible to add freeform credits, of which you might see a few.

Statistics

I'm a bit of a data viz nerd. I'm not great at it, but I really enjoy it. I put some stats together on the Hunt after the fact – spoilers are within. I also got to dust off my SQL skills, which was enjoyable.

Statistics | MIT Mystery Hunt 2022
The MIT Mystery Hunt Statistics page

In retrospect

This would usually go at the end of a post like this, but I also want a section to just talk about puzzles without needing to spoiler tag things, so it's here instead.

I'm so glad I decided to join the writing team for this year, and so glad to have met so many new friends in Palindrome. I felt welcomed from the very start, along with all my wacky ideas and sometimes indelicate criticism.

There was so much focus on such beautiful art for the Hunt, as well as inclusion and accessibility. If you haven't seen the site yet, you should! It'll be at https://bookspace.world for a while, and then it'll probably get moved to the MIT hunt archives. (Update: it's here.)

As someone said recently, in hindsight it was a crazy risk to join a group of people, most of whom I'd never met or done anything with at all, and volunteer for a year-long project.

There were challenging times, such as when I definitely overreached in responsibility and had to ask for help, and when I thought one of my puzzles might not make it through.

But the good times – writing puzzles, testing them, but also including Hunt weekend itself, when I got to patch/upgrade the HQ backend and watch so many Hayden + Rotch/Barker interactions with teams – made up for them a thousand fold.

A screenshot of the Whoston page - a framed, stylised drawing of me from the shoulders up, and the bottom right corner of another framed picture.
I also make a small cameo on the Houston page!

I'm incredibly thankful for the past year I've had – to all the team who joined Palindrome with me, I can honestly say that come over alone to write Hunt (as I was willing to do at first) would not have been the same, or as enjoyable.

Editors and everyone else who humored my wacky puzzle ideas and even helped some of them come to life, thank you!

And to all the Pals I've made along the way, thanks for everything. The year was truly amazing, and I'm glad I came over to write with you all.

Writing puzzles

Spoilers soon™!

I mentioned earlier that I wanted to write one puzzle that made it into the Hunt, and I managed to write four, including three where I had sole author credits!

There's something nice about taking a puzzle from idea, to seeing people test it, to see it released into the wild that's just enormously satisfying. This makes me so incredibly happy, and reading wonderful feedback about the puzzles warmed my heart so much. To everyone who tested, and those who solved my puzzles, thank you, so much.

Oh, spoilers are about to start, I think. I did show off the spoiler tags earlier in this post, but spoilers from here on may be unmarked. I also have author's notes on all these puzzles, so refer to those for extra context.

Puzzles that will be spoiled are, in order:

  • Act 3: Sci-Ficisco, puzzle: Cheers
  • Act 3: Noirleans, puzzle: Please Prove You Are Human
  • Act 2: The Ministry, puzzle: The Talking Tree
  • Act 3: New You City, puzzle: Tech Support
⚠️
Unmarked spoilers are present below

Cheers (Sci-Ficisco)

Edited by Kah Kien Ong

A pair of glasses touching, as in a toast

Cheers was the first puzzle I wrote, and one I co-wrote with Renee Ngan.

Early on, I'd had the idea to use the quirks of Aussie language somehow, because I have a little bit of a chip on my shoulder about how US-centric puzzle hunts can be (even if justifiably so).

I think a lot of the original ideas I'd had were around things like how people in different areas call things different things, and that remained in the puzzle to the end, though it evolved a bit over time. At one point, we were going to make solvers determine which denomination of currency each person had (because they only had the one), as well as references to state mammals, birds and fish.

Ultimately we couldn't make that version of the puzzle work, and I'm glad we eventually had the idea to just use Australia's Big Things instead.

So, doing what I usually do (see above re:spreadsheets), I just grabbed all the tables from this Wikipedia article, then unioned, filtered and sorted to get us the items we needed. Easy, right?

The rest of it was fairly normal for constructing a logic puzzle, I understand - adding constraints until you have a thing that looks like a puzzle, which Renee did most of the heavy lifting on.

We needed to exclude the territories, though, because they just... didn't have interesting beer sizes going on.

Our first testsolve for this puzzle was brilliant! People loved the We Love Our Lamb clue, though there was a rather irritating red herring they followed for a bit, which was because of this Lifehacker article (which you should by no means pay attention to).

The second group was perhaps less fortunate – or more. They did have an Aussie in it, but that might've set them back. You know how sometimes, you know things well enough that you don't question that knowledge? Well, this might've been the case in this group, and that although they still finished in a good amount of time, there was a little bit of a hitch when they identified the four sizes as middie, schooner, pint, and jug.

This went through a handful of testsolves, but all in all, was pretty smooth! Oh, and the text was presented upside down, so we were secretly hoping for a few incorrect answer submissions of RABRETSYO, but sadly that didn't happen.

Please Prove You Are Human (Noirleans)

Edited by Ben Smith

A picture of a robot, opening up a door on its front and pointing at its heart-shaped heart within
Isn't the robot adorable?

This was my first solo Hunt puzzle, but it was so much fun to make! It started off as just the image puzzle that you might've encountered, but with fewer transformations.

That… was solved much too quickly. Much too quickly. I think we had a six-minute solve at one point, which is nuts for an act 3 puzzle.

The audio layer was added after that testsolve, which turned out to be lots of fun. Initially, I used some noisy, computer-generated readings of the phonetic alphabet, but this turned out to be too much of a distraction – solvers were trying to solve the noise voices, instead of just using them. The actual voices are brought to you by Ben Smith, who was my editor on this puzzle, and whose guidance was invaluable.

We collected Telemetry (i.e. I watched a few Hunt teams try to solve this puzzle) and will iterate on PPYAH in a future version.

The Talking Tree (The Ministry)

Edited by Ben Smith

The Talking Tree was something that I'd wanted to do, since one of my only academic qualifications is having, in theory, studied some linguistics. Throw in some garden path sentences and you've got a puzzle, right?

It originally started… very differently. It was going to be a big tree of sounds, with sub-trees included, which would give a unique solution and orientation, and which you'd need to fill out in order to obtain the answer. I struggled with this for a while before changing tack, but this tree-like thing survived in the final extraction for this puzzle.

In any case, it turns out, not having parsed any syntax trees for… oh, say, seven years, it means you're not great at doing it. I ended up using a combination of the Stanford Parser, as well as a Syntax Tree Generator, to draw the trees for the sentences.

To get the Parser to work correctly, I added complementisers (or punctuation) to the sentences, for example:

  • Everywhere Alex walks, the dog chases them.
  • The horse which was raced past the finishing line fell.
  • Right while I was surfing, the internet dropped.
  • No person who was pushed quickly through the door fell.

The extraction tree (graph? net?) was drawn in GraphViz, and despite its spartan look, does the job well, I think – it had a decent solve rate.

Tech Support (New You City)

Edited by Sandy Weisz

Tech Support was the last of my puzzles, and possibly the second-to-last puzzle we finalised for the Hunt.

The idea? Make people do things that are a little unusual. In this case, mess around with email headers. Or at least, that was the initial idea. It started off largely the way it is in the final version, but took a long time to get there.

That email idea was only ever going to get me USA, because standards are cool. Then I thought to myself, I could have multiple parts that yield references to standards, which would need to be retrieved and parsed.

(I have a hair-trigger on Nutrimatic, Qat, and other tools, and I like trying to make my puzzles as resistant to them as possible. Call it hypocrisy or a character flaw, but if you look at this, or The Talking Tree, or Please Prove You Are Human, I'd like to think that they're… mostly resistant to shortcutting? I'd love to know if there were ways to shortcut them that weren't backsolving.)

See, that first test, it took around… an hour and fifteen minutes, but the total "parallelised" time was around 45 minutes, and at the time, we thought it was too short for the round. This puzzle went through a whole process of making the components serial (so that you would need to solve the email part to unlock the chat part, and the chat part to unlock the HTTP part), but solvers treated that as though the minipuzzles should have been connected to each other in some way – which they were, but not so that their outputs were linked in that way.

I also tried to make the email around harder through a number of NPL flat-style mini-mini-puzzles, though this proved under-constrained and very difficult to fact-check. The final, trivia-style questions you find in the final puzzle are very similar to the initial version.

The only other notable differences were that the parts to the HTTP part of the puzzle used to produce three primes, which you needed to multiply to yield 3971, and that the use of the emoji in the chat game was slightly different (more like Mastermind than Wordle).


Anyway, that's all! I don't publish many, if any, puzzles in normal times, so instead I'm just going to take notes on the world around me and build up a few good puzzles for… whenever I next get a chance to write for a hunt.

James Sugrono

James Sugrono

I think about things, and sometimes I write about things that won't fit in a tweet. Views expressed here are mine, and not those of my employer or anyone else, unless explicitly attributed.
Sydney, Australia