This is a continuation of my earlier post about my effort to create a program to help construct U.S. style crossword puzzles. It’s possible buy software to do this already but why buy ready-made when you can spend a couple months rolling your own?
My puzzle construction program is now basically functioning the way I wanted, but I am still adding improvements and new features as needed.
Right now, I can enter a set of themed words and use the program to help fill in the rest of the puzzle. In theory it should be straightforward to fully automate, but the result would likely be a correct but absolutely boring fill. In practice so far it seems best to fill puzzle sections with various candidates to find one that really seems to work.
The word lists that I am using rank the words with an integer value from 1-100. The higher the number the better the word. More or less. The rankings are subjective values assigned by the person who put it in the list.
Currently I am using two words lists that each have about 500,000 entries.
Crossword Nexus is a collaborative effort dictionary. Volunteers enter new words with a ranking that feels right to that individual. But even when the rankings are good initially they can grow stale over time.
Peter Broda's word list is... well I'll just quote him – and thank him and his collaborators.
This has been a work in progress for roughly a decade now, and is comprised of 616,617 entries typed in by me, scraped from various lists (important movies, Billboard hit songs and artists, world capitals, etc.), and harvested from thousands of puzzles from dozens of sources both indie and mainstream. Special thanks to Joe Krozel and Mark Diehl, both of them wordlist curators and constructors par excellence. They have both made significant contributions to this list by helping me clean up and rescore a huge number of junk entries.
Working with a puzzle theme
I started this project by cribbing a theme from a NTY puzzle - August 7, 2023 built by Chloe Revery and Alissa Revness.
My first goal was to see if my program could produce an alternate fill with the same theme entries. The puzzle stripped of the the fill with the monster related theme intact is below. A hot key combination makes these theme words. They are rendered with a bold and italicized font and can’t be changed by the user or the program while their internal Theme Word flag is set.
Program features
This is what the constructor program looks like now with a partial fill in place.
It is easy lay out a pattern of black squares by right clicking on the grid. I should qualify that word ‘easy’ though. Coming up with a workable configuration of black squares is an interesting problem in itself. See: How to Make a Crossword Puzzle - part 2 in the New York Times series.
The symmetrical puzzle square will also become black automatically. When the black squares are completely laid out, a menu click will auto number the answers.
When an answer that hasn't been completely filled is clicked, the list box to the right of the grid will populate with dictionary words that would fill the answer. The radio buttons below the list box - Rating, Ease of Fit and Alphabetic - determine the sort order in the list box.
Alphabetic is self explanatory. When Rating is selected the answer candidate are sorted in descending order by their 1-100 rating. Ease of Use orders the candidates by how likely their individual letters will match a crossing answer. Program code gives a value to each letter in the word based on how often it appeared in a large sample of English language text. See the C# method below. These values are sort of the inverse of the values on Scrabble tiles. The values for each letter are summed to give the word its Ease of Use rating and words in the list box are ordered by that sum, high to low. This could be optimized by considering the position of each letter in a word and where it appears in crossing answers. Right now I’m not sure that would be worth extra effort.
Using Auto Fill
Let’s look at how the program does a fill and consider the northwest corner of the puzzle above:
Selecting Auto Fill from the menu bar brings up the Auto Fill form. In this case we want the program to fill Down Answers 1-5, making sure that 1, 14, and 17 Across are also viable words in the dictionary. The Candidate Limit entry determines how far the program will go in search of the next fill before backtracking. We enter these values and click Fill.
About two minutes after hitting fill the program announces success with this fill:
Now if you are like me, you are asking, “What the hell is this ‘ITME’ that was placed in 4 Down?” (Telma Hopkins has her own Wikipedia page so TELMA seems OK.)
The Crossword Nexus dictionary gives this a rating of 90 so someone thinks it’s a great entry. Wiktionary gives ITME an entry, but if it still feels sketchy, the constructor has to decide if they are comfortable with putting it in the final puzzle.
If the fill answer just seems too obscure, the program allows the word rating to be reduced to a lower value by selecting it in the list box and hitting F1.
Setting it to a Rating value less than 40 will prevent it from being used in the solve. The rating cutoff value of 40 is somewhat arbitrary and is currently hard coded into the program. I plan to make this configurable from the program menu.
Human fill
After theme words have been added, the puzzle will still have a lot of white squares that need words or phrases. Part of making a good puzzle is coming up with interesting and fresh fill. This is where a completely automated process would probably fall on its face.
Here is where the list box on the main screen comes into play. When an answer is clicked on that has some partial fill from crossing answer, program code uses regular expressions to filter the list down to answer what will work. This is were the constructor can use their own judgment to give the puzzle the look and feel they want.
I came up with several alternate fills for the NYT puzzle. This is one of them:
Puzzle file format
The state of a puzzle in progress can be saved from the file menu. Puzzle state is currently being saved as an XML file. Some crossword publishers require the submissions be submitted in a format that started as a proprietary Microsoft format with file extension ‘.PUZ’. I’ve been researching the format and I see that some people have tried with varying degrees of success to reverse engineer it.
But I recently found a link from 2021 that said that the New York Times no longer accepts puzzles saved to the PUZ format. NYT ditches PUZ
So I haven’t yet resolved what my final Save and Open file formats will be. Currently the NYT’s will accept puzzles with clues and answers as PDF files. I sent my first submission to them last week.
To do
Add a settings item to the main menu to allow changing the number of answer candidates in the main screen list box
Create add new word to dictionary function. I had this before I started to use word lists with rating. This was done by selecting the new word on the grid and hitting a hot key combination. I need to add a dialog box to add the rating and specify the target dictionary.
Add a runtime update of autofill progress. I’m currently watching this in the output window of the development environment and logging it to a text file.
This is the first puzzle I submitted to the New York Times. This one wasn’t accepted.