Getting Started - GUI Control and Utilities
I enjoy solving crossword puzzles and would like to be able to create some of my own. To help myself out I’ve developed a couple of C# programs to help automate the task. One bit of code I developed is a Windows Form Control that allows me to set up the crossword grid with my choice of black squares and then auto-number the up and down answers. The control also implements navigation within the puzzle and answer entry.
My plan is to set up a theme for a puzzle and have my program access a data set of candidate answer word to at least partial fill in the rest of the grid. Here is an example of monster themed puzzle from the NYT in my WinForm control:
My data set is a word list in an XML file with about 90,000 common words and short phrases containing 3-15 letters that I have scrounged from different sources on the internet.
I’ve written a utility program that allows me to examine and maintain my word list. You select the number of letters in the word and enter any letters that are constrained by crossing answers and it will show a list of all the words that match those constraints.
The Add Word button opens a dialog that will let me enter a new word to the word list. The Add Batch button opens a text file with any number of new words. The code checks to see if the word is not in the list and if it is not, it is inserted into the proper place in the XML file. The Add Batch routine generates a report with a list of the new words added and a list of words that were already part of the data set.
About the Word List
Let’s go back and look at the NYT puzzle I started with. Here it is at completion.
Here is the Northwest corner of the puzzle.
The original word list did not contain the two word phrase ‘NO ONE’ or the French word ‘MERCI’ so the completed puzzle as done in The Times couldn’t possibly be generated. That’s not to say another solution couldn’t be created but crosswords do rely on a lot of short two and three word phrases and common foreign language words. So I suspect my word list is somewhat anemic. Other go-to items in crosswords are acronyms. My original list didn’t contain any. If you solve the NYT crossword for a while you realize that the initials of New Deal programs - TVA, CCC, WPA - come to the rescue of many a puzzle creator. This is before we even consider presidential initials - LBJ, FDR, DDE, HST… And then there are slang terms, texting abbreviations, pop culture references, etc.
With these sorts of ‘words’ in mind I’ve been beefing up my word list by creating lists of new words in my phone Notes application in odd moments of free time. When I have a few hundred I send the text file to myself in an email and use the Add Batch function of my Word Search utility to add them to the master word list.
Additionally the original list list does not contain plural forms of countable nouns, or past tense or gerund verb forms. Crap. ChatGPT can identify the words of these forms but to use this effectively I’m going to have to learn something about it’s API, so that’s an item on my todo list.
I’ve also found other lists of words online that will probably provide useful additions but they require another utility to massage them into the form that my current XML list uses.
The Interesting Part
All of this is just a prelude to the actual puzzle creation, the filling in of the empty squares that will agree with the constraints of the initial themed entries. I’ve been considering a couple ways to represent the data internally and a couple different ways of approaching the solution search. It’s still pretty hand wavy, picture drawing at this point. Right now I’m thinking that whatever algorithm I work with, I’ll set up the code to stop at various stages of completion, 50%, 75%, 90% and display the candidate fill at that point. Examining the fill state at various stages should give me an idea if my approach is on track.
I definitely will have a check for a 95% fill state so I can see the partial fill. I’m reminded of an interview of veteran crossword constructor who complained of all the times he would have been saved by a word like ‘ENEMA’ or ‘URINE’. The Times has become a lot less prissy about what is an acceptable word. I’m not necessarily thinking of saving a puzzle with something tacky but I wouldn’t want to toss a perfectly cromulent puzzle when a bit of regionally accented and phonetically spelled English, or perhaps an common short phrase might save the day.
SHONUFF. DONTWANTTHAT.
Very impressive