# The Target puzzle: searching wordlists with tries

February 25, 2020I was thinking about the Target puzzle from a certain Australian newspaper the other day: you're given 9 letters, and you have to create as many words as possible from them (length 4+). There's always at least one 9 letter word you can form.

This is going to be a short post, as I basically wanted to check whether you could use a trie to solve this.

**Trie**s are a way of storing strings (or other sequences) so they can be searched quickly: basically, you build a tree where each letter in the string becomes a node, with the next letter in the string becoming a child node. The time to search for a string of length \(m\) is \(O(m)\).

A simple trie storing a small list of words looks like:

```
def build_trie(words):
trie = {}
for word in words:
current_dict = trie
for index, c in enumerate(word):
if index == (len(word) - 1):
current_dict[c] = True
else:
if c not in current_dict:
current_dict[c] = {}
current_dict = current_dict[c]
return trie
simple_trie = build_trie(['act', 'bat', 'bag', 'bin'])
print(simple_trie)
```

```
{'a': {'c': {'t': True}}, 'b': {'a': {'t': True, 'g': True}, 'i': {'n': True}}}
```

Note that here, we're just looking for 9 letter words, so I just assume each word is equal length and terminate the branches with a `True`

.

To check if a word occurs in the trie we start at the root, then for each letter in the word, we check if that letter is a child of the current node. If it is, we step down to that node and move to the next letter:

```
def word_in_trie(word, trie):
current_dict = trie
for letter in word:
if not letter in current_dict:
return False
current_dict = current_dict[letter]
return True
word_in_trie('bat', simple_trie)
```

```
True
```

To solve the Target puzzle, we'll use the standard word list built into the OS, and grab all the 9 letter words. We'll create a trie, as well as a standard word list that we'll use when comparing the trie approach to alternatives.

```
def generate_words(filename, word_length=9):
with open(filename) as word_list:
for line in word_list:
word = line.strip().lower()
if len(word) == word_length:
if word.isalpha():
yield word
word_trie = build_trie(generate_words('/usr/share/dict/words'))
word_list = list(generate_words('/usr/share/dict/words'))
```

```
word_list[:3]
```

```
['abernathy', 'abyssinia', 'accenture']
```

To find a 9-letter word using the trie approach, we just generate permutations of the available letters, and check if any of them form a full word:

```
from itertools import permutations
def solve_target(letters, trie):
letters = list(c.lower() for c in letters)
for comb in permutations(letters, 9):
if word_in_trie(comb, trie):
return ''.join(comb)
solve_target('WDPARUPNE', word_trie)
```

```
'unwrapped'
```

This seems pretty fast, especially given that it's potentially having to run through a large number of permutations:

```
sum(1 for comb in permutations('WDPARUPNE'))
```

```
362880
```

If my (admittedly fuzzy) knowledge of big-O notation is right, the runtime for this trie-based search should be something like \(O(pm)\) where \(p\) is the number of permutations, and \(m\) is the length of the words ($m = 9$ here).

Unfortunately when I tested it against a set based implementation, it seems like the set wins out:

```
def solve_with_set(letters, word_set):
letters = list(c.lower() for c in letters)
for comb in permutations(letters, 9):
if comb in word_set:
return ''.join(comb)
```

```
%%timeit -n 10 -r 1
solve_target('WDPARUPNE', word_trie)
```

```
62 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 10 loops each)
```

```
%%timeit -n 10 -r 1
solve_with_set('WDPARUPNE', set(tuple(w) for w in word_list))
```

```
28.6 ms ± 0 ns per loop (mean ± std. dev. of 1 run, 10 loops each)
```

The set-based solution should also have a worst-case runtime of $O(pn)$, where $n$ is the total number of words in the list, which should mean our trie wins out as the word list gets longer and longer. However, set lookups are often $O(1)$, and the Python implementation of them is pretty efficient, so we might only see the advantages of tries for this kind of puzzle under specific circumstances. I might have to put this down as an idea that didn't quite pay off.