Skip to content

Generating word-sequences

Nested Words

In response to this post by
Qwantz, I generated a list of words of the type he suggested; namely ones that you can form by appending a letter to either side of a word of that form (and “a” and “I” are words of that form).

For instance, he came up with as an example

I->lie->alien->salient

I used the scowl dictionary which can be got here.
The problem with this is that it is a little too big, and many of the words are nowhere else to be
found, and don’t seem to mean much. But at least my program generats a finite list of
words that you can check through.

Here are my wordlists of length n (words of length 2n are required to be generated from a 2 letter
word in the scowl wordlist). Of course, the only ones worth looking at are the 7+ ones.

Words of length 1 (2 words)
Words of length 2 (283 words)
Words of length 3 (504 words)
Words of length 4 (4421 words)
Words of length 5 (1843 words)
Words of length 6 (3742 words)
Words of length 7 (392 words)
Words of length 8 (205 words)
Words of length 9 (10 words)
Words of length 10 (1 word)

Of course, a generalisation that might give more interesting cases would be to allow the sequence to
terminate at three, four, or five letters instead of just 1 or 2, or to allow for not-so-symmetric
adding. But I don’t feel like doing those. If you generate them and come across anything
interesting, I’d love to hear!

Source Code (uncommented, but short)

In PERL. Obviously usernames and passwords removed. To run, pipe some wordlist (1 word per line)
into the program (note, you will have to have the table created already (I included the command in
the comments at the start)), say

perl addwords.pl < wordlist.txt

addwords.pl

Then you create another table (i called it “nestedwords”), and add all words of length 1 and 2 to it
(note that in my dictionary, all letters were counted as words of length 1, so I just added “i” and
“a” manually to stop things from going too crazy, but length 2 is dealt with properly by the
following command).

To generate the words of length n, given that you already have words of length n-2 in nestedwords,
just modify the $length variable in generate.pl, and run it.

All it does is go throughall words of length n in the wordlist table, chop off the first and last
character, and see if this generated word is in the table nestedwords. (I think this is the best way
to go about generating these words).

generate.pl

To output the list of words, just run outputwords.pl