24 points
stone shone shine chine chins coins corns cores cones coney money
All of these words that make up the ladder can be found in a dictionary (or a data file specified by the user).
This assignment provides practice with pointers, lists, queues, and recursion.
cp ~rcduvall/cps100e/assign/assign4/* .
If you type the Unix command ls you should see the files listed below.
Additionally, two large data files are provided for you to test your program. So you can access these files without specifying a long path-name, create a link to the data files in your assign4 directory by typing:
ln -s ~rcduvall/data/knuth.dat ln -s ~rcduvall/data/words5.dat
For this assignment, you'll be using a database of all the English five-letter words compiled by Don Knuth over the past thirty years. The file knuth.dat has 5,757 words in it as well as some extraneous information in it. Code to read this file is already written for you to use. The code ignores lines that begin with *, and only processes the first five characters on other lines. Knuth asks that the file not be altered, hence these restrictions.
There is also a smaller file of data (3,200 words). You can, of course, make your own data files (strongly recommended for debugging!): the format will be one 5-letter word per line.
Your program will find the shortest word ladder between two words entered by the user when running the program. There may be more than one shortest ladder, your program must find one, but not all such ladders.
Your program should:
The following is a sample run of the program:
> doladder Enter two 5-letter words (length != 5 to end): smart brain smart start stark stack slack black blank bland brand braid brain Enter two 5-letter words (length != 5 to end): angel devil There is no path from angel to devil Enter two 5-letter words (length != 5 to end): no more
A ladder is found by putting the starting word on a queue, then putting all words one letter away from this word on the queue, then putting all words two letters away on the queue, then all words three letters away, etc. As each word is taken off the queue, if the last (target) word is found the process can stop (there may be other words on the queue, but they'll be ignored). This process is guaranteed to find the shortest word ladder (please explain why in your README file).
A word w is not actually stored on the queue, but instead a pointer to a struct containing w is stored. The other fields of the struct are a pointer to the word that is one letter away from w and that caused w to be put on the queue (i.e., the word's predecessor in a potential ladder), and a boolean that records whether or not this word has been put onto queue (i.e., if the algorithm has yet visited this word). For example, if w is bears, then pointers to structs containing dears, fears, gears, beard (and so on) are enqueued with each struct pointing to bears since this word preceded the others and caused them to be enqueued. The first word does not have a predecessor, so its ancestor field is 0/NULL. The boolean field prevents a word from being put on the queue multiple times and potentially creating a circular ladder.
This is diagrammed below
Thus, store all of the words from the file in a vector of type LadderNode * (this is already done in Ladder::Load). The LadderNode struct is used to store the word from the data file, the previous LadderNode that it may be one letter apart from (this link is formed during the construction of the ladder), and a sentinel that marks that the node has already been considered on the queue. Thus after Ladder::Load has been called, the vector may look like the following:
The first word (entered by the user) is looked up in the list of words, and a pointer to the struct containing the word is enqueued. If the first word is not in the list, you should create a LadderNode struct containing the first word and add it to the queue (as if it had been in the list of words). Once you have dealt with the special case of adding the first word onto the queue (because it has to be a special case is why you can fake that the first word was in the list already), then repeat the dequeue/enqueue process below.
When the target word is derived, you will need to print out the ladder from the first word to the target word. The ancestor pointer in the LadderNode stores information that will allow the ladder to be recreated, you may need to use recursion or a vector since the ladder will be backwards (but should be printed properly). Instead, you could try to search backwards, so to find a ladder from smart to brain search from brain to smart.
It should also be possible to find target words that are not initially in your list of words by using the same trick, of creating a new temporary LadderNode, that you did for the first word.
The code provided you reads a set of words from a user specified file (either on the command line or via prompt) and prompts the user for two words to check if they can form a word ladder. It then calls some empty member functions of the class Ladder. You must develop the class Ladder that has been started for you, by filling in the public member functions and perhaps adding more private member functions.
You must implement the public functions, Find and Print, as specified in the Ladder header file. You may implement as many other private member functions as you need. A private member function is a helper function for other member functions, but should not be available to be called by the user of the class. Making a helper function private ensures that only other member functions can access the helper function, but client programs cannot.
Use these helper functions liberally, because points will be taken off of your program for each function you write that is longer than twenty lines (and even more for each additional five lines above fifteen). You get in the habit of writing small, single purpose, functions that are easier to write and thus easier to debug.
Additionally, you should not include any "magic" values in your program. If you must use magic values, make a constant and refer to the constant within your code. Again, points will be taken off for each magic value found within your code.
You should use the standard templated queue class documented here.
When your program compiles and produces the correct output, you should create a README file for this and all assignments. In this file, include your name, the date, and an estimate of how long you worked on the assignment. You must also include a list of names of all those people (students, professor, TAs, tutor) with whom you consulted on the assignment. See the rules for collaboration in the CPS 100e syllabus. Finally you must explain why the algorithm described in this handout is guaranteed to generate the shortest word ladder.
To submit your assignment, type:
submit100e assign4 README doladder.cpp ladder.h ladder.cpp
You should receive a message telling you that the program was submitted correctly.
To see that files were submitted correctly, you can use the submit100e command with just the assignment name and it will list the files submitted. That is, type submit100e assign4
Adding the following functionality will earn you extra credit.
Replace the queue with a stack and discuss (in your README file) the differences in behavior between the stack and queue versions of the program.
Write a new version of the Ladder class. This program should process the words so that only "good" matches are tried when ladders are found. The preprocessing step will take a long time, but word ladders will be found very quickly.
The idea is that for each word, all words one-letter away are determined (and stored somehow) when the words are loaded. When looking for candidate words to enqueue, only words that are one-letter away (these are already known) are checked for previous use. This saves searching through the entire list of words and checking whether each is one letter away.
To submit the extra credit assignment, use the name assign4X.