Computer Science 15-111 (Sec A+B), Spring 2007

Computer Science 15-111 (Sections A & B), Spring 2007

Homework #8

Due: Wed 25-Apr-2007 at the start of class
(in-class and online submission).

Submit a printed copy of your assignment as you enter class. Also, for your online submission (which must be the same as your printed submission), place your programs in hw8.zip. Include the written work in a file hw8.doc (not hw8.txt) in the top level of the zip file. Remember to just submit your .java files, and not your .class files, random .jpg files, or miscellaneous Eclipse or other IDE-related files.

This is another TRUE PAIR-PROGRAMMING ASSIGNMENT. Pairs are optional, but if you work in pairs, then you submit only one set of solutions, and you each receive the same grades (you may not work in pairs on only part of the assignment – it’s all or nothing). Be sure to include BOTH NAMES at the top of ALL FILES!

This time, you may work with a partner who you have previously worked with. Besides these changes, the rules for “Pair Programming Lite” apply – see the online course syllabus for details.

Note: if you would like to have a partner randomly assigned, please email your request as soon as possible. These requests cannot be satisfied more than 2 days after the assignment is posted.

Reminder: Style counts! Write your code according to our Style Guide!

Note: At a minimum, your submissions must abide by the test cases in Hw8SimpleTestCases.zip. Failure to meet these simple test cases will result in a 10-point penalty. (Also, note that the CA’s wrote these without regard to style; if the test cases themselves include an error, please bring this to your CA’s attention – thanks!).

Note: at no penalty, you may omit all parameterized classes from this assignment. If you do write parameterized classes, you will receive up to 5 points of bonus for your efforts. In general, if you are not parameterizing your classes, you should convert the given signatures in the standard fashion, replacing parameterized classes with “Object”. If you have any questions about this, please ask your instructor or CA.

1.      MyCollections Algorithms
For this problem, you will write your own implementations of some of the algorithms included in the Collections framework, creating them as public static methods in the MyCollections class. You must write these to operate over parameterized Collections classes. So, for example, the following code should work:
   ArrayList<String> al = new ArrayList<String>();
   al.add("foo");
   al.add("bar");
   MyCollections.reverse(al);
   System.out.println(al); // prints [bar, foo], not [foo, bar]
Write the following methods, adhering very precisely to their descriptions including throwing the right exceptions at the right times (

a. binarySearch
Note 1: Use this unparameterized signature:
public static <T> int binarySearch(List<T> list, T key, Comparator<T> c);
This indicates that the method returns an “int”, and takes 3 arguments: a parameterized list, a key of that same parameter type, and a parameterized Comparator. The “<T>” at the front of the signature is what allows the syntax to indicate that all three parameters use the same parameterized type.

Note 2: Compare the simplified signature to the real one. While not required, see if you can understand the differences.

Note3: As you should carefully read in the online API, this version takes a Comparator as an argument, and the Comparator can be null, in which case it casts the elements to (parameterized) Comparables and uses compareTo. Also note that this method checks if the list implements RandomAccess, and if so this runs in O(logn) time, otherwise it runs in O(n) time.

b. disjoint
Note: Assuming both collections are of size N, then you must use an algorithm that runs in worst-case O(N) time. Thus, you cannot in general simply iterate over one collection and check if each element is in the second collection by calling the latter’s contains() method N times – if the second collection’s contains() method is worst-case O(logN), then this would run overall in O(NlogN) time, which is too slow.

c. indexOfSubList

d. reverse
Note: this method must run in worst-case O(N) time.

e. shuffle
Note: carefully follow the given algorithm, and in particular run in O(N) time even over sequential access (non-random-access) lists.

2. MyCollections Algorithms Testers
Using the test framework from the previous homework, include a suite of tests for the MyCollection algorithms implemented above in a class MyCollectionTester which extends the Tester class. You should test every boundary case for every method. Be sure to test each possible outcome of each conditional, and be sure to make each possible exception be thrown.

3.      Collection ADT’s
For this problem, you will provide efficient implementations of several Collection ADT’s, though you only need to implement the abbreviated interfaces described here (check the respective online API’s to see how these methods should work, though in a couple of cases the method signatures were modified slightly here to be a little clearer). Your classes should not extend any classes, and need not implement any interfaces from the JCF (just implement the abbreviated interfaces given here). Thus, these implementations will not work with the JCF, but that’s ok for our purposes here.

Note that your implementations cannot use any classes or methods from java.util.*, except for java.util.Iterable and java.util.Iterator, and once again your classes must be implemented efficiently.

Also note that your classes should use the following Exceptions:
     class MyException extends RuntimeException { }
     class MyEmptyCollectionException extends MyException { }

Note: by making MyException extend RuntimeException, you do not have to declare a method that throws this exception, nor must you use such methods inside a try/catch clause. That is the upside, but also the downside at the same time, as it “turns off” some exception handling at compile time, leading to a less-structured style of programming. In particular, see how MyStackInterface.pop() throws such an exception, yet it does not have to be declared in the interface or actual method.

Finally, here are the abbreviated interfaces that your methods should implement:

interface MyStackInterface<T> {
   public T push(T obj);
   public T pop(); // throws MyEmptyCollectionException
   public boolean isEmpty();
}

interface MyMapInterface<K,V> {
   // Maps keys of type K to values of type V
   public V put(K key, V value); // returns old value
   public V get(K key);
   public V remove(K key); // also returns old value
   public void clear();
   public Iterator<K> keyIterator();
}

interface MySetInterface<T> {
   // Maintains a set of objects of type T
   public boolean add(T object); // returns true if newly-added
   public boolean contains(T object);
   public boolean remove(T object); // returns true if it was present
   public void clear();
   public Iterator<T> iterator();
}

Here they are again without parameterization:

interface MyStackInterface {
   public Object push(Object obj);
   public Object pop(); // throws MyEmptyCollectionException
   public boolean isEmpty();
}

interface MyMapInterface {
   // Maps keys to values
   public Object put(Object key, Object value); // returns old value
   public Object get(Object key);
   public Object remove(Object key); // also returns old value
   public void clear();
   public Iterator keyIterator();
}

interface MySetInterface {
   // Maintains a set of objects
   public boolean add(Object object); // returns true if newly-added
   public boolean contains(Object object);
   public boolean remove(Object object); // returns true if it was present
   public void clear();
   public Iterator iterator();
}

a. MyStack<T> implements MyStackInterface<T>
Note: You should use a linked-list implementation here (though not java.util.LinkedList, as you cannot use any classes in java.util.*).

b. MyHashMap<K,V> implements MyMapInterface<K,V>, Iterable<K>
Note 1: This implementation must use a HashTable (of your own creation) and must provide O(1) operation both for put and get. The HashTable must use linear chaining (the buckets must be instances of linked lists, which again must be of your own creation and not from java.util), and you should double the size of the bucket array and rehash when the average bucket size exceeds 10. You should use the hashCode() method to obtain your hash values of the keys.

Note 2: Note that the method keyIterator does not occur in java.util.Map. In that version, you use getKeys() which returns a Set. Here, we simplified the process by returning an Iterator (and not, to be clear, a ListIterator), that iterates over all the key values in this map. In this implementation, the keys can occur in any order.

Note 3: If you want to write this class prior to our HashTable lecture next week, you can temporarily use java.util.HashMap to back this implementation. That class obviously does nearly all the work for you, and thus allows you to have a “working” MyHashMap for the rest of this assignment then to return later next week to replace java.util.HashMap with your own hashtable implementation.

c. MyHashSet<T> implements MySetInterface<T>
Note: this implementation must be backed by a MyHashMap, where you can use “null” (or any other value, really) as the mapped value, as here we are only really interested in the keys. As with MyHashMap, the iterator can iterate over the values in any order.

d. MyTreeMap<K,V> implements MyMapInterface<K,V>
Note: This implementation must use a Binary Search Tree (of your own creation) and must run in O(logn) average-case time both for put and get. (though O(n) worst-case time is acceptable for these methods) Also, the iterator returned by keyIterator must iterate over the keys in sorted order (that being the whole point of using a Binary Search Tree instead of a HashTable)..

e. MyTreeSet<T> implements MySetInterface<T>
Note: this implementation must be backed by a MyTreeMap, where you can use “null” (or any other value, really) as the mapped value, as here we are only really interested in the keys. As with MyTreeMap, the iterator must iterate over the values in sorted order.

4. Collection ADT Testers
Using the test framework from the previous homework, include a suite of tests for the Collection ADT classes implemented above in a class CollectionADTTester which extends the Tester class. You should test every boundary case for every method. Be sure to test each possible outcome of each conditional, and be sure to make each possible exception be thrown.

5.      MyPrefixDictionary
Here you will implement a specialized collection class that will be useful for writing the game of Boggle (see below):
interface MyPrefixDictionaryInterface {
   public void addWord(String word); // add the word and all its prefixes
   public boolean isPrefix(String word); // must be O(logn) or better
   public boolean isWord(String word); // must be O(logn) or better
}
As this API suggests, a Boggle engine needs to know not only whether or not a word is in the dictionary, but whether or not a word (or, more precisely, a sequence of letters) is a prefix of a word in the dictionary.

Note: the semantics of this interface assume that elements are not added twice to the underlying data structures. Adding a word that is already present has no effect, and adding a prefix that is already present also has no effect.

Here, you will implement this interface not once but twice, as follows:

a.      MyPrefixDictionary1 implements MyPrefixDictionaryInterface
In this version, you will have two private instance variables:
   private MyHashSet<String> prefixes;
   private MyHashSet<String> words;
For each call to addWord, you will add the word to this.words and you will add each of its proper prefixes (not including the word itself) to this.prefixes.

b.      MyPrefixDictionary2 implements MyPrefixDictionaryInterface
Here, you will implement the dictionary as an N-ary Tree (that is, a tree where each node can have many children, and not just two, as in a binary tree), where edges are labeled with characters and thus each path from the tree root to some node represents a prefix (or an entire word). To accomplish this, we will use specialized tree nodes:
   class MyPrefixTreeNode {
      public char edgeChar;
      public boolean isWord;
      public LinkedList<MyPrefixTreeNode> children;
   }
Again, the “edgeChar” is the character along that edge (except at the root, where the edgeChar is null, as the root represents the empty string). Seeing as a sequence of characters can be both a word and a prefix of another longer word, we need a way to distinguish these cases. Here it is:

A sequence of characters is a word if you can follow a path from the root of the tree described by that sequence of characters, and in the final node of that path the “isWord” boolean is true.

A sequence of characters is a proper prefix (of a longer word) if you can follow a path from the root of the tree described by that sequence of characters, and in the final node of the path the “children” list is non-empty.

6.      MyPrefixDictionary Tester
As is now the standard, write tester methods for the preceding classes in the class MyPrefixDictionaryTester. Also, analyze the memory usage and runtimes of the two classes, and include a comment in your tester code wherein you argue which of the two classes is superior and why it is so. You will use your preferred class in your implementation of Boggle (see below).

Note: you may wish to use the file “words.txt” (69,310 words in a 712kb file) as a source of words to load into your dictionary, with code such as this:

public void addWords() {
   Scanner scanner = null;
   try { scanner = new Scanner(new java.io.File("words.txt")); }
   catch (Exception e) {
         System.out.println("File not found.");
         System.exit(1);
   }
   while (scanner.hasNext()) {
         String word = scanner.next();
         if (word.length() < MAX_WORD_LENGTH)
               myDictionary.addWord(word);
   }
}

While it is probably best to set MAX_WORD_LENGTH to a large number (say, 15, or even 1500 (which is effectively infinity, of course!)), if you find that you are running out of memory when creating your dictionaries, you can test your code with shorter max-lengths (say, 5), which will significantly decrease the number of unique prefixes. Alternatively, you can simply use a subset of the words.txt file.

7.      Boggle Engine
Here, you will write a Boggle engine. This is the code that plays the game of Boggle, only without any user interface (that comes later). First, do this:

a) Read the Wikipedia page on Boggle;
Note that you will write a version of Boggle that has these restrictions:
     1) It is single-player only;
     2) It does not have a timer;
     3) It does not include the “qu” piece (but it does have a “q” piece)
Note that these restrictions are lifted in the bonus problems.

b) Play Boggle
Boggle applets are fairly easy to find on the web. Do a Google search for your favorite, or use this randomly-chosen Boggle applet (which, by the way, does not have the best user interface but is good enough to learn the rules). Play it for a while to get a better feeling for the game.

Now, onto writing the Boggle engine. Basically, the BoggleEngine class should expose all the methods a user interface might need to play a game of Boggle. These include:

public BoggleEngine(String dictionaryPath)
Construct a new BoggleEngine using the given path to the file containing the dictionary as a list of words one-per-line. Typically, this path will just be “words.txt”, which should be stored in the same directory as your Java files. Your constructor should create an instance of MyPrefixDictionary and load it with all the words from the given file and all their prefixes.

public char[][] newBoard()
This constructs a new board for a new game, and it returns the newly-constructed board as a two-dimensional array of chars. To do this, you must randomly place the 16 dice into the 4x4 board locations. To do this, you must place your dice into a Java.util.List of some kind, and then you must use the MyCollections.shuffle method you wrote above to randomize the order in the List. As for each individual die, you can use an instance of Random to choose which face of the die to use. Note that the 16 dice must have exactly these letters (each die has 6 faces, and so 6 letters, so here are the 16 dice’s letters):

mcouti estosi wootta eltytr eilrdx gneeaa wngeeh ffkspa
nhiumq oojbab ytdtis caosph lnhnrz edlyvr eesuni rwtevh

Important side-effect: When you create a newBoard, you should also find every legal word on the board, and store these in a TreeSet<String> of boardWords. This is best done by iterating over every board position as a starting position, then from each possible starting position calling a recursive depth-first search from that position. The depth-first search proceeds by checking if the word formed so far is in fact a word (of 3 or more letters) in the prefixDictionary, and if so adding it to boardWords, and then checking if the word formed so far is also a prefix in the prefixDictionary, and if so then recursively calling depth-first search on all the neighboring tiles (that is why we used a prefix dictionary! Be sure not to recurse if the word is not a prefix!). Note that you should also keep a two-dimensional array of booleans indicating whether or not a given tile has already been used in the word so far. Each time you use a tile, set the respective boolean to true, do the recursive search, then set it back to false and return. This will prevent tiles from being used twice in one word. Note: this algorithm is really the crux of our implementation of Boggle. While the general approach is laid out here, some details are left to you. You are expected to work out these details on your own.

public void newBoard(char[][] board)
This version of newBoard is strictly for testing purposes. Here, we send in a 4x4 array of char containing a board, and a new game is started using the given board. This call has the same side-effect as the previous newBoard method of creating the TreeSet<String> of boardWords.

public Iterator<String> getBoardWords()
Returns an Iterator over the TreeSet<String> of boardWords which were generated as a side-effect of the call to newBoard(). Iterating over this Iterator will produce, in alphabetical order, all the legal words that occur on the current board. Note that you should not use this method as a means of determining whether or not a word is in the dictionary. Use getPoints() for that. Instead, use this method if you wish to display all the legal words in your user interface, perhaps at the end of a round. The boardWords are a TreeSet rather than a HashSet so that this iterator will produce the words in alphabetical order, which is presumably preferable for your user interface.

public int getPoints(String word)
This returns the number of points the given word should score on the current board. If the word has fewer than 3 characters, it returns 0. Also, if the word does not occur (as a word, not a prefix) in the prefixDictionary, it returns 0. Otherwise, this returns the number of points Boggle awards words of this length, as such:

Letters   Points
   3        1
   4        1
   5        2
   6        3
   7        5
   8+      11

You should feel free to add additional methods as you desire, but you must at least implement the methods listed above exactly as described.

8. Boggle Engine Tester
As is now the standard, write tester methods for the preceding class in the class BoggleEngineTester.

9. Boggle Text-UI
Write a text-based Boggle game that runs in the Java console (using System.out calls for output and a Scanner for input). Of course, you should use an instance of your BoggleEngine to actually play the game. Your user interface should be clear and easy-to-use. While a text UI can not, in general, be as elegant and easy-to-use as a GUI, you are expected to make your text UI as elegant and easy-to-use as possible.

10. BONUS [5 pts] Parameterized Classes
As noted above, if you write the preceding code using parameterized classes, you will receive up to 5 points of bonus.

11. BONUS: [5 pts] Better Boggle
The version of Boggle above has the following 3 restrictions:
     1) It is single-player only;
     2) It does not have a timer;
     3) It does not include the “qu” piece.
Remove each of these restrictions here. For the “qu” piece, you are to treat the “q” piece as though it is “qu” – display it using both letters in a single cell, and when it is used in words, treat it as two letters, not one.

12. BONUS: [10 pts] Boggle GUI
Write a GUI-based version of Boggle that also uses your BoggleEngine from above. The grading for this GUI will be similar to the grading for your Tetris GUI’s (see that assignment for more details). For the full 10 points, you should extend BetterBoggle (see previous problem), and your GUI must be very elegant and include a number of clever design features. Be sure to list all your GUI’s features in the header file of your GUI class so they will not be missed by the grader!

13. BONUS: [10 pts] 3D Boggle
Write a 3d version of Boggle that uses a 3x3x3 array. You do not need to have a true 3d display, but can instead show three 2d panels, say slices from top-to-bottom. It would be nice (and worth more of the 10 possible points) if you could slant these back a bit (using Java2D translations – check out Graphics2D for more details), or otherwise come up with a clever way to portray the board. In any case, you would need to adapt your BoggleEngine code so that it works over a 3d 3x3x3 board in order to get this working.

14. BONUS: [5 pts] Huffman Encoding
Review the class notes on Huffman Encoding, and read what online materials you need to further understand it. Then, write a program that asks whether to encode or decode a file. If the response is to encode a file, the program reads in the text from the file “Huffman1.txt” (into which you may place whatever sample text you wish), then constructs the prefix tree according to the Huffman algorithm. Next, create the file “Huffman2.txt” that contains the variable-length code in this form:
first line: the number of encoded letters N
next N lines: the ASCII of encoded letter, a single space, the variable-length code of that letter.
After that, create the file “Huffman3.bin” [not “.txt”, as this file is binary, not text data] that contains the encoding of the raw data (Huffman1.txt) using the code in Huffman2.txt. Note that the data must be stored in an array of bytes, and that you will use an ObjectOutputStream to write the data. You will have to do a little research to figure out how this works. In any case, do not store this file as an ASCII file, but as a binary file, otherwise you will not get the compression you seek. Finally, print out a report showing the size of the input file, the size of the output file, and the percentage of the compression.

Now, if the user instead opts to decode a file at the start of the program, your program should read in the code stored in Huffman2.txt, generate the prefix tree from this, then read in the data (using an ObjectInputStream) from Huffman3.bin, and decode the data, producing the file Huffman4.txt. Finally, your program should compare the files Huffman1.txt and Huffman4.txt to confirm that they are the same: that is, that we compressed and then decompressed the data without any loss of information. Cool!

15. BONUS: [25 pts] Network Boggle
If anyone finishes all of the preceding problems (including all the bonus), and generally at an “A” level of work, before class on April 18^th, and is interested in writing a networked version (of some kind, be it TCP/IP or a servlet or some other approach), they must contact me before class that day and arrange a time to speak with me after class that day about the project. Note that the start of class on April 18^th, is the latest time you can place such a request, but you can certainly do so earlier (assuming you have completed everything else first) and get a bigger jump on this substantial bonus opportunity.

16. EVEN MORE BONUS: [? pts]
If anyone finishes everything including Network Boggle before class on April 18^th, then we can arrange for yet another bonus project on this assignment. Discuss with me as appropriate.