Assignment 2: Lazy Auto-Sorting Collection

Description

In this assignment, you will write a generic component (a lazily ordered collection, hereafter known as LazyColl). The goal is to make this a reusable component suitable for inclusion in a local library. The LazyColl should fit into the Java Collections framework by extending AbstractCollection ; this means that you also automatically implement the Collection interface, since AbstractCollection implements it.  

Aside: The reason for AbstractCollection's existence is to make it easy to create new kinds of collections through extension; it provides a framework with a lot of default logic for building a collection class.  The minimum the subclass needs to do is provide an iterator(), size(), and add() methods; AbstractCollection can implement all other Collection methods by delegating to these three.

The motivation for creating LazyColl is to have a container that automatically sorts the objects it contains and provides for efficient lookup of individual objects through a contains() method (it could easily be extended to provide a find() method as well). In this respect it is much like the TreeSet class, with the following differences: (1) duplicate objects (two objects comparing equal() to each other) are allowed; and (2) internally, it does not maintain sorted order on every addition, only when someone asks to look at the data. Clients aren't generally aware of (2), though -- the class hides this design decision.

The LazyColl class is intended to be efficient when used by client programs which have different phases: Adding  many individual pieces of data, then doing many queries against that data. An example of such a client would be a program that reads a set of records from a file, then performs queries against the records. This collection is not very efficient if a client alternates between appending and reading many times.

See http://java.sun.com/products/jdk/1.2/docs/api/java/util/AbstractCollection.html.

For additional background on Collections, see:

An in-depth introuduction to the Collections framework: http://developer.java.sun.com/developer/onlineTraining/collections/Collection.html

A design FAQ: http://java.sun.com/products/jdk/1.2/docs/guide/collections/designfaq.html

Implementation

Suggestions: Implement the LazyColl with an ArrayList internally. Keep a flag _isSorted, initialized to true. When an client appends new elements by calling add(), delegate this to the ArrayList, and also set _isSorted to false. When any call is made that cares about the order of objects (for example, iterator()), check _isSorted; if it's false, first sort the objects in the internal ArrayList and then set _isSorted to true. Use Collections.sort() to sort the elements. The goal is that clients shouldn't be aware of the sorting taking place behind the scenes -- things should just work.  All of this should be well-factored so that there isn't a lot of repeated code in your methods.

You will need to override size(), iterator(), and add() in AbstractCollection in order to insert your checks; however, the actual implementation of most can be delegated back to AbstractCollection. You can avoid writing your own Iterator subclass by returning your internal ArrayList's iterator from iterator().

For the method contains(), you should use a binary search to find the element rather than the linear search provided by AbstractCollection. NOTE:  LazyColl does lookups using by-value rather than by-reference; this means that the elements inside a LazyColl must implement the Comparable interface.  It also means that, to maintain the postcondition provided by Collection, the elements' equals() method must be compatible with their compareTo() method.  To get around this restriction, see the extra credit below.  

Using binary serach turns contains() from an O(n) time operation to an O(log n) time operation, assuming the collection is already sorted (which the rest of your methods should ensure). You should use the Collections.binarySearch() method to perform the search. Be sure that the sorting method is O(n log n), and that contains() is O(log n) in the case where no sorting is needed.

Document all of your preconditions, postconditions, and invariants. Check your preconditions and invariants where possible. If you have any class invariants, provide a checkOK() method which throws a RuntimeException if any invariant is broken, but otherwise does nothing; use this in your tests.

Test your code with a series of unit tests using the Integer class and Strings. Be sure to test edge cases (what are they?).  Include your test code with your class.

Tests

Your code should be able to pass (at least) the following test suite:

    /**
     * Does tests for a LazyColl collection of String objects.
     * @param c LazyColl to test
     * @throws RuntimeException If a test fails.
     */
    private static void stringTests(LazyColl c) {      
        c.checkOK();
        assertTrue(c.isEmpty());
        assertTrue(c.add("Sneezy"));
        c.checkOK();
        assertTrue(c.add("Dopey"));
        assertTrue(c.add("Grumpy"));
        assertTrue(!c.isEmpty());
        Iterator i = c.iterator();
        c.checkOK();
        assertTrue(i.hasNext());
        assertTrue(c.size()==3);
        assertTrue(c.remove("Dopey"));
        c.checkOK();
        assertTrue(c.size()==2);
        assertTrue(c.contains("Grumpy"));
    }

    /**
     * Does tests for a LazyColl collection of Integer objects.
     * @param c LazyColl to test
     * @throws RuntimeException If a test fails.
     */
    private static void basicTests(LazyColl c) {
        ArrayList testData = new ArrayList();
        testData.add(new Integer(50));
        testData.add(new Integer(48));
        testData.add(new Integer(55));
        testData.add(new Integer(54));
       
        c.checkOK();
        assertTrue(c.isEmpty());
        assertTrue(c.add(new Integer(5)));
        c.checkOK();
        assertTrue(c.add(new Integer(3)));
        assertTrue(c.add(new Integer(4)));
        assertTrue(!c.isEmpty());
        Iterator i = c.iterator();
        c.checkOK();
        assertTrue(i.hasNext());
        assertTrue(c.size()==3);
        assertTrue(c.remove(new Integer(4)));
        c.checkOK();
        assertTrue(c.size()==2);
        assertTrue(c.contains(new Integer(5));

        LazyColl c2 = new LazyColl(testData);
        c2.checkOK();
        c2.addAll(testData);
        assertTrue(c2.size() == 2*testData.size());
        assertTrue(c2.remove(new Integer(54)));
        assertTrue(c2.contains(new Integer(55)));
        c2.checkOK();
        c2.removeAll(testData);
        assertTrue(c2.isEmpty());
    }

    /**
     * Checks condition and throws RuntimeException
     * if the assertion fails.
     * @throws RuntimeException On assertion failure.
     */
    private static void assertTrue(boolean cond) {
        if (!cond) {
            throw new RuntimeException("Assertion failed!");
        }
    }

Extra Credit (1pt)

Add the capability to pass a Comparator object to the LazyColl constructor. Use the Comparator for all sorting and searching.

Deliverables

Commented source code for both the class and the unit tests, and output of the unit tests. No class diagram is needed for this assignment, since you are implementing a single class.

If you want to provide additional design documents -- for example, scenario diagrams for class interactions, or a state chart -- feel free to do so.

Be sure to comment your code appropriately, describing the general responsibilities of a class in a class header, and documenting the public methods. Comments within method bodies chould describe why the code is doing something, not what it's doing.  Use JavaDoc format for all public comments; see http://java.sun.com/j2se/javadoc/writingdoccomments/index.html for a detailed reference on the javadoc format.  You can run the javadoc tool on your source code to generate nice HTML documentation, but you need not turn this in.

I would prefer an email submission to jpanzer@acm.org; you can also copy jpanzer@netscape.com to ensure that it goes through. For diagrams, gif images are the best format. A paper submission is fine too, but you won't get as much feedback.