ELEC 372
Engineering Software Structures

Assignment 1: A Student Data Base

Solution



The purpose of this assignment was to expose and practice the process of understanding and modifying modules within an existing code base (application) using the object-orientation paradigm and with attention to the software architecture.  The design is given in terms of interfaces and implementations, represented by Java interfaces and an implementing class hierarchy, together with some sample 'client' classes.  You were to change only the implementation of the interfaces StudentList and CourseList and thereby guarantee the correct behaviour of the rest of the system, due to its modular architecture.

The maintenance task given was to change the list representation strategy from arrays (as given) to linked lists.  As it happens, the given application's architecture gives a strong hint on how to proceed.  The array strategy is almost invisible outside the base class ListImpl.  For this reason, by far the easiest kind of solution is to replace the internals of ListImpl with a linked-list version.  The definitions of the subclasses CourseListImpl and StudentListImpl can remain almost unchanged.  If the change is carried out correctly, the rest of the application remains unchanged, too.

The linked list implementation requires a new class (I called it Node) to represent nodes of a linked list.  It is very simple, because its only purpose is to be a target for Java's dynamic memory allocation, which is only available through new expressions (and therefore requires a class definition).

Your submission is given a mark out ot 100, which will be reduced to a mark out of 10 when it contributes to the marking scheme.

Note on the Relationship Between Interfaces and Inheritance

It is the intention of the given program to exhibit a software architecture in which classes interact across interface boundaries, so that the 'public users' of a class
see the class only as the implementor of an interface.  By this means the users accept a 'view' of the class's functionality, but gain the advantage of freedom from the implementation details.  A similar advantage is gained by using inheritance (extends): then, the user can exercise the functionality of a subclass by means of the view provided by the superclass.  However, the interface method in this assignment is more flexible, because completely unrelated classes can all implement the same interface and therefore can all be used at once with the same 'view' in a client.  Subclasses, on the other hand, all share code with the base class, and therefore are much more tightly related on to another.

The disadvantage of this interface technique is that object creation is more complicated.  In Java, a new instance of Class is created by evaluating new Class.  There is no other way to create a new instance.  But an interface cannot be instantiated using new because interfaces are 'abstract', that is, they are a view of classes, not a kind of class in their own right.    What this means is that either

  1. users of the interface must know the name(s) of particular classes which implement them: this is easy and convenient but lacks flexibility in the design of implementing classes; or
  2. interfaces have to come in pairs, one of which being the interface of the class (e.g. CourseList) and the other describing a class whose job is creating and managing instances of the class (e.g. CourseListManager).  This second interface would specify a method like createCourseList to abstract the creation of a new instance, and also other useful things such as keeping track of different concrete classes which are known to implement the original interface.  These manager classes are sometimes called factories.  More on such design styles will be covered in this course starting in Week 6.
Unsurprisingly, the assignment code uses the first option above, because the second would be serious overengineering for the situation.  For this reason the client class studentDB, in the assignment , contains numerous explicit references to CourseListImpl and StudentListImpl.

The principal advantage of the linked-list strategy is that it allows memory allocation to be more flexible: if there are many lists, each only occupies as much as necessary; and if a list needs to be very large, then only the memory available to the Java engine limits it.  By contrast, the array strategy requires a fixed amount of space per list, as much as the maximum required: if most lists are much smaller than the maximum, this is inefficient use of space.

On the other hand, retrieval from a linked list, by element number, requires traversal from the beginning of the list, whereas retrieval from the array requires only indexing.  If there are many retrievals from long lists then the added time will be noticeable for the linked-list strategy.
 

  1. [25] UML diagram, see below.
  2. [60]  The new implementation consists of slightly modified versions of StudentListImpl and CourseListImpl, a completely revised (internally!) version of ListImpl, and a new class Node.   The principal advantage of the linked-list strategy is that it allows memory allocation to be more flexible: if there are many lists, each only occupies as much as necessary; and if a list needs to be very large, then only the memory available to the Java engine limits it.  By contrast, the array strategy requires a fixed amount of space per list, as much as the maximum required: if most lists are much smaller than the maximum, this is inefficient use of space.
  3. [0] I found one error, and one anomaly.   The error was: it dies with an exception condition when an input line contains no tokens.   This was handled by testing input more carefully in the class StudentDB.   The anomaly was: the method ListImpl. addElement had a second parameter sentinel, which is not used, and not needed for either the array or the linked list strategy.  I removed it and updated the subclasses accordingly.  The method was protected, and hence this didn't change the implementation of the interfaces.  The program also dies horribly if presented with a
  4. [15] Testing.   Some sample tests are
    1. sample.txt, the given sample input
    2. long.txt, with lots of lines (shows that there is no longer an upper limit on input size)
    3. zero.txt, with no lines (showing that there is no lower limit on input size)
    4. space.txt, with an  empty line (showing bug fix)
Some remarks on the given solutions:

UML Diagram  This part of the lab was poorly done.  Very few people seemed to understand the relationship between their code and the UML. In fact several people turned in the same diagram that was supplied with the lab. If you are showing attributes in the class box then showing role names as labels on associations is optional.  However it is an error to supply an erroneous name.  Many people correctly added a Node class while removing the original object class, however they left the original role name and multiplicity (elements 0..*) on the Node link.   The purpose of UML is to reveal the structure of a design or implementation.

Discussion  Some people stated that a linked list was always more efficient with respect to memory use.  However they failed to realize that each node in a linked list  stores one or possibly 2 pointers to other nodes  (if you define an inner class, that is, a class defined and used locally in a scope, and then you refer to the outer class members from within the inner class, you pay at least one more pointer). There were various incorrect statements about the complexity of various operations:

  • Insertion: Adding a new element to the end of a singly linked list is O(n), for a doubly linked list it is O(1).
  • Searching vs lookup: A linear search for either an array or a linked lists is O(n), wheras a lookup based on an index is O(1) for an array
  • and O(n) for a linked list.
  • Deletion:  Removing an element from an array (the lab's implementation) is O(n) because you have to reorganize the array.  Removing an element from a linked list is O(n) because you have to traverse the list to find the element to be removed.

  • Testing In addition to the test cases supplied with the lab and mentioned above you could also test for:

  • undergraduate with a thesis
  • postgraduate enrolled in courses
  • mangled inputs, by this I mean incorrectly formatted text from the input file

  • Code The isFull() method is interesting.  The easiest thing to do was to leave there but change it to just return null, meaning the linked list is never full.  It is also permissable to remove it completely since it was a private method (and therefore not part of the interface).  Had this method been a part of the interface then it is obligatory  to supply it in order not to 'break' the interface.

    Lots of solutions were basically right operationally, but completely failed to explain what they were doing.  In particular, when a linked list is represented with one "extra" node to make traversal easier, it definitely requires a comment.  In fact, for any abstract data type there should be a block of comments at the beginning clearly laying ouf how the abstract type (identified with the public operations' behaviour) is represented internally by data structures.  This necessity was almost universally ignored.

    If you are going to use meaningful names such as next or head then do so. However it is better to use some totally cryptic name rather than a misleading one.  For example if you have a reference to a single element call it element not elements. If you have a class that is a Node then call it a Node, Item, Element, etc.  Do not call it a Fred or LinkedListImpl, that's not what it is.

    This diagram was made by one of the TAs, Brad Graham.  Note that the roles played by Node and Object are shown clearly, and the cardinality is indicated diagramatically using open-diamonds and arrows, in UML style.