Sorting Arrays: Algorithms and Efficiency

Contents

  1. Introduction to Sorting
  2. Selection Sort
  3. Complexity of Sorting Algorithms
  4. Bonus Material: More Efficient Sorting with Quicksort
  5. Sorting Algorithms: Summary

1. Introduction to Sorting

Sorting is the process of arranging the items of a list into a particular order. Typically, sorting uses the numerical values of list items and arranges them into increasing order. All of the examples in these notes will sort arrays of integers into increasing numerical order. Thus, when sorting an array sortArray, the following condition should be true after the sorting operation is complete:
sortArray[0]<=sortArray[1]<=sortArray[2]<=...<=sortArray[sortArray.length-1]

For the sake of simplicity and clarity, all of the algorithms and Java methods presented here assume this particular sort. It would, however, be quite straightforward to adapt them, to sort into descending order, to sort double values, or to sort String values into alphabetic (or lexicographic) order.

These notes make one other simplifying assumption: that the array to be sorted is full. Prior to sorting, the array has been initialized so that every location, from sortArray[0] to sortArray[sortArray.length - 1] contains an appropriate integer value. Again, it would be relatively straightforward to relax this assumption, and include bounds on the portion of the array to be sorted in the algorithms.

In the past, devising sorting algorithms and analysing their relative efficiencies was a significant concern for computer scientists. As a result, many different sorting algorithms were developed. The discussion that follows considers just two of these: Selection Sort and Quicksort. Note that only Selection Sort is directly examinable material. Quicksort illustrates a contrasting approach to the sorting task, but is not examinable material for this course.

Each of these sorting algorithms has been implemented as a Java method. The selectionSort method has a single parameter - the array of int values to be sorted. The array is assumed to be full. A simple modification of the method would include one or two int parameters to specify bounds for the portion of the partially filled array to sort. The quickSort method does include, in addition to the array to be sorted, int parameters for the left and right bounds of the current portion of the array to sort.

The Sorters.java program may be used to demonstrate the algorithms. It includes the definitions of selectionSort and quickSort methods, plus the declaration and initialization of arrays, and a main method that calls the sorting methods. The program also includes a swap method to accomplish the exchanges of array location contents in the selectionSort and quickSort methods. Finally, the program includes a simple method for displaying the contents of an array.
Back to Top


2. Selection Sort

The first sorting algorithm is Selection Sort. As one of the easiest sorting techniques to understand and implement, it is a good starting point. Many applications require no more complicated or efficient approach. It will turn out to be roughly equivalent in efficiency to several other sorting algorithms.

Algorithm

The general approach of Selection Sort is to find one value and put it in its final place in the array, repeating the process for each of the other values. Here's a pseudocode description of the general algorithm:
//assume an array, a, containing integers, not sorted

for(int i = 0; i < a.length - 1; i++)

  put ith smallest remaining element in a[i]

Here's a somewhat more detailed description of the Selection Sort algorithm.
  1. scan the array to locate the smallest value
  2. put it in the first position
  3. scan the remainder of the array to find the next smallest value
  4. put it in the second position
  5. repeat the process for successive values until all are properly placed

Example

Here's an example of applying the algorithm to a sample array of integer values.
  original:     3   9   6   1   2

                3   9   6   1   2 --> smallest is 1

  swap 1&3 -->  1   9   6   3   2 --> smallest is 2

  swap 2&9 -->  1   2   6   3   9 --> smallest is 3

  swap 3&6 -->  1   2   3   6   9 --> smallest is 6

  swap 6&6 -->  1   2   3   6   9

  result:       1   2   3   6   9 

Back to Top

3. Complexity of Sorting Algorithms

To compare sorting algorithms, we need to adopt some way of measuring their performance. The rest of this discussion uses the term "metric" to mean a measure of algorithm performance. Historically, computer scientists have used both space (or memory requirements), and time (or the number of computational operations) as metrics for comparing different algorithms for performing a particular task. Since the algorithms discussed here sort an array "in place", without the need for additional data structures, comparisons between them focus on the number of computational steps they require to complete the sorting task. Moving elements of an array to re-arrange their order involves swapping locations, so that the number of swaps is a candidate metric. However, even more central to the sorting operation is the comparison of one array element to another to determine whether they are in the proper order relationship. Thus we adopt the number of comparisons between pairs of array elements as our metric for describing the complexity (or efficiency) of different sorting algorithms.

Having adopted a metric, we go one step further, and analyse an algorithm to develop a simple mathematical expression as a summary description of its performance in terms of that metric. The term "Computational Complexity" is used both to refer to this type of analysis and the resulting description of algorithm performance.

Selection Sort is one of a class of sorting algorithms (including, for example, Insertion Sort and Bubble Sort) that are all similar in computational complexity (or efficiency, if you like). They have outer loops that range over all array elements i, and inner loops that range over much of the array (for Selection Sort, from i to the end). That results in approximately n2 comparisons between pairs of elements in order to sort an array of length n. Thus, these algorithms are said to have order n2, or O(n2), complexity. Other sorting algorithms may be more efficient. Quicksort, for example, has O(nlog2n) complexity.
Back to Top


4. Bonus Material: More Efficient Sorting with Quicksort

Here is a brief description of a more efficient array sorting algorithm. A method that implements Quicksort is included in the Sorters.java program.

Quicksort relies on the observation that it would be preferable to make exchanges of array elements (swaps) over large distances within the array. In addition, it employs a form of "divide and conquer" strategy, partitioning the array to be sorted and repeatedly applying the same operations to the resulting (smaller) partitions. The algorithm involves picking a value to act as a partitioning, or pivot, value. Ideally, this would be the median, or the value that divides the list of sorted array elements in half. Since finding the median requires that the array already be sorted, some estimate of it must be used. A crude way to get an estimate, and the one used in the algorithm as it is implemented here, is to just pick the middle value of the (unsorted) array. Then the array is scanned beginning from opposite ends, and elements are swapped so that all those in one part of the array are greater than or equal to the partitioning value and all those in the other part are less than or equal to the partitioning value. By repeating this process on each partition (until the "partition" is just a single item), the algorithm accomplishes sorting.

As you might surmise from this description, the Quicksort algorithm is most easily implemented using a recursive method definition. This means that the method definition includes calls to itself. To be successful, these recursive calls must be made with reduced versions of the problem, so that the recursion process eventually ends.

There are nasty cases for which Quicksort becomes "Slowsort", with complexity of O(n2). Consider, for example, what happens if the value picked each time as the partitioning value happens to be the largest, or smallest, value in the array. In general, however, it has performance roughly O(nlog2n). Note that this is substantially better than Selection Sort.

Example

Here is an example of applying the algorithm to the same array as in the previous example.
  original:  3   9   6   1   2 --> partitionVal is 6

             3   2   6   1   9 --> swap 9 & 2

             3   2   1   6   9 --> swap 6 & 1

  sort l, j  3   2   1         --> partitionVal is 2

             1   2   3         --> swap 3 & 1

  sort i, r              6   9 --> partitionVal is 6 

  result:    1   2   3   6   9

Back to Top

5. Sorting: Summary

  1. To simplify our discussion of sorting algorithms, we made several assumptions. We specified that the arrays to be sorted were filled with int values, and that the sorts were into increasing numerical order.
  2. We presented one of a class of sorting algorithms with O(n2) complexity: Selection Sort.
  3. For illustrative purposes only, we presented an alternative array sorting algorithm with significantly better (O(nlog2n)) complexity: Quicksort.
Back to Contents
Originally Developed for CISC-101 & APSC-142
Revised by Jim Rodger
Last Updated: Fall 2003