YHSCS :: AP Computer Science :: Lessons

AP Computer Science :: Lessons :: Sorting Methods

Chapter 5:
Pages 271 - 272
Chapter 12
Pages 524 - 540

Fundamentals of Java
Chapter 11
Pages 415 - 421
Chapter 12
Pages 468 - 474, 478 - 485

Big-O Notation

Big-O notation is a quantitative way of describing the run time or memory management of an algorithm. If n is the number of elements to be processed a given algorithm's number of comparisons, exchanges, data movements, and primitive operations can be expressed in terms of a function of n. The type of function determines the order of the algorithm. An exponential function would be order 2ⁿ, or O(2ⁿ). The most common cases are listed below.

Function Type	Big-O Type
Constant	O(1)
Logarithmic	O(log n)
Linear	O(n)
Quadratic	O(n²)
Cubic	O(n³)
Exponential	O(2ⁿ)

Below is an example of a constant algorithm.

public boolean isFirstNull(String[] myString) 
{ 	
  if(myString[0] == null) 
    return true;
  return false; 
}

The algorithm is constant because it takes the same number of steps no matter how many elements are in myString.

This next example is of a linear algorithm.

public boolean Contains(String[] myString, String value)
{
  for(int i=0; i< myString.Length; i++)
  {
     if(myString[i] == value)
       return true;
  } 	
  return false;
}

A linear algorithm grows in direct proportion to the size on the input data. In the worst case, the above algorithm would have to look through every element of the myString array so it takes n cycles of the for loop to complete the method where n is the size of the array.

The next example is a quadratic algorithm.

public boolean ContainsDuplicates(String[] myString)
{
   for(int i=0; i < myString.Length; i++) 	
   {
     for(int j=0; j < myString.Length; j++)
     {
       if(i==j) // Don't compare with self 			
         continue; 			
       if(myString[i] == myString[j])
         return true; 			
     } 	
   }
   return false;
}

The algorithm above has two nested loops, which causes it to perform operations proportional to the square of the number of items in myString. A cubic algorithm can be caused by another nested loop.

An exponential algorithm is a situation you want to avoid because it cannot be used for large data sets. An exponential algorithm will eventually run out of memory since the processing requirements expand so rapidly.

Selection Sort

The selection sort method has a best case runtime of O(n²) and a worst case runtime of O(n²). Also known as a "search-and-swap" algorithm it is one of the easiest sort algorithms to implement. First, the smallest element in an array called a is found and swapped with a[0]. The next smallest element is then found and swapped with a[1]. This continues until just two elements remain and the larger element is placed in a[size - 1]. Below is a sample array showing the result of each sort.

8 1 9 2     Original Array
1 8 9 2     After First Pass
1 2 9 8     After Second Pass
1 2 8 9     After Third Pass

Selection sort always makes the same number of comparisons, making it one of the slowest sorting algorithms. Below is the code for a selection sort of integers in Java:

int[] a;
          
public void selectionSort()
{
  int maxPos, max;
          
  for (int i=0; i<a.length-1; i++)
  {
    max = a[i];
    maxPos = i;
    for (int j=i+1; j<a.length; j++)
    if (max < a[j])
    {
      max = a[j];
      maxPos = j;
    }
    swap(i, maxPos);
  }
}
          
public void swap(int i, int j)
{
  int temp = a[i];
  a[i] = a[j];
  a[j] = temp;
}

View the Selection Sort Gypsy Folk Dance

Insertion Sort

The insertion sort method has a best case runtime of O(n) and a worst case runtime of O(n²). The method moves elements from an unsorted portion of an array to a sorted portion. The array below has elements that are in the sorted portion of the array in red and the next element to be sorted is blue.

8 1 4 6     Original Array
1 8 4 6     After First Pass
1 4 8 6     After Second Pass
1 4 6 8     After Third Pass

Insertion sort is a little more efficient than selection sort if an array is almost sorted, but for an array that is in reverse sort order it will be just as bad. Selection and insertion sort are both inefficient for large arrays. Below is an implementation of insertion sort in Java.

int[] a;
  
public void insertionSort()
{
  int itemToInsert, j;
  boolean stillLooking;  
  
  for (int i=1; i<a.length; i++)
  {
    itemToInsert = a[i];
    j = i - 1;
    stillLooking = true;
    
    while ((j > 0) && stillLooking)
      if (itemToInsert < a[j])
      {
        a[j + 1] = a[j];
        j--;
      }
      else
        stillLooking = false;
    a[j + 1] = itemToInsert;
  }
}

View the Insertion Sort Romanian Folk Dance

Mergesort

The mergesort method always has a runtime of O(n log n), which makes it very efficient for large arrays. It splits an array in half and then sorts both halves, merging them at the end. It works recursively so each half is split until only one element remains, which is when the merging begins. The image to the right shows an array broken into subarrays and then the resulting merge.

Mergesort is more efficient than the non-recursive sort methods, but it has the disadvantage of creating temporary arrays, which could be a major problem. Below is the Java code for mergesort.

int a[];
  
public void mergeSort()
{
  int[] copy = new int[a.length];
  mergeSorter(a, copy, 0, a.length-1);
}

public void mergeSorter(int[] a, int[] copy, int low, int high)
{
  if (low < high)
  {
    int middle = (low + high) / 2;
    mergeSorter(a, copy, low, middle);
    mergeSorter(a, copy, middle + 1, high);
    merge(a, copy, low, middle, high);
  }
}

public void merge(int[] a, int[] copy, int low, int middle, int high)
{
  int i1 = low, i2 = middle + 1;

  for(int i=low; i<=high; i++)
  {
    if (i1 > middle)
      copy[i] = a[i2++];
    else if (i2 > high)
      copy[i] = a[i1++];
    else if (a[i1] < a[i2])
      copy[i] = a[i1++];
    else
      copy[i] = a[i2++];
  }

  for (int i=low; i<=high; i++)
    a[i] = copy[i];
}

View the Mergesort German Folk Dance