Sorting algorithmsHeapsort 
With its time complexity of O(n log(n)) heapsort is optimal. It utilizes a special data structure called heap. This data structure is explained in the following.
Definition: Let T = (V, E) an almost complete binary tree with a vertex labelling a : V M that assigns to each vertex u a label a(u) from an ordered set (M, ).
A vertex u V has the heap property if it has no direct descendant with a greater label, i.e.
v V : (u, v) E a(u) a(v)
T is a heap if all vertices have the heap property, i.e.
(u, v) E : a(u) a(v)
We call T a semiheap if all vertices except possibly the root r have the heap property, i.e.
(u, v) E, ur : a(u) a(v)
Example:
 
Figure 1: Heap with n = 10 vertices  
Observe that each leaf automatically has the heap property regardless of its label, since it has no descendants.
The data structure of the heapsort algorithm is a heap. The data sequence to be sorted is stored as the labels of the binary tree. As shown later, in the implementation no pointer structures are necessary to represent the tree, since an almost complete binary tree can be efficently stored in an array.
The following description of heapsort refers to Figure 2 (a)  (e).
 
 
Figure 2: Retrieving the maximum element and restoring the heap  
If the sequence to be sorted is arranged as a heap, the greatest element of the sequence can be retrieved immediately from the root (a). In order to get the nextgreatest element, the rest of the elements have to be rearranged as a heap.
The rearrangement is done in the following way:
Let b be a leaf of maximum depth. Write the label of b to the root and delete leaf b (b). Now the tree is a semiheap, since the root possibly has lost its heap property.
Making a heap from a semiheap is simple:
Do nothing if the root already has the heap property, otherwise exchange its label with the maximum label of its direct descendants (c). Let this descendant be v. Now possibly v has lost its heap property. Proceed further with v, i.e. make a heap from the semiheap rooted at v (d). This process stops when a vertex is reached that has the heap property (e). Eventually this is the case at a leaf.
Making a heap from a semiheap can conceptionally be implemented by the following procedure downheap:
procedure downheap (v)  
Input:  semiheap with root v 
Output:  heap (by rearranging the vertex labels) 
Method: 

Procedure downheap can be used to build a heap from an arbitrarily labelled tree. By proceeding bottomup, downheap is called for all subtrees rooted at inner vertices. Since leaves are already heaps they may be omitted.
procedure buildheap  
Input:  almost complete binary tree T of depth d(T) with vertex labelling a 
Output:  heap (by rearranging the vertex labels) 
Method: 

A call of buildheap is the first step of procedure heapsort, which can now be written down as follows:
procedure heapsort  
Input:  almost complete binary tree with root r and vertex labelling a 
Output:  vertex labels in descending order 
Method: 

This simulation illustrates the process of building a heap from an arbitrarily labelled tree.
An almost complete binary tree with n vertices has a depth of at most log(n). Therefore, procedure downheap requires at most log(n) steps. Procedure buildheap calls downheap for each vertex, therefore it requires at most n·log(n) steps. Heapsort calls buildheap once; then it calls downheap for each vertex, together it requires at most 2·n·log(n) steps.
Thus, the time complexity of heapsort is T(n) O(n·log(n)). The algorithm is optimal, since the lower bound of the sorting problem is attained.
An almost complete binary tree with n vertices and vertex labelling a can be stored most efficiently in an array a:
All vertices at positions 0, ..., n/21 are inner nodes, all vertices n/2, ..., n1 are leaves (integer division / ). The heap with n = 10 vertices of Figure 1 is shown in Figure 3 as an example.
 
Figure 3: Array representation of the heap of Figure 1  
Using this array representation of the heap, heapsort can be implemented as an inplace sorting algorithm. In each step of heapsort, the root label a(r) is not output but stored at the position of the leaf b that is deleted in the following. Deleting leaf b means to consider just the array elements left of b as the remaining heap.
In other words: the four steps
are replaced by an exchange of the root label and the label of b:
The following Java class HeapSorter encapsulates the functions downheap, buildheap and heapsort. In order to sort an array b, Heapsort is called with the statement HeapSorter.sort(b).
public class HeapSorter { private static int[] a; private static int n; public static void sort(int[] a0) { a=a0; n=a.length; heapsort(); } private static void heapsort() { buildheap(); while (n>1) { n; exchange (0, n); downheap (0); } } private static void buildheap() { for (int v=n/21; v>=0; v) downheap (v); } private static void downheap(int v) { int w=2*v+1; // first descendant of v while (w<n) { if (w+1<n) // is there a second descendant? if (a[w+1]>a[w]) w++; // w is the descendant of v with maximum label if (a[v]>=a[w]) return; // v has heap property // otherwise exchange(v, w); // exchange labels of v and w v=w; // continue w=2*v+1; } } private static void exchange(int i, int j) { int t=a[i]; a[i]=a[j]; a[j]=t; } } // end class HeapSorter 
With its time complexity of O(n log(n)) heapsort is optimal. Unlike mergesort, heapsort requires no extra space. On the other hand, heapsort is not stable. A sorting algorithm is stable, if it leaves the order of equal elements unchanged.
The heap data structure can also be used for an efficient implementation of a priority queue. A priority queue is an abstract list data structure with the operations insert and extractMax. With each element in the list a certain priority is associated. By insert an element together with its priority is inserted into the list. By extractMax the element with the highest priority is extracted from the list. Using a heap, both operations can be implemented in time O(log(n)).
Heapsort was originally published by J.W.J. Williams as Algorithm 232 in the journal Communications of the ACM [Wil 64].
[AHU 74]  A.V. Aho, J.E. Hopcroft, J.D. Ullman: The Design and Analysis of Computer Algorithms. AddisonWesley (1974)  
[CLRS 01]  T.H. Cormen, C.E. Leiserson, R.L. Rivest, C. Stein: Introduction to Algorithms. 2nd edition, The MIT Press (2001)  
[Sed 88]  R. Sedgewick: Algorithms. 2nd edition, AddisonWesley (1988)  
[Wil 64]  J.W.J. Williams: Algorithm 232: Heapsort. Communications of the ACM, 7, 6, 347348 (1964) 
Next: [Mergesort] or 
H.W. Lang FH Flensburg lang@fhflensburg.de Impressum © Created: 25.02.1997 Updated: 18.05.2010 