The Heapsort algorithm belongs to the family of efficient sorting algorithms, whose average complexity is the best that we can find **O**(*n *log(*n*)).

## Review

Name | Heapsort |

Best case | O(n log(n)) |

Average case | O(n log(n)) |

Worst case | O(n log(n)) |

Difficulty | Easy |

Family | Efficient sorts |

## Heap data structure

The algorithm introduces another algorithm design technique: it uses a data structure called heap, but what is that?

In a binary tree, each parent node can have at most two children.

There are two important proprieties to mark...

**Full binary tree**

**Complete binary tree**

Our heap must adopt this property and be a **complete** binary tree, as shown below.

I talk about how to implement binary trees in my article on C++ tree implementation.

We can have two kinds of the heap: **max-heaps** and **min-heaps**.

We are going to use the max-heap structure for our algorithm.

**Max-Heap**

The max-heap has the property that for every node i

The value of a node is at most the value of its parent.

Thus, the largest element in the max-heap is stored at the root and the subtree contains values no larger than that contained at the node itself.

In this pic, it is represented the correlation between heap and array.

The number above the node is the corresponding index of the array.

We can summarize the structure in this way:

- Parent(i) = (i -1)/2
- Left (i) = 2i + 1
- Right(i) = 2i + 2

For example, watching the pic, in the position i = 3 there's the value 8.

Parent(3) --> i = 1 --> 14

Left (3) --> i = 7 --> 2

Right(3) --> i = 8 --> 4

## Algorithm

Now that we undestood what an heap is, we can analize the algorithm itself.

There are three steps

- Building the max-heap from the input data
- Now the largest element is stored in the root, we are going to replace it with the last element of the heap and decrease the size by 1
- Repeat the steps while the size of the heap is greater than 1

The algorithm was invented by Williams in the year 1964.

## Pseudocode

**Maintain the max-heap**

In order to maintain the max-heap propriety, we are going to write the function Heapify.

In this way, the max element is always of the array is placed in the root of our heap.

**Heapify**

`heapify(array[], n, index)`

`n is the length of the current heap`

`largest = index`

`left = 2*index`

`right = 2*index+1`

`if left < n AND array[left] > array[index]`

`largest = left`

`if right < n AND array[right] > array[index]`

`largest = right`

`if largest != index{`

`swap array[index] with array[largest]`

`heapify(array, largest)`

`}`

At each step it is determined the largest element between array[index], array[left(index)] and array[right(index)] and its index is stored in largest.

**Base case**

If array[index] is the largest, then the subtree rooted in i is already a max-heap tree and the procedure ends.

**Recursive step**

Otherwise one of the two subtrees has the largest element and array[index] is swapped with array[largest].

The subtree rooted at largest now might violate the propriety and we need to call the function recursively on that subtree.

**Build the max - heap from the input array**

In case of a complete tree, the index of a **non**-leaf node is given by

Where size is the length of the array.

All the node after that index are leaf nodes and thus they don't have subtrees to convert to max-heap.

So we can build a max heap from

`for (i = size/2-1; i>=0; i--)`

`heapify(array, size, i)`

For example, let's consider the array A = { 3, 4, 1, 2, 5, 9} and see what's happening

size = 6

i = size/2 -1 = 6/2 - 1 = 2

A[2] = 9

heapify(A, 6, 2)

i = 1

heapify(A, 6, 1)

i = 0

heapify(A, 6, 0)

We have finally obtained our max-heap!

Once we have built the max heap from the input array, we can now go on with the heapSort.

**HeapSort**

Because of the max-heap propriety, the largest element is stored on the root.

So...

- Swap the first element with the last element of the array
- Remove the last element by decreasing the size of the array
- Heapify the tree by reducing the size by 1, so we can have the largest element in the root again
- Repeat the process while the size of the heap is greater than 1

`for( i = size-1; i >= 0; i--)`

`swap array[0] with array[i]`

`heapify(array, i, 0)`

## Programming language implementation

**C++**

**Heapify**

void heapify(int array[], int n_size, int index) { //Set the largest as the index int largest = index; //Left element in the array of max-heap int left = (2*index) + 1; //Right element in the array of max-heap int right = (2*index) + 2; //Check if the left children element is greater than its parent if (left < n_size && array[left] > array[largest]) largest = left; //Check if the right children element is greater than its parent if (right < n_size && array[right] > array[largest]) largest = right; //If the largest is not the parent if (largest != index) { //Swap it with the largest one std::swap(array[index], array[largest]); //Recursively check the subtrees heapify(array, n_size, largest); } }

**heapSort**

//heapSort function void heapSort(int arr[], int n) { //Let's build the max-heap from the input array for (int i = n / 2 - 1; i >= 0; i--) heapify(arr, n, i); //Now we are sure that the first element is the max n--; //While cicle from the last element do{ std::swap(arr[0], arr[n]); //Let's swap the first element heapify(arr, n--, 0); //Recall the heapify function to rebuild the max-heap } while(n > 0); }

**Complete program**

// // HeapSort - Array implementation - C++ // // Created by Andrea Cappelletti on 15/04/2018. // // Explanation: https://www.cmprogrammers.com/post.php?id=55 #include #define size 5 void heapify(int array[], int n_size, int index) { //Set the largest as the index int largest = index; //Left element in the array of max-heap int left = (2*index) + 1; //Right element in the array of max-heap int right = (2*index) + 2; //Check if the left children element is greater than its parent if (left < n_size && array[left] > array[largest]) largest = left; //Check if the right children element is greater than its parent if (right < n_size && array[right] > array[largest]) largest = right; //If the largest is not the parent if (largest != index) { //Swap it with the largest one std::swap(array[index], array[largest]); //Recursively check the subtrees heapify(array, n_size, largest); } } //heapSort function void heapSort(int arr[], int n) { //Let's build the max-heap from the input array for (int i = n / 2 - 1; i >= 0; i--) heapify(arr, n, i); //Now we are sure that the first element is the max n--; //While cicle from the last element do{ std::swap(arr[0], arr[n]); //Let's swap the first element heapify(arr, n--, 0); //Recall the heapify function to rebuild the max-heap } while(n > 0); } //This function allow us to print the array so we can check if the merge was successfull void print(int array[]){ for(int i = 0; i< size; i++){ std:: cout <<array[i]<<" "; } std::cout<<std::endl; } int main() { //Let's fill our array of size defined in the beginning of the file int array[size]={1,4,3,8,0}; std::cout<<"Array before heapSort"<<std::endl; print(array); heapSort(array, size); std::cout<<"Array after heapSort"<<std::endl; print(array); }

The full code is available on my GitHub.

Feel free to contribute!

## Properties

This algorithm has a limited use compared to Merge sort and Quick sort.

However, it uses the heap structure, that it's very common!

## Complexity

The heapSort is a really efficient algorithm for a large amount of input.

The time complexity of healpify is **O**(log(n)) and the of heapSort is **O**(n), thus the complexity is

in each case (average, worst, best).

The basic operations on heaps run in time at most proportional to the height of the tree and thus take **O**(log(n)) time.