Heaps

Introduction to Heaps

The heap property

Note that a heap is not simply a "sorted" tree or linked list. The largest value is first (root), but the rest of the tree may (and usually will) not be completely ordered and may contain duplicates. (Children are less than or equal to their parent.)

This is a desirable property when we're only interested in the largest value. Much easier to simply locate the largest item rather than to sort the entire structure. This is the idea behind priority queues.

Priority Queue example using arrays and linked lists. The complexity of the priority queue is directly related to the data structure used to represent it. Some algorithms that make use of priority queues are Dijkstra's algorithm, Kruskal's algorithm, and the A* (A-Star) algorithm.

Example:

This tree is a heap because it has the properties described above:

Removing the largest element

To heapify the tree after removing the root and replacing it, simply swap the root with the larger of its children:

Continue this process until the swapping no longer results in a child node being larger than its parent:

Removing the largest element again

After removing the root node R from the previous tree, we have to heapify the tree.

Step 1:Step 2:
Step 3:Step 4:


Inserting into a heap

Inserting F into the tree causes no problems. The tree maintains the heap property. Inserting into the right-most position on the bottom level preserves the completeness as well.

Inserting W into the tree causes the heap property to be lost: (W > B)

Heapifying the tree is similar to how it was done before except that we will work from the bottom up (instead of the top down).

W is larger than B, so we swap the nodes:

W is larger than L, so we swap the nodes:

W is larger than R, so we swap the nodes.

Implementing Complete Binary Trees Using Arrays

Example:

Again, if we number the nodes by position in the complete binary tree using level-order (breadth-first) traversal we arrive at these values:

Given this parent-child relationship, we can simply represent the complete binary tree above as an array.

We will leave the first slot empty to make the arithmetic easier.


The root of the tree is V and its left and right children are M and R, respectively:


Node J has left and right children A and D, respectively:


Node L is a leaf and has no children: 2i > N (i = 7 and N = 13)

Visualizing Trees vs. Arrays

Removing the root

Original heap:



Remove largest element:



Move last node to root:



R is larger than M, so swap B and R:



L is larger than H, so swap B and L:

B is a leaf, (2 * 7 > 12), so we're done.


Inserting into the heap

Insert F, heap property is preserved (F < H)



Insert W, heap property is violated (W > B)



Swap W and B, heap property is still violated:



Swap W and L, heap property is still violated:



Swap W and R. W is at the root, so we are done.



Notes:

Self check Implementing a heap using an array is left as an exercise for the student.


Using the STL to create a heap from an array:
std::make_heapstd::priority_queue
void f1()
{
  srand(10);
  std::vector<int> v;
  for (int i = 0; i < 20; i++)
    v.push_back(RandomInt(10, 99));

  std::cout << "vector: ";
  PrintArray(&v[0], v.size());

  std::cout << "  heap: ";
  std::make_heap(v.begin(), v.end());
  PrintArray(&v[0], v.size());

  std::cout << "\npop, print, heapify\n";
  while (!v.empty())
  {
      // print "top", pop, re-heapify
    std::cout << v[0] << "  ";

      // Expensive operation: O(N) due to shifting
    v.erase(v.begin()); 

    std::make_heap(v.begin(), v.end());
  }
}
void f2()
{
  srand(10);
  std::vector<int> v;
  for (int i = 0; i < 20; i++)
    v.push_back(RandomInt(10, 99));

  std::cout << "Using a priority queue:\n";
  std::priority_queue<int> pq(v.begin(), v.end());
  while (!pq.empty())
  {
    std::cout << pq.top() << "  ";

      // Cheap operation: O(lg N), no shifting
    pq.pop();
  }
}
Output:

vector: 81  79  42  24  27  36  72  52  36  19  64  21  13  24  43  92  51  79  74  40
  heap: 92  81  72  79  64  36  43  52  79  40  27  21  13  24  42  24  51  36  74  19

pop, print, re-heapify...
92  81  79  79  74  72  64  52  51  43  42  40  36  36  27  24  24  21  19  13

Using a priority queue:
92  81  79  79  74  72  64  52  51  43  42  40  36  36  27  24  24  21  19  13
With 40,000 elements:
f1() runs in about 5.2 seconds
f2() runs in about 0.007 seconds

with 50,000,000 items, f2() runs in about 5.2 seconds.

An example showing how to use operator> with a priority queue:

std::priority_queue (with operator>)
void f2b()
{
  srand(10);
  std::vector<int> v;
  for (int i = 0; i < 20; i++)
    v.push_back(RandomInt(10, 99));

  std::cout << "Using PQ with operator> instead of operator<:";

    // Use operator> instead of operator<  (orders the elements from small to large)
  std::greater<int> comparor;

    // Construct PQ with desired sort order
  std::priority_queue<int, std::vector<int>, std::greater<int>> pq(v.begin(), v.end(), comparor);
  while (!pq.empty())
  {
    std::cout << pq.top() << "  ";
    pq.pop();
  }
}
Output:

Using PQ with operator> instead of operator<:
11  15  27  27  28  41  45  48  58  58  68  70  73  81  84  87  93  93  95  98
Final thoughts:

Self check: Fill in the array from the tree below such that it will allow efficient (constant-time) access to a node's children and parent. Yes, there will be "holes" in the array for the missing nodes. Note: This isn't a heap, just a mapping of tree nodes into array slots.

Tree:
Array: