Data Structure VII Heap&PriorityQueue

Heap is a tree-like data structure. Every node must satisfy the heap property (max/min/etc). In java, heap is implemented as PriorityQueue<>(size, Comparator). Due to this property, the time complexity of insertion and deletion are both O(logn).

Article Language: English
Example Language: Java
Reading Time: 11min

Implement a Heap Using Array

Constructor takes different parameters, with default capacity and heapify input array. Implements basic operations, offer, poll, peek, size, and isEmpty. With resize function, the priority queue can expand its capacity when it’s half full. The amortized time complexity for offer and poll remain constant.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
public class MyPQ {
private int size;
private int SCALE = 2;
private int INITIAL_CAP = 15;
private int[] array;
public MyPQ(){
array = new int[INITIAL_CAP];
size = 0;
}

public MyPQ(int[] array) {
this.array = array;
size = array.length;
heapify();
}

public void offer(int e) {
if (size == array.length / 2) {
resize();
}
if (size == 0) {
array[0] = e;
} else {
array[size] = e;
}
size++;
percolateUp();
}

public Integer poll(){
if (size == 0) {
return null;
}
int peek = array[0];
array[0] = array[size - 1];
size--;
percolateDown(0);
return peek;
}

public Integer peek() {
if (size == 0) {
return null;
}
return array[0];
}

public int size() {
return size;
}

public boolean isEmpty() {
return size == 0;
}

private void percolateUp() {
int index = size - 1;
while (index >= 0) {
int parent = (index - 1) / 2;
if (array[index] < array[parent]) {
swap(array, index, parent);
} else {
break;
}
index = parent;
}
}

private void percolateDown(int index) {
while (index <= (size - 2) / 2) {
int left = index * 2 + 1;
int right = index * 2 + 2;
int smallest = left;
if (array[right] < array[left]) {
smallest = right;
}
if (array[index] < array[smallest]) {
break;
} else {
swap(array, smallest, index);
}
index = smallest;
}
}

private void heapify() {
for (int i = size / 2 - 1; i >= 0; i--) {
percolateDown(i);
}

}

private void resize() {
int[] newArr = new int[array.length * SCALE];
for (int i = 0; i < array.length; i++) {
newArr[i] = array[i];
}
array = newArr;
}

private void swap(int[] arr, int one, int two) {
int temp = arr[one];
arr[one] = arr[two];
arr[two] = temp;
}
}

public class MyPQTest {

private MyPQ pq = new MyPQ();
private MyPQ pq2 = new MyPQ(new int[]{3, 5, 7, 0, 1, 2, -10});

@Test
public void offer() {
pq.offer(5);
pq.offer(-1);
pq.offer(0);
pq.offer(9);
assertEquals(-1, (Object)pq.peek());
assertEquals(-1, (Object)pq.poll());
assertEquals(3, pq.size());
assertEquals(false, pq.isEmpty());
assertEquals(0, (Object)pq.peek());
assertEquals(0, (Object)pq.poll());
assertEquals(2, pq.size());
assertEquals(5, (Object)pq.poll());
assertEquals(1, pq.size());
assertEquals(9, (Object)pq.poll());
assertEquals(null, pq.poll());
assertEquals(true, pq.isEmpty());

assertEquals(-10, (Object)pq2.peek());
}
}

Heapify

input: an int array.
output: none.

A min heap must satisfy following property, for a list of n elements {k1, k2… ki…kn}, (ki <= k2i, ki <= k2i+1)或者(ki >= k2i, ki >= k2i+1), (i = 1, 2, 3, 4… n/2).
In other words,

  • left child (node * 2 + 1)
  • right child (node * 2 + 2)
    must not greater than node.
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
public class Heapify {
/**
* @param A: Given an integer array
* @return: nothing
*/
public void heapify(int[] A) {
// write your code here
if (A == null || A.length <= 1) {
return;
}

for (int i = A.length / 2; i >= 0; i--) {
shiftDown(A, i);
}
}

private void shiftDown(int[] arr, int curr) {

int length = arr.length;

while (curr < length) {
int smallestIndex = curr;
int leftChild = curr * 2 + 1;
int rightChild = curr * 2 + 2;

if (leftChild < length && arr[leftChild] <= arr[smallestIndex]) {
smallestIndex = leftChild;
}

if (rightChild < length && arr[rightChild] <= arr[smallestIndex]) {
smallestIndex = rightChild;
}
//whether curr is smaller than its children or curr is leaf;
if (smallestIndex == curr) {
break;
}

swap(arr, smallestIndex, curr);
//from current node down
curr = smallestIndex;
}
}

private void swap(int[] arr, int one, int two) {
int temp = arr[one];
arr[one] = arr[two];
arr[two] = temp;
}
}

Merge K Sorted Lists

input: lists of sorted linked list.
output: one sorted linked list.

The most intuitive way of solving this problems is that merge lists two by two. Since there are soreted linked list, recursively call a helper function and merge. Problem is that every node is visited more than one time.
Another approach is using heap. Only keep the reference of head of the linked list. It taks O(k) to build min heap. Offer and poll operation take O(lok). There a k lists and every list has n nodes. The time complexity is O(knlogk). Space complexity is O(k) for list nodes in the priority queue.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
public class MergeKLists {
/**
* @param lists: a list of ListNode
* @return: The head of one sorted list.
*/
public ListNode mergeKLists(List<ListNode> lists) {
if (lists == null || lists.size() == 0) {
return null;
}

ListNode dummy = new ListNode(-1);
ListNode newHead = dummy;
PriorityQueue<ListNode> pq = new PriorityQueue<>(lists.size(), new Comparator<ListNode>(){
@Override
public int compare(ListNode o1, ListNode o2) {
return o1.val < o2.val ? -1 : (o1.val == o2.val ? 0 : 1);
}
});

for (ListNode node : lists) {
if (node != null) pq.offer(node);
}

while (!pq.isEmpty()) {
ListNode cur = pq.poll();
newHead.next = cur;
newHead = newHead.next;
if (cur.next != null) {
pq.offer(cur.next);
}
cur.next = null;
}
return dummy.next;
}

Merge K sorted arrays

input: list of sorted arrays.
output: one sorted array.

Similar to merge linked list. Only difference is that element in the array doesn’t have reference to its neightbor. Creating a cell class with all information about an element’s neighbor. Using priority queue to store first element of each row because the rows in the 2D array may not have the same number of column.

The time complexity is O(Nlogk) where N is the totol number of input and k is the number of arrays.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
ublic class MergeKArrays {

class Cell {
int val, row, col;

public Cell(int val, int row, int col) {
this.val = val;
this.row = row;
this.col = col;
}
}

/**
* @param arrays
* @return
*/

public int[] mergekSortedArrays(int[][] arrays) {
if (arrays == null || arrays[0] == null || arrays.length == 0 || arrays[0].length == 0) {
return new int[]{};
}

PriorityQueue<Cell> pq = new PriorityQueue<>(arrays.length, new Comparator<Cell>() {
@Override
public int compare(Cell o1, Cell o2) {
return o1.val < o2.val ? -1 : o1.val == o2.val ? 0 : 1;
}
});
//O(array.length)
int length = 0;
for (int i = 0; i < arrays.length; i++) {
pq.offer(new Cell(arrays[i][0], i, 0));
length += arrays[i].length;
}

int index = 0;
int[] result = new int[length];
while (!pq.isEmpty()) {
Cell cur = pq.poll();
result[index++] = cur.val;
if (cur.col + 1 < arrays[cur.row].length) {
pq.offer(new Cell(arrays[cur.row][cur.col + 1], cur.row, cur.col + 1));
}
}
return result;
}
}

High Five

input: list of Records with id and score.
output: return the average of top 5 score for each id.

Extra data structure needs to be used here to record scores for each id, and the average of score for the id. Using two maps, for list of scores, and another one for final result.
It asks top 5 score. Intuitively, top number of something fits the property of max heap. Using priority queue to record scores, calculate averate of top 5 numbers. Time compexity is O(n) for store all data from the input. O(knlogn) for poping k times (5) for each id, and calculate the average. Therefore, the total time complexity is O(knlogn). Space used here besides the ouput is the map that used to store data, which is O(n).

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
/**
* public class Record {
* public int id, score;
* public Record(int id, int score){
* this.id = id;
* this.score = score;
* }
* }
*/
public class HighFive {
/**
* @param results
* @return
*/
public Map<Integer, Double> highFive(Record[] results) {
if (results == null) {
return new HashMap<>();
}

Map<Integer, Double> map = new HashMap<>();
Map<Integer, PriorityQueue<Integer>> score = new HashMap<>();

for (Record record : results) {
if (!score.containsKey(record.id)) {
PriorityQueue<Integer> pq = new PriorityQueue<>(new Comparator<Integer>() {
@Override
public int compare(Integer o1, Integer o2) {
return o1 > o2 ? -1 : o1 == o2 ? 0 : 1;
}
});

pq.offer(record.score);
score.put(record.id, pq);
} else {
score.get(record.id).offer(record.score);
}
}

for (Map.Entry<Integer, PriorityQueue<Integer>> pair : score.entrySet()) {
PriorityQueue<Integer> pq = pair.getValue();
int count = 0;
double sum = 0;
while (!pq.isEmpty() && count < 5) {
sum += pq.poll();
count++;
}
map.put(pair.getKey(), sum / count);
}

return map;
}
}