Algorithm Notes
1 Introduction
This is my journey into Algorithm and Data Structures. I implemented quite a bunch of classical algorithms based on the holy bible Introduction to Algorithms. End up getting offers from Amazon Alexa (Applied Scientist), Uber (Software Engineer), Google (Software Engineer), Microsoft (Software Engineer) with a Mechanical Engineering background.
2 Algorithms
2.2 Asymptotic notation
- Θ(g(n)) asymptotic tight bound: there exists c1, c2, such that c1*g(n) ≤ f(n) ≤ c2*g(n).
- O(g(n)) asymptotic upper bound: there exists c such that 0 ≤ f(n) ≤ c*g(n).
- Ω(g(n)) asymptotic lower bound: there exists c such that c*g(n) ≤ f(n).
2.3 Master Theorem
Given the recurrence equation: T(n) = aT(n/b) + f(n)
- If f(n) = O(nlogb a - ε) for some constant ε, then T(n) = Θ(nlogb a). That is if f(n) is polynomially faster than nlogb a then the time will be dominated by nlogb a.
- If f(n) = Θ(nlogb a * logk n) for any k ≥ 0, then T(n) = Θ(nlogb a logk + 1 n).
- If f(n) = Ω(nlogb a + ε) for some constant ε, and if af(n/b) ≤ cf(n) for some constant c, then T(n) = Θ(f(n)). That is f(n) is polynomially slower than nlogb a, then the time will be dominated by f(n). Can be translated as
- Too many leaves. Number of leaf nodes outweighs the sum of the internal evaluation cost. The total running time is O(nlogb a).
- Equal work per level. Each level cost nlogb a time, and there are logb n levels. Total time O(nlogb a log n).
- Too expensive a root. The internal evaluation costs dominates. O(f(n)).
2.4 Sort
2.4.1 Insertion Sort
def insertion_sort(arr): if len(arr) == 1: print(arr) return arr for i in range(1, len(arr)): # Get the latest unsorted value key = arr[i] j = i - 1 while arr[j] > key and j >= 0: # When the current value is larger than the unsorted value, shift # the current value to the right arr[j + 1] = arr[j] print(arr) j -= 1 # The current value arr[j] is either smaller than the unsorted value, # or the current position is -1. arr[j + 1] = key print(arr) return arr a = [2, 5, 3, 8, 7, 1] insertion_sort(a) insertion_sort([1]) insertion_sort([2, 2, 2, 1])
2.4.2 Merge Sort
def mergeSort(A): return mergeSortHelper(A, 0, len(A) - 1) def mergeSortHelper(A, left, right): if left == right: return A[left:right + 1] mid = left + (right - left) // 2 A_left = mergeSortHelper(A, left, mid) A_right = mergeSortHelper(A, mid + 1, right) return mergeArray(A_left, A_right) def mergeArray(A, B): i = 0 j = 0 merged = [] while i < len(A) and j < len(B): if A[i] < B[j]: merged.append(A[i]) i += 1 else: merged.append(B[j]) j += 1 if i == len(A): merged.extend(B[j:]) else: merged.extend(A[i:]) return merged A = [6, 7, 3, 2, 9, 6, 7] print(mergeSort(A))
2.4.3 Quick Sort
- Simple Quick Sort
def quick_sort(A, left, right): if left < right: mid = partition(A, left, right) print('Pivot Index: {}'.format(mid)) print(A) quick_sort(A, left, mid) quick_sort(A, mid + 1, right) def partition(A, left, right): print('Partition {} to {}'.format(left, right)) pivot = A[right] print('Pivot value: {}'.format(pivot)) i = left for j in range(left, right): if A[j] < pivot: A[i], A[j] = A[j], A[i] i += 1 A[i], A[right] = A[right], A[i] return i - 1 A = [2, 8, 7, 1, 3, 5, 6, 4] quick_sort(A, 0, len(A) - 1) print(A) A = [1] quick_sort(A, 0, len(A) - 1) print(A)
- Randomized quick sort
import random def quick_sort(A, left, right): if left < right: mid = random_partition(A, left, right) print('Pivot Index: {}'.format(mid)) print(A) quick_sort(A, left, mid) quick_sort(A, mid + 1, right) def random_partition(A, left, right): pivot_index = random.randint(left, right) A[pivot_index], A[right] = A[right], A[pivot_index] return partition(A, left, right) def partition(A, left, right): print('Partition {} to {}'.format(left, right)) pivot = A[right] print('Pivot value: {}'.format(pivot)) i = left for j in range(left, right): if A[j] < pivot: A[i], A[j] = A[j], A[i] i += 1 A[i], A[right] = A[right], A[i] return i - 1 A = [2, 8, 7, 1, 3, 5, 6, 4] quick_sort(A, 0, len(A) - 1) print(A) A = [1] quick_sort(A, 0, len(A) - 1) print(A)
2.4.5 Counting Sort
Count the times of each element, and put the element to the place of the counting number.
- NOTE: the lower bound of comparison sort is Θ(n log n), counting sort is not a comparison sort. It can sort the array in linear time.
- The time complexity of counting sort is O(n + k), given that the elements in the array are from 0 to k.
- It is stable, but not in place. It takes O(n + k) memory.
def counting_sort(A, B, k): """Counting sort Sort A and put the sorted results into B. The elements in A are assumed from 0 to k """ # Array to save counts C = [0 for i in range(k + 1)] for a in A: C[a] += 1 print('Counted array: {}'.format(C)) # Accumulate the counts for i in range(1, len(C)): C[i] += C[i - 1] print('Accumulated counts: {}'.format(C)) # The reason to start from the end is to maintain the stability of the # sort, because the count number starts from the largest position for i in range(len(A) - 1, -1, -1): B[C[A[i]] - 1] = A[i] C[A[i]] -= 1 print('Current sorted array B: {}'.format(B)) print('Current counting array C: {}'.format(C)) A = [2, 5, 3, 0, 2, 3, 0, 3] B = [-1 for i in range(8)] counting_sort(A, B, 5) print('Sorted array B: {}'.format(B))
2.4.6 TODO Radix sort
- Split the number into multiple digit, and perform STABLE sort for each each digit from smaller digit to larger digit.
- Pseudo code: for i = 1 to d: perform stable sort to sort array A on digit i
- If using counting sort as the underlying stable sorting algorithm, the time complexity is O(d(n + k))
def radix_sort(A, d): # TODO pass
2.4.8 Patience Sort
Very efficient algorithm to solve Longest Increasing Subsequence Problem. https://leetcode.com/problems/longest-increasing-subsequence/ https://www.cs.princeton.edu/courses/archive/spring13/cos423/lectures/LongestIncreasingSubsequence.pdf https://en.wikipedia.org/wiki/Patience_sorting
2.5 Maximum-subarray problem
Find the subarray that has the maximum sum.
def find_max_subarray(A): low, high, max_sum = helper_find_max_subarray(A, 0, len(A) - 1) return low, high, max_sum def helper_find_max_subarray(A, low, high): if low == high: return low, high, A[low] mid = (low + high) // 2 left_low, left_high, left_sum = helper_find_max_subarray(A, low, mid) right_low, right_high, right_sum = helper_find_max_subarray(A, mid + 1, high) cross_low, cross_high, cross_sum = helper_find_cross_max_subarray(A, low, mid, high) if left_sum >= right_sum and left_sum >= cross_sum: return left_low, left_high, left_sum elif right_sum >= left_sum and right_sum >= cross_sum: return right_low, right_high, right_sum else: return cross_low, cross_high, cross_sum def helper_find_cross_max_subarray(A, low, mid, high): left_sum = A[mid] left_low = mid tmp_sum = A[mid] i = mid - 1 while i >= low: tmp_sum += A[i] if tmp_sum > left_sum: left_sum = tmp_sum left_low = i i -= 1 right_sum = A[mid + 1] right_high = mid + 1 tmp_sum = A[mid + 1] i = mid + 1 while i <= high: tmp_sum += A[i] if tmp_sum > right_sum: right_sum = tmp_sum right_high = i i += 1 return left_low, right_high, left_sum + right_sum A = [13, -3, -25, 20, -3, -16, -23, 18, 20, -7, 12, -5, -22, 15, -4, 7] low, high, max_sum = find_max_subarray(A) print(low) print(high) print(max_sum) A = [13, -3] low, high, max_sum = find_max_subarray(A) print(low) print(high) print(max_sum)
2.6 Stack
class Stack: def __init__(self, size): self.size = size self._array = [0 for i in range(self.size)] self.top = -1 def isempty(self): if self.top < 0: return True else: return False def push(self, x): if self.top == self.size - 1: raise Exception('Stack overflow!') self.top += 1 self._array[self.top] = x def pop(self): if self.top < 0: raise Exception('Stack underflow!') self.top -= 1 return self._array[self.top + 1] s = Stack(3) s.push(3) s.push(9) s.push(4) print(s.pop()) print(s.pop()) s.push(5) print(s.pop()) print(s.pop())
2.7 Queue
- NOTE: use a array of n + 1 length to represent a queue of size n.
Tail is the pointer pointing to the position to enqueue new element.
class="example"> Head 0 Tail 1 [1, -1, -1, -1, -1, -1] Head 0 Tail 2 [1, 2, -1, -1, -1, -1] Head 0 Tail 3 [1, 2, 3, -1, -1, -1] Head 0 Tail 4 [1, 2, 3, 4, -1, -1] Head 0 Tail 5 [1, 2, 3, 4, 5, -1] [1, 2, 3, 4, 5, -1] Head 1 Tail 5 1 [1, 2, 3, 4, 5, -1] Head 2 Tail 5 2 [1, 2, 3, 4, 5, -1] Head 3 Tail 5 3 [1, 2, 3, 4, 5, -1] Head 4 Tail 5 4 [1, 2, 3, 4, 5, -1] Head 5 Tail 5 5 9class Queue: def __init__(self, size): """ Use a (n + 1) length array to implement a queue of size n """ self.size = size self.head = 0 self.tail = 0 self._array = [-1 for i in range(self.size + 1)] def enqueue(self, x): if self.head == self.tail + 1 or (self.head == 0 and self.tail == self.size): raise Exception("Queue overflow!") self._array[self.tail] = x if self.tail == self.size: self.tail = 0 else: self.tail += 1 print("Head {}".format(self.head)) print("Tail {}".format(self.tail)) print(self._array) def dequeue(self): if self.head == self.tail: raise Exception("Queue underflow!") val = self._array[self.head] if self.head == self.size: self.head = 0 else: self.head += 1 print(self._array) print("Head {}".format(self.head)) print("Tail {}".format(self.tail)) return val q = Queue(5) q.enqueue(1) q.enqueue(2) q.enqueue(3) q.enqueue(4) q.enqueue(5) print(q.dequeue()) print(q.dequeue()) print(q.dequeue()) print(q.dequeue()) print(q.dequeue()) q.enqueue(9) print(q.dequeue()) # print(q.dequeue())
2.8 Binary Search
This is a less cleaner version, but its advantage is that left pointer and right pointer finally converges to the same position. Sometimes this is handy when you trying to locate certain position in the end of the loop.
def binarySearch(nums, target): if not nums: return -1 left = 0 right = len(nums) - 1 while left < right: mid = left + (right - left) // 2 if nums[mid] == target: return mid elif nums[mid] < target: # mid could be equal to left, under this situation, left should be # mid + 1 to force the search on the right hand side. To test this, # use a test case of [0, 1], search 1 left = mid + 1 else: right = mid return left if nums[left] == target else -1 nums = [1, 4, 6, 9] target = -1 print(binarySearch(nums, target)) nums = [1, 4, 6, 9] target = 4 print(binarySearch(nums, target)) nums = [0, 1] target = 1 print(binarySearch(nums, target))
NOTE: This version of binary search terminates when r == l - 1, which might not be suitable for certain situation when we are trying to locate certain value in the end of the loop.
def binarySearch1(nums, target): l = 0 r = len(nums) - 1 while l <= r: mid = l + (r - l) // 2 if nums[mid] == target: return mid elif nums[mid] < target: l = mid + 1 else: r = mid - 1 return -1 nums = [1, 4, 6, 9] target = 6 print(binarySearch1(nums, target)) nums = [0] target = 6 print(binarySearch1(nums, target)) nums = [0] target = 6 print(binarySearch1(nums, target))
2.9 Linked List
- Doubly linked list: O(n) time for search O(1) time for insert and delete
- Singly linked list
2.9.1 Singly linked list
- Search takes O(n) time
- Insert take O(1) time
Delete takes O(n) time
class="example"> 1 -> 2 -> 3 -> 4 -> 5 Node(2) 1 -> 2 -> 4 -> 5 9 -> 1 -> 2 -> 4 -> 5 None 9 -> 2 -> 4 -> 5class Node: def __init__(self, key): self.key = key self.next = None def __str__(self): return 'Node({})'.format(self.key) class SinglyLinkedList: def __init__(self, head): self.head = head @classmethod def fromList(cls, arr): head = Node(arr[0]) curr = head for k in arr[1:]: curr.next = Node(k) curr = curr.next return cls(head) def search(self, k): curr = self.head while curr and curr.key != k: curr = curr.next return curr def insert(self, node): node.next = self.head self.head = node def delete(self, k): prev = None curr = self.head while curr and curr.key != k: prev = curr curr = curr.next if not curr: raise Exception('The key {} does not exist in the list.'.format(k)) if prev: prev.next = curr.next else: self.head = curr.next curr.next = None def __str__(self): s = "" curr = self.head while curr: if s != "": s += " -> " s += str(curr.key) curr = curr.next return s lst = SinglyLinkedList.fromList([1, 2, 3, 4, 5]) print(lst) print(lst.search(2)) lst.delete(3) print(lst) lst.insert(Node(9)) print(lst) print(lst.search(3)) lst.delete(1) print(lst)
2.9.2 Doubly linked list
class Node: def __init__(self, key): self.key = key self.next = None self.prev = None def __str__(self): return 'Node({})'.format(self.key) class DoublyLinkedList: def __init__(self): self.head = None self.tail = None def insert(self, node): node.next = self.head node.prev = None if self.head: self.head.prev = node if not self.tail: self.tail = node self.head = node def search(self, key): curr = self.head while curr and curr.key != key: curr = curr.next return curr def delete(self, node): if node.prev: node.prev.next = node.next else: self.head = node.next if node.next: node.next.prev = node.prev else: self.tail = node.prev def __str__(self): s = "" curr = self.head while curr: if s != "": s += " <-> " s += str(curr.key) curr = curr.next return s lst = DoublyLinkedList() lst.insert(Node(1)) lst.insert(Node(2)) lst.insert(Node(3)) lst.insert(Node(4)) print(lst) lst.delete(lst.search(1)) print(lst) lst.delete(lst.search(4)) print(lst.tail) print(lst.head) print(lst)
2.9.3 Reverse linked list
class Node: def __init__(self, val): self.val = val self.next = None def print_list(head): cur = head while cur: print(cur.val) cur = cur.next def reverse_linked_list(head): cur = head prev = None while cur: next = cur.next cur.next = prev prev = cur cur = next return prev head = Node(1) head.next = Node(2) head.next.next = Node(3) head.next.next.next = Node(4) print("Before revesing:") print_list(head) rhead = reverse_linked_list(head) print("After revesing:") print_list(rhead) print("Edge Case when head is None") print_list(reverse_linked_list(None)) print("Edge Case when head is orphan") print_list(reverse_linked_list(Node(1)))
2.10 Hash table
Universe of keys is large. The set of keys actually stored may be small relative to the universe of keys. Consider we maintain a table with m slots, and we have a hash function h: U -> {0, 1, …, m - 1}. Given a key k, we locate T[h[k]]. If there is key collision, which means h[k1] == h[k2]. Then we chain the corresponding values as a linked list.
If a hash table uses m slots to store n values, we define the load factor α = n/m. Then the average search time is O(1 + α) assuming a uniform hashing function.
It is often desired to have keys that are relatively close to map to hash values that are far apart (especially for linear probing).
class Node: def __init__(self, key): self.key = key self.next = None self.prev = None def __str__(self): return 'Node({})'.format(self.key) class DoublyLinkedList: def __init__(self): self.head = None self.tail = None def insert(self, node): node.next = self.head node.prev = None if self.head: self.head.prev = node if not self.tail: self.tail = node self.head = node def search(self, key): curr = self.head while curr and curr.key != key: curr = curr.next return curr def delete(self, node): if node.prev: node.prev.next = node.next else: self.head = node.next if node.next: node.next.prev = node.prev else: self.tail = node.prev def __str__(self): s = "" curr = self.head while curr: if s != "": s += " <-> " s += str(curr.key) curr = curr.next return s def divide_hash_fun(val, m): return val % m class HashMap: def __init__(self, size, hash_fun=divide_hash_fun): self.size = size self.hash_fun = hash_fun self._slots = [DoublyLinkedList() for i in range(self.size)] def insert(self, node): self._slots[self.hash_fun(node.key, self.size)].insert(node) def search(self, key): return self._slots[self.hash_fun(key, self.size)].search(key) def delete(self, node): self._slots[self.hash_fun(node.key, self.size)].delete(node) map = HashMap(7) map.insert(Node(199)) map.insert(Node(10)) map.insert(Node(30)) map.insert(Node(50)) print(map.search(10)) map.delete(map.search(10)) print(map.search(10))
- Universal hashing: select hash function randomly in a way that is independent of the keys.
- Open addressing: all elements occupy the hash table itself. No linked list. When searching for an element, we systematically examine table slots. The load factor α = n/m is always less than 1.
- Linear probing
- Quadratic probing
- Double Hashing
- Perfect hashing: O(1) search time, 2 levels of hash table. Similar to chained hash table, instead of using linked list, we use a small secondary hash table.
- We can guarantee there is no collision at the secondary level.
2.11 TODO Segment Tree
- Range Sum Query Problem
- Given an array [-2, 0, 3, -5, 2, -1], support two operations, update and querySumInRange.
- Naive solution: directly update the i th value. Sum up values one by one when doing the query. Update takes O(1) time, query takes O(n) time.
- This might be good enough when update happens a lot, but query happens less frequently.
- Precompute the cumulated sum, then query just need to do cumSum[j] - cumSum[i - 1]. Update takes O(n) time, query takes O(1) time.
- This is good, if update happens much less frequently than query.
- Segment Tree solution. Update and query both take O(logn) time.
- Naive solution: directly update the i th value. Sum up values one by one when doing the query. Update takes O(1) time, query takes O(n) time.
- Given an array [-2, 0, 3, -5, 2, -1], support two operations, update and querySumInRange.
2.12 Types of binary tree
- Full binary tree, every node has 0 or 2 children
- Complete binary tree, the tree is completely filled except the last right most node. Think about binary heap.
- Perfect binary tree. All the interior nodes have two children, and all leaves have the same depth.
2.13 BFS
py file BFS guarantees the SHORTEST path from the source node to target node. NOTE: use the color to differentiate nodes!!!! Same as DFS. Time complexity O(V + E), given a graph G(V, E), with V vertices, and E edges.
from collections import deque class Vertex: def __init__(self, val, color='white', distance=-1, parent=None): self.val = val self.color = color self.dist = distance self.parent = parent def __str__(self): return "Val: {}, Color: {}, Distance: {}".format( self.val, self.color, self.dist) class Graph: def __init__(self, adj_list): self.adj_list = adj_list self._reset_vertices() def _reset_vertices(self): self.vertices = {n: Vertex(n) for n in self.adj_list} def bfs(self, s): self._reset_vertices() v_s = self.vertices[s] v_s.color = 'grey' v_s.dist = 0 queue = deque() queue.append(v_s) while queue: cur = queue.popleft() cur.color = 'black' print(cur) for n in self.adj_list[cur.val]: v = self.vertices[n] if v.color == 'white': queue.append(v) v.color = 'grey' v.dist = cur.dist + 1 v.parent = cur # Undirected Graph adj_list = {1: [2, 5], 2: [1, 3, 4], 3: [2, 4], 4: [2, 3, 5], 5: [4, 1]} g = Graph(adj_list) g.bfs(1) print('-'*50) # Directed cyclic Graph adj_list = {1: [2], 2: [4], 3: [2], 4: [3, 5], 5: [1]} g = Graph(adj_list) g.bfs(1) print('-'*50) # Directed acyclic Graph adj_list = {1: [2, 5], 2: [3, 4], 3: [], 4: [3, 5], 5: []} g = Graph(adj_list) g.bfs(1) print('-'*50) # Directed acyclic Graph adj_list = {1: [2, 5], 2: [3, 4], 3: [], 4: [3, 5], 5: []} g = Graph(adj_list) g.bfs(4)
2.14 DFS
py file Parenthesis theorem: Given two vertices in the graph u and v, u.d denotes the discover time of u, u.f denotes the finish time of u.
- If the interval [u.d, u.f] and [v.d, v.f] are entirely disjoint, and neither u nor v is a descendant of the other in the depth-first forest.
- If the interval [u.d, u.f] is contained in [v.d, v.f], the u is a descendant of v.
If the interval [v.d, v.f] is contained in [u.d, u.f], the v is a descendant of v.
Classification of edges:
- Tree edges, edges in the depth first forest. Edge (u, v) is a tree edge if v was first discovered by exploring (u, v).
- Back edges, edges (u, v) connecting a vertex u with its ancestor v in a depth-first tree. Including self loops in the directed graph, where a vertex is connecting with itself.
- Forward edges, nontree edges (u, v) connecting a vertex u to a descendant v in a depth-first tree.
Cross edges, all the other edges.
DFS has enough information to classify some of the edges it encounters. When we first explore an edge (u, v), the color of v tells us something about the edge
- WHITE indicates a tree edge
- GRAY indicates a back edge. NOTE: this is critical to determine whether the graph is cyclic or not.
- BLACK indicates a forward or cross edge.
For undirected graph, every edge is either a tree edge or a back edge.
class="example"> At time 1, Discovered node: (Val: s, Color: gray) At time 2, Discovered node: (Val: z, Color: gray) At time 3, Discovered node: (Val: y, Color: gray) At time 4, Discovered node: (Val: x, Color: gray) At time 5, Finish visiting node: (Val: y, Color: black) At time 6, Finish visiting node: (Val: z, Color: black) At time 7, Discovered node: (Val: w, Color: gray) At time 8, Finish visiting node: (Val: z, Color: black) At time 9, Finish visiting node: (Val: s, Color: black) At time 10, Discovered node: (Val: u, Color: gray) At time 11, Discovered node: (Val: v, Color: gray) At time 12, Finish visiting node: (Val: u, Color: black) At time 13, Discovered node: (Val: t, Color: gray)from datetime import datetime time = 0 # global timer class Vertex: def __init__(self, val, color='white', parent=None): self.val = val self.color = color self.parent = parent def __str__(self): return "(Val: {}, Color: {})".format(self.val, self.color) class Graph: def __init__(self): self.adj = {} self.nodes = {} def addNode(self, val): node = Vertex(val) self.nodes[val] = node self.adj[node] = [] def addMultiNodes(self, val_list): for v in val_list: self.addNode(v) def addEdge(self, u, v): u_node = self.nodes[u] v_node = self.nodes[v] self.adj[u_node].append(v_node) def addMultiEdges(self, edge_list): for e in edge_list: self.addEdge(e[0], e[1]) def dfs(self): global time time = 0 for u in self.nodes.values(): if u.color == 'white': self.dfs_visit(u) def dfs_visit(self, u): global time time += 1 u.color = 'gray' print('At time {}, Discovered node: {}'.format(time, u)) for v in self.adj[u]: if v.color == 'white': self.dfs_visit(v) u.color = 'black' time += 1 print('At time {}, Finish visiting node: {}'.format(time, u)) g = Graph() g.addMultiNodes(['s', 'z', 'y', 'x', 'w', 't', 'v', 'u']) g.addMultiEdges([['s', 'z'], ['z', 'y'], ['y', 'x'], ['z', 'w'], ['w', 'x'], ['x', 'z'], ['s', 'w'], ['v', 's'], ['v', 'w'], ['t', 'v'], ['t', 'u'], ['u', 'v']]) g.dfs()
The results shows the parenthesis theorem clearly: (s (z (y (x x) y) (w w) z) s) (t (v v) (u u) t).
2.15 Topological Sort
Topological sort can only be performed in a directed acyclic graph (DAG). A topological sort of a dag G = (V, E) is a linear ordering of all its vertices such that if G contains an edge (u, v), the u appears before v in the ordering.
Topological sort pseudo code:
- call DFS(G)
- as each vertex is finished, insert it in front of a linked list
return the linked list
class="example"> deque(['undershorts', 'pants', 'pants', 'socks', 'tie'])import logging from collections import deque logger = logging.getLogger(__name__) class Vertex: def __init__(self, val, color='white', parent=None): self.val = val self.color = color self.parent = parent def __str__(self): return "Val: {}, Color: {}".format(self.val, self.color) class Graph: def __init__(self): self.adj = {} self.nodes = {} def addNode(self, val): node = Vertex(val) self.nodes[val] = node self.adj[node] = [] def addMultiNodes(self, val_list): for v in val_list: self.addNode(v) def addEdge(self, u, v): u_node = self.nodes[u] v_node = self.nodes[v] self.adj[u_node].append(v_node) def addMultiEdges(self, edge_list): for e in edge_list: self.addEdge(e[0], e[1]) def dfs(self): self._reset_nodes() for u in self.nodes.values(): if u.color == 'white': self.dfs_visit(u) def dfs_visit(self, u): u.color = 'gray' logger.info('Discovered node: {}'.format(u)) for v in self.adj[u]: if v.color == 'white': self.parent = u self.dfs_visit(v) u.color = 'black' logger.info('Finish visiting node: {}'.format(u)) def _reset_nodes(self): for u in self.nodes.values(): u.color = 'white' u.parent = None def isDAG(self): self._reset_nodes() for u in self.nodes.values(): if u.color == 'white' and not self.isDAGTree(u): return False return True def isDAGTree(self, u): u.color = 'gray' for v in self.adj[u]: logger.info('Discovered node: {}'.format(v)) if v.color == 'gray': return False elif v.color == 'white' and not self.isDAGTree(v): return False u.color = 'black' return True def topo_sort(self): self._reset_nodes() sorted_list = deque() for u in self.nodes.values(): if u.color == 'white': self.topo_sort_helper(u, sorted_list) return sorted_list def topo_sort_helper(self, u, sorted_list): u.color = 'gray' for v in self.adj[u]: if v.color == 'white': v.parent = u self.topo_sort_helper(v, sorted_list) elif v.color == 'gray': ValueError( 'The graph has cycle. Cannot perform topological sort!') u.color = 'black' sorted_list.appendleft(u.val) g = Graph() g.addMultiNodes([ 'undershorts', 'pants', 'belt', 'shirt', 'tie', 'tie', 'jacket', 'socks', 'shoes', 'watch' ]) g.addMultiEdges( [['undershorts', 'pants'], ['pants', 'belt'], ['belt', 'jacket'], ['shirt', 'belt'], ['shirt', 'tie'], ['tie', 'jacket'], ['undershorts', 'shoes'], ['pants', 'shoes'], ['socks', 'shoes']]) print(g.topo_sort())
Here is a simpler version. Good as an interview template.
class="example"> deque([3, 2, 1, 0])from collections import deque def buildDirectedGraph(numNodes, edges): graph = {i: set() for i in range(numNodes)} for u, v in edges: graph[u].add(v) return graph def topo_sort(graph): def dfs(graph, node, visited, discovered, res): if node in visited: return if node in discovered: raise Exception( 'The graph has a cycle. It is not topologically sortable.') discovered.add(node) for nei in graph[node]: dfs(graph, nei, visited, discovered, res) discovered.remove(node) visited.add(node) res.appendleft(node) res = deque() visited = set() # nodes that are finished visiting discovered = set() # nodes that are discovered by dfs but not yet finished for node in graph: dfs(graph, node, visited, discovered, res) return res # graph = buildDirectedGraph(2, [[1, 0], [0, 1]]) # res = topo_sort(graph) graph = buildDirectedGraph(4, [[1, 0], [2, 0], [3, 1], [3, 2]]) res = topo_sort(graph) print(res)
Kahn's algorithm, something like BFS, queue is used.
- Create the indegree map of the graph
- Append the nodes with 0 indegree to the queue
- while the queue is not empty, dequeue and get one node, count += 1, stack append the node. for all its neighbors, decrease the indegree by 1, and append the neighbors whose indegree is 0.
if count != the total number of nodes, then the graph is not DAG. Otherwise return the stack. The stack is topologically sorted.
class="example"> [3, 1, 2, 0]from collections import deque def buildDirectedGraph(numNodes, edges): graph = {i: set() for i in range(numNodes)} for u, v in edges: graph[u].add(v) return graph def topo_sort(graph): indegree = [0] * len(graph) for neighbors in graph.values(): for nei in neighbors: indegree[nei] += 1 q = deque([n for n in graph if indegree[n] == 0]) res = [] while q: node = q.popleft() res.append(node) for nei in graph[node]: indegree[nei] -= 1 if indegree[nei] == 0: q.append(nei) if len(res) < len(graph): raise Exception('The graph has a cycle. It is not topologically sortable.') return res # graph = buildDirectedGraph(2, [[1, 0], [0, 1]]) # res = topo_sort(graph) graph = buildDirectedGraph(4, [[1, 0], [2, 0], [3, 1], [3, 2]]) res = topo_sort(graph) print(res)
2.16 Strongly connected components
In a strongly connected component, every pair of vertices are reachable from each other.
- call DFS(G) to compute the finishing times for each vertex
- compute GT, G transpose (reverse every edge)
- call DFS(GT), but in the main loop of DFS, consider vertices in order of decreasing finishing time u.f obtained in the first step.
- output the vertices of each tree in the depth-first forest obtained in step 3
2.17 Binary Search Tree
- If node y is in the left subtree of x, then y.key <= x.key.
- If node y is in the right subtree of x, then y.key >= x.key.
- Inorder tree walk: print left tree -> print the node itself -> print the right tree
- Preorder tree walk: print the root -> print the subtrees
Post order tree walk: print the subtrees -> print the root
class="example"> 2 5 9 12 15 18 19 Node(9) Node(9) Node(2) Node(19) None Node(5) 12 5 18 2 9 15 19 15 5 2 9from collections import deque class Node: def __init__(self, x): self.key = x self.left = None self.right = None self.parent = None def __str__(self): return "Node({})".format(self.key) class BinarySearchTree: def __init__(self): self.root = None def inorder_tree_walk(self): self._inorder_tree_walk_helper(self.root) def _inorder_tree_walk_helper(self, node): if node: self._inorder_tree_walk_helper(node.left) print(node.key) self._inorder_tree_walk_helper(node.right) def search(self, key): return self._search_helper(self.root, key) def _search_helper(self, node, key): if not node or node.key == key: return node if key < node.key: return self._search_helper(node.left, key) else: return self._search_helper(node.right, key) def iterative_search(self, key): curr = self.root while curr and curr.key != key: if key < curr.key: curr = curr.left else: curr = curr.right return curr def maximum(self): return self._maximum(self.root) def _maximum(self, node): curr = node while curr.right: curr = curr.right return curr def minimum(self): return self._minimum(self.root) def _minimum(self, node): curr = node while curr.left: curr = curr.left return curr def insert(self, node): p = None curr = self.root while curr: p = curr if node.key < curr.key: curr = curr.left else: curr = curr.right node.parent = p if p: if node.key < p.key: p.left = node else: p.right = node else: self.root = node def successor(self, node): """ Find the successor of the sorted array in the inorder traversal """ if node.right: return self._minimum(node.right) parent_node = node.parent while parent_node and parent_node.left != node: node = parent_node parent_node = node.parent return parent_node def predecessor(self, node): if node.left: return self._maximum(node.left) parent_node = node.parent while parent_node and parent_node.right != node: node = parent_node parent_node = node.parent return parent_node def transplant(self, u, v): """Replace the subtree rooted at u with the subtree rooted at v NOTE: u must not be None, but v can be None """ if not u.parent: self.root = v elif u.parent.left == u: u.parent.left = v else: u.parent.right = v if v: v.parent = u.parent def delete(self, node): """Delete the node, assuming node cannot be None """ if not node.left: self.transplant(node, node.right) elif not node.right: self.transplant(node, node.left) else: suc_node = self.successor(node) if suc_node.parent != node: self.transplant(suc_node, suc_node.right) suc_node.right = node.right suc_node.right.parent = suc_node self.transplant(node, suc_node) suc_node.left = node.left suc_node.left.parent = suc_node def _print(self, node, indent): s = "" if node: s = indent + str(node.key) + "\n" if node.left: s += self._print(node.left, indent + "|---") if node.right: s += self._print(node.right, indent + "|---") return s def __str__(self): if not self.root: return "" s = "" curr_level = [] curr_level.append(self.root) while curr_level: next_level = [] for n in curr_level: s += str(n.key) + " " if n.left: next_level.append(n.left) if n.right: next_level.append(n.right) s += "\n" curr_level = next_level return s t = BinarySearchTree() t.insert(Node(12)) t.insert(Node(5)) t.insert(Node(2)) t.insert(Node(9)) t.insert(Node(18)) t.insert(Node(15)) t.insert(Node(19)) t.inorder_tree_walk() print(t.search(9)) print(t.iterative_search(9)) print(t.minimum()) print(t.maximum()) print(t.successor(t.search(9))) print(t.predecessor(t.search(9))) print(t) t.delete(t.search(12)) print(t)
- Randomly built binary search tree: each of the n! permutations of the input keys is equally likely (it is different from insert the n keys equally likely each time).
2.18 Red-Black Tree
Red-Black Properties:
- Every node is either red or black.
- The root is black.
- Every leaf (NIL) is black.
- If a node is red, then both its children are black.
For each node, all simple paths from the node to descendant leaves contain the same number of black nodes.
Black-height of node x: the number of black nodes from the node x (not including x) to a leaf.
Lemma: A red-black tree with n internal nodes has height at most 2lg(n+1).
Rotation:
After Right-Rotate is performed,
class="example"> 11 (BLACK) 11 (BLACK) |---l: 3 (RED) 5 (BLACK) |---l: 3 (RED) |---r: 11 (RED) 5 (BLACK) |---l: 3 (BLACK) |---r: 11 (BLACK) |---|---r: 20 (RED) 5 (BLACK) |---l: 3 (BLACK) |---r: 11 (BLACK) |---|---l: 9 (RED) |---|---r: 20 (RED) 5 (BLACK) |---l: 3 (BLACK) |---|---l: 1 (RED) |---r: 11 (BLACK) |---|---l: 9 (RED) |---|---r: 20 (RED) 5 (BLACK) |---l: 2 (BLACK) |---|---l: 1 (RED) |---|---r: 3 (RED) |---r: 11 (BLACK) |---|---l: 9 (RED) |---|---r: 20 (RED) 5 (BLACK) |---l: 2 (RED) |---|---l: 1 (BLACK) |---|---r: 3 (BLACK) |---|---|---r: 4 (RED) |---r: 11 (BLACK) |---|---l: 9 (RED) |---|---r: 20 (RED) 5 (BLACK) |---l: 2 (RED) |---|---l: 1 (BLACK) |---|---r: 3 (BLACK) |---|---|---r: 4 (RED) |---r: 11 (RED) |---|---l: 9 (BLACK) |---|---|---l: 6 (RED) |---|---r: 20 (BLACK) 5 (BLACK) |---l: 2 (RED) |---|---l: 1 (BLACK) |---|---|---l: -1 (RED) |---|---r: 3 (BLACK) |---|---|---r: 4 (RED) |---r: 11 (RED) |---|---l: 9 (BLACK) |---|---|---l: 6 (RED) |---|---r: 20 (BLACK) 5 (BLACK) |---l: 2 (RED) |---|---l: 1 (BLACK) |---|---|---l: -1 (RED) |---|---r: 3 (BLACK) |---|---|---l: 3 (RED) |---|---|---r: 4 (RED) |---r: 11 (RED) |---|---l: 9 (BLACK) |---|---|---l: 6 (RED) |---|---r: 20 (BLACK) 6 (BLACK) |---l: 2 (RED) |---|---l: 1 (BLACK) |---|---|---l: -1 (RED) |---|---r: 3 (BLACK) |---|---|---l: 3 (RED) |---|---|---r: 4 (RED) |---r: 11 (RED) |---|---l: 9 (BLACK) |---|---r: 20 (BLACK)class Node: BLACK = "BLACK" RED = "RED" def __init__(self, key): self.key = key self.left = None self.right = None self.color = self.BLACK self.parent = None def __str__(self): return "Node(key={}, color={})".format(self.key, self.color) class RedBlackTree: def __init__(self): self.nil = Node(None) self.root = self.nil def left_rotate(self, x): # Assuming x.right is not self.nil y = x.right # set y x.right = y.left # turn y's left subtree into x's right subtree if y.left != self.nil: # no need to link y.left's parent if it is a sentinel y.left.parent = x y.parent = x.parent # link x's parent to y if x.parent == self.nil: self.root = y elif x == x.parent.left: x.parent.left = y else: x.parent.right = y y.left = x # put x into y's left x.parent = y def right_rotate(self, x): # Assuming x.left is not self.nil, symmetric to left_rotate y = x.left x.left = y.right if y.right != self.nil: y.right.parent = x y.parent = x.parent if x.parent == self.nil: self.root = y elif x == x.parent.left: x.parent.left = y else: x.parent.right = y y.right = x x.parent = y def insert(self, z): prev = self.nil curr = self.root while curr != self.nil: prev = curr if curr.key > z.key: curr = curr.left else: curr = curr.right if prev == self.nil: self.root = z z.parent = self.nil elif prev.key > z.key: prev.left = z else: prev.right = z z.parent = prev z.left = self.nil z.right = self.nil z.color = Node.RED self.insert_fixup(z) def insert_fixup(self, z): while z.parent.color == Node.RED: # z.parent.parent is guaranteed to exist, because the root is # black. If a z.parent is red then it is not the root, so it must # have parent. if z.parent == z.parent.parent.left: y = z.parent.parent.right # uncle node if y.color == Node.RED: z.parent.color = Node.BLACK y.color = Node.BLACK z.parent.parent.color = Node.RED z = z.parent.parent else: if z == z.parent.right: z = z.parent self.left_rotate(z) z.parent.color = Node.BLACK z.parent.parent.color = Node.RED self.right_rotate(z.parent.parent) else: y = z.parent.parent.left if y.color == Node.RED: z.parent.color = Node.BLACK y.color = Node.BLACK z.parent.parent.color = Node.RED z = z.parent.parent else: if z == z.parent.left: z = z.parent self.right_rotate(z) z.parent.color = Node.BLACK z.parent.parent.color = Node.RED self.left_rotate(z.parent.parent) self.root.color = Node.BLACK def transplant(self, u, v): if u.parent == self.nil: self.root = v elif u == u.parent.left: u.parent.left = v else: u.parent.right = v v.parent = u.parent def minimum(self, x): curr = x while curr.left != self.nil: curr = curr.left return curr def delete(self, z): y = z # y is either the node to be removed, or the node that replaces z y_original_color = y.color if z.left == self.nil: x = z.right # x moves into y's original position self.transplant(z, z.right) elif z.right == self.nil: x = z.left self.transplant(z, z.left) else: y = self.minimum(z.right) y_original_color = y.color x = y.right if y.parent == z: x.parent = y # TODO: why do we need this????? else: self.transplant(y, y.right) y.right = z.right self.transplant(z, y) y.left = z.left y.left.parent = y y.color = z.color if y_original_color == Node.BLACK: self.delete_fixup(x) def delete_fixup(self, x): while x != self.root and x.color == Node.BLACK: # x must be a "doubly-black" node in this loop if x == x.parent.left: w = x.parent.right # the sibling of x if w.color == Node.RED: # convert to case 2 or 3 or 4 by rotation w.color = Node.BLACK # case 1 x.parent.color = Node.RED # case 1 self.left_rotate(x.parent) # case 1 w = x.parent.right # case 1 if w.left.color == Node.BLACK and w.right.color == Node.BLACK: w.color = Node.RED # case 2 x = x.parent # case 2 else: if w.right.color == Node.BLACK: w.color = Node.RED # case 3 w.left.color = Node.BLACK # case 3 self.right_rotate(w) # case 3 w = x.parent.right # case 3 w.color = x.parent.color # case 4 x.parent.color = Node.BLACK # case 4 w.right.color = Node.BLACK # case 4 self.left_rotate(x.parent) # case 4 x = self.root # case 4, terminate the loop else: # Symmetric to left node case w = x.parent.left if w.color == Node.RED: w.color = Node.BLACK x.parent.color = Node.RED self.right_rotate(x.parent) w = x.parent.left if w.left.color == Node.BLACK and w.right.color == Node.BLACK: w.color = Node.RED x = x.parent else: if w.left.color == Node.BLACK: w.color = Node.RED w.right.color = Node.BLACK self.left_rotate(w) w.color = x.parent.color x.parent.color = Node.BLACK w.left.color = Node.BLACK self.right_rotate(x.parent) x = self.root # Outside the loop, x is either the root, or a "red-and-black" node # If x is the root, then root must be BLACK. # If x is a "red-and-black" node, then color it as black, because it # will have a RED child in case 2, and it might have RED parent. x.color = Node.BLACK def search(self, key): curr = self.root while curr: if curr.key == key: return curr elif curr.key < key: curr = curr.right else: curr = curr.left return None def pprint(self, node, indent): if node.parent == self.nil: s = "{}{} ({})".format(indent, node.key, node.color) else: side = "l" if node == node.parent.left else "r" s = "{}{}: {} ({})".format(indent, side, node.key, node.color) if node.left != self.nil: s += "\n" + self.pprint(node.left, "|---" + indent) if node.right != self.nil: s += "\n" + self.pprint(node.right, "|---" + indent) return s def __str__(self): return self.pprint(self.root, "") + "\n" t = RedBlackTree() t.insert(Node(11)) print(t) t.insert(Node(3)) print(t) t.insert(Node(5)) print(t) t.insert(Node(20)) print(t) t.insert(Node(9)) print(t) t.insert(Node(1)) print(t) t.insert(Node(2)) print(t) t.insert(Node(4)) print(t) t.insert(Node(6)) print(t) t.insert(Node(-1)) print(t) t.insert(Node(3)) print(t) # print(t.search(5)) t.delete(t.search(5)) print(t)
2.19 Trie Tree
Radix Tree
class="example"> False False False True ['1', '0', '10', '101', '1011'] ['0', '1', '10', '101', '1011']class Node: def __init__(self): self.children = [None] * 2 self.isEnd = False class RadixTree: """ This radix tree only contains binary path at each node, i.e. it only stores sting like "01100". """ def __init__(self): self.root = Node() def getIdx(self, s): return ord(s) - ord("0") def getStr(self, idx): return str(idx) def sort(self): sorted_arr = [] stack = [(self.root, "")] while stack: node, curr_str = stack.pop() # NOTE: need to iterate the children from right to left, so left # node will be traversed before right node for idx, n in reversed(list(enumerate(node.children))): if n: next_str = curr_str + self.getStr(idx) stack.append((n, next_str)) if n.isEnd: sorted_arr.append(next_str) return sorted_arr def sort_recursive(self): return self._preorder_travel(self.root, "") def _preorder_travel(self, node, s): sorted_arr = [] if node.isEnd: sorted_arr.append(s) for idx, n in enumerate(node.children): if n: sorted_arr.extend(self._preorder_travel(n, s + self.getStr(idx))) return sorted_arr def insert(self, seq): curr = self.root for s in seq: idx = self.getIdx(s) if not curr.children[idx]: curr.children[idx] = Node() curr = curr.children[idx] curr.isEnd = True def search(self, seq): curr = self.root for s in seq: idx = self.getIdx(s) if not curr.children[idx]: return False curr = curr.children[idx] return curr.isEnd t = RadixTree() t.insert("1011") t.insert("0") t.insert("100") t.insert("10") t.insert("011") print(t.search("00")) print(t.search("01")) print(t.search("011")) print(t.search("1011")) print(t.sort()) print(t.sort_recursive())
2.20 (Binary) Heap
- An array with two attributes: 1) length, the number of elements in the array. 2) heapsize, the number of elements in the heap. The elements in the heap are from A[1] to A[A.heapsize].
- The heap has two properties:
- Complete Binary Tree: the tree is completely filled in all levels except the lowest level. For the lowest level, the tree fills from left to right.
- Heap property: all the parent values are larger (max-heap) / smaller (min-heap) or equal to the child values.
- Due to complete binary tree property, given a node index i, the parent index is floor(i/2), the left child index is 2*i, and the right child index is 2*i+1. NOTE: the root index is 1 (not 0).
NOTE: heapify only takes O(n) time (it looks like O(nlogn), however, a tighter upper bound can be found, which is O(n)). https://www.geeksforgeeks.org/time-complexity-of-building-a-heap/
from collections import deque class Heap: def __init__(self): self.heap_size = 0 self._array = [0] def __getitem__(self, k): if isinstance(k, slice): return self._array[slice(k.start + 1, k.stop + 1, k.step)] else: return self._array[k + 1] def __len__(self): return self.heap_size def parent(self, i): return i // 2 def left(self, i): return 2 * i def right(self, i): return 2 * i + 1 def build_heap(self, A): # It can be proved that A[len(A) // 2], A[len(A) - 1] are leaf nodes. self.heap_size = len(A) self._array = [0] + A # The time complexity is upper bounded by O(nlogn), but actually the # tighter upper bound is O(n) for i in range(self.heap_size // 2, 0, -1): # Build the heap bottom up self.max_heapify(i) def max_heapify(self, i): # Assumes the binary trees rooted at self.left(i) and self.right(i) are # max-heap # The Time complexity of this operation is O(log n), or O(h), where h # is the height of the tree lft = self.left(i) rht = self.right(i) if lft <= self.heap_size and self._array[i] < self._array[lft]: largest = lft else: largest = i if rht <= self.heap_size and self._array[largest] < self._array[rht]: largest = rht if largest != i: self._array[i], self._array[largest] = self._array[ largest], self._array[i] self.max_heapify(largest) def maximum(self): return self._array[1] def extract_max(self): # Time complexity O(logn) if self.heap_size < 1: raise Exception("Heap underflow!") max = self._array[1] self._array[1] = self._array[self.heap_size] self.heap_size -= 1 self.max_heapify(1) return max def insert(self, x): # Time complexity O(logn) self.heap_size += 1 self._array.append(-1) self.increase_key(self.heap_size, x) def increase_key(self, i, k): # Time complexity O(logn) if k < self._array[i]: raise Exception("New key must be larger than the current key.") self._array[i] = k while i > 1 and self._array[i] > self._array[self.parent(i)]: self._array[self.parent(i)], self._array[i] = self._array[i], self._array[self.parent(i)] i = self.parent(i) def __str__(self): return str(self[0:len(self)]) def heap_sort(A): # Time complexity O(nlogn) h = Heap() h.build_heap(A) q = deque() while h.heap_size > 0: q.appendleft(h.extract_max()) return list(q) # A = [1, 2, 3, 4, 5, 6] # h = Heap(A, 100) # print(h) A = [7, 100, 3, 6, 90, 45, 1000] h = Heap() h.build_heap(A) print(h) sorted_A = heap_sort(A) print(sorted_A) h = Heap() h.insert(7) h.insert(100) h.insert(3) h.insert(6) h.insert(90) h.insert(45) h.insert(32) print(h.maximum()) print(h) h.increase_key(4, 1000) print(h)
2.21 Dynamic Programming
- Divide-and-conquer algorithms partition the problem into disjoint subproblems, solve the subproblems recursively, and then combine their solutions to solve the original problem.
- Dynamic programming applies when the subproblems overlap, that is when subproblems share subsubproblems. In this context, divide-and-conquer does more work than necessary, repeatedly solve the common subsubproblems.
- Dynamic programming solve each subsubproblem just once and saves its answer in a table to avoid solving the same problem again.
- To develop a dynamic programming algorithm, we follow 4 steps:
- Characterize the structure of an optimal solution
- Recursively define the value of an optimal solution
- Compute the value of an optimal solution, typically in a bottom-up fashion
- Construct an optimal solution from computed information
2.21.1 Rod cutting
Top-down method utilizing memoization
class="example"> [1, -1, -1, -1, -1, -1, -1, -1, -1, -1] [1, 2, -1, -1, -1, -1, -1, -1, -1, -1] [1, 5, -1, -1, -1, -1, -1, -1, -1, -1] [1, 5, 6, -1, -1, -1, -1, -1, -1, -1] [1, 5, 6, -1, -1, -1, -1, -1, -1, -1] [1, 5, 8, -1, -1, -1, -1, -1, -1, -1] [1, 5, 8, 9, -1, -1, -1, -1, -1, -1] [1, 5, 8, 10, -1, -1, -1, -1, -1, -1] [1, 5, 8, 10, -1, -1, -1, -1, -1, -1] [1, 5, 8, 10, -1, -1, -1, -1, -1, -1] [1, 5, 8, 10, 11, -1, -1, -1, -1, -1] [1, 5, 8, 10, 13, -1, -1, -1, -1, -1] [1, 5, 8, 10, 13, -1, -1, -1, -1, -1] [1, 5, 8, 10, 13, -1, -1, -1, -1, -1] [1, 5, 8, 10, 13, -1, -1, -1, -1, -1] [1, 5, 8, 10, 13, 14, -1, -1, -1, -1] [1, 5, 8, 10, 13, 15, -1, -1, -1, -1] [1, 5, 8, 10, 13, 16, -1, -1, -1, -1] [1, 5, 8, 10, 13, 16, -1, -1, -1, -1] [1, 5, 8, 10, 13, 16, -1, -1, -1, -1] [1, 5, 8, 10, 13, 17, -1, -1, -1, -1] [1, 5, 8, 10, 13, 17, 18, -1, -1, -1] [1, 5, 8, 10, 13, 17, 18, -1, -1, -1] [1, 5, 8, 10, 13, 17, 18, -1, -1, -1] [1, 5, 8, 10, 13, 17, 18, -1, -1, -1] [1, 5, 8, 10, 13, 17, 18, -1, -1, -1] [1, 5, 8, 10, 13, 17, 18, -1, -1, -1] [1, 5, 8, 10, 13, 17, 18, -1, -1, -1] [1, 5, 8, 10, 13, 17, 18, 19, -1, -1] [1, 5, 8, 10, 13, 17, 18, 22, -1, -1] [1, 5, 8, 10, 13, 17, 18, 22, -1, -1] [1, 5, 8, 10, 13, 17, 18, 22, -1, -1] [1, 5, 8, 10, 13, 17, 18, 22, -1, -1] [1, 5, 8, 10, 13, 17, 18, 22, -1, -1] [1, 5, 8, 10, 13, 17, 18, 22, -1, -1] [1, 5, 8, 10, 13, 17, 18, 22, -1, -1] [1, 5, 8, 10, 13, 17, 18, 22, 23, -1] [1, 5, 8, 10, 13, 17, 18, 22, 23, -1] [1, 5, 8, 10, 13, 17, 18, 22, 25, -1] [1, 5, 8, 10, 13, 17, 18, 22, 25, -1] [1, 5, 8, 10, 13, 17, 18, 22, 25, -1] [1, 5, 8, 10, 13, 17, 18, 22, 25, -1] [1, 5, 8, 10, 13, 17, 18, 22, 25, -1] [1, 5, 8, 10, 13, 17, 18, 22, 25, -1] [1, 5, 8, 10, 13, 17, 18, 22, 25, -1] [1, 5, 8, 10, 13, 17, 18, 22, 25, 26] [1, 5, 8, 10, 13, 17, 18, 22, 25, 27] [1, 5, 8, 10, 13, 17, 18, 22, 25, 27] [1, 5, 8, 10, 13, 17, 18, 22, 25, 27] [1, 5, 8, 10, 13, 17, 18, 22, 25, 27] [1, 5, 8, 10, 13, 17, 18, 22, 25, 27] [1, 5, 8, 10, 13, 17, 18, 22, 25, 27] [1, 5, 8, 10, 13, 17, 18, 22, 25, 27] [1, 5, 8, 10, 13, 17, 18, 22, 25, 27] [1, 5, 8, 10, 13, 17, 18, 22, 25, 57] 57def cut_rod(price, n): memo = [-1 for i in range(n)] return helper_cut_rod(price, n, memo) def helper_cut_rod(price, n, memo): if memo[n - 1] != -1: return memo[n - 1] if n == 0: return 0 optimal_price = -1 for i in range(1, n + 1): optimal_price = max(optimal_price, price[i - 1] + helper_cut_rod(price, n - i, memo)) memo[n - 1] = optimal_price print(memo) return optimal_price price = [1, 5, 8, 9, 10, 17, 17, 20, 24, 30] print(cut_rod(price, 10))
Bottom-up method
class="example"> [0, 1, -1, -1, -1, -1, -1, -1] [0, 1, -1, -1, -1, -1, -1, -1] [0, 1, 2, -1, -1, -1, -1, -1] [0, 1, 1, -1, -1, -1, -1, -1] [0, 1, 5, -1, -1, -1, -1, -1] [0, 1, 2, -1, -1, -1, -1, -1] [0, 1, 5, 6, -1, -1, -1, -1] [0, 1, 2, 1, -1, -1, -1, -1] [0, 1, 5, 8, -1, -1, -1, -1] [0, 1, 2, 3, -1, -1, -1, -1] [0, 1, 5, 8, 9, -1, -1, -1] [0, 1, 2, 3, 1, -1, -1, -1] [0, 1, 5, 8, 10, -1, -1, -1] [0, 1, 2, 3, 2, -1, -1, -1] [0, 1, 5, 8, 10, 11, -1, -1] [0, 1, 2, 3, 2, 1, -1, -1] [0, 1, 5, 8, 10, 13, -1, -1] [0, 1, 2, 3, 2, 2, -1, -1] [0, 1, 5, 8, 10, 13, 14, -1] [0, 1, 2, 3, 2, 2, 1, -1] [0, 1, 5, 8, 10, 13, 15, -1] [0, 1, 2, 3, 2, 2, 2, -1] [0, 1, 5, 8, 10, 13, 16, -1] [0, 1, 2, 3, 2, 2, 3, -1] [0, 1, 5, 8, 10, 13, 17, -1] [0, 1, 2, 3, 2, 2, 6, -1] [0, 1, 5, 8, 10, 13, 17, 18] [0, 1, 2, 3, 2, 2, 6, 1] 18 [1, 6]def cut_rod(price, n): optimal_price, cut_points = cut_rod_helper(price, n) optimal_cut_points = [] curr_len = n while curr_len > 0: optimal_cut_points.append(cut_points[curr_len]) curr_len = curr_len - cut_points[curr_len] return optimal_price[n], optimal_cut_points def cut_rod_helper(price, n): memo = [-1 for i in range(n + 1)] cut_point = [-1 for i in range(n + 1)] memo[0] = 0 cut_point[0] = 0 for i in range(1, n + 1): opt = -1 cp = -1 for j in range(1, i + 1): tmp = price[j - 1] + memo[i - j] if tmp > opt: opt = tmp cp = j memo[i] = opt cut_point[i] = cp print(memo) print(cut_point) return memo, cut_point price = [1, 5, 8, 9, 10, 17, 17, 20, 24, 30] opt_price, opt_cut = cut_rod(price, 7) print(opt_price) print(opt_cut)
- Subproblem Graph
- Top-down approach performs a "Depth-First-Search" of the subproblem graph.
- Bottom-up approach solves the subproblem in a "reverse topological sort" order.
- The time complexity of the dynamic ptrogramming is O(V + E), where V are the nodes, E are the edges of the graph.
2.21.2 Matrix-chain multiplication
2D DP problem
class="example"> [[0, 15750, 7875, 9375, 11875, 15125], [-1, 0, 2625, 4375, 7125, 10500], [-1, -1, 0, 750, 2500, 5375], [-1, -1, -1, 0, 1000, 3500], [-1, -1, -1, -1, 0, 5000], [-1, -1, -1, -1, -1, 0]] [[0, 0, 0, 2, 2, 2], [-1, 0, 1, 2, 2, 2], [-1, -1, 0, 2, 2, 2], [-1, -1, -1, 0, 3, 4], [-1, -1, -1, -1, 0, 4], [-1, -1, -1, -1, -1, 0]] ((A_1(A_2A_3))((A_4A_5)A_6))def matrix_chain_order(p): n = len(p) - 1 m = [[-1 for i in range(n)] for i in range(n)] s = [[-1 for i in range(n)] for i in range(n)] for i in range(n): m[i][i] = 0 s[i][i] = 0 for d in range(1, n): # l is the difference between start and end indices for i in range(n - d): j = i + d # i is the start index for k in range(i, j): # k is the relative split position tmp = m[i][k] + m[k + 1][j] + p[i] * p[k + 1] * p[j + 1] if m[i][j] == -1: m[i][j] = tmp s[i][j] = k elif m[i][j] > tmp: m[i][j] = tmp s[i][j] = k return m, s def print_solution(s, i, j): sol = "" if i == j: sol += "A_{}".format(i + 1) else: sol += "(" sol += print_solution(s, i, s[i][j]) sol += print_solution(s, s[i][j] + 1, j) sol += ")" return sol p = [30, 35, 15, 5, 10, 20, 25] m, s = matrix_chain_order(p) sol = print_solution(s, 0, len(m) - 1) print(m) print(s) print(sol)
- Elements of dynamic programming
- Optimal substructure
- We build an optimal solution to the problem from optimal solutions of the subproblems.
- Common pattern in discovering optimal substructure:
- A solution to the problem consists of making a choice. Making to the choice leads to solving one or more subproblems.
- You suppose for a given problem, you are given the choice that leads to an optimal solution. You do not concern yourself yet with how the choice came up. You just assume it has been given to you.
- Given the choice, you determine which subproblems follow, and how to best characterize the resulting space of subproblems (e.g. memo[i], memo[i][j]).
- You show that the solutions to the subproblems used within an optimal solution to the problem must be optimal themselves to the subproblems. You can prove it by contradiction.
- Optimal substructure varies across problem domains in two ways:
- how many subproblems an optimal solution to the original problem uses
- how many choices we have in determining which subproblems to use in an optimal solution.
- Informally, the running time of a dynamic-programming algorithm depends on the product of two factors:
- the number of subproblems overall, denoted as n (i.e. the vertices of the subproblem graph).
- how many choices we look at for each subproblem, denoted as m (i.e. the edges of each vertex, so the total edge number will be n * m).
- DP versus greedy:
- DP solves the subproblem first, and then use the solutions of the subproblems to solve the original problem
- Greedy first make a "greedy" choice (the choice that looks best at the time), and then solve a resulting subproblem. It doesn't bothering to solve every possible smaller subproblems first. In some cases, this strategy works.
- Independence between subproblems
- If solving subproblems will need to share information between subproblems in order to find the optimal solution to the original problem, then the problem cannot be solved using DP. Example: unweighted longest simple path in a directed graph.
- Overlapping subproblems
- When a recursive algorithm revisits the same problem repeatedly, then the optimization problem has overlapping subproblems.
- Optimal substructure
Longest Common Subsequence
class="example"> 4 bdabdef longest_common_subsequence(x, y): m = len(x) n = len(y) c = [[0 for j in range(n + 1)] for i in range(m + 1)] d = [[-1 for j in range(n)] for i in range(m)] for i in range(1, m + 1): for j in range(1, n + 1): if x[i - 1] == y[j - 1]: c[i][j] = c[i - 1][j - 1] + 1 d[i - 1][j - 1] = 2 # 2 means go to left up corner for parent elif c[i - 1][j] > c[i][j - 1]: c[i][j] = c[i - 1][j] d[i - 1][j - 1] = 0 # 0 means go left for parent else: c[i][j] = c[i][j - 1] d[i - 1][j - 1] = 1 # 1 means go right for parent return c[m][n], d def print_lcs(x, d, i, j): if i < 0 or j < 0: return "" s = "" if d[i][j] == 2: s += print_lcs(x, d, i - 1, j - 1) s += x[i] elif d[i][j] == 0: s += print_lcs(x, d, i - 1, j) else: s += print_lcs(x, d, i, j - 1) return s x = "abcbdab" y = "bdcaba" ct, d = longest_common_subsequence(x, y) print(ct) print(print_lcs(x, d, len(x) - 1, len(y) - 1))
NOTE: if only the length is needed, then there is no need to keep the whole lookup table. Only the latest two rows of the table is needed.
Optimal Binary Search Tree
class="example"> [[0, 0, 0, 0, 0, 0], [0.05, 0.45000000000000007, 0.9, 1.25, 1.75, 2.75], [0, 0.1, 0.4, 0.7, 1.2, 2.0], [0, 0, 0.05, 0.25, 0.6, 1.2999999999999998], [0, 0, 0, 0.05, 0.30000000000000004, 0.9], [0, 0, 0, 0, 0.05, 0.5], [0, 0, 0, 0, 0, 0.1]] [[0, 0, 0, 0, 0, 0], [0.05, 0.30000000000000004, 0.45, 0.55, 0.7000000000000001, 1.0000000000000002], [0, 0.1, 0.25, 0.35, 0.49999999999999994, 0.7999999999999999], [0, 0, 0.05, 0.15000000000000002, 0.3, 0.6], [0, 0, 0, 0.05, 0.2, 0.5], [0, 0, 0, 0, 0.05, 0.35], [0, 0, 0, 0, 0, 0.1]] 2.75 k2 Left: k1 Left: d0 Right: d1 Right: k5 Left: k4 Left: k3 Left: d2 Right: d3 Right: d4 Right: d5def optimal_bst(p, q, n): e = [[0 for j in range(n + 1)] for i in range(n + 2)] # e[1 to n + 1][0 to n] w = [[0 for j in range(n + 1)] for i in range(n + 2)] root = [[-1 for j in range(n + 1)] for i in range(n + 1)] for i in range(1, n + 2): e[i][i - 1] = q[i - 1] w[i][i - 1] = q[i - 1] for d in range(n): for i in range(1, n - d + 1): j = i + d e[i][j] = float("inf") w[i][j] = w[i][j - 1] + p[j - 1] + q[j] for k in range(i, j + 1): tmp = e[i][k - 1] + e[k + 1][j] + w[i][j] if tmp < e[i][j]: e[i][j] = tmp root[i][j] = k print(e) print(w) return e[1][n], root def print_sol(root, i, j): if i > j: print('d{}'.format(j)) return k = root[i][j] print("k{}".format(k)) print("Left:") print_sol(root, i, k - 1) print("Right:") print_sol(root, k + 1, j) p = [0.15, 0.1, 0.05, 0.1, 0.2] q = [0.05, 0.1, 0.05, 0.05, 0.05, 0.1] n = len(p) e_opt, root = optimal_bst(p, q, n) print(e_opt) print_sol(root, 1, n)
- Elements of dynamic programming
2.22 Greedy Algorithms
2.22.1 Interval-graph coloring problem
Meeting rooms II: https://leetcode.com/problems/meeting-rooms-ii/
2.23 Union Find
Union find with rank and path compression. The time complexity of union find with both rank and path compression is O(m α(n)), where m is the number of make-set, find, union operations, n is the number of make-set operations (only happens at the beginning, usually means n nodes). α(n) is a very slowly growing function, α(n) < 4 in most of the practical cases. Hence, the time complexity is almost linear to the number of operations. Each operation nearly takes constant time.
class UnionFind: def __init__(self, size): self.parent = list(range(size)) self.rank = [0 for i in range(size)] def find(self, x): while x != self.parent[x]: # path compression, connect the grandchild node with its grandparent every time executing find self.parent[x] = self.parent[self.parent[x]] x = self.parent[x] return x def union(self, x, y): print('Before Union') print('Parent: {}'.format(self.parent)) print('Rank: {}'.format(self.rank)) xroot = self.find(x) yroot = self.find(y) if self.rank[xroot] > self.rank[yroot]: self.parent[y] = xroot elif self.rank[x] < self.rank[y]: self.parent[x] = yroot else: # rank only increases when the two sets to join have the same rank self.parent[y] = xroot self.rank[xroot] += 1 print('After Union') print('Parent: {}'.format(self.parent)) print('Rank: {}'.format(self.rank)) uf = UnionFind(10) uf.union(3, 4) uf.union(9, 4) uf.union(8, 0) uf.union(2, 3) print(uf.find(3)) print(uf.find(2)) print(uf.find(9)) uf.union(5, 6) uf.union(5, 9) uf.union(7, 3) uf.union(4, 8) uf.union(6, 1)
2.25 Quick Select
Time complexity: worst case O(n2) if the pivot is always the biggest or smallest. best case O(n), if we can always decrease the array to search by half, then 1 + 1/2 + 1/4 + … = 2. average O(n).
def quickSelect(nums, k): left = 0 right = len(nums) - 1 while left < right: pivot_idx = partition(nums, left, right) if pivot_idx == k - 1: break elif pivot_idx < k - 1: left = pivot_idx + 1 else: right = pivot_idx - 1 return nums[:k] def partition(nums, left, right): pivot = nums[right] i = left for j in range(left, right): if nums[j] < pivot: nums[i], nums[j] = nums[j], nums[i] i += 1 nums[i], nums[right] = nums[right], nums[i] return i nums = [5, 3, 9, 2, 7, 2] print(quickSelect(nums, 3))
2.26 Graph Coloring Problem
The general graph coloring problem (find the minimum number of colors that the adjacent nodes do not share the same color) is NP-Complete. Greedy algorithm can only find an upper bound of the color numbers, but it is not guaranteed to be the optimal.
2.27 Tree in the context of graph
- A tree is an undirected graph in which any two vertices are connected by exactly one path. Any connected graph without simple cycles is a tree.
- Any connected graph who has n nodes with n - 1 edges is a tree.
- Diameter of a graph: longest of the shortest paths of any two pairs in a graph.
- Remoteness: the length of the longest path from the node to its leaves.
- Center: the node that has the smallest remoteness. Or if treating the node as the root of the tree, the tree has the minimal height.
- Theorem: There can be at most two centers for a tree.
- Theorem: the center of the tree must be on the diameter of the tree.
2.28 Bellman-Ford Algorithm
- Repeat |V| - 1 times, for each edge, relax the edge
- The reason for |V| - 1 times: because any simple path from u to v has at most |V| - 1 edges. Otherwise, the path has cycle. Positive cycles will not contribute to shortest path.
- After |V| - times, if we do the relaxation for all the edges one more time, we can detect negative cycles if certain nodes got relaxed again.
2.29 Dijkstra's Algorithm
- Works for directed and undirected graphs.
- Greedy, always find the node with the shortest distance so far and relax its edges.
Does not work with negative weights.
class="example"> Shortest distance from 0: [0, 5, 3, 6, 2] All the shortest paths starting from 0: 0->2->2->1 0->2 0->2->2->1->1->3 0->4from heapq import heappush, heappop from collections import deque def dijkstra(n, edges, start): graph = buildGraph(n, edges) dist = [float('inf') for _ in range(n)] dist[start] = 0 parent = [-1 for _ in range(n)] parent[start] = start minHeap = [] heappush(minHeap, (0, start)) visited = set() while minHeap: d, node = heappop(minHeap) if node in visited: continue dist[node] = d visited.add(node) for nei, w in graph[node]: if nei not in visited: if dist[nei] > dist[node] + w: dist[nei] = dist[node] + w parent[nei] = node heappush(minHeap, (dist[nei], nei)) print('Shortest distance from {}:'.format(start)) print(dist) print('All the shortest paths starting from {}:'.format(start)) for i in range(n): print("->".join(map(str, reconstruct_path(parent, i)))) def reconstruct_path(parent, node): path = deque() while node != parent[node]: path.appendleft(node) node = parent[node] path.appendleft(node) return path def buildGraph(n, edges): graph = {i: set() for i in range(n)} for u, v, w in edges: graph[u].add((v, w)) return graph dijkstra(5, [[0, 1, 10], [0, 2, 3], [2, 1, 2], [2, 3, 5], [1, 3, 1], [0, 4, 2], [3, 4, 1]], 0)
2.31 Greatest Common Divisor
def gcd(a, b): if a < b: a, b = b, a # now a is always larger than b while b != 0: tmp = b b = a % b a = tmp return a print(gcd(100, 30)) print(gcd(100, 11))
2.32 Find prime numbers less than n
2.32.1 Method 1 - brute force
def printPrime(n): res = [] for i in range(2, n): if isPrime(i): res.append(i) return res def isPrime(x): for i in range(2, x): if x % i == 0: return False return True print(printPrime(20))
2.32.2 Method 2 - better brute force
def printPrime(n): res = [] for i in range(2, n): if isPrime(i): res.append(i) return res def isPrime(x): for i in range(2, x): if i * i > x: break if x % i == 0: return False return True print(printPrime(20))
2.32.3 Method 3 - Sieve of Eratothenes
Key idea: mark multiples as not a prime.
def printPrime(n): if n <= 2: return [] res = [2] isPrime = [True for i in range(n)] for i in range(4, n): if i % 2 == 0: isPrime[i] = False for i in range(3, n, 2): if isPrime[i]: x = i * i while x < n: isPrime[x] = False x += i for i in range(3, n): if isPrime[i]: res.append(i) return res print(printPrime(100)) print(printPrime(2)) print(printPrime(3)) print(printPrime(10))
2.33 Bit Manipulation
2.34 Knuth Shuffling (Fisher-Yates shuffle)
import random def knuthShuffle(arr): for i in range(len(arr) - 1, 0, -1): j = random.randint(0, i) arr[i], arr[j] = arr[j], arr[i] arr = list(range(10)) knuthShuffle(arr) print(arr)
2.35 Reservoir Sampling
Sample k numbers from n numbers. Consider the number array is very large that it cannot fit in memory.
- Can we sample the the number in a streaming fashion?
- The answer is reservoir sampling!
- If current selected number is less than k, then keep appending the new number into the selected number array.
- If current selected number is equal to k, when the i th number arrives, keep the new number with probablity of k / i and discard one of the old numbers equally randomly (otherwise, just keep the old ones).
In the end, each number is kept in the selected set with probablity of k / n.
class="example"> [5974, 4744, 4354, 2221, 6505] [1, 2, 6, 4, 5]import random def reservoirSampling(nums, k): """ Randomly draw k samples from a very large array nums. Assuming nums is a very large array that cannot fit in memory, we can only sample nums in a streaming fashion. """ reservoir = [] for i, x in enumerate(nums): if len(reservoir) < k: reservoir.append(x) else: rn = random.randint(0, i - 1) if rn < k: reservoir[rn] = x return reservoir nums = list(range(1, 10001)) print(reservoirSampling(nums, 5)) nums = list(range(1, 7)) print(reservoirSampling(nums, 5))
2.36 Random pick number with weight
- Compute the accumulated sum.
- Generate a random number in range of [0, accumsum[-1]).
- Perform a binary search to see the random number sits in which interval (defined by the accumulated sum array).
2.37 Monotonic Queue/Stack
https://leetcode.com/problems/shortest-subarray-with-sum-at-least-k/discuss/204290/Monotonic-Queue-Summary https://leetcode.com/problems/daily-temperatures/ https://leetcode.com/problems/sliding-window-maximum/
A queue where the elements from front to end are always strictly increasing (or nondecreasing) or decreasing (or nonincreasing).
When a new value arrives, it kicks the values that are smaller than it in front of it for an increasing queue.
It enables finding the maximum in a queue in O(1) time.
import collections class MaxQueue(object): def __init__(self): self.queue = collections.deque() self.max_q = collections.deque() def append(self, val): self.queue.append(val) while self.max_q and self.max_q[-1] < val: self.max_q.pop() self.max_q.append(val) def popleft(self): val = self.queue.popleft() if self.max_q and self.max_q[0] == val: self.max_q.popleft() return val def getMax(self): return self.max_q[0] if self.max_q else None
- Monotonic (increasing) stack can be used to find the Previous Less Element and Next Less Element in O(n) time.
- This is very helpful for the histogram problem https://leetcode.com/problems/largest-rectangle-in-histogram/
- Similar problems:
2.39 Dutch national flag problem (three-way-partition problem)
Like quick sort partition, but has one more pointer to partition the array into three chunks.
- This algorithm can be used to improve quick sort: partition the array into < pivot, = pivot and > pivot 3 parts. This can be useful if there is lots of duplicates in the array.
2.40 Hamiltonian path and hamiltonian cycle
- A Hamiltonian path is a path that visits each vertex of the graph exactly once.
- A Hamiltonian cycle is a cycle that visits each vertext exactly once.
Backtrack
2.41 Least Recently Used (LRU) cache
class LRUCache(object): def __init__(self, capacity): self.capacity = capacity self.key_map = {} self.dlist = DoublyLinkedList() def put(self, key, value): if key not in self.key_map and len(self.key_map) == self.capacity: tail = self.dlist.pop() self.key_map.pop(tail.key) if key in self.key_map: self.dlist.remove(self.key_map[key]) node = Node(key, value) self.key_map[key] = node self.dlist.appendleft(node) def get(self, key): if key not in self.key_map: return None node = self.key_map[key] self.dlist.remove(node) self.dlist.appendleft(node) # print(node) return node.val class DoublyLinkedList(object): def __init__(self): self.sentinel = Node(0, 0) self.sentinel.next = self.sentinel self.sentinel.prev = self.sentinel def appendleft(self, node): head = self.sentinel.next self.sentinel.next = node node.prev = self.sentinel head.prev = node node.next = head def pop(self): tail = self.getTail() if not tail: raise IndexError('The list is empty!') self.remove(tail) return tail def remove(self, node): node.prev.next = node.next node.next.prev = node.prev def getHead(self): return self.sentinel.next if self.sentinel.next is not self.sentinel else None def getTail(self): return self.sentinel.prev if self.sentinel.prev is not self.sentinel else None class Node(object): def __init__(self, key, val): self.key = key self.val = val self.next = None self.prev = None cache = LRUCache(3) cache.put('a', 1) cache.put('b', 2) cache.put('c', 3) cache.put('c', 4) cache.put('c', 5) cache.put('d', 6) print(cache.get('a')) print(cache.get('b')) print(cache.get('c')) cache.put('e', 7) cache.put('e', 8) print(cache.get('b'))
2.42 Interval problems
2.42.1 Count number of events happening concurrently at certain time point
from heapq import heappush, heappop class EventCounter(object): def __init__(self, intervals): pq = [] for start, end in intervals: heappush(pq, (start, 1)) heappush(pq, (end, -1)) self.count = [[-float('inf'), 0]] while pq: time, diff = heappop(pq) if time == self.count[-1][0]: self.count[-1][1] += diff else: self.count.append([time, self.count[-1][1] + diff]) def query(self, time): print(self.count) left = 0 right = len(self.count) - 1 res = 0 while left <= right: mid = left + (right - left) // 2 if self.count[mid][0] <= time: res = self.count[mid][1] if self.count[mid][0] <= time: left = mid + 1 else: right = mid - 1 print(res) return res ec = EventCounter([[2, 5], [3, 8], [4, 8], [5, 10]]) ec.query(3) ec.query(5)
3 Learning Resources
4 P, NP, NP-complete, NP-hard
- P: the set of problems that can be solved in polynomial time.
- NP (Nondeterministic polynomial time): the set of problems that can be verified (whether the provided solution is correct or not) in polynomial time given a potential solution.
- NP-complete: a problem p in NP is an NP-complete problem if every other problems in NP can be reduced to p in polynomial time. That is any problems belong to both NP-hard and NP. A solution to an NP-complete problem can be verified "quickly", but there is no known way to find a solution quickly.
- NP-hard: a decision problem p is NP-hard if every problem in NP can be reduced to p in polynomial time.
5 Catalan Number
\[C_0 = 1\] \[C_{n+1} = \sum_{i=0}{n} C_i C_{n - i}\] The formula indicates a recursion scheme.
Related to generate parentheses problem, generate full binary tree problem, monotonic path problem (number of "move right" must be larger than number of "move up" in the grid, so always traveling the grid under its diagonal) https://en.wikipedia.org/wiki/Catalan_number
6 ASCII
Pronunciation: ass-key :) Extended ASCII encodes 256 characters (0 - 255). Therefore, a char usually takes 1 byte (8 bits).
7 Convert int to byte string
def int_to_byte(num): # assuming int is 32 bit byte = [chr(num >> i * 8 & 0xff) for i in range(4)] byte.reverse() print(byte) byte_str = ''.join(byte) return ''.join(byte) def byte_to_int(byte): res = 0 for i, c in enumerate(byte): res += ord(c) << (3 - i) * 8 return res b = int_to_byte(12) print(byte_to_int(b))
8 Rotation Matrix
\[R = \begin{bmatrix} \cos \theta & -\sin \theta \\ \sin \theta & \cos \theta \end{bmatrix}\]
Therefore, turn left 90 degree (anti-clockwise): \[R = \begin{bmatrix} 0 & -1 \\ 1 & 0 \end{bmatrix}\]
\[R \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} -y \\ x \end{bmatrix}\]
Turn right 90 degree (clockwise): \[R = \begin{bmatrix} 0 & 1 \\ -1 & 0 \end{bmatrix}\]
\[R \begin{bmatrix} x \\ y \end{bmatrix} = \begin{bmatrix} y \\ -x \end{bmatrix}\]
Created: 2020-05-16 Sat 23:20
Comments powered by Disqus.