Feineigle.com - Python Algorithms - Mastering Basic Algorithms in the Python Language

Home · Book Reports · 2017 · Python Algorithms - Mastering Basic Algorithms in the Python Language

Published: March 22, 2017
Tags:  Algorithms · Programming · Python



The book in...
One sentence:
Fairly detailed overview of many of the problems and algorithmic solutions expressed in python.

Five sentences:
It starts out with an overview of big O running times. Next the basics of graphs and trees are discussed, with more advanced discussion later. Many of the common mathematical equations and concepts are covered as well as more advance topics including proofs, NP hardness, and recurrence equations and their applications and running times. Searching and traversing are covered with basic concepts like depth/breadth first search as well as more advanced algorithms. The appendixes list algorithms, problems, and terminology.

designates my notes. / designates important.


Thoughts

Not for beginners, but if you have a programming language (not python per se as python is pretty close to pseudo) under your belt and aren’t afraid of equations containing funny Greek squiggles, this can provide a lot of insight into problem solving in general.

Graphs are certainly not anything close to new, but with the advent of social media they have come into the “mainstream”. Even so, the graphs discussed here don’t have much in common with the typical social network. They are used here to solve problems that are (sometimes) not intuitively seen as graph problems.

It includes a LOT of code snippets that are well commented as well as list (in the appendixes) of common algorithms and problems. It seems like it might be a worthy reference if you are looking for an algorithmic solution, but aren’t sure which one you need. Heck, you might not even be sure what you problem is and how it might be able to be tweaked so as to take advantage of a time-tested solution.


Table of Contents


· Chapter 1: Introduction

page 7:

· Chapter 2: The Basics

page 13:
page 14:
page 15:
page 16:
page 20:
page 21:
import cProfile
cProfile.run('main()')
page 23:
page 25:
page 26:
a, b, c, d, e, f, g, h = range(8)
N = [
{b, c, d, e, f}, # a
{c, e}, # b
{d}, # c
{e}, # d
{f}, # e
{c, g, h}, # f
{f, h}, # g
{f, g}  # h
]
page 27:
>>> b in N[a] # Neighborhood membership
True
>>> len(N[f]) # Degree
3
page 28:
a, b, c, d, e, f, g, h = range(8)
N = [
{b:2, c:1, d:3, e:9, f:4}, # a
{c:4, e:3}, # b
{d:8}, # c
{e:7}, # d
{f:5}, # e
{c:2, g:2, h:2}, # f
{f:1, h:6}, # g
{f:9, g:8} # h
]
>>> b in N[a] # Neighborhood membership
True
>>> len(N[f]) # Degree
3
>>> N[a][b]
 # Edge weight for (a, b)
2
page 29:
N = {
'a': set('bcdef'),
'b': set('ce'),
'c': set('d'),
'd': set('e'),
'e': set('f'),
'f': set('cgh'),
'g': set('fh'),
'h': set('fg')
}
page 30:
page 34:
class Bunch(dict):
def __init__(self, *args, **kwds):
super(Bunch, self).__init__(*args, **kwds)
self.__dict__ = self
>>> x = Bunch(name="Jayne Cobb", position="Public Relations")
>>> x.name
'Jayne Cobb'
Second, by subclassing dict, you get lots of functionality for free, such as iterating over the keys/attributes
or easily checking whether an attribute is present. Heres an example:
>>> T = Bunch
>>> t = T(left=T(left="a", right="b"), right=T(left="c"))
>>> t.left
{'right': 'b', 'left': 'a'}
>>> t.left.right
'b'
>>> t['left']['right']
'b'
>>> "left" in t.right
True
>>> "right" in t.right
False
page 37:
>>> s = ""
>>> for chunk in input():
... s += chunk
>>> chunks = []
>>> for chunk in input():
... chunks.append(chunk)
...
>>> s = ''.join(chunks)
>>> s = ''.join(input())
page 38:
>>> result = sum(lists, [])
>>> res = []
>>> for lst in lists:
... res.extend(lst)
page 39:
>>> sum(0.1 for i in range(10)) == 1.0
False
>>> def almost_equal(x, y, places=7):
... return round(abs(x-y), places) == 0
...
>>> almost_equal(sum(0.1 for i in range(10)), 1.0)
True
>>> from decimal import *
>>> sum(Decimal("0.1") for i in range(10)) == Decimal("1.0")
True

· Chapter 3: Counting 101

page 46:
page 47:
page 48:
page 49:
page 50:
page 51:
>>> from random import randrange
>>> n = 10**90
>>> p = randrange(10**90)

>>> p < n/2 #cut the search in half
True

>>> from math import log
>>> log(n, 2) # base-two logarithm
298.97352853986263 #how many guesses to always win
page 53:
page 54:
page 56:
page 58:
page 59:
page 60:
page 65:

· Chapter 4: Induction and Recursion… and Reduction

page 71:
page 78:
page 79:
'''Listing 4-1. Recursive Insertion Sort'''
def ins_sort_rec(seq, i):
  if i==0: return                        #Base case -- do nothing
  ins_sort_rec(seq, i-1)                 #Sort 0..i-1
  j = i                                  #Start "walking" down
  while j > 0 and seq[j-1] > seq[j]:     #Look for OK spot
  seq[j-1], seq[j] = seq[j], seq[j-1]    #Keep moving seq[j] down
  j -= 1                                 #Decrement j
'''Listing 4-2. Insertion Sort'''
def ins_sort(seq):
  for i in range(1,len(seq)):             #0..i-1 sorted so far
    j = i                                 #Start "walking" down
    while j > 0 and seq[j-1] > seq[j]:    #Look for OK spot
      seq[j-1], seq[j] = seq[j], seq[j-1] #Keep moving seq[j] down
      j -= 1                              #Decrement j
page 80:
page 81:
page 82:
>>> M = [2, 2, 0, 5, 3, 5, 7, 4]
>>> 
0
page 83:
'''Listing 4-5. A Naïve Implementation of the Recursive Algorithm Idea
   for Finding a Maximum Permutation'''
def naive_max_perm(M, A=None):
  if A is None:                   # The elt. set not supplied?
    A = set(range(len(M)))        # A = {0, 1, ... , n-1}
  if len(A) == 1: return A        # Base case -- single-elt. A
  B = set(M[i] for i in A)        # The "pointed to" elements
  C = A - B                       # "Not pointed to" elements
  if C:                           # Any useless elements?
    A.remove(C.pop())             # Remove one of them
    return naive_max_perm(M, A)   # Solve remaining problem
  return A                        # All useful -- return all

>>> naive_max_perm(M)
{0, 2, 5} # So, a, c, and f can take part in the permutation. 
page 84:
page 85:
from collections import defaultdict

def counting_sort(A, key=lambda x: x):
  B, C = [], defaultdict(list)       # Output and "counts"
  for x in A:
    C[key(x)].append(x)              # "Count" key(x)
  for k in range(min(C), max(C)+1):  # For every key in the range
    B.extend(C[k])                   # Add values in sorted order
  return B
page 87:
page 88:
page 89:
page 91:
page 93:
for v in range(n):
  C[v] = float('inf')
for i in range(N):
  u, v = randrange(n), randrange(n)
  C[v] = min(C[v], A[u] + B[u][v]) # Relax
page 95:
page 97:
  1. Prove that this is correct using induction.

· Chapter 5: Traversal: The Skeleton Key of Algorithmics

page 104:
def walk(G, s, S=set()):            # Walk the graph from node s
  P, Q = dict(), set()              # Predecessors + "to do" queue
  P[s] = None                       # s has no predecessor
  Q.add(s)                          # We plan on starting with s
  while Q:                          # Still nodes to visit
    u = Q.pop()                     # Pick one, arbitrarily
    for v in G[u].difference(P, S): # New nodes?
      Q.add(v)                      # We plan to visit them!
      P[v] = u                      # Remember where we came from
  return P                          # The traversal tree
page 105:
def components(G):          # The connected components
  comp = []                 
  seen = set()              # Nodes we've already seen
  for u in G:               # Try every starting point
    if u in seen: continue  # Seen? Ignore it
    C = walk(G, u)          # Traverse component
    seen.update(C)          # Add keys of C to seen
    comp.append(C)          # Collect the components
  return comp
page 108:
page 110:
def rec_dfs(G, s, S=None):
  if S is None: S = set()  # Initialize the history
  S.add(s)                 # We've visited s
  for u in G[s]:           # Explore neighbors
    if u in S: continue    # Already visited: Skip
    rec_dfs(G, u, S)       # New: Explore recursively
page 111:
def iter_dfs(G, s):
  S, Q = set(), []      # Visited-set and queue
  Q.append(s)           # We plan on visiting s
  while Q:              # Planned nodes left?
    u = Q.pop()         # Get one
    if u in S: continue # Already visited? Skip it
    S.add(u)            # We've visited it now
    Q.extend(G[u])      # Schedule all neighbors
    yield u             # Report u as visited
>>> list(iter_dfs(G, 0))
[0, 5, 7, 6, 2, 3, 4, 1]
page 112:
def traverse(G, s, qtype=set):
  S, Q = set(), qtype()
  Q.add(s)
  while Q:
    u = Q.pop()
    if u in S: continue
    S.add(u)
    for v in G[u]:
      Q.add(v)
    yield u
>>> list(traverse(G, 0, stack))
[0, 5, 7, 6, 2, 3, 4, 1]
def dfs(G, s, d, f, S=None, t=0):
  if S is None: S = set()     # Initialize the history
  d[s] = t; t += 1            # Set discover time
  S.add(s)                    # We've visited s
  for u in G[s]:              # Explore neighbors
    if u in S: continue       # Already visited. Skip
    t = dfs(G, u, d, f, S, t) # Recurse; update timestamp
  f[s] = t; t += 1            # Set finish time
  return t                    # Return timestamp
page 113:
page 114:
page 115:
page 116:
def iddfs(G, s):
  yielded = set()                     # Visited for the first time
  def recurse(G, s, d, S=None):       # Depth-limited DFS
    if s not in yielded:
      yield s
      yielded.add(s)
    if d == 0: return                 # Max depth zero: Backtrack
    if S is None: S = set()
    S.add(s)
    for u in G[s]:
      if u in S: continue
      for v in recurse(G, u, d-1, S): # Recurse with depth-1
        yield v
  n = len(G)
  for d in range(n):                  # Try all depths 0..V-1
    if len(yielded) == n: break       # All nodes seen?
  for u in recurse(G, s, d):
    yield u
page 117:
def bfs(G, s):
  P, Q = {s: None}, deque([s]) # Parents and FIFO queue
  while Q:
    u = Q.popleft()            # Constant-time for deque
    for v in G[u]:
      if v in P: continue      # Already has parent
      P[v] = u                 # Reached from u: u is parent
      Q.append(v)
  return P
>>> path = [u]
>>> while P[u] is not None:
...     path.append(P[u])
...     u = P[u]
...
>>> path.reverse()
page 118:
page 119:
page 120:
def tr(G):                  # Transpose (rev. edges of) G
  GT = {}
  for u in G: GT[u] = set() # Get all the nodes in there
  for u in G:
    for v in G[u]:
      GT[v].add(u)          # Add all reverse edges
  return GT

def scc(G):
  GT = tr(G)                # Get the transposed graph
  sccs, seen = [], set()
  for u in dfs_topsort(G):  # DFS starting points
    if u in seen: continue  # Ignore covered nodes
      C = walk(GT, u, seen) # Don't go "backward" (seen)
      seen.update(C)        # We've now seen C
      sccs.append(C)        # Another SCC found
  return sccs

· Chapter 6: Divide, Combine, and Conquer

page 126:
page 127:
page 128:
def divide_and_conquer(S, divide, combine):
  if len(S) == 1: return S
  L, R = divide(S)
  A = divide_and_conquer(L, divide, combine)
  B = divide_and_conquer(R, divide, combine)
  return combine(A, B)
page 129:
>>> from bisect import bisect
>>> a = [0, 2, 3, 5, 6, 8, 8, 9]
>>> bisect(a, 5)
4
page 131:
page 133:
class Node:
  lft = None
  rgt = None
  def __init__(self, key, val):
    self.key = key
    self.val = val

  def insert(node, key, val):
    if node is None: return Node(key, val)  # Empty leaf: it's not here
    if node.key == key: node.val = val      # Found key: return val
    elif key < node.key:                    # Less than the key?
      node.lft = insert(node.lft, key, val) # Go left
    else:                                   # Otherwise...
      node.rgt = insert(node.rgt, key, val) # Go right
  return node

  def search(node, key):
    if node is None: raise KeyError         # Empty leaf: add node here
    if node.key == key: return node.val     # Found key: replace val
    elif key < node.key:                    # Less than the key?
      return search(node.lft, key)          # Go left
    else:                                   # Otherwise...
      return search(node.rgt, key)          # Go right

class Tree:                                 # Simple wrapper
  root = None
  def __setitem__(self, key, val):
    self.root = insert(self.root, key, val)

  def __getitem__(self, key):
    return search(self.root, key)

  def __contains__(self, key):
    try: search(self.root, key)
    except KeyError: return False
    return True
>>> tree = Tree()
>>> tree["a"] = 42
>>> tree["a"]
42
>>> "b" in tree
False
page 134:
def partition(seq):
  pi, seq = seq[0], seq[1:]        # Pick and remove the pivot
  lo = [x for x in seq if x <= pi] # All the small elements
  hi = [x for x in seq if x > pi]  # All the large ones
  return lo, pi, hi                # pi is "in the right place"

def select(seq, k):
  lo, pi, hi = partition(seq)      # [<= pi], pi, [> pi]
  m = len(lo)                      
  if m == k: return pi             # We found the kth smalles
  elif m < k:                      # Too far to the left
    return select(hi, k-m-1)       # Remember to adjust k
  else:                            # Too far to the right
    return select(lo, k)           # Just use original k here
page 137:
page 138:
page 140:
page 144:
page 147:
page 149:

· Chapter 7: Greed Is Good? Prove It!

page 151:
page 153:
>>> denom = [10000, 5000, 2000, 1000, 500, 200, 100, 50, 25, 10, 5, 1]
>>> owed = 56329
>>> payed = []
>>> for d in denom:
...   while owed >= d:
...   owed -= d
...   payed.append(d)
...
>>> sum(payed)
56329
>>> len(payed)
14
page 156:
page 157:
page 158:
from heapq import heapify, heappush, heappop
from itertools import count
def huffman(seq, frq):
  num = count()
  trees = list(zip(frq, num, seq))      # num ensures valid ordering
  heapify(trees)                        # A min-heap based on freq
  while len(trees) > 1:                 # Until all are combined
    fa, _, a = heappop(trees)           # Get the two smallest trees
    fb, _, b = heappop(trees)
    n = next(num)
    heappush(trees, (fa+fb, n, [a, b])) # Combine and re-add them
  return trees[0][-1]
page 159:
>>> seq = "abcdefghi"
>>> frq = [4, 5, 6, 9, 11, 12, 15, 16, 20]
>>> huffman(seq, frq)
[['i', [['a', 'b'], 'e']], [['f', 'g'], [['c', 'd'], 'h']]]
def codes(tree, prefix=""):
  if len(tree) == 1:
    yield (tree, prefix) # A leaf with its code
    return
  for bit, child in zip("01", tree): # Left (0) and right (1)
for pair in codes(child, prefix + bit): # Get codes recursively
  yield pair
page 161:
page 162:
page 165:
def find(C, u):
  if C[u] != u:
    C[u] = find(C, C[u]) # Path compression
  return C[u]

def union(C, R, u, v):a
  u, v = find(C, u), find(C, v)
  if R[u] > R[v]: # Union by rank
    C[v] = u
  else:
    C[u] = v
  if R[u] == R[v]: # A tie: Move v up a level
    R[v] += 1

def kruskal(G):
  E = [(G[u][v],u,v) for u in G for v in G[u]]
  T = set()
  C, R = {u:u for u in G}, {u:0 for u in G} # Comp. reps and ranks
  for _, u, v in sorted(E):
    if find(C, u) != find(C, v):
      T.add((u, v))
      union(C, R, u, v)
  return T
page 167:
from heapq import heappop, heappush
def prim(G, s):
  P, Q = {}, [(0, None, s)]
  while Q:
    _, p, u = heappop(Q)
    if u in P:  #if we already added u to P, don't add a dupe
      continue
    P[u] = p
    for v, w in G[u].items():
      heappush(Q, (w, u, v))
  return P
page 168:
page 172:

· Chapter 8: Tangled Dependencies and Memoization

page 175:
page 176:
from itertools import combinations
def naive_lis(seq):
  for length in range(len(seq), 0, -1):    # n, n-1, ... , 1
    for sub in combinations(seq, length):  # Subsequences of given length
      if list(sub) == sorted(sub):         # An increasing subsequence?
        return sub                         # Return it!
page 177:
from functools import wraps
def memo(func):
  cache = {}                       # Stored subproblem solutions
  @wraps(func)                     # Make wrap look like func
  def wrap(*args):                 # The memoized wrapper
    if args not in cache:          # Not already computed?
      cache[args] = func(*args)    # Compute & cache the solution
    return cache[args]             # Return the cached solution
  return wrap                      # Return the wrapper
page 178:
>>> @memo
... def fib(i):
... if i < 2: return 1
... return fib(i-1) + fib(i-2)
...
>>> fib(100)
573147844013817084101
>>> def two_pow(i):
... if i == 0: return 1
... return two_pow(i-1) + two_pow(i-1)
...
>>> two_pow(10)
1024
>>> two_pow(100) #hangs
>>> def two_pow(i):
... if i == 0: return 1
... return 2*two_pow(i-1)
...
>>> print(two_pow(10))
1024
>>> print(two_pow(100))
1267650600228229401496703205376
page 179:

(n) = (n-1) + (n-1) n choose k (k) (k-1) (k)

page 180:
>>> @memo
>>> def C(n,k):
... if k == 0: return 1
... if n == 0: return 0
... return C(n-1,k-1) + C(n-1,k)
>>> C(4,2)
6
>>> C(10,7)
120
>>> C(100,50)
100891344545564193334812497256
page 181:
>>> from collections import defaultdict
>>> n, k = 10, 7
>>> C = defaultdict(int)
>>> for row in range(n+1):
... C[row,0] = 1
... for col in range(1,k+1):
... C[row,col] = C[row-1,col-1] + C[row-1,col]
...
>>> C[n,k]
120
page 183:
def rec_dag_sp(W, s, t):                   # Shortest path from s to t
  @memo # Memoize f
  def d(u): # Distance from u to t
    if u == t: 
      return 0                             # We're there!
    return min(W[u][v]+d(v) for v in W[u]) # Best of every first step
return d(s)
def dag_sp(W, s, t):
  d = {u:float('inf') for u in W}
  d[s] = 0
  for u in topsort(W):
    if u == t: break
    for v in W[u]:
      d[v] = min(d[v], d[u] + W[u][v])
  return d[t]
page 185:
def rec_lis(seq):                           # Longest increasing subseq.
  @memo
  def L(cur):                               # Longest ending at seq[cur]
    res = 1                                 # Length is at least 1
    for pre in range(cur):                  # Potential predecessors
      if seq[pre] <= seq[cur]:              # A valid (smaller) predec.
        res = max(res, 1 + L(pre))          # Can we improve the solution?
    return res
  return max(L(i) for i in range(len(seq))) # The longest of them all
def basic_lis(seq):
  L = [1] * len(seq)
  for cur, val in enumerate(seq):
    for pre in range(cur):
      if seq[pre] <= val:
        L[cur] = max(L[cur], 1 + L[pre])
  return max(L)
page 187:
from bisect import bisect
def lis(seq):                         # Longest increasing subseq.
  end = []                            # End-values for all lengths
  for val in seq:                     # Try every value, in order
  idx = bisect(end, val)              # Can we build on an end val?
  if idx == len(end): end.append(val) # Longest seq. extended
  else: end[idx] = val                # Prev. endpoint reduced
  return len(end)                     # The longest we found
page 188:
def rec_lcs(a,b):                          # Longest common subsequence
  @memo # L is memoized
  def L(i,j):                              # Prefixes a[:i] and b[:j]
    if min(i,j) < 0: return 0              # One prefix is empty
    if a[i] == b[j]: return 1 + L(i-1,j-1) # Match! Move diagonally
    return max(L(i-1,j), L(i,j-1))         # Chop off either a[i] or b[j]
  return L(len(a)-1,len(b)-1)              # Run L on entire sequences
page 189:
def lcs(a,b):
  n, m = len(a), len(b)
  pre, cur = [0]*(n+1), [0]*(n+1)      # Previous/current row
  for j in range(1,m+1):               # Iterate over b
    pre, cur = cur, pre                # Keep prev., overwrite cur.
    for i in range(1,n+1):             # Iterate over a
      if a[i-1] == b[j-1]:             # Last elts. of pref. equal?
        cur[i] = pre[i-1] + 1          # L(i,j) = L(i-1,j-1) + 1
      else:                            # Otherwise...
        cur[i] = max(pre[i], cur[i-1]) # max(L(i,j-1),L(i-1,j))
  return cur[n] # L(n,m)
page 190:
def rec_unbounded_knapsack(w, v, c): # Weights, values and capacity
  @memo                              # m is memoized
  def m(r):                          # Max val. w/remaining cap. r
    if r == 0: return 0              # No capacity? No value
    val = m(r-1)                     # Ignore the last cap. unit?
    for i, wi in enumerate(w):       # Try every object
      if wi > r: continue            # Too heavy? Ignore it
      val = max(val, v[i] + m(r-wi)) # Add value, remove weight
    return val                       # Max over all last objects
  return m(c)                        # Full capacity available
page 191:
def unbounded_knapsack(w, v, c):
  m = [0]
  for r in range(1,c+1):
    val = m[r-1]
    for i, wi in enumerate(w):
      if wi > r: continue
      val = max(val, v[i] + m[r-wi])
    m.append(val)
  return m[c]
page 192:
def rec_knapsack(w, v, c):              # Weights, values and capacity
@memo                                   # m is memoized
def m(k, r):                            # Max val., k objs and cap r
if k == 0 or r == 0: return 0           # No objects/no capacity
i = k-1                                 # Object under consideration
drop = m(k-1, r)                        # What if we drop the object?
if w[i] > r: return drop                # Too heavy: Must drop it
return max(drop, v[i] + m(k-1, r-w[i])) # Include it? Max of in/out
return m(len(w), c)                     # All objects, all capacity
def knapsack(w, v, c):                    # Returns solution matrices
  n = len(w)                              # Number of available items
  m = [[0]*(c+1) for i in range(n+1)]     # Empty max-value matrix
  P = [[False]*(c+1) for i in range(n+1)] # Empty keep/drop matrix
  for k in range(1,n+1):                  # We can use k first objects
    i = k-1                               # Object under consideration
    for r in range(1,c+1):                # Every positive capacity
      m[k][r] = drop = m[k-1][r]          # By default: drop the object
      if w[i] > r: continue               # Too heavy? Ignore it
      keep = v[i] + m[k-1][r-w[i]]        # Value of keeping it
      m[k][r] = max(drop, keep)           # Best of dropping and keeping
      P[k][r] = keep > drop               # Did we keep it?
  return m, P                             # Return full results
>>> m, P = knapsack(w, v, c)
>>> k, r, items = len(w), c, set()
>>> while k > 0 and r > 0:
...   i = k-1
...   if P[k][r]:
...     items.add(i)
...     r -= w[i]
...   k -= 1
page 196:

· Chapter 9: From A to B with Edsger and Friends

page 200:
inf = float('inf')
def relax(W, u, v, D, P):
  d = D.get(u,inf) + W[u][v] # Possible shortcut estimate
  if d < D.get(v,inf):       # Is it really a shortcut?
    D[v], P[v] = d, u        # Update estimate and parent
    return True              # There was a change!
>>> D[u]
7
>>> D[v]
13
>>> W[u][v]
3
>>> relax(W, u, v, D, P)
True
>>> D[v]
10
>>> D[v] = 8
>>> relax(W, u, v, D, P)
>>> D[v]
8
page 201:
page 202:
page 203:
def bellman_ford(G, s):
  D, P = {s:0}, {}                    # Zero-dist to s; no parents
  for rnd in G:                       # n = len(G) rounds
    changed = False                   # No changes in round so far
    for u in G:                       # For every from-node...
      for v in G[u]:                  # ... and its to-nodes...
        if relax(G, u, v, D, P):      # Shortcut to v from u?
          changed = True              # Yes! So something changed
    if not changed: break             # No change in round: Done
  else:                               # Not done before round n?
    raise ValueError('negative cycle')# Negative cycle detected
  return D, P                         # Otherwise: D and P correct
page 205:
from heapq import heappush, heappop
def dijkstra(G, s):
  D, P, Q, S = {s:0}, {}, [(0,s)], set() # Est., tree, queue, visited
  while Q:                               # Still unprocessed nodes?
    _, u = heappop(Q)                    # Node with lowest estimate
    if u in S: continue                  # Already visited? Skip it
    S.add(u)                             # We've visited it now
    for v in G[u]:                       # Go through all its neighbors
      relax(G, u, v, D, P)               # Relax the out-edge
      heappush(Q, (D[v], v))             # Add to queue, w/est. as pri
  return D, P                            # Final D and P returned
page 206:
page 207:
from copy import deepcopy
def johnson(G):                 # All pairs shortest paths
  G = deepcopy(G)               # Don't want to break original
  s = object()                  # Guaranteed unused node
  G[s] = {v:0 for v in G}       # Edges from s have zero wgt
  h, _ = bellman_ford(G, s)     # h[v]: Shortest dist from s
  del G[s]                      # No more need for s
  for u in G:                   # The weight from u ...
    for v in G[u]:              # ... to v ...
      G[u][v] += h[u] - h[v]    # ... is adjusted (nonneg.)
  D, P = {}, {}                 # D[u][v] and P[u][v]
  for u in G:                   # From every u ...
    D[u], P[u] = dijkstra(G, u) # ... find the shortest paths
    for v in G:                 # For each destination ...
      D[u][v] += h[v] - h[u]    # ... readjust the distance
  return D, P                   # These are two-dimensional
page 208:
page 209:
def rec_floyd_warshall(G):                             # All shortest paths
  @memo                                                # Store subsolutions
  def d(u,v,k):                                        # u to v via 1..k
    if k==0: return G[u][v]                            # Assumes v in G[u]
    return min(d(u,v,k-1), d(u,k,k-1) + d(k,v,k-1))    # Use k or not?
  return {(u,v): d(u,v,len(G)) for u in G for v in G}  # D[u,v] = d(u,v,n)
page 210:
def floyd_warshall(G):
  D = deepcopy(G) # No intermediates yet
  for k in G:     # Look for shortcuts with k
    for u in G:
      for v in G:
        D[u][v] = min(D[u][v], D[u][k] + D[k][v])
  return D
def floyd_warshall(G):
  D, P = deepcopy(G), {}
  for u in G:
    for v in G:
      if u == v or G[u][v] == inf:
        P[u,v] = None
      else:
        P[u,v] = u
  for k in G:
    for u in G:
      for v in G:
        shortcut = D[u][k] + D[k][v]
        if shortcut < D[u][v]:
          D[u][v] = shortcut
          P[u,v] = P[k,v]
  return D, P
page 211:
def idijkstra(G, s):
  Q, S = [(0,s)], set()            # Queue w/dists, visited
  while Q:                         # Still unprocessed nodes?
    d, u = heappop(Q)              # Node with lowest estimate
    if u in S: continue            # Already visited? Skip it
    S.add(u)                       # We've visited it now
    yield u, d                     # Yield a subsolution/node
    for v in G[u]:                 # Go through all its neighbors
      heappush(Q, (d+G[u][v], v))  # Add to queue, w/est. as pri
page 212:
page 213:
from itertools import cycle
def bidir_dijkstra(G, s, t):
  Ds, Dt = {}, {}                             # D from s and t, respectively
  forw, back = idijkstra(G,s), idijkstra(G,t) # The "two Dijkstras"
  dirs = (Ds, Dt, forw), (Dt, Ds, back)       # Alternating situations
  try:                                        # Until one of forw/back ends
    for D, other, step in cycle(dirs):        # Switch between the two
      v, d = next(step)                       # Next node/distance for one
      D[v] = d                                # Remember the distance
      if v in other: break                    # Also visited by the other?
  except StopIteration: return inf            # One ran out before they met
  m = inf                                     # They met; now find the path
  for u in Ds:                                # For every visited forw-node
    for v in G[u]:                            # ... go through its neighbors
      if not v in Dt: continue                # Is it also back-visited?
      m = min(m, Ds[u] + G[u][v] + Dt[v])     # Is this path better?
  return m                                    # Return the best path
page 215:
from heapq import heappush, heappop
def a_star(G, s, t, h):
  P, Q = {}, [(h(s), None, s)]     # Preds and queue w/heuristic
  while Q:                         # Still unprocessed nodes?
    d, p, u = heappop(Q)           # Node with lowest heuristic
    if u in P: continue            # Already visited? Skip it
    P[u] = p                       # Set path predecessor
    if u == t: return d - h(t), P  # Arrived! Ret. dist and preds
    for v in G[u]:                 # Go through all neighbors
      w = G[u][v] - h(u) + h(v)    # Modify weight wrt heuristic
      heappush(Q, (d + w, u, v))   # Add to queue, w/heur as pri
  return inf, None                 # Didn't get to t
page 216:
from string import ascii_lowercase as chars
class WordSpace:           # An implicit graph w/utils

def __init__(self, words): # Create graph over the words
  self.words = words
  self.M = M = dict()      # Reachable words

def variants(self, wd, words): # Yield all word variants
  wasl = list(wd)              # The word as a list
  for i, c in enumerate(wasl): # Each position and character
    for oc in chars:           # Every possible character
      if c == oc: continue     # Don't replace with the same
      wasl[i] = oc             # Replace the character
      ow = ''.join(wasl)       # Make a string of the word
      if ow in words:          # Is it a valid word?
        yield ow               # Then we yield it
    wasl[i] = c                # Reset the character

def __getitem__(self, wd): # The adjacency map interface
  if wd not in self.M:     # Cache the neighbors
    self.M[wd] = dict.fromkeys(self.variants(wd, self.words), 1)
  return self.M[wd]

def heuristic(self, u, v):               # The default heuristic
  return sum(a!=b for a, b in zip(u, v)) # How many characters differ?

def ladder(self, s, t, h=None):   # Utility wrapper for a_star
  if h is None:                   # Allows other heuristics
    def h(v):
      return self.heuristic(v, t)
  _, P = a_star(self, s, t, h)    # Get the predecessor map
  if P is None:
    return [s, None, t]  # When no path exists
  u, p = t, []
  while u is not None:  # Walk backward from t
    p.append(u)         # Append every predecessor
    u = P[u]            # Take another step
  p.reverse()           # The path is backward
  return p
>>> wds = set(line.strip().lower() for line in open("/usr/share/dict/words"))
>>> G = WordSpace(words)
>>> G.ladder('lead', 'gold')
['lead', 'load', 'goad', 'gold']
page 217:

· Chapter 10: Matchings, Cuts, and Flows

page 221:
page 224:
from collections import defaultdict
from itertools import chain

def match(G, X, Y):                              # Maximum bipartite matching
  H = tr(G)                                      # The transposed graph
  S, T, M = set(X), set(Y), set()                # Unmatched left/right + match
  while S:                                       # Still unmatched on the left?
    s = S.pop()                                  # Get one
    Q, P = {s}, {}                               # Start a traversal from it
    while Q:                                     # Discovered, unvisited
      u = Q.pop()                                # Visit one
      if u in T:                                 # Finished augmenting path?
        T.remove(u)                              # u is now matched
        break                                    # and our traversal is done
      forw = (v for v in G[u] if (u,v) not in M) # Possible new edges
      back = (v for v in H[u] if (v,u) in M)     # Cancellations
      for v in chain(forw, back):                # Along out- and in-edges
        if v in P: continue                      # Already visited? Ignore
        P[v] = u                                 # Traversal predecessor
        Q.add(v)                                 # New node discovered
    while u != s:                                # Augment: Backtrack to s
      u, v = P[u], u                             # Shift one step
      if v in G[u]:                              # Forward edge?
        M.add((u,v))                             # New edge
      else:                                      # Backward edge?
        M.remove((v,u))                          # Cancellation
  return M                                       # Matching -- a set of edges
page 225:
  1. The number of paths going into any node except s or t must equal the number of paths going out of that node.

  2. At most one path can go through any given edge.

page 227:
page 228:
  1. The amount of flow going into any node except s or t must equal the amount of flow going out of that node.

  2. At most c(e) units of flow can go through any given edge. [c(e) is capacity of edge]

page 229:
page 230:
from collections import deque
inf = float('inf')

def bfs_aug(G, H, s, t, f):
  P, Q, F = {s: None}, deque([s]), {s: inf} # Tree, queue, flow label
  def label(inc):                           # Flow increase at v from u?
    if v in P or inc <= 0: return           # Seen? Unreachable? Ignore
    F[v], P[v] = min(F[u], inc), u          # Max flow here? From where?
    Q.append(v)                             # Discovered -- visit later
  while Q:                                  # Discovered, unvisited
    u = Q.popleft()                         # Get one (FIFO)
    if u == t: return P, F[t]               # Reached t? Augmenting path!
    for v in G[u]: label(G[u][v]-f[u,v])    # Label along out-edges
    for v in H[u]: label(f[v,u])            # Label along in-edges
  return None, 0                            # No augmenting path found
def ford_fulkerson(G, s, t, aug=bfs_aug): # Max flow from s to t
  H, f = tr(G), defaultdict(int)          # Transpose and flow
  while True:                             # While we can improve things
    P, c = aug(G, H, s, t, f)             # Aug. path and capacity/slack
    if c == 0: return f                   # No augm. path found? Done!
    u = t                                 # Start augmentation
    while u != s:                         # Backtrack to s
      u, v = P[u], u                      # Shift one step
      if v in G[u]: f[u,v] += c           # Forward edge? Add slack
      else: f[v,u] -= c                   # Backward edge? Cancel slack
page 231:
page 234:
def busacker_gowen(G, W, s, t):            # Min-cost max-flow
  def sp_aug(G, H, s, t, f):               # Shortest path (Bellman-Ford)
    D, P, F = {s:0}, {s:None}, {s:inf,t:0} # Dist, preds and flow
    def label(inc, cst):                   # Label + relax, really
      if inc <= 0: return False            # No flow increase? Skip it
      d = D.get(u,inf) + cst               # New possible aug. distance
      if d >= D.get(v,inf): return False   # No improvement? Skip it
      D[v], P[v] = d, u                    # Update dist and pred
      F[v] = min(F[u], inc)                # Update flow label
      return True                          # We changed things!
    for rnd in G:                          # n = len(G) rounds
      changed = False                      # No changes in round so far
      for u in G:                          # Every from-node
        for v in G[u]:                     # Every forward to-node
          changed |= label(G[u][v]-f[u,v], W[u,v])
        for v in H[u]:                     # Every backward to-node
          changed |= label(f[v,u], -W[v,u])
      if not changed: break                # No change in round: Done
      else:                                # Not done before round n?
        raise ValueError('negative cycle') # Negative cycle detected
    return P, F[t]                         # Preds and flow reaching t
  return ford_fulkerson(G, s, t, sp_aug)   # Max-flow with Bellman-Ford
page 237:

· Chapter 11: Hard Problems and (Limited) Sloppiness

page 242:
  1. The problem is intractable—any algorithm solving it must be exponential.

  2. We don’t know whether the problem is intractable, but no one has ever been able to find a polynomial algorithm for it.

page 256:
page 261:
page 262:
from collections import defaultdict # 2-approx for metric TSP     
def mtsp(G, r):                     # Tree and cycle
  T, C = defaultdict(list), []      # Build a traversable MSP
  for c, p in prim(G, r).items():   # Child is parent's neighbor
    T[p].append(c)                  # Recursive DFS
  def walk(r):                      # Preorder node collection
    C.append(r)                     # Visit subtrees recursively
    for v in T[r]: walk(v)          # Traverse from the root
  walk(r)                           # At least half-optimal cycle
  return C
page 265:
page 266:
from __future__ import division
from heapq import heappush, heappop
from itertools import count

def bb_knapsack(w, v, c):
  sol = 0               # Solution so far
  n = len(w)            # Item count
  idxs = list(range(n))
  idxs.sort(key=lambda i: v[i]/w[i], reverse=True) # Sort by descending unit cost

  def bound(sw, sv, m):                     # Greedy knapsack bound
    if m == n: return sv                    # No more items?
    objs = ((v[i], w[i]) for i in idxs[m:]) # Descending unit cost order
    for av, aw in objs:                     # Added value and weight
      if sw + aw > c: break                 # Still room?
      sw += aw                              # Add wt to sum of wts
      sv += av                              # Add val to sum of vals
    return sv + (av/aw)*(c-sw)              # Add fraction of last item

  def node(sw, sv, m):                  # A node (generates children)
    nonlocal sol                        # "Global" inside bb_knapsack
    if sw > c: return                   # Weight sum too large? Done
    sol = max(sol, sv)                  # Otherwise: Update solution
    if m == n: return                   # No more objects? Return
    i = idxs[m]                         # Get the right index
    ch = [(sw, sv), (sw+w[i], sv+v[i])] # Children: without/with m
    for sw, sv in ch:                   # Try both possibilities
      b = bound(sw, sv, m+1)            # Bound for m+1 items
      if b > sol:                       # Is the branch promising?
        yield b, node(sw, sv, m+1)      # Yield child w/bound

  num = count()                       # Helps avoid heap collisions
  Q = [(0, next(num), node(0, 0, 0))] # Start with just the root
  while Q:                            # Any nodes left?
    _, _, r = heappop(Q)              # Get one
    for b, u in r:                    # Expand it ...
       heappush(Q, (b, next(num), u)) # ... and push the children
  return sol                          # Return the solution
page 271:

· PEDAL TO THE METAL: ACCELERATING PYTHON (Appendix A)

page 272:
page 273:

· LIST OF PROBLEMS AND ALGORITHMS (Appendix B)

page 276:
page 277:
page 278:

· ALGORITHMS AND DATA STRUCTURES (Appendix B)

page 280:
page 281:
page 282:
page 283:

· GRAPH TERMINOLOGY (Appendix C)

page 286:
page 288:
  1. T is a tree (that is, it is acyclic and connected).
  2. T is acyclic, and has n –1 edges.
  3. T is connected, and has n –1 edges.
  4. Any two nodes are connected by exactly one path.
  5. T is acyclic, but adding any new edge to it will create a cycle.
  6. T is connected, but removing any edge yields two connected components.