20091005


Dynamic Programming

We commenced our study of dynamic programming with an example:

Consider a city with a completely rectangular street layout, with n+1 North-South streets and m+1 East-West streets.  All the streets are one-way: the North-South streets all run south and the East-West streets all run east.  The streets are all numbered in the natural way and each intersection is identified by the streets that meet there.  (0,0) is the northwest corner of the city, and (n,m) is the southeast corner.

(In class I described this as a map of Middle Earth, and the streets as path segments leading from Rivendell in the northwest to Mt. Doom in the southeast.  The city analogy is perhaps less strained.)

We want to get from (0,0) to (n,m).  It is easy to see that all routes we could take contain the same number of street segments: we must travel n blocks to the east and m blocks south.  To make the problem interesting, we add a positive cost to each street segment - and all costs might be different.   Now it makes sense to ask for the least expensive route from (0,0) to (n,m).  This is an example of an optimization problem.

It is not hard to see that there are exponentially many different routes from (0,0) to (n,m), so enumerating all possible solutions and choosing the least expensive one is impractical.

Can we solve this with a Divide and Conquer algorithm?  If we knew for certain that the optimal solution had to go through some particular intersection (i,j) then we could divide the problem into "find the best route from (0,0) to (i,j)" and "find the best route from (i,j) to (n,m)" - but the trouble with this idea is that we have no way to know in advance that any particular (i,j) is part of the optimal solution.

We approach the problem by considering the final step of the solution.  There are only two possible final steps:  either we move from (n-1,m) to (n,m), or we move from (n,m-1) to (n,m).  If we knew the minimum cost to reach (n-1,m) and the minimum cost to reach (n,m-1), then we could easily determine the minimum cost to reach (n,m).

More formally, we can let Min_Cost(i,j) be the minimum cost of getting from (0,0) to (i,j).  Then

Min_Cost(n,m) = min{Min_Cost(n-1,m) + "cost from (n-1,m) to (n,m)",
                                   Min_Cost(n,m-1) + "cost from (n,m-1) to (n,m)" }

where the "cost from ... to ..." is just the cost of the single street joining the two intersections.

In fact, we can see that for any interior intersection (i,j) the same pattern holds

Min_Cost(i,j) = min{Min_Cost(i-1,j) + "cost from (i-1,j) to (i,j)",
                                Min_Cost(i,j-1) + "cost from (i,j-1) to (i,j)" }

because even though there may be exponentially many paths from (0,0) to (i,j), they all go through either (i-1,j) or (i,j-1) .

The intersections on the perimeter of the city have an even simpler situation because each one has exactly one possible predecessor:
Min_Cost(0,j) = Min_Cost(0,j-1) + "cost from (0,j-1) to (0,j)"
Min_Cost(i,0) = Min_Cost(i-1,0) + "cost from (i-1,0) to (i,0)"
Min_Cost(n,j) = Min_Cost(n,j-1) + "cost from (n,j-1) to (n,j)"
Min_Cost(i,m) = Min_Cost(i-1,m) + "cost from (i-1,m) to (i,m)"

Given this recurrence relation, we could use recursion to compute the cost of an optimal solution.  Since in most cases the recurrence involves two subproblems, it might seem that the number of subproblems will be exponentially large.  Indeed if we approach the computation naively this might be an issue.  However, we avoid this problem by observing that there are in fact only (n+1)*(m+1) subproblems, and we only need to solve each of them once.

Instead of using recursion to solve the problem from the top down, we use iteration to solve it from the bottom up.  We can start by computing Min_Cost(0,j) for j going from 1 to m, then we can compute Min_Cost(1,j) for all j, and so on.

The last value we compute will be Min_Cost(n,m).

Now we have the cost of the optimal solution ... but we don't actually have the solution!  Fortunately we can reconstruct it by working backwards from (n,m), at each stage determining which of the two possible predecessors was used.  Note that this is very efficient because for the intersections which are not part of the optimal solution (which will usually be most of them) we don't bother about their predecessors.