A gentle intro to dynamic programming

Dynamic programming! Everyone’s favorite interview question nightmare. DP is a technique that never feels intuitive to use, and even if you’re told that a solution needs DP that’s often not enough to even take a step closer to solving.

I’ve seen a lot of attempts at explaining DP, but none of them really felt like they got to the heart of the matter - why is dynamic programming so hard to learn to use? I think a lot of explanations start by showing something really simple like “and here’s the Fibonacci sequence” without connecting any more dots. I’m going to try to avoid that - here I want to try a different approach and explicitly show why DP is difficult (until it’s not), and how you can learn to better think about applying DP.

Quick note: I like to write in pseudocode because I think it’s easier to adapt to each scenario and harder to cheat yourself out of learning by copy-pasting. As always, the best way to learn is by doing, and that means rewriting my code in a real language.

Walkways: A motivating example

We’re trying to pave our walkway, and to do so we have a couple of different types of tiles to use - 1m long plain tiles [] and 2m long decorative tiles [~~].

Let’s say our walkway is 5m long - this gives us 8 different ways to pave it with tiles:

1. [][][][][]   5. [][][][~~]

2. [~~][][][]   6. [~~][~~][]

3. [][~~][][]   7. [~~][][~~]

4. [][][~~][]   8. [][~~][~~]

The problem:

How many ways are there to pave a 50m long walkway?

Without revealing the answer, it’s safe to say that it’s a huge number and you won’t be able to just enumerate all of the possibilities like we did above. We’re going to have to be more clever than that.

Where do we start with something like this? One useful way to break into the problem is considering what would happen if we changed the bounds of the problem marginally - what happens at the very end of our pathway? This breaks our problem into a couple of cases, namely the ways that we can add a new tile to complete our 5m walkway:

We can add a 1m tile to any 4m walkway, or
We can add a 2m tile to any 3m walkway.

So this covers all of the ways to make a 5m walkway: it’s the number of 4m walkways + the number of 3m walkways.

To generalize and formalize this, let’s think in terms of a function. Let f(n) be defined as the number of ways to make an n-meter long walkway. So we have:

f(5) = f(5-1) + f(5-2) = f(4) + f(3)

But in general:

f(n) = f(n-1) + f(n-2)

We’ve defined this function in terms of itself, so now we’re using recursion! Which also means that we need to consider one more thing - base cases, because otherwise we’d be claiming nonsensical things like f(1) = f(0) + f(-1).

There’s one way to make a 1m walkway (a single small tile), and one way to make a 0m walkway (no tiles). You can probably reason for yourself why we don’t need any more base cases than just these two. Now let’s write this function in pseudocode:

function f(n):
    if n == 0: return 1
    if n == 1: return 1

    return f(n-1) + f(n-2)

This code works, it’ll give you the right answer for f(5). But if you try to do f(50), it won’t work - not because it’s wrong, but because it’s extremely slow. Remember, the base cases each return 1 - if the answer to f(n) is 1,000,000, then you know your function had to add up 1,000,000 ones, making a million checks to see if n was 0 or 1, and a million calls to itself.

A lot of our computation is, in fact, just re-doing our own work: when you call f(50), we calculate f(49) and f(48). Then f(49) calculates f(48) again as well as f(47). This cascades into an exponential blowup that we really want to avoid.

To do this, we can save the results of our function somewhere and reuse them. In the language of dynamic programming, this is called memoization or caching.

-- Here my cache is a global variable, you can alternatively
-- modify f to take cache as an argument
cache = empty int -> int map

function f(n):
    if n == 0: return 1
    if n == 1: return 1
    -- If we've done this before, just return the answer
    if cache[n] exists: return cache[n]

    -- Otherwise get the answer, store it, and return it
    ans = f(n-1) + f(n-2)
    cache[n] = ans
    return ans

Doing this will let us compute an answer really quickly, returning f(50) = like 20 billion or something.

How quickly exactly? Our first function was something exponential (O(1.6^n), roughly). But our caching function runs in O(n): we only ever have to run it once for each input up to n. So we’ve turned an extremely slow function into an extremely fast one, just by adding a few lines.

Now, you might have noticed that the function we’re computing isn’t some new complex thing. In fact, this is just the Fibonacci sequence with some pretext to make it cool and obscure.

The pretext, however, is important to making this a real DP problem. Implementing code to compute Fibonacci numbers is really easy, and adding the caching optimization afterward is trivial. But there isn’t a single DP problem out there that gives you a formula and tells you to implement it, figuring out the function is the entire hard part.

The what

Now that you’re primed, I can actually explain what dynamic programming is.

Dynamic programming is a technique for solving problems by breaking them into overlapping subproblems with optimal substructure.

“Overlapping subproblems” means that in trying to compute something, you’ll end up trying to compute the same subproblems multiple times (like when we did f(48) twice, which cascaded into doing f(1) millions of times). The problems overlap. And “optimal substructure” means that combining the optimal solutions to these subproblems leads to the optimal solution for the larger problem. In a counting problem like the example, that essentially means that you’re counting everything exactly once. We split the problem into two cases (paths that end in a 1m tile and paths that end in a 2m tile), and these cases cover all possibitiles and are mutually exclusive, so we’re good.

I stole all of that from Wikipedia by the way - highly recommend reading the short bit on the history of dynamic programming because it explains the petty reason behind why it has such a stupid name.

Anyway, DP is one of those things that doesn’t really make sense just from the definition. It’s not an algorithm that you can just apply, it’s a problem solving technique which requires practice to learn. So let’s look more into what it takes to solve DP problems.

The how

Like I said earlier, the hard part about learning DP is learning how to see the recursion in a problem. When we learn about recursion, we tend to look at algorithms that don’t need to seriously consider overlap. For instance, in a tree traversal you don’t need to handle overlap at all, since it’s impossible to accidentally revisit the same node. In a graph traversal (like DFS/BFS), you can avoid running into previous nodes by just remembering which ones you’ve visited and ignoring those. However, we don’t often consider algorithms that rely on overlap.

Let’s look at an example that builds on the previous one.

Walkways 2: Defining subproblems

So we know how many ways we can pave our walkway now, but let’s say that for stylistic reasons we want to avoid putting two decorative tiles (the 2m ones) next to each other. With this added constraint, we have fewer ways to pave our 5m walkway:

1. [][][][][]   4. [][][~~][]

2. [~~][][][]   5. [][][][~~]

3. [][~~][][]   6. [~~][][~~]

Can we modify our original function to work with the constraint? If we tried to do this naively, we run into a problem - originally our logic was to look at making an n-meter walkway by incrementally adding one tile (plain 1m or decorative 2m). But we can’t add a 2m tile without knowing that we’re not putting it next to another 2m tile!

That’s a problem. We need to know something (whether the last tile was decorative), and there’s no clever way to divine this information from the one parameter that our function has (n). But you might already have an idea of how it can be solved: what if we just add a new parameter to the function? It’ll be a boolean called end2, and it’ll be true if we’re counting pathways that end on a 2m tile.

function f(n, end2)

Now f takes on a new meaning. We’re asking a slightly different question: how many ways can we pave an n-meter long walkway that ends/doesn’t end on a 2m tile?

Let me be explicit about the lesson here. The parameters of your function define the subproblem and provide all of the information you can use to solve it. So sometimes you need to pose weird questions just in order to propagate information. And, posing this weird question will actually let us use recursion in a useful way.

In this scenario, we’re using a very simple extra parameter (boolean end2) to split the problem into more cases. That means that we need to rewrite all of the logic, including the base cases, but it’ll feel pretty similar to solve:

If we’re not ending on a decorative tile: we can add the 1m tile to any n-1-meter walkway
If we are ending on a decorative tile: we can add the 2m tile to an n-2-meter walkway that doesn’t also end on a decorative tile

For the base cases, there is still 1 way to make a 0m/1m walkway that doesn’t end on a decorative tile, and 0 ways if it does end on a decorative tile.

So our entire function:

function f(n, end2):
    if (n == 0 || n == 1) && !end2: return 1
    if (n == 0 || n == 1) && end2: return 0

    if !end2:
        return f(n-1, false) + f(n-1, true)
    else:
        return f(n-2, false)

There’s one last caveat to solving. Since we’ve changed the way our function works, we can no longer just call f(n) to get the answer we want. Instead, we need to add f(n, false) + f(n, true) in order to cover all cases. But if we run this, we can see that it’s correct - f(5, false) + f(5, true) gives us 6, like we figured earlier.

Now we just need to add caching to make it fast again, and we’ll have our solution:

cache = empty (int, bool) -> int map

function f(n, end2):
    if (n == 0 || n == 1) && !end2: return 1
    if (n == 0 || n == 1) && end2: return 0
    if cache[(n, end2)] exists: return cache[(n, end2)]

    ans = 0
    if !end2:
        ans = f(n-1, false) + f(n-1, true)
    else:
        ans = f(n-2, false)
    cache[(n, end2)] = ans
    return ans

-- Function to compute the real answer
function real_f(n):
    return f(n, false) + f(n, true)

And we can use this to see that there are something like 100 million ways of re-paving our 50m walkway with the new constraint. Cool.

The time complexity hasn’t changed. Instead of having n subproblems to potentially go through, we have 2n - so that’s still O(n) work.

0/1 Knapsack: Optimization problems with DP

DP can be used for more than just counting - in fact, it’s probably more common to see DP problems that involve optimization, or finding the minimum/maximum of something. Knapsack is another classic problem solvable with dynamic programming, here’s the problem:

Imagine we have a knapsack and some items that we want to carry in it - each item has a weight and a value. The knapsack has a maximum capacity - it will break if you try to carry more than a certain amount of weight in it.

How can we choose items to hold the most amount of value in the knapsack without breaking it?

For example, we might have a knapsack with a capacity of 20 and the following items:

Item	Weight	Value
A	3	5
B	6	7
C	6	8
D	7	8

It might seem like a good heuristic to use is to pick the items with the most value per weight, but this doesn’t work in general - the optimal solution here is to choose items {B, C, D} to get a total value of 7 + 8 + 8 = 23 (with weight 6 + 6 + 7 = 19 < 20). Item A has the highest value per weight, but as you can see it can’t be added to our optimal solution without sacrificing one of the more valuable items.

If we follow somewhat-similar logic to how we solved the tiles problems, we might try solving “smaller” versions of the same problem: here, that could mean finding the solution for less items at a time.

We can imagine iterating over the list of items and just deciding, at each item, whether to keep or leave it. Formally, we might say that f(n) is the optimal answer when considering the first n items from the list. Then our overall answer (for the example) is f(4) - the optimal value after considering every item.

This doesn’t give us any way to choose an item, because we can’t even tell if a previous solution could be extended by a new item - we might go over capacity. So we definitely need a way to factor in the knapsack’s capacity, or its remaining capacity. We might write something like this: f(n, c) is the optimal answer when considering the first n items when we have a knapsack of capacity c.

This actually lets us figure out whether an item should be kept or not! We have two options for each item (with weight w, value v):

Keep: new value = f(n-1, c-w) + v. Add the item’s value to the optimal answer for a smaller knapsack - this is kind of like how we counted walkways by “extending” a smaller walkway.
Leave: new value = f(n-1, c). We just use the optimal answer from the subproblem that doesn’t consider this item but uses the same capacity.

Then the answer to f(n, c) is the max between the values for keeping/leaving the item. We also need base cases, but those are easy - f(0, c) = 0 for any c - if we can’t select any items then we have no value.

One edge case that hasn’t been mentioned is when w > c. This just means we can’t add the item, so we shouldn’t consider the possibility of keeping it. All together, we get something like this:

cache = empty (int, int) -> int map

function f(n, c):
    if n == 0: return 0
    if cache[(n, c)] exists: return cache[(n, c)]

    w, v = the weight and value of the nth item (1-indexed)

    keep = 0
    -- Make sure we don't use an invalid value of c
    if c > w:
        keep = f(n-1, c-w) + v
    leave = f(n-1, c)

    ans = max of keep, leave
    cache[(n, c)] = ans
    return ans

Our runtime is O(nw), where n is the number of items and w is the capacity of the knapsack. Much better than the O(2^n) runtime that brute-forcing all possible sets of items would give us.

The end?

This is certainly not all there is to dynamic programming - there’s other general types of problems that can be solved with clever recursion, like problems to do with strings which weren’t covered here. I may follow up or edit this post with more examples, because the concept of DP is subtle and far-reaching. Furthermore, an important concept in getting the last bit of optimization from your DP solutions involves removing recursion altogether via bottom-up DP. But until then, hopefully this gave you a good introduction to the concepts involved in solving problems with dynamic programming.

This post was adapted from a presentation I gave for the UFPT a few years ago. Thanks to my friend Ethan for helping me iron out the details back then.