Log In Start studying!

Select your language

Suggested languages for you:
StudySmarter - The all-in-one study app.
4.8 • +11k Ratings
More than 3 Million Downloads
Free
|
|

Dynamic Programming

Dynamic Programming

Often in mathematics and related fields like computer science, it is important to be able to make decisions at multiple stages based on associated costs. These could be time costs, resource costs, monetary costs, or really any other cost at all (like feeding your friends pizza if they help you move furniture). How can you make sure to find an optimal path through a field of possible decisions? Well, one method is dynamic programming.

Definition of dynamic programming

So what exactly is dynamic programming?

Dynamic Programming is a recursive technique for finding optimal routes through multi-stage decision-making processes from a defined starting point to the desired endpoint.

So far, there has been a lot of talk of 'multi-stage decision making' and 'costs' and all sorts; it's all been rather abstract. Let's try and make this all bit more concrete by visualising it.

The diagram below represents a multi-stage decision process. Each circle with a letter inside is known as a node. The arrows indicate the ways in which you may move from node to node. The numbers represent the associated cost of each move. With this information, dynamic programming can be used to find an optimal path between \( A \) and \( G \), depending on what we are optimising for.

The simplest way to envisage cost at this stage is to simply imagine it as a distance (which it may often be). The aim may be to find the path from \(A\) to \(G\) over the shortest distance, or perhaps over the longest distance.

Dynamic programming, multistage decision network, studysmarterFig. 1 - Multi-Stage Decision Network

Before diving into exactly how this works though, let's compare dynamic programming with another sort of optimisation technique you may have come across.

Difference between dynamic and linear programming

Ok so hopefully now you've got a better idea of what exactly dynamic programming is, but how does it compare to linear programming?

You might remember that linear programming is also a method used in mathematics to find optimal solutions to problems. However, where linear programming is useful in finding a single optimal solution based on constraints such as supply, demand, or budget. Dynamic programming is useful in finding the optimal path through a network of decisions.

For instance, linear programming would be a great choice if you wanted to find the laptop in the store with the optimal amount of processing power for the minimal price.

Dynamic programming would be far more useful if you wanted to find the optimal route through a city to get from point \(A\) to point \(B\). Got that? Great! Well, let's move on to some dynamic programming examples to test what you've learnt.

General method of dynamic programming

Dynamic programming works on the principle that for a given problem, any part of an optimised path is itself an optimised path of its own sub-problem. This is known as the principle of optimality.

Take, for instance, the example below. If you wanted to get from \(A\) to \( E \) over the shortest distance, which route would you take? You'd likely say

\[S \to A \to C \to T.\]

and you would be right. What then, is the shortest path from \(S\) to \(T\)? The principle of optimality states that it will be a part of this larger optimal path, and we can see from inspection that this is true. The shortest path is

\[S \to A \to C.\]

dynamic programming, principle of optimality example, studysmarterFig. 2 - Principle of Optimality Example

Dynamic programming uses this principle to its advantage. By working backwards from the end-point, it can determine the optimal path.

But, you might be thinking, that's all well and good when there are only five nodes, but how on Earth am I supposed to keep that all in my head when there eight, ten, thirty? That's where dynamic programming tables come in. Let's take a look at an example to see how they work.

Dynamic programming tables

Take the network below. How can you find the optimal shortest path from \(S\) to \(T\)?

Dynamic programming, dynamic programming example, studysmarterFig. 3 - Dynamic Programming Example

First, construct a table with the following columns.

Table 1 - Start of dynamic programming table.

Stage

State

Action

Destination

Value

Starting off from the nodes adjacent to \(T\), fill in the table like so.

Table 2 - Dynamic programming table with nodes adjacent to \(T\).

Stage

State

Action

Destination

Value

\(1\)

\(D\)

\(DT\)

\(T\)

\(2^*\)

\(1\)

\(B\)

\(BT\)

\(T\)

\(4^*\)

\(1\)

\(E\)

\(ET\)

\(T\)

\(3^*\)

The first row that's filled in simply means, starting from \(D\) and moving to \(T\) has a value of \(2\). These are all stage \(1\) moves, as they one move away from the endpoint, \(T\). Mark the optimal action for each state with *.

Now do the same in your table for stage \(2\).

Table 3 - Dynamic programming table stage \(2\).

Stage

State

Action

Destination

Value

\(2\)

\(A\)

\(AD\)

\(D\)

\(3+2 = 5^*\)

\(2\)

\(B\)

\(BD\)

\(D\)

\(1+2 = 3^*\)

\(2\)

\(B\)

\(BT\)

\(T\)

\(4\)

\(2\)

\(B\)

\(BE\)

\(E\)

\(1+3 = 4\)

\(2\)

\(C\)

\(CE\)

\(E\)

\(4+3 = 7^*\)

Notice that the value from each action is the total value to get to \(T\). For the action \(AD\) in the first row, the value is \(3\) to get to \(D\), and then from \(D\) the optimal value to \(T\) is \(2\) (from stage \(1\)) - hence a final value of \(3+2 = 5\).

Now do the same in your table for stage \(3\).

Table 4 - Dynamic programming table stage \(3\)

Stage

State

Action

Destination

Value

\(3\)

\(S\)

\(SA\)

\(A\)

\(3+5 = 8^*\)

\(3\)

\(S\)

\(SB\)

\(B\)

\(5+3 = 8^*\)

\(3\)

\(S\)

\(SC\)

\(C\)

\(2+7 = 9\)

So, there are in fact two equally optimal paths through this network, each with a value (cost) of \(8\). By tracking back through the table, following the *s, you can find and write down the optimal path(s) from \(S\) to \(T\).

For example, looking at stage \(3\), you can see that one of the optimal paths goes \(S \to B\). Going to \(B\) in stage two, you can see that the optimal path from \(B\) goes \(B\to D\). And finally going to stage \(1\) the optimal path from \(D\) goes \(D\to T\).

The optimal paths are therefore

\[S \to A \to D \to T\]

\[S \to B \to D \to T.\]

This technique can be used both to minimise the value of the route or maximise the value of the route. But there's more; dynamic programming can also be used for maximin and minimax problems. What are maximin and minimax problems? Good question.

Minimax and maximin problems

Maximin and minimax problems are types of problems that are concerned not with the value of the entire route, but with the maximum or minimum values of each stage.

For minimax problems, the maximum value of a stage of the route must be as small as possible.

For maximin problems, the minimum value of a stage of the route must be as large as possible.

Luckily, these problems can both be solved in a similar fashion to the problem you solved previously, with a little tweak that is. Let's take a look at the same network and see how the route differs when we approach it as a minimax problem.

Dynamic programming, dynamic programming example, studysmarterFig. 4 - Minimax Example

Once again, construct the table.

Table 5 - Start of dynamic programming table.

StageStateActionDestinationValue

Now, fill in stage one as before.

Table 6 - Dynamic programming table stage \(1\).

Stage

State

Action

Destination

Value

\(1\)

\(D\)

\(DT\)

\(T\)

\(2^*\)

\(B\)

\(BT\)

\(T\)

\(4^*\)

\(E\)

\(ET\)

\(T\)

\(3^*\)

Now, at stage two things are a little different. Instead of adding the stage one and stage two values to find the total for each action, compare them instead, and choose the maximum. For each state, mark the action with the smaller value with a *.

Table 7 - Dynamic programming table stage \(2\).

Stage

State

Action

Destination

Value

\(2\)

\(A\)

\(AD\)

\(D\)

\(\text{Max}(3,2) = 3^*\)

\(B\)

\(BD\)

\(D\)

\(\text{Max}(1,2) = 2^*\)

\(BE\)

\(E\)

\(\text{Max}(1,3) = 3\)

\(C\)

\(CE\)

\(E\)

\(\text{Max}(4,3) = 4^*\)

Now finally do the same for stage three.

Table 8 - Dynamic programming table stage \(3\).

Stage

State

Action

Destination

Value

\(3\)

\(S\)

\(SA\)

\(A\)

\(\text{Max}(3,3) = 3^*\)

\(SB\)

\(B\)

\(\text{Max}(5,3) = 5\)

\(SC\)

\(C\)

\(\text{Max}(3,4) = 4\)

Now, following the *s back through the table, you can find the path that minimises the maximum stage value.

\[S\to A \to D \to T\]

See what you did? For each state, you find the maximum value of a stage that you will encounter from each potential action, and then choose the smaller one.

If the problem were instead a maximin problem, you would do the opposite. You would compare for the minimum value, and then choose the action with the largest one.

Dynamic Programming examples

Let's take a look at some more examples to make sure you've got this!

Using dynamic programming, find the maximised route through the following network.

dynamic programming maximise example, studysmmarterFig. 5 - Dynamic Programming Maximise Example

Answer:

Fill in the table, placing a * next to the optimal action from each state. In this instance, the optimal action is the one with the highest value.

Table 9 - Network dynamic programming table

Stage

State

Action

Destination

Value

\(1\)

\(F\)

\(FT\)

\(T\)

\(4\)

\(G\)

\(GT\)

\(T\)

\(2\)

\(H\)

\(HT\)

\(T\)

\(2\)

\(2\)

\(D\)

\(DF\)

\(F\)

\(1+4 = 5^*\)

\(DG\)

\(G\)

\(2+2 = 4\)

\(E\)

\(EG\)

\(G\)

\(1+2 = 3^*\)

\(EH\)

\(H\)

\(1+2 = 3^*\)

\(3\)

\(A\)

\(AD\)

\(D\)

\(2+5 = 7^*\)

\(B\)

\(BD\)

\(D\)

\(3+5 = 8^*\)

\(BE\)

\(E\)

\(2+3 = 5\)

\(C\)

\(CE\)

\(E\)

\(3+3 = 6^*\)

\(4\)

\(S\)

\(SA\)

\(A\)

\(4+7 = 11^*\)

\(SB\)

\(B\)

\(2+8 = 10\)

\(SC\)

\(C\)

\(4+10\)

Following the *s back through the table, the optimal route with the maximum value is

\[S \to A \to D \to F \to T.\]

Let's take another example.

A train company wishes to find the route between station \(S\) and station \(T\) with the largest minimum distance between any two stations. The stations and distances in miles between are shown below.

Dynamic programming, maximin example, studysmarterFig. 6 - Maximin Example

Find the optimal route for the train company's brief.

Answer:

First, identify that this is a minimax problem.

Now, construct and fill out the table accordingly.

Table 10 - Route dynamic programming table

StageStateActionDestinationValue

\(1\)

\(F\)\(FT\)\(T\)\(70^*\)
\(G\)\(GT\)\(T\)\(25^*\)
\(H\)\(HT\)\(T\)\(32^*\)
\(I\)\(IT\)\(T\)\(45^*\)
\(2\)\(C\)\(CF\)\(F\)\(\text{Min}(13,70) = 13\)
\(CG\)\(G\)\(\text{Min}(22,25) = 22^*\)
\(D\)\(DG\)\(G\)\(\text{Min}(12,25) = 12\)
\(DH\)\(H\)\(\text{Min}(29,32) = 29^*\)
\(E\)\(EH\)\(H\)\(\text{Min}(5,32) = 5\)
\(EI\)\(I\)\(\text{Min}(18,45) = 18^*\)
\(3\)\(A\)\(AC\)\(C\)\(\text{Min}(15,22) = 15\)
\(AD\)\(D\)\(\text{Min}(27,19) = 27^*\)
\(B\)\(BD\)\(D\)\(\text{Min}(23,29) = 23^* \)
\(BE\)\(\text{Min}(20,18) = 18\)
\(4\)\(S\)\(SA\)\(A\)\(\text{Min}(10,27) = 10\)
\(SB\)\(B\)\(\text{Min}(35,23) = 23^*\)

Following the *s back through the table, the optimal route with the largest minimum distance between two stations is

\[S \to B \to D \to H \to T.\]

This has been a brief overview of how you can use dynamic programming to tackle different sorts of problems. Why not head over to our other explanations on dynamic programming, like Shortest and Longest Path Problems, to see each of these types of problems covered in more depth?

Dynamic Programming - Key takeaways

  • Dynamic programming is a recursive technique for finding optimal routes in multistage decision processes.
  • Dynamic programming works on the principle that for a given problem, any part of an optimised path is itself an optimised path of its own sub-problem.
  • Dynamic programming can be used to solve shortest and longest path problems.
  • Dynamic programming can also be used to solve minimax and maximin problems.

Frequently Asked Questions about Dynamic Programming

Dynamic Programming is a recursive technique for finding optimal routes through multi-stage decision-making processes from a defined starting point to the desired endpoint. 

To practice dynamic programming, just do lots of questions and examples.

Dynamic programming is best used to solve optimisation problems which can be broken into subproblems, such that the solutions can be stored and used later.

There are two types of dynamic programming - the top-down approach and the bottom-up approach.l

Dynamic programming obtains solutions both to the overall problem and to subproblems, the solutions to these subproblems can be stored and used in other problems down the line.

Final Dynamic Programming Quiz

Question

What is dynamic programming?

Show answer

Answer

Dynamic Programming is a recursive technique for finding optimal routes through multi-stage decision-making processes from a defined starting point to the desired endpoint. 

Show question

Question

The ________ states that any part of an optimised path is itself an optimised path of its own sub-problem.

Show answer

Answer

Principle of optimality.

Show question

Question

What is a longest path problem?

Show answer

Answer

A problem that requires you to find the path of greatest value through a multi-stage decision process network.

Show question

Question

What is a shortest path problem?

Show answer

Answer

A problem that requires you to find the path of least value through a multi-stage decision process network.

Show question

Question

What is a minimax problem?

Show answer

Answer

A problem that requires you to find the path with the smallest maximum value between any two nodes.

Show question

Question

What is a maximin problem?

Show answer

Answer

A problem that requires you to find the path with the largest minimum value between any two nodes.

Show question

Question

How are nodes represented?

Show answer

Answer

A circle or point and a letter.

Show question

Question

_____, _____, _____, _____, and _____ are the five columns in the table for solving dynamic programming. problems?

Show answer

Answer

Stage, State, Action, Destination, Value

Show question

Question

What does stage refer to in the dynamic programming table?

Show answer

Answer

Stage refers to how far you currently are from the final destination node.

Show question

Question

What does state refer to in the dynamic programming table?

Show answer

Answer

State refers to the starting position for the move being considered.

Show question

Question

What does action refer to in the dynamic programming table?

Show answer

Answer

Action refers to the move toward the final destination node being considered.

Show question

Question

What does destination refer to in the dynamic programming table?

Show answer

Answer

Destination refers the node being moved to from the current state.

Show question

Question

What does value refer to in the dynamic programming table?

Show answer

Answer

Value refers to the value added to the route by a given action. This could be a distance or an additional cost, or many other things.

Show question

Question

How is the optimal action for a given state denoted in a dynamic programming table?

Show answer

Answer

*

Show question

Question

The optimal route can be found by following the ______ through each stage from the bottom of the table.

Show answer

Answer

*s.

Show question

60%

of the users don't pass the Dynamic Programming quiz! Will you pass the quiz?

Start Quiz

Discover the right content for your subjects

No need to cheat if you have everything you need to succeed! Packed into one app!

Study Plan

Be perfectly prepared on time with an individual plan.

Quizzes

Test your knowledge with gamified quizzes.

Flashcards

Create and find flashcards in record time.

Notes

Create beautiful notes faster than ever before.

Study Sets

Have all your study materials in one place.

Documents

Upload unlimited documents and save them online.

Study Analytics

Identify your study strength and weaknesses.

Weekly Goals

Set individual study goals and earn points reaching them.

Smart Reminders

Stop procrastinating with our study reminders.

Rewards

Earn points, unlock badges and level up while studying.

Magic Marker

Create flashcards in notes completely automatically.

Smart Formatting

Create the most beautiful study materials using our templates.

Sign up to highlight and take notes. It’s 100% free.

Get FREE ACCESS to all of our study material, tailor-made!

Over 10 million students from across the world are already learning smarter.

Get Started for Free
Illustration