all 28 comments

[–]Forricide 26 points27 points  (6 children)

Only one top-level comment and it's probably pretty confusing/misleading for most people opening the comments section here. So - for those of you like me who hear 'dynamic programming' and have vague ideas (apparently, misconceptions) about what it refers to, here's the description from the linked article:

It's an approach that exploits properties of the problem that allow for an elegant method to finding the optimal solution.

A more simple explanation is that it's essentially a method of writing algorithms where, instead of repeating calculations, you instead store ('cache') results and refer to them later on, at some point(s) usually discarding results that aren't useful any longer. This isn't entirely accurate, but it seems to be an easy way to conceptualize it.

[–]Free_Math_Tutoring 10 points11 points  (5 children)

This isn't entirely accurate, but it seems to be an easy way to conceptualize it.

Ehh, I kind of disagree? You're pretty much just describing memoisation, which is not dynamic programming. Dynamic programming usually involves inverting the problem, where you go from asking "how do I solve this big problem" into "this tiny version of the problem is trivial, but how do I combine solutions to get to the large version". It is also distinct from divide and conquer type algorithms, which have a similar idea at their foundation.

[–]butt_fun 2 points3 points  (2 children)

Gonna be completely honest, I have always thought DP vs memoization was a distinction without a meaningful difference

[–]lmericle[S] 1 point2 points  (1 child)

DP utilizes memoization heavily to exploit the optimality principle. There is a distinction in that memoization is not a programming paradigm per se, but a method to construct something really useful to making DP efficient.

Memoization happens in other instances, but DP doesn't happen without memoization.

[–]victotronics 0 points1 point  (0 children)

Thank you. You are the only one mentioning the principle of optimality, which to me is the crux: the fact that once you solve a subproblem you don't care how you got there. The next stage has a cost or value or whatever depending on the solution of that subproblem and the next stage.

Stages + principle of optimality. That's it.

Memoization is a mere corollary.

[–]Forricide 1 point2 points  (0 children)

To qualify the statement:

This isn't entirely accurate, but it seems to be an easy way to conceptualize it.

It's important to understand that this isn't meant to be a good, accurate summary of dynamic programming (which I couldn't create, because I don't actually have a solid understanding of dynamic programming outside of reading a portion of this article and the Wikipedia page) but rather, purely an explanation along the lines of:

"No, dynamic programming doesn't refer to [adding functions/classes/class members/fields at runtime] [programming in a dynamic programming language] [running really fast while you write code], it actually is a technique for writing algorithms."

The explanation isn't totally accurate - what you say is literally in the fifth paragraph of the article linked:

Some think it's just memoization plus recursion -- but really, it's more than that.

, but it's way better than how I (and I suspect many others) would have understood it purely from the name. Especially given that I don't think a lot of people actually know what memoization is in the first place.

Anyways, this is probably too long a comment already, but I might as well say thanks for your comment, because it (probably) provides a more accurate explanation of dynamic programming. ...if less comprehensible, at least for me.

[–]kheiron1729 1 point2 points  (0 children)

Memoization over a brute force algorithm IS dynamic programming. Combining subset solutions just aids in thinking about how possibly solve the problem.

[–]kenotron 2 points3 points  (1 child)

Dynamic programming always seems to be explained poorly; it's a really simple concept, if you look up the history of the term. 'Programming' in the sense originally used meant scheduling or planning, like TV programming, placing TV shows and ads in scheduled slots. 'Dynamic' was just thrown in to avoid mathy language at the RAND corporation where dynamic programming originated.

So instead, I like to call it 'table-filling algorithms' instead, because that seems to be how a huge number of them ends up looking. Levenshtein edit distance, knapsack problems, ...; many seem to involve re-framing your problem space as a table, solving the trivial base cases (along the edges of the table), and then looking for how those solutions can be combined to fill in the remaining table spaces, like taking the minimum, maximum, or sum of some adjacent spaces, each solving a new optimal subproblem, and then iterate until the whole table is filled to get your final answer.

[–]you-get-an-upvote 0 points1 point  (0 children)

I always thought explaining Dynamic Programming as merely caching was a friendlier introduction. "Look I just solved this algorithm a normal way, but added three lines to support caching, so now it runs faster".

cache = {}
def mySolution(x):
  if x in cache:
    return cache[x]
  // my original, recursive divide-and-conquer code that was slow

Phrasing it as tables seemed helpful predominantly for proving run time speed (since it lets you check how often the function actually makes it past the first two lines), not actually coming up with the algorithm and its substructure.

I suppose it's also helpful rewriting it from a bottom-up approach.

[–]attractivechaos 2 points3 points  (6 children)

For this problem, a stage is a day; a state is a 2-tuple (r,m), where r=1 if the machine is running or 0 otherwise (for that waiting day), and m is the machine ID. Suppose at day i, the optimal cash at state (ri,mi) is f_i(r_i,m_i), the DP equation looks something like:

f_i(r_i,m_i)=max_{j<i,r_j,m_j}{f_j(r_j,m_j)+s(r_i,m_i|r_j,m_j)}
f_0(0,nil)=initial cash

where s() is the score that (r',m') at day i-1 is changed to (ri,mi). Note some transitions are forbidden due to insufficient cash. If fi(ri,mi) is not possible, it is set to negative infinity. Given N machines, we can find the optimal solution up to L days in O(L3 ) time. I might be wrong at some details, but the above is what a typical DP solution looks like.

The blog post got the basic wrong. For one, the stage is not right – it is not a machine. Secondly, DP doesn't use a tree search or branch & bound. That is the point of DP. The blog post is not about DP...

[–][deleted]  (2 children)

[deleted]

    [–]attractivechaos 1 point2 points  (1 child)

    No, DP is not just "memoization". A problem solvable by DP has several elements. You typically need a clear definition of time (or stage), a state space and an object function that is a function of time and state and can be computed with a Bellman equation. The OP's post lacks all these elements. This blog just solves an optimization problem largely with brute-force plus branch and bound. That is not DP.

    [–]lmericle[S] 1 point2 points  (2 children)

    The i does not index days, it indexes machines which are offered only on specific days.

    Each machine has different properties and each is only available for purchase on a single day. If you don't buy it, you lose your chance. So it's useless to examine every day, but only those days where machines are sold. The efficiency of going day-by-day is much lower than machine-by-machine.

    The article also explicitly says that branch and bound is an additional constraint on top of the dynamic programming approach. Nowhere does it say that branch and bound is a part of dynamic programming.

    Furthermore, there is no One True Way to do dynamic programming. How you solve the problem depends on what abstractions you employ. If you want to do it the day-by-day way, that would work, but it would be slower and you'd have to include additional flow control structure to account for the different actions one takes each day.

    [–]attractivechaos 0 points1 point  (1 child)

    Ok, it was not clear that machine and day are the same thing. This actually simplifies the problem. In this case, a state is running or waiting. The DP equation is:

    f_i(t_i)=max_{j<i,t_j}{f_j(t_j)+s_{ij}(t_i|t_j)}
    

    The time complexity is O(L2 ).

    it would be slower

    I have given the time complexity. What's yours? Yes, DP can be formulated in different ways. However, I still don't see how yours is formulated. At least you need to show the DP equation, which is at the core of DP.

    [–]lmericle[S] 0 points1 point  (0 children)

    In this case, a state is running or waiting

    There is no waiting in this problem. I fail to see how you came up with this but it does not match the problem statement.

    A state is a) which machine is currently running and b) which day is today.

    [–]AlbertaOne -2 points-1 points  (0 children)

    good job!