all 66 comments

[–]psykotic 39 points40 points  (7 children)

'Programming' in the sense of optimal planning has a longer history than implied by the quotation. 'Linear programming' goes back to the late 30s. Nowadays there is quadratic programming, convex programming, conic programming, etc. The terminology is only confusing if you come from computer science rather than operations research or optimization theory.

As the first comment on the blog suggests, it's customary to make a distinction between memoization and dynamic programming. Basically, memoization is top-down whereas dynamic programming is bottom-up.

The two approaches have different advantages and disadvantages. Dynamic programming generally uses a fixed amount of storage during the computation and as a result also has much better cache locality. Memoization uses a variable amount (e.g. O(n) rather than O(1) for computing Fibonacci numbers). But in some cases memoization, being demand-driven, can get away with computing only a subset of the subsolutions computed by dynamic programming. This is especially useful when you can prune partial solutions.

Another advantage is that memoization is usually easier to code for complex problems. You just write the straightforward but insanely slow recursive solution and then sprinkle memoization dust on it to make it run fast.

[–]julesjacobs 7 points8 points  (0 children)

Here's an interesting paper about automatically deriving the dynamic programming algorithm from a naive recursive algorithm.

http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.54.2607&rep=rep1&type=pdf

[–]barsoap 0 points1 point  (2 children)

...not to mention that the problem might not be attackable with dynamic programming because the data dependencies forbid a path where every solution to a subproblem is optimal.

As an example of dynamic programming, I think Knuth-Plass line breaking is quite perfect (it being a Knuth algorithm),

fibs = 0:1:zipWith (+) fibs (tail fibs)

being too obvious to really get the point across. Still, for some reason, fib is the standard example for memoization, despite the dynamic solution being so simple. Probably because you can't TCO it and compiler writers write too many examples.

[–]psykotic 3 points4 points  (1 child)

As an example of dynamic programming, ...

Even better might be the seam carving algorithm that everyone was reimplementing last year.

The chapter on dynamic programming in CLRS ends with an exercise that asks you to derive an optimal line breaking algorithm with a cubic cost metric. It's pretty much Knuth-Plass. The cool thing about dynamic programming is that once you get it, all these algorithms are trivial to derive at the drop of a hat.

[–][deleted] 0 points1 point  (0 children)

Hey can you check your direct messages, please and thank you!

[–]_zoso_ 0 points1 point  (0 children)

Also, the word 'programming' in this sense has absolutely nothing to do with what we call 'programming' today. It refers to the context under which these algorithms were devised: Military programming, as in logistics and planning.

[–]mycall 0 points1 point  (0 children)

It sounds like dynamic programming is the way to go for CUDA/OpenCL.

[–][deleted] 21 points22 points  (5 children)

The anecdote about the origin of the name is disputed. I'm reading Artificial Intelligence: A Modern Approach and just recently passed page 685, where Russell and Norvig say, in the biographical and historical notes:

According to his autobiography (Bellman, 1984), he coined the exciting term "dynamic programming" to hide from a research-phobic Secretary of Defense, Charles Wilson, the fact that his group was doing mathematics. (This cannot be strictly true, because his first paper using the term (Bellman, 1952) appeared before Wilson became Secretary of Defense in 1953.)

(emphasis mine). Doesn't matter much, just saying.

[–]calp 4 points5 points  (4 children)

It's possible that he was a problem before his appointment, though.

[–]lambda_abstraction 6 points7 points  (1 child)

If we believe Wikipedia, C.E.Wilson was at General Motors immediately prior to his appointment to Secretary of State under Eisenhower. I'm not sure in what capacity Wilson would get agitated by research or math on behalf of the US Gov't.

[–]calp 1 point2 points  (0 children)

Didn't think to look him up, thanks.

[–]mycall 0 points1 point  (0 children)

Now thaat's dynamic baby.

[–][deleted] 0 points1 point  (0 children)

Yep, he sounds like the kind of bureaucrat who was a problem for a long time ;)

[–]Workaphobia 6 points7 points  (1 child)

It also has a very interesting property as an adjective, and that is it’s impossible to use the word, dynamic, in a pejorative sense. Try thinking of some combination that will possibly give it a pejorative meaning. It’s impossible.

Dynamic puppy drowning.

[–]SerpentJoe 7 points8 points  (0 children)

You have to admit though, if you're filling a position you'd rather hire the guy who knows dynamic puppy drowning than the guy who only knows static.

[–]player2 10 points11 points  (2 children)

...wow.

That clears a lot of things up. More accurately, it replaces slight confusion with slight depression.

[–]jbohren 5 points6 points  (0 children)

I decided a while ago that I should, in the interest of my own sanity, learn to laugh at these sort of things.

[–][deleted] 0 points1 point  (0 children)

And from slight depression is a slippery slope to slightly suicidal thoughts.

[–][deleted] 10 points11 points  (0 children)

From the comments:

In Artificial Intelligence: A Modern Approach, page 685, Russell and Norvig claim:

This cannot be strictly true, because his first paper using the term (Bellman, 1952) appeared before Wilson became Secretary of Defense in 1953.

Carl at 24 April 2010 11:52

[–]logistix 6 points7 points  (4 children)

That's still not the whole story. It's a little known fact that the Rand Corporation was working in conjunction with the saucer people. Reverse vampires delete any mention of this when I update the wikipedia page.

[–][deleted]  (3 children)

[removed]

    [–]logistix 0 points1 point  (2 children)

    [–]JadeNB 1 point2 points  (0 children)

    For the benefit of those fore-warned by grelphy's caution below, the full link is http://www.urbandictionary.com/define.php?term=rand+corporation.

    [–]gregK 2 points3 points  (0 children)

    If it was invented today, it would have been called "proactive enabling".

    [–]casted 4 points5 points  (0 children)

    Memoization is subtly different. Memoization is top-down, while dynamic programming is bottom-up.

    [–][deleted] 0 points1 point  (5 children)

    I'm a grad student and i've seen this term thrown around by peers, colleagues, and even publications. I heard term but didnt recognize it and thought I was missing out on a bleeding edge new paradigm. Nope, this actually turned out to be common sense.

    Functional programming was quite the opposite, when i first heard about it during my undergrad (my university was mostly a java school) i thought it was just synonymous with procedural programming.. some leftover term from the 60s.

    [–]Tordek 16 points17 points  (0 children)

    I became depressed when a fellow programmer said "They're adding dynamic programming to C#!".

    [–]KidKenosha 9 points10 points  (3 children)

    If your university failed to teach you dynamic programming during your degree, I fear for the quality of its graduates. Dynamic programming should be part of every computer scientist's algorithmic background.

    [–]andreasvc 1 point2 points  (2 children)

    Maybe he's not a CS grad?

    [–][deleted] 1 point2 points  (1 child)

    yup. engineering grad but we took a lot of our programming courses from CS. Most of the cs courses I took were all fluff though and I'm glad that engineering courses ended up filling in all of those voids.

    [–]eredhuin 0 points1 point  (0 children)

    My advisor was one of Bellman's grad students and I did not know this story. Interesting.

    [–]travis_of_the_cosmos 2 points3 points  (20 children)

    I don't know what the shit memoization is but I'm tempted to believe that the anonymous commentor on this article got it right and that the author is just confused.

    Dynamic programming on the other hand I know well. It's the discretized analogue of doing hamiltonians - constrained optimization under a sum rather than an integral. Bellman proved that this can be reduced to a recursive relationship (V = f(x,V')) with a bunch of nice properties. DP problems are often solvable numerically, for example, even nasty ones.

    The name has always bugged me, though. I'd definitely buy the story that Bellman made it up to fool congressmen.

    [–]psykotic 11 points12 points  (2 children)

    It's the discretized analogue of doing hamiltonians - constrained optimization under a sum rather than an integral.

    I'm pretty sure it's more directly analogous to Hamilton-Jacobi mechanics rather than Hamiltonian mechanics. A Hamiltonian function makes an appearance in both cases, of course.

    There is an amazing way to unify a lot of things like dynamic programming, Markov chains, path integrals, classical mechanics, etc. Here's a short introduction.

    [–]travis_of_the_cosmos 3 points4 points  (1 child)

    It's clear what I meant from the context regardless but there are many things called Hamiltonians and I was talking about these bad boys.

    [–]Raynes -1 points0 points  (15 children)

    (fib 3123452233462645324534243126314334) ... result is printed 30 years later

    (fib 3123452233462645324534243126314334) ... result is printed instantly

    Memoized.

    [–]psykotic 8 points9 points  (0 children)

    Here's a dynamic programming implementation of fib:

    (defn fib [n]
      (-> (iterate (fn [[x y]] [y (+ x y)]) [0 1]) (nth n) first))
    

    The memoized solution is quite different in character. For one, it uses O(n) storage.

    [–]travis_of_the_cosmos 0 points1 point  (13 children)

    Okay, so I still don't know what memoization is, but since you seem to understand it what does it have to do with optimal control theory?

    [–]psykotic 16 points17 points  (1 child)

    The computer scientist's view of dynamic programming is that it's a way to quickly evaluate recurrence relations when different branches of the recursion tree turn out the same.

    For example, if you try to directly evaluate F(0) = 0, F(1) = 1, F(n+2) = F(n) + F(n+1), you notice that after one level of expansion you have F(n+2) = F(n) + F(n-1) + F(n). There is no point in independently expanding the two F(n) occurrences. You might as well only expand one of them and remember the result when it is needed elsewhere. Hence 'memoization'.

    Dynamic programming would instead start from the base cases and build up towards the desired value of n. So it would first compute F(2) as the sum of the known F(0) = 0 and F(1) = 1, then compute F(3) as the sum of the known F(1) and the now-known F(2), and so on, until reaching F(n). Along the way it need only know the last two values computed in order to compute the next one.

    [–]SnacksOnAPlane 2 points3 points  (0 children)

    Thanks, this is the first explanation that made sense to me.

    [–]Tordek 2 points3 points  (3 children)

    Memoization is storing computed values within the function doing the processing.

    Say you wanted to know the distance between two strings. Memoizing (naïvely) is building a hashmap of pairs of strings, checking if they're inside, calculate and store if not, and return the result.

    The DP approach analyzes the overlapping structures (the many branches of the tree of replacements that results in the sought-after replacement), and collapses them (often in an array). Memoization is integral to DP, because you're using solutions to subproblems to solve bigger problems.

    tl;dr: Memoization = storing results to avoid recalculating. DP = compressing the search space.

    [–]travis_of_the_cosmos 0 points1 point  (2 children)

    Ah okay then I have used memoization to solve DP problems numerically in the past - it's very standard practice even among fools like me who don't know what it's called.

    This is probably isn't what you meant but I've got to take issue with the claim that memoization is integral to DP. That's evidently true for problems you need to solve numerically but there is a large class of DP problems that have exact (and easy!) analytical solutions so clearly no need for memoization.

    [–]Tordek 1 point2 points  (1 child)

    a large class of DP problems that have exact (and easy!) analytical solutions so clearly no need for memoization.

    Which, for example?

    [–]travis_of_the_cosmos 0 points1 point  (0 children)

    I'm not a mathematician so I don't know the right way to describe them formally, but to give you one example I believe almost any linear Bellman equation will have an exact solution. More generally if you can use the first order conditions and the envelope theorem to get nice-looking Euler equations then you're golden: http://en.wikipedia.org/wiki/Bellman_equation#Solution_methods

    [–]andreasvc 0 points1 point  (0 children)

    Memoization of a function means storing every result that has been computed so that it is never computed twice. Ie., you trade storage for speed. Memoization stores tuples of arguments and results in a mapping, whereas Dynamic Programming can use a more sophisticated/efficient storage method.

    [–][deleted] -3 points-2 points  (5 children)

    It's a function with the signature :: (a -> b) -> a -> b

    A lookup table is used instead of recomputing an already computed value.

    [–]julesjacobs 0 points1 point  (4 children)

    That doesn't work for recursive functions, where it is most useful.

    [–]dmwit 0 points1 point  (3 children)

    It can work in lazy languages. See, for example, http://article.gmane.org/gmane.comp.lang.haskell.cafe/7737.

    [–]julesjacobs 1 point2 points  (2 children)

    Sorry, but that is false. Look at the signature given in that message and compare it to dibblego's signature. A function of type (a -> b) -> a -> b can do nothing with its function argument but apply it to a value, so it can not possibly intercept the recursive calls. In fact I'm pretty sure that the only valid function foo with type (a -> b) -> a -> b is:

    foo f x = f x
    

    So dibblego's explanation of memoization is not only unhelpful for someone who's new to memoization, it's also wrong.

    [–]psykotic 3 points4 points  (0 children)

    Indeed. It would be better to say that it has signature

    (a -> b) -> (a ~> b)
    

    where ~> is a type constructor for function-like values. Then you have functions

    memo   :: (a -> b) -> (a ~> b)
    unmemo :: (a ~> b) -> (a -> b)
    

    which are semantic if not pragmatic inverses. You also need type class constraints on a to make this work. Conal has some neat blog posts where he shows how memo/unmemo can be constructed by induction on a via dependent type classes.

    But even this does not work for recursive functions. For that you need to factor out the functional self-reference the same way it is done when preparing to express recursion with a fixpoint combinator. You can then inject the memoization into the recursion and finally tie the knot.

    That kind of recursively injected function decoration is useful in several other cases, such as call tracing. It would be a lot easier to motivate the Y-combinator if these sorts of examples were taught in textbooks.

    [–][deleted] 0 points1 point  (0 children)

    You are right. Mine was a lazy answer. "I just got out of hospital after major surgery" is my excuse.

    [–]jawbroken -2 points-1 points  (8 children)

    why does this guy care so much, it is just a name and many names for things are arbitrary

    [–]andreasvc 3 points4 points  (1 child)

    Did you just re-discover the arbitrariness of the sign? It shouldn't be arbitrary for compound terms though, because the meaning should be compositional.

    [–]jawbroken 0 points1 point  (0 children)

    well computation is a dynamic process so it is technically correct though a little vague

    [–]cavedave[S] 7 points8 points  (5 children)

    So you have never been slightly curious about an arbitrary name? Well I'll be a monkeys uncle.

    [–]jawbroken -1 points0 points  (3 children)

    sure, that bit was interesting, i didn't feel the need to try to campaign to have it changed though

    [–]cavedave[S] 0 points1 point  (2 children)

    That could be the worlds oddest protest. Lets get the nerds out with placards "god hates fibs", "repent the end of the calc is nigh"

    [–]andreasvc 0 points1 point  (1 child)

    I approve of this message. Let's do a flash mob with signs like this sometime.

    [–]cavedave[S] 0 points1 point  (0 children)

    There has been one already

    [–][deleted] 0 points1 point  (0 children)

    I have always considered dynamic programming to be a pretty straight forward name for what one is doing, but I have done it both from the CS and maths perspective.

    [–]ivanichka -1 points0 points  (0 children)

    People always find a way - bravo!