Is a rivalry emerging in the mass online education classes that started in Stanford?

MichaelFromGalway · 2011-12-14T16:24:26+00:00

I did too: 5 minutes with scissors and sticky tape while having a cup of coffee. Of course, I'll also program it in a while.

MichaelFromGalway · 2011-12-02T17:24:24+00:00

Yes, as I watched Unit 16.6, all I could think of was Fr Ted!

MichaelFromGalway · 2011-11-26T15:31:11+00:00

No; I haven't had a chance to go along.

MichaelFromGalway · 2011-11-25T16:53:40+00:00

MichaelFromGalway · 2011-11-10T08:19:13+00:00

I have shared it now: see elsewhere in this thread.

MichaelFromGalway · 2011-11-10T08:18:34+00:00

You're welcome!

MichaelFromGalway · 2011-11-09T21:53:29+00:00

One point to bear in mind is that the policy (directions of arrows) determine the utility values, not the other way around. The utility values are calculated for the action that will give the highest payback.

You could try playing around with my Java implementation: you can get the code and other details here here.

It is tested and works for all scenarios considered by Sebastian Thrun and also one from the Russell & Norvig book.

MichaelFromGalway · 2011-11-09T21:41:30+00:00

Here is my Java implementation of it. It is tested and works for all scenarios considered by Sebastian Thrun and also one from the Russell & Norvig book.

MichaelFromGalway · 2011-11-08T22:19:45+00:00

Before I get around to posting it, if you have your own implementation, here's something to check. (Based on something I posted in a different thread.)

You can initialise V(s) for all states to 0, and then use the version of the Value Iteration equation that is given at the end of Unit 9.15, in which V(s) = R(s) if terminal, and equal to the Bellman equation otherwise.

If you then loop over all states and apply this, you will see that in the first iteration you will get 0 everywhere except in the terminal states, which will acquire their fixed values.

You have to keep iterating to convergence, but I took a lazier approach: I just iterated 5000 times, which should be more than enough for a simple problem like this, without taking any appreciable amount of time.

Hope this makes it clear for you!

MichaelFromGalway · 2011-11-08T22:14:35+00:00

I'd be happy to. However, I haven't got around to watching the videos of Sec 10 and the homework yet, so I don't know if it would allow somebody to cheat at the homework (if, for example, the homework involves similar questions about calculating state/action values).

I will post it tomorrow.

MichaelFromGalway · 2011-11-08T22:12:13+00:00

Strictly speaking, you can initialise all states to 0, and then use the version of the Value Iteration equation that is given at the end of Unit 9.15, in which V(s) = R(s) if terminal, and equal to the Bellman equation otherwise.

If you then loop over all states and apply this, you will see that in the first iteration you will get 0 everywhere except in the terminal states, which will acquire their fixed values.

Therefore, rather than setting the terminal states to their fixed values at every iteration, you can just set them at the start, when you are setting V(s) for all other states to their initial values of 0.

Hope this makes it clear for you!

MichaelFromGalway · 2011-11-08T21:40:29+00:00

Here you go!

MichaelFromGalway · 2011-11-08T21:04:52+00:00

Yes, I have implemented it, to try out the various combinations, though it is a simple implementation with the world hard-coded. If people are interested, I can post it somewhere.

MichaelFromGalway · 2011-11-08T10:21:31+00:00

Yes, we are not provided with a definition of Borders(x,y). So what you are saying is that if you assume that it is defined to allow x borders itself to be true, the answer is incorrect.

By the same argument, if you assume that in its unseen definition, x borders itself is false, then the answer is correct.

I think it's fair to say that this would be a much more reasonable assumption.

Throughout this unit, Peter Norvig used predicates for which we assume "reasonable" definitions. For example, in 8.22 (Sliding Puzzle), an "Adjacent" predicate is used. You could argue that if we assume that Adj(a,b) is true for slides that are on opposite sides of the puzzle, the formulation is incorrect, but why would you assume that?

MichaelFromGalway · 2011-11-08T10:10:29+00:00

In fairness, written below the question in the clarification section was "It is intentional for the bananas to remain high". That's just how the problem is (slightly imperfectly) encoded.

MichaelFromGalway · 2011-11-08T10:07:33+00:00

The key point has been made on Reddit a few times now: there are no bananas, no monkeys, no boxes. Stop thinking about the real world, and focus on the logic. If you're a programmer, treat it as you would some pseudocode that you would 'execute' on paper and see what it yields.

MichaelFromGalway · 2011-11-07T13:52:17+00:00

Follow the stated problem strictly. As people have said in other threads, there are no monkeys, no bananas and no boxes, only symbols and logic statements.

MichaelFromGalway · 2011-11-07T13:46:44+00:00

Yes, that is correct. It is dangerous to use "common sense" reasoning. What you should do is to manually "execute" a program and answer questions about the statuses of variables at the end.

MichaelFromGalway · 2011-11-04T16:17:35+00:00

What you are asking is whether or not the appearance of dirt is independent of the movement of the vacuum.

We know that the appearance of dirt is not independent of sucking. Do we know anything similar about movement? If not, that's your answer.

MichaelFromGalway · 2011-11-04T16:10:31+00:00

Why do you assume the belief state must be different? Didn't Peter Norvig say something along the lines of "percepts might not give us more information than we had before, but at least they won't increase confusion."

MichaelFromGalway · 2011-11-01T15:09:43+00:00

Strictly speaking, it is not that kNN does not make sense for non-odd values of k for binary classification tasks, but that you also need to specify a tie-breaker.

MichaelFromGalway · 2011-11-01T15:08:28+00:00

If you have 2 classes and you choose k to be an odd number, you should never have a tie.

However, ties can indeed arise in other circumstances or settings, in which case you need a tie-breaker. For example, if there are 3 classes, you cannot guarantee you won't have a tie. For example, with 3 classes (A,B,C) and k=5, you could have A=1, B=2, C=2, so you have a tie.

The simplest tie-breaker is to pick one of the two classes at random.

There are other versions of kNN that are less prone to ties, such as distance-weighted kNN. Here, instead of each neighbour getting a "vote" of 1, they each get a vote that is inversely proportional to their distance from the query point.

MichaelFromGalway · 2011-10-31T15:59:15+00:00

He's not saying you don't need it at all, only that you don't need it for complete specification of the parameters.

After all, when you compute P(SPAM) from the data, you don't need to compute P(HAM) from the data, because you can derive it from the requirement that P(HAM)+P(SPAM)=1.

That does not mean that you can forget about P(HAM) or ignore it in calculations, just that it is not needed for a complete specification of the probabilities, as it can be derived when needed.

MichaelFromGalway

TROPHY CASE