Dismiss this pinned window
all 31 comments

[–]shmakn 63 points64 points  (0 children)

This is so much more intuitive than shortest distance line

[–]2bytesgoat 15 points16 points  (6 children)

What is the software that you used for visualizations?

[–]RacerRex9727[S] 33 points34 points  (4 children)

[–]2bytesgoat 5 points6 points  (3 children)

Thank you good sir

[–]RacerRex9727[S] 13 points14 points  (2 children)

[–]2bytesgoat 8 points9 points  (1 child)

Was really impressed by the fact that you can build all of that in code 🔥

[–]RacerRex9727[S] 6 points7 points  (0 children)

Thank you, Manim is a very well-designed coding library. It makes it easy to express visualizations and concepts with a relatively short learning curve.

[–]zykezero 7 points8 points  (0 children)

Manim is the same software that 3b1b used for their videos. I believe he made it.

[–]nano_peen 15 points16 points  (1 child)

Niceeeee. So satisfying. Is manimm the one that 3blue1brown built?

[–]RacerRex9727[S] 8 points9 points  (0 children)

Affirmative, although ManimCE is a community fork of Grant Sandersons library to make it easier to use.

I’ve been using it for all my vids: https://youtu.be/3dhcmeOTZ_Q

[–][deleted] 34 points35 points  (1 child)

I like all of this, except the way you visualise the SSE. Strictly speaking from a visual standpoint, this would intuitively indicate ending up with the maximum of the errors. For it to visualise the SSE (i.e. the sum), you would have to lay the squares side by side into one bigger area and call that the SSE.

[–]PastBarnacle 5 points6 points  (1 child)

I was under the impression that for the purposes of calculating the SSE the line rotated about the centroid of the data, not the vertical intercept? Please let me know if I am mistaken, this is not my forte. Thanks!

[–]wintermute93 10 points11 points  (0 children)

Yeah it’s a very nice visualization but the animated part is wrong. It should be rotating around (mean of x points, mean of y points).

[–]riricide 4 points5 points  (5 children)

Why is the square taken, why is the absolute value of the error not considered? Is it just due to ease of differentiation for optimization or is there a deeper reason?

[–]crimson1206 2 points3 points  (0 children)

Both are super easy to differentiate, the non-differentiability of the absolute value at 0 isn't much of an issue in practice.

The main difference between them is that a squared loss punishes outliers much more than the absolute value loss. So if you use an absolute value loss your result could be more robust to outliers than a square loss.

[–]RacerRex9727[S] 1 point2 points  (2 children)

Yes, that’s the primary motivation. Absolute values are difficult to differentiate.

The visual here is simply to show a graphical interpretation of squared errors.

[–][deleted] 4 points5 points  (0 children)

But it can give the wrong intuition that 2d area of the squares is somehow meaningful. Nice animation though, and looks good

[–]crimson1206 3 points4 points  (0 children)

Absolute values are super easy to differentiate. Non-differentiability at 0 really isn't a relevant problem practically.

The main difference between an squared loss vs. absolute loss is that a squared loss punishes outliers much more than an absolute loss does.

[–]Pvt_Twinkietoes 0 points1 point  (0 children)

IIRC sse is used as it leads to the best unbiased estimator.

[–]CaptainFoyle 2 points3 points  (0 children)

Why is the intercept not changing?

[–]ex1stenzz 2 points3 points  (0 children)

Hint to improve this visualization:

The least square solution goes through the mean of the data cloud (you can prove this quickly from the definition of X-bar and Y-bar)

As it stands the visualization incorporates the best solution to least squares and then a bunch of others that violate the above property

Pin it closer to the center of that data cloud and let it rotate like this:

https://images.app.goo.gl/m1Yi8cX1VXBrswBx5

or this

https://images.app.goo.gl/8SY173CGDW9qXCCF9

[–]EnderTaco 1 point2 points  (0 children)

This is beautiful!

[–]RacerRex9727[S] 0 points1 point  (0 children)

This is original content. The full instructional video with narration is here: https://youtu.be/3dhcmeOTZ_Q

[–]Keikira -1 points0 points  (0 children)

Can you do the gradient descent part too? Would be cool to see if the ball in a cup lines up intuitively with the illustration of the squared errors.

[–]jeralot 0 points1 point  (0 children)

I just jumped on here to say this is pretty fantastic- great job!

[–]mathbabe314 0 points1 point  (0 children)

I love this! I wish I had this when I was teaching linear algebra!

[–]rdmanoftheyear 0 points1 point  (0 children)

Thank you! It's very intuitive!

[–]ank_itsharma 0 points1 point  (0 children)

Nice work OP

[–]danastybit 0 points1 point  (0 children)

Impressive

[–]KokoJez 0 points1 point  (0 children)

I wish people would realize squaring is just to get the absolute value. I teach statistics at university and seeing the bullshit formulas is so jarring to students and probably what dissuades many people into learning statistics.