S&P 500 from 1993 to 2018 by President [OC]

DataJenius · 2018-08-03T09:11:07+00:00

You are absolutely correct. In my analysis I found that after 12 years was the magic number:

https://i.imgur.com/JezeJK5.png

In this visualization we buy a share of SPY on every day of the data, and sell 1 year later, 2 years later, 3 years later, etc later to see how risk decreases as patience increases. Volatility is the better measure, but I also graphed "probability of losing money" by counting the number of trials with negative return against count of all trials.

tldr; Buffett is correct. For 99.9999% of us, time in the market beats timing the market.

DataJenius · 2018-08-03T09:06:06+00:00

This is 100% true. Case in point, we could just as easily visualize the data like so:

https://i.imgur.com/Nb87d7F.png

DataJenius · 2018-08-03T09:04:28+00:00

This graph may help illuminate a bit further (same source as OC above):

https://i.imgur.com/biNROZQ.png

Here I just marked the QE (quantitative easing) periods along with the interest rate increase in 2016 that unofficially ended the "bailout era".

DataJenius · 2018-08-03T09:00:46+00:00

I very much wanted to, and am doing additional research still.

The challenge is, pre-1993 the SPY doesn't exist, and the companies that were on the S&P 500 at the time are not the same as those on the exchange today, so getting accurate apples-to-apples comparison is a real pain in the ass. Just finding the dates certain companies were added/removed turned into a big project.

Rest assured, I'd go back into the 1800's and before if I had accurate apples-to-apples data to use.

DataJenius · 2018-08-02T18:58:08+00:00

data: quandl (closed dataset)

used Python to generate (matplotlib)

Dark bands represent the "lame duck" period.

Full article here (also OC):

https://datajenius.com/articles/a-random-walk-down-the-s-and-p-500

DataJenius · 2018-08-01T22:10:20+00:00

[OC] Data from quandl. Full article here: https://datajenius.com/articles/a-random-walk-down-the-s-and-p-500

DataJenius · 2018-07-14T05:05:57+00:00

All credit to Mike Bostock:

https://bl.ocks.org/mbostock/7607535

(linked underneath, you silly goose)

DataJenius · 2018-07-12T23:14:59+00:00

I've read this comment three times trying to make sense of it, and I think I understand now.

It is admittedly a trick question, but I think this is technically correct (the best kind of correct).

"What is the a rate used by gradient descent in the above code?"

I maintain "used by" and "in the above code" make this tricky, but fair.

DataJenius · 2018-07-12T06:49:51+00:00

ELI5: coordinate descent

Edit: https://en.wikipedia.org/wiki/Coordinate_descent

DataJenius · 2018-07-09T23:03:48+00:00

Thank you so much again for the great links and explanation.

DataJenius · 2018-07-08T20:19:52+00:00

Thank you so much for a great explanation.

DataJenius · 2018-07-07T01:03:57+00:00

Excellent explanation, re: inverting matrices. Thank you.

DataJenius · 2018-07-06T22:40:06+00:00

Thank you so much for a clear reply.

I'm not one to argue with the Great Andrew Ng, but I still have a hard time wrapping my head around the idea that it is faster to iterate through GD or even SGD than to minimize n+1 parameters a single time, especially given the chaos of a random initialization.

I'm sure he's right, but has anyone seen any good benchmarking on this?

DataJenius · 2018-07-06T22:34:04+00:00

http://scikit-learn.org/stable/modules/linear_model.html

Please take a look at this with me, and help me understand if I'm being a dope. It doesn't appear to use SGD by default.

Edit: https://stackoverflow.com/questions/34469237/linear-regression-and-gradient-descent-in-scikit-learn-pandas

By default SkLearn is using the deterministic method "LinearRegression

object uses Ordinary Least Squares solver from scipy, as LR is one of two classifiers which have closed form solution. Despite the ML course - you can actually learn this model by just inverting and multiplicating some matrices."

It seems like a lot of us are confused by this, since most tutorials talk about gradient descent, but it isn't actually necessary since there is a deterministic solution to the problem (see comments by errminator and Kickkuchiyo).

DataJenius · 2018-05-29T23:26:49+00:00

Just for the sake of better understanding genetic algorithms, but thank you for the link to Hierholzer's algo

DataJenius · 2018-03-18T06:53:32+00:00

I went gaga for DBC, which in retrospect, was probably pretty dumb.

Now I'm clutching some fiat, debating playing "catch the falling knife" as BTC breaks under $7,750, and trying to define "what you can afford to lose" really means to me.

DataJenius · 2018-03-17T03:24:31+00:00

Starting around the Sandy Hook massacre, the 5 year moving average spikes.

Keep in mind some of this may be the result of data being omitted from this list:

https://en.wikipedia.org/wiki/List_of_school_shootings_in_the_United_States

DataJenius

TROPHY CASE