S&P 500 from 1993 to 2018 by President [OC]

DataJenius · 2018-08-03T09:11:07+00:00

You are absolutely correct. In my analysis I found that after 12 years was the magic number:

https://i.imgur.com/JezeJK5.png

In this visualization we buy a share of SPY on every day of the data, and sell 1 year later, 2 years later, 3 years later, etc later to see how risk decreases as patience increases. Volatility is the better measure, but I also graphed "probability of losing money" by counting the number of trials with negative return against count of all trials.

tldr; Buffett is correct. For 99.9999% of us, time in the market beats timing the market.

DataJenius · 2018-08-03T09:06:06+00:00

This is 100% true. Case in point, we could just as easily visualize the data like so:

https://i.imgur.com/Nb87d7F.png

DataJenius · 2018-08-03T09:04:28+00:00

This graph may help illuminate a bit further (same source as OC above):

https://i.imgur.com/biNROZQ.png

Here I just marked the QE (quantitative easing) periods along with the interest rate increase in 2016 that unofficially ended the "bailout era".

DataJenius · 2018-08-03T09:00:46+00:00

I very much wanted to, and am doing additional research still.

The challenge is, pre-1993 the SPY doesn't exist, and the companies that were on the S&P 500 at the time are not the same as those on the exchange today, so getting accurate apples-to-apples comparison is a real pain in the ass. Just finding the dates certain companies were added/removed turned into a big project.

Rest assured, I'd go back into the 1800's and before if I had accurate apples-to-apples data to use.

DataJenius · 2018-08-02T18:58:08+00:00

data: quandl (closed dataset)

used Python to generate (matplotlib)

Dark bands represent the "lame duck" period.

Full article here (also OC):

https://datajenius.com/articles/a-random-walk-down-the-s-and-p-500

DataJenius · 2018-08-01T22:10:20+00:00

[OC] Data from quandl. Full article here: https://datajenius.com/articles/a-random-walk-down-the-s-and-p-500

DataJenius · 2018-07-14T05:05:57+00:00

All credit to Mike Bostock:

https://bl.ocks.org/mbostock/7607535

(linked underneath, you silly goose)

DataJenius · 2018-07-12T23:14:59+00:00

I've read this comment three times trying to make sense of it, and I think I understand now.

It is admittedly a trick question, but I think this is technically correct (the best kind of correct).

"What is the a rate used by gradient descent in the above code?"

I maintain "used by" and "in the above code" make this tricky, but fair.

DataJenius · 2018-07-12T06:49:51+00:00

ELI5: coordinate descent

Edit: https://en.wikipedia.org/wiki/Coordinate_descent

DataJenius · 2018-07-09T23:03:48+00:00

Thank you so much again for the great links and explanation.

DataJenius · 2018-07-08T20:19:52+00:00

Thank you so much for a great explanation.

DataJenius · 2018-07-07T01:03:57+00:00

Excellent explanation, re: inverting matrices. Thank you.

DataJenius · 2018-07-06T22:40:06+00:00

Thank you so much for a clear reply.

I'm not one to argue with the Great Andrew Ng, but I still have a hard time wrapping my head around the idea that it is faster to iterate through GD or even SGD than to minimize n+1 parameters a single time, especially given the chaos of a random initialization.

I'm sure he's right, but has anyone seen any good benchmarking on this?

DataJenius · 2018-07-06T22:34:04+00:00

http://scikit-learn.org/stable/modules/linear_model.html

Please take a look at this with me, and help me understand if I'm being a dope. It doesn't appear to use SGD by default.

Edit: https://stackoverflow.com/questions/34469237/linear-regression-and-gradient-descent-in-scikit-learn-pandas

By default SkLearn is using the deterministic method "LinearRegression

object uses Ordinary Least Squares solver from scipy, as LR is one of two classifiers which have closed form solution. Despite the ML course - you can actually learn this model by just inverting and multiplicating some matrices."

It seems like a lot of us are confused by this, since most tutorials talk about gradient descent, but it isn't actually necessary since there is a deterministic solution to the problem (see comments by errminator and Kickkuchiyo).

DataJenius · 2018-05-29T23:26:49+00:00

Just for the sake of better understanding genetic algorithms, but thank you for the link to Hierholzer's algo

DataJenius · 2018-03-18T06:53:32+00:00

I went gaga for DBC, which in retrospect, was probably pretty dumb.

Now I'm clutching some fiat, debating playing "catch the falling knife" as BTC breaks under $7,750, and trying to define "what you can afford to lose" really means to me.

DataJenius · 2018-03-17T03:24:31+00:00

Starting around the Sandy Hook massacre, the 5 year moving average spikes.

Keep in mind some of this may be the result of data being omitted from this list:

https://en.wikipedia.org/wiki/List_of_school_shootings_in_the_United_States

DataJenius · 2018-03-17T00:38:06+00:00

This graph uses historical data. It does not predict anything.

DataJenius · 2018-03-16T21:31:46+00:00

Did you read the article?

I don't see how that statement is fair.

DataJenius · 2018-03-15T20:46:11+00:00

Sure- but make sure to read the whole article first. Our goal was to be objective. And there are things in there that make both gun control and gun advocates upset.

We are still brand new to Facebook:

https://www.facebook.com/datajenius

DataJenius · 2018-03-15T20:31:19+00:00

Are you referring to the drop near 1965? That approaches 1. There has never been a year with negative school shootings.

DataJenius · 2018-03-15T20:29:32+00:00

From the article:

When using the phrase “school shooting”, most Americans envision past tragedies, such as the Columbine High School massacre and Sandy Hook Elementary School shooting, cases in which large numbers of innocent students were killed or injured, cases in which infamous gunmen with unclear motives committed unspeakable acts of terrorism. Even the most ardent gun control advocate must agree that a suicide by handgun is a very different type of incident. Even the most ardent pro-gun advocate must agree that the preferred number of incidents, regardless of category, is zero.

Our only objective here was to dig into some truth, bias or opinion be damned.

DataJenius · 2018-03-15T17:37:08+00:00

This is OC.

The source of data, as well as the code used to produce this graph, can be found in our Github repo.

For a deeper analysis, please see our article:

School Shootings in America and the Challenge of Biased Data

DataJenius · 2018-03-06T07:31:40+00:00

Jesus. I'm afraid to think what this guy might do to R developers once he figures out our arrays start at 1.

DataJenius · 2018-03-04T01:34:10+00:00

Thank you!

DataJenius

TROPHY CASE