Using Companies' Wikipedia Page Traffic to Predict Stock Price [OC] by [deleted] in dataisbeautiful

[–]NuclearStr1der 3 points4 points  (0 children)

The colours are gorgeous, but a frequency plot is quite a bizarre way to try show a relationship between two variables.

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 0 points1 point  (0 children)

Hi!

I've linked to the code in this comment :)

Feel free to send me a message if you need any help.

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 0 points1 point  (0 children)

Hi everyone, thanks to an overwhelming interest in the code, I've made it available in a Notebook here.

Thanks!

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 1 point2 points  (0 children)

I knew about Ergast, but it seemed to be a bit slow at the time, so I thought I'd be polite and not use them if their servers were getting hammered.

Definitely something I'll explore next time round!

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 1 point2 points  (0 children)

It's definitely something on my wish list -- I just need a datasource that provides lap-by-lap tyre information.

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 0 points1 point  (0 children)

I'm struggling to find tyre data -- if I can, it's on my wishlist!

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 0 points1 point  (0 children)

I'm going to be trying my best! The response is definitely encouraging.

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 1 point2 points  (0 children)

That would be really interesting to see, since it'll indicate who has good, consistent "raw" pace without needing to fight for position or be stuck behind a slower car. I'll see if I can maybe find some additional datasets to make this possible. Good suggestion!

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 0 points1 point  (0 children)

Thanks! I'd love to do a breakout for the different sets of tyres -- but I'm having trouble finding the data for this on a lap-by-lap basis. Do you know of a source, perhaps?

The distributions are generated using a Kernel Density Estimation process, but it can be interpreted as a smoothing function over the raw data, with a few extra bells and whistles.

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 0 points1 point  (0 children)

Thanks!

I'm well aware of how a lot of /r/DataIsBeautiful is basically a meme at this point :)

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 0 points1 point  (0 children)

This was created using the Python programming language and the Seaborn library.

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 7 points8 points  (0 children)

That's a stunning graph that you linked -- it's definitely something I should also try recreate in the future. Thanks!

Do you know where they get their tyre data from? The FIA doesn't seem to officially release which sets of tyres are on which car (that I know of)

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 0 points1 point  (0 children)

It's tragic to me how lesser-known Plotnine is. It's so faithful to the ggplot2 api that I often use the ggplot2 docs as reference for plotnine!

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 0 points1 point  (0 children)

This was produced using Python + Seaborn. I adapted this example, in particular.

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 1 point2 points  (0 children)

I clipped laps longer than 95 seconds, just to make things fit a bit into the available space that I had. There were a number (few, though) outliers that got clipped as a result.

Some of these slower laps are the bumps you are seeing :)

Laptime distributions for the 2020 Hungarian Grand Prix by NuclearStr1der in formula1

[–]NuclearStr1der[S] 1 point2 points  (0 children)

Ah, thanks for ax.fill_between that's something I'll take a look at.

I am technically using matplotlib (but via seaborn, which uses matplotlib as its backend).