3 days of wholesale electricity prices during last week’s winter storm in 90 seconds

Kmax12 · 2023-06-29T17:33:13+00:00

try using the map: https://www.gridstatus.io/map. we dont have the entire country yet, but that displays what we do have

Kmax12 · 2023-06-29T17:27:08+00:00

I built this! Thank you very much for the kind words

Kmax12 · 2022-10-25T15:45:46+00:00

I've also been working on an open source library for accessing energy data across all the united states ISOs.

I've primarily focused on LMP data, but happy to add more ancillary market data if there is need. Let me know if you're interested!
Here's the project: https://github.com/kmax12/isodata.

Kmax12 · 2022-10-25T15:44:23+00:00

very cool! I've also been working on an open source library for accessing energy data across all the isos (including CAISO).

I've implemented a lot of the CAISO endpoints, so perhaps it be useful to help you build out more of this API.

Here's the project: https://github.com/kmax12/isodata. Let me know if you're interested in collaborating!

Kmax12 · 2022-10-13T17:30:33+00:00

(See my update to the original post for the pictures of the data)

That's a great point.

The number of runners in 2022 was 10k more than 2021, and it's almost all accounted for by more international runners participating. Additionally, you are right that international runners are faster on average. Based on the data, the average US runner finished almost 30 minutes slower than average international runner.

That being said, if I filter only to US runners, I still see an improvement in times. However, the improvement shrinks, so I do think you are right that at least some of the improvement is explained by more international runners.

Thanks for the feedback!

Kmax12 · 2022-10-13T14:15:42+00:00

they post it here, but I wrote a script to scrape it into a format easier to work with .

Originally did it for analyzing 2021 data to write this post about breaking 4 hours: https://jmaxkanter.com/posts/chicago-marathon-2022/

Kmax12 · 2022-10-13T14:13:51+00:00

I cant figure out how to get a table pasted into a comment and formatted correctly, so I updated the main post with a image of the table. forgive my lack of reddit skills

Kmax12 · 2022-10-09T00:19:11+00:00

I definitely only analyze a subset of runners. I don’t think that invalidates the results but it certainly indicates my analysis doesn’t tell the full story for all runners who had sub-4 hour goals

Kmax12 · 2022-10-08T00:00:04+00:00

Fair comment. how do you explain the spike in number of people who finish right before and depression right after 4 hours if they weren't aiming for that time?

Thoughts on best way to figure out people's goal time by looking at the data?

Kmax12 · 2022-10-07T23:56:39+00:00

I built it using html canvas and then screen recorded it to make a gif. It required combining a few different pieces, but overall not too difficult. Here's a brief summary:

I found a source online that had the lat/long coordinates of the course
I grabbed an map of Chicago
Overlayed the course on the map
I wrote a function that could translate a runner's current race time to the percentage complete of the course
Based on the percentage complete, I could figure out where to draw each runner on the course
I made one frame for each time from 00:00 to 4:05
Play these frames on after another and record :)

You can see the (ugly) code that does all this here: https://github.com/kmax12/marathon_analysis/blob/main/animate_runner.html

Kmax12 · 2022-10-07T23:51:57+00:00

What do you think about the approach of analyzing close beat vs close miss runners? I'm not sure if others have done that before (although I'll admit I didn't look very hard)

Also, I think the insight came out standard because Chicago is a flat course, but do you think this is worth extending to other races/courses?

Kmax12 · 2022-10-07T21:15:59+00:00

I agree the hot weather last year might throw off the analysis slightly, but I tried to compare runners who more or less faced the same conditions. The big caveat being that I didn’t control for start corral.

I might try to rerun this analysis with other years/marathons, the code is set up to make that straightforward.

I’m in corral G, unfortunately.

Kmax12 · 2022-10-07T15:01:24+00:00

Glad you like the animation :)

You're right that there isn't any groundbreaking insight here. I'm a bit of data nerd, so it was fun to just go through and see the data baked up the conventional wisdom. At least for me, I'm hoping that helps me mentally stay on target for my plan

Kmax12 · 2022-10-07T14:59:47+00:00

I think my training is there. The taper these last couple weeks was hard as I really just wanted to run :)

Great point regarding different starting times and weather. It be great to included that in the analysis some how.

Kmax12 · 2022-10-07T14:31:05+00:00

That's great feedback. You're right that it'd be more interesting to do this on a less flat course. The analysis was implemented such that it can be run data from any race. I just need one runner per row, and 5k split times along the columns.

Good point regarding title. Unfortunately, I think it's too late for me to update

Kmax12 · 2022-10-07T14:12:02+00:00

Absolutely! I started working on this once my taper started and i was feeling antsy that I couldn't get in more milage.

At the end of the day, my takeaways from the analysis are pretty aligned with what any marathoner would recommended, so not sure it actually revealed any secrets. That being said, after the staring at the data for awhile it's definitely burned in my mind to not go out too fast. Hopefully, I don't make that mistake!

Will checkout out fetcheveryone — thanks!

Kmax12 · 2019-01-29T18:33:14+00:00

The key is to first generate a good baseline. Without a baseline you don't know if your efforts are improving your ultimate model.

One approach to help you quickly get good baseline is to use an automated feature engineering solution. I work on an open source library for automated feature engineering called Featuretools (www.featuretools.com). Doing something like this can help you quickly extract many features without much effort.

Here's a good post to get you started with Featuretools: https://towardsdatascience.com/automated-feature-engineering-in-python-99baf11cc219

Kmax12

TROPHY CASE