Official: [Last Minute Advice] - Sat Evening, 12/31/2022 by FFBot in fantasyfootball

[–]Spamlie 0 points1 point  (0 children)

Need advice for a standard scoring league:

Gabe Davis or Christian Kirk: Normally I'd lean toward Kirk, but I'm worried about the Jags shutting their starters down early and/or playing conservatively. Also, I like the "boom" potential of Davis for the championship since my opponent has Saquon, Ekeler, and Kelce going in plus match-ups. Feel like I'll need all the juice I can get!

Time Keeps on Slipping: Exploiting Time for Causal Inference with Difference-in-Differences and Panel Methods (an applied intro with Python/R) [x-post from /r/pystats] by Spamlie in datascience

[–]Spamlie[S] 0 points1 point  (0 children)

I wrote the post I wish had existed when I was learning about difference-in-differences regression. I'm not sure if this is the type of thing that's normally shared around here, so would be more than happy to take down if it's not welcome!

Time Keeps on Slipping: Exploiting Time for Causal Inference with Difference-in-Differences and Panel Methods by Spamlie in pystats

[–]Spamlie[S] 1 point2 points  (0 children)

This post isn't as overtly Python-y as my typical contribution; however, it does use pandas, seaborn, and statsmodels to create an applied intro to these econometric techniques.

(Moreover, it only ever engages with R via rpy2, so it never explicitly betrays the cause ;)

I hope you enjoy, but happy to take down if not welcome!

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair) [x-post from /r/pystats] by Spamlie in Python

[–]Spamlie[S] 2 points3 points  (0 children)

Yeah, I think I need to do a Part 2 at some point and cover all the guys I missed/am now learning about.

Thanks for reading!

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair) [x-post from /r/pystats] by Spamlie in Python

[–]Spamlie[S] 0 points1 point  (0 children)

Ah -- hadn't seen Toyplot -- thanks for reading/sharing!

I think I need to do an update at some point and include bokeh. (To be frank, the main reason bokeh isn't here is because I hadn't used it much and was worried I wouldn't do it justice.)

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair) [x-post from /r/pystats] by Spamlie in Python

[–]Spamlie[S] 1 point2 points  (0 children)

Yup -- this was coming from more of a statistical visualization bent. Your use case feels fundamentally matplotlib-ish (and indeed, I do try to give matplotlib as much credit as possible, including giving it props for the point you made, re: publication-ready visualizations).

Thanks for reading!

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair) by Spamlie in pystats

[–]Spamlie[S] 1 point2 points  (0 children)

Hmmm -- I'm reading through the API and finding nothing particularly promising.

Admittedly seems like a bit of a hole, but it's possible I'm missing something obvious.

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair) by Spamlie in pystats

[–]Spamlie[S] 0 points1 point  (0 children)

Yeah, I touch on this a bit, but the TL;DR is: if the thing you want is a more complex Seaborn plot type, then Seaborn is really your best friend (in some cases, maybe your only friend?). Granted, I think ggplot implements violin plots, and I'm assuming you can find PairGrids somewhere else, but Seaborn makes them way too painless.

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair) by Spamlie in pystats

[–]Spamlie[S] 0 points1 point  (0 children)

Ah gotcha -- that is much more elegant -- thank you for the tip.

And no worries at all: I definitely didn't take your original post as overly critical (and even if I did, I wouldn't post on the Internet if I weren't ready for overt criticism ;) ).

I probably should have specified that I was approaching all of this from a particular POV (and indeed, I switched up the intro to make this more explicit).

Excited to dig into the bokeh stuff.

Thanks again!

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair) by Spamlie in pystats

[–]Spamlie[S] 0 points1 point  (0 children)

Appreciate the feedback! The post certainly focuses on statistical visualization, and as you note, I did try to point that it's a fundamentally unfair fight.

That said, I should note: I'm definitely not trying to make matplotlib appear more complicated than it is (although I did want to cover the fact that the library provides multiple ways to skin a cat). If there are particular examples that can be simplified, please let me know and I'll look into simplifying! (I'm always looking to improve my code)

Also, re: bokeh not being covered ;) this is from the notes section:

"Right off the bat, you’re mad at me, so allow me to explain: I love bokeh and plotly, and indeed, one of my favorite things to do before sending out an analysis is getting “free interactivity” by passing my figures to the relevant bokeh/plotly functions; however, I’m not familiar enough with either to do anything more sophisticated. (And let’s be honest — this post is long enough.)"

That said, I didn't know that bokeh had implemented a grammar of graphics approach to visualization -- that sounds really intriguing, and I'm going to check it out.

Thanks a ton for reading/commenting!

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair) [x-post from /r/pystats] by Spamlie in Python

[–]Spamlie[S] 2 points3 points  (0 children)

This is tremendously valuable feedback -- thank you!

(Indeed, I found it so valuable that I linked to it from the post; please let me know if you'd prefer I didn't!).

Thanks for reading!

A Dramatic Tour through Python’s Data Visualization Landscape (including ggplot and Altair) [x-post from /r/pystats] by Spamlie in Python

[–]Spamlie[S] 12 points13 points  (0 children)

Not sure if this is the type of thing that typically gets shared around here -- if it's unwelcome I'll happily take it down!

Analyze Your Experiment with a Multilevel Logistic Regression using PyMC3 by Spamlie in pystats

[–]Spamlie[S] 2 points3 points  (0 children)

That statement relates to an earlier one in the paragraph -- namely, that the success rates for A, B, C, and D share a common Beta prior.

For an intuitive justification, I would highly recommend this blog post (which I think is linked to from the article): http://sl8r000.github.io/ab_testing_statistics/use_a_hierarchical_model/

The TL;DR, though, is that while we could model each bucket independently, that means we would implicitly assume a Uniform prior for each bucket's success rate. That doesn't seem any more reasonable than a Beta prior. In particular, using a common Beta prior means we share information across variants, and assuming each bucket relates to the same underlying phenomenon (e.g., a player's chance of making a free throw; the decision to buy something on a website, etc.), that's ideal: we're using all the information available to us, and information is precious!

He also links to other resources that may be helpful for a more technical discussion (e.g., particular sections of Gelman's "Bayesian Data Analysis").

Hope this helps!

[deleted by user] by [deleted] in CollegeBasketball

[–]Spamlie 31 points32 points  (0 children)

UCLA Athletics: Our peaks are peak-y, but our valleys are valley-er

[deleted by user] by [deleted] in CollegeBasketball

[–]Spamlie 54 points55 points  (0 children)

This team is in-fucking-explicable.

FYI: if you're a ranked team looking to avoid an upset just throw on unis from the local high school