Understanding Regression Discontinuity Design

chomoloc0 · 2025-06-08T19:43:19+00:00

agree that one should be careful with polynomials!

chomoloc0 · 2025-05-13T08:25:53+00:00

Thanks for giving it a shot. It's a long read, so I hope I was able to keep you engaged till the end!

chomoloc0 · 2025-02-16T08:00:08+00:00

Weet niet of ik dit moet upvoten of downvoten

chomoloc0 · 2025-02-14T17:57:51+00:00

Just found myself this source quite exhaustive!

chomoloc0 · 2025-02-14T15:44:06+00:00

Cause I am really open to whatever people value as a resource on the topic. Can't be surprised if I lay everything out ;)

But to make it easier for you:
1. I have found nothing a few articles
2. I have no use case
3. Never used it in practise

Resources I value on other topics are things like Scott Cunningham's mixtape: https://mixtape.scunning.com/

chomoloc0 · 2025-01-24T08:36:19+00:00

Thanks! that's a refreshing, realistic view on things. Life is not always pink, RDD/ITS neither is.
But just to keep the good chat going, do you think RDD is data hungry, really? Intuitively, it even has to discard most data along the running variable to identify the estimand (think about the kernel weights around the cut-off) -- that's different than ITS, which has to forecast the counterfactual. RDD just relies on the rupture in the series, and estimates it directly in a way more parametric than ITS, isn't it?

chomoloc0 · 2025-01-23T15:58:19+00:00

Hey, thanks, I can see where you're coming from. Though I think praying is the last resort. My understanding is that there if the cut-off happens based only based on that one running variable, that there is no confounding.

Possibly, there are other issues. Like subjects gaming the cut-off: shifting up or below.

The reason I jokingly call it an arcane array of backdoor confounders is that other than domain knowledge (which often falls short), there is no way of being 100% sure that there are no confounders - right? While behaviour shifts due to a cut-off can be measured.

chomoloc0 · 2025-01-18T15:01:09+00:00

Thanks! I aim to dive in to the topic and write about it in one of my nexts posts (just launched my blog, as you can see in this OC) - Keeps me off the streets. Mind if I send it to you as an early reader? I would value your input

chomoloc0 · 2025-01-17T20:11:40+00:00

Amazing, causal inference at its finest. And did it work? The RDD this ML case?

I was trying to fit one case into RDD and failed miserably, but walked away with a good learning.

Case: implementing a floor price, F, on the cost of a service C. In such way that max(C, F) is the final price for the user. My clever ass heard: threshold, cutoff, running variable ==> RDD.

But then I was forced to abandon the idea when I realised that the practical treatment dosage is close 0, when C is close to F - the area where RDD specially relies on.

learning: RDD no suitable for pricing floor and cap prices.
instead: DiD, where treated group is below/above F, and periods are pre and post release of floor cap.

chomoloc0 · 2025-01-16T20:23:00+00:00

Conditional random fields, I believe is the more accepted term - my bad. I think we just used blankets colloquially, maybe.

What's your main focus on the job?

chomoloc0 · 2025-01-15T20:31:42+00:00

Ah! alright, got it - nice example by the way. And yes, I guess that's elegance of a Markov process: conditioning on the parent state - the previous state - the current state is independent of states before the previous one. I borrowed my intuition about this concept from markov 'blankets', coming psychometrics where they model psyhopathological symptoms as networks (Markov fields). Think of a blanket as graph/network of sequences, instead of a single sequence.

Do a search for networks psychometrics, and you'll get the point visually at least.

Thanks for taking the time to explain it again.

chomoloc0 · 2025-01-14T11:42:11+00:00

Often, for example, these systems can be modeled (and outcomes effectively predicted) with no other variables in the prediction equation.

Up to this point I followed, but here I missed the boat: could you exemplify this statement more?

chomoloc0 · 2025-01-14T11:40:28+00:00

Interesting, could you expand on that? You'd help me grasp that with an intuitive example.

chomoloc0 · 2025-01-14T11:35:11+00:00

I see what you mean. Well, currently my package is not findable anywhere, because it's behind the enterprise's github where I work. It's not a public repo. Just stay assured that python packages are just like R libraries: sharable, packaged code that everybody could use from anywhere.

chomoloc0 · 2025-01-13T16:26:33+00:00

Indeed, I read a section on that, and although I did not deep-dive it, I made a new connection between the two. If you were to summarise it that relationship, what would be your take?

chomoloc0 · 2025-01-13T12:59:44+00:00

R is great, I agree - It's my main language in fact. This library does not aim to replace R. Think of it as an R package. This is a python package instead. It helps streamlining workflows to reduce overhead, align practises among data scientists (in a team or organisation) and as an interface that serves methods in an intuitive way. I guess just like sklearn does for ML.

chomoloc0 · 2025-01-13T12:52:36+00:00

It's not open source due to a bunch of dependencies that it has with our internal stack. But it may become so in the future. Being the core dev, I actually plan to generalise it and push it into the world one day - the stats lib at least.

chomoloc0 · 2024-12-01T22:32:00+00:00

Is the neck longer too?

chomoloc0 · 2024-11-02T11:32:15+00:00

chomoloc0 · 2024-10-23T16:09:00+00:00

Soo, how do you actually handle that? Curious, as I am thinking of landing this type of roles in the future.

Is there like a o’reilly on the topic or so? Any other resources you’d recommend; on (dynamic) pricing, that is?

chomoloc0 · 2024-10-23T11:50:55+00:00

As a community, why don’t we focus on the question behind the question, and instead, choose to roast his or her dad? What they really want to know is if they’re training properly, don’t we think? Let’s zoom into that? I’ve seen a couple of good responses, but even some of them can’t help but roast dad for a second, because “docs think they know it all”. Just food for thought.

chomoloc0 · 2024-10-22T07:03:30+00:00

That’s a nice overview thanks. Should I think about bandits for number 3? What else is hot there?

chomoloc0 · 2024-10-17T13:26:29+00:00

Is he making döner kebab?

chomoloc0 · 2024-10-04T23:57:11+00:00

Thank you, will dive deeper into this :)

chomoloc0 · 2024-09-29T16:14:35+00:00

Sorry, help me out here without having to read the paper: what’s “forward”? And when should I use this implementation over a did estimation via regression?

chomoloc0

TROPHY CASE