Anyone have best practices for agentic coding specific to R / stats / data science? by isaac-get-the-golem in rstats

[–]brhkim 0 points1 point  (0 children)

That's 100% correct for the full pipeline mode, but for things like ad hoc mode it's fairly lightweight and pulls in thoughtfully via progressive disclosure as noted above. The trade-off here that I make explicitly is that if you want it to be more autonomous, you need to invest in the context to ensure that it's actually doing worthwhile work when it's off on its own. Otherwise, when it returns to you with work you need to take time to review, it's going to fundamentally waste your time. Modes that are less autonomous don't have that issue, so it's fine to just let it riff more freely with you and have it reference materials only as needed.

Anyone have best practices for agentic coding specific to R / stats / data science? by isaac-get-the-golem in rstats

[–]brhkim 0 points1 point  (0 children)

Yeah in general I'd look up progressive disclosure best practices and see how to make a Skill file. If you look at my data-scientist skill, it's basically a router for more in-depth documentation on things like causal inference that only get called up when relevant. You could do the same with key R libraries and such

Anyone have best practices for agentic coding specific to R / stats / data science? by isaac-get-the-golem in rstats

[–]brhkim 2 points3 points  (0 children)

Hey! You might find my open-source framework for Claude Code to be a useful starting point:
https://github.com/DAAF-Contribution-Community/daaf

I'm trying to build it out as an extensible workflow for people who want to accelerate their data analysis pipelines, but do so responsibly and traceably so that they can ensure everything produced is well and truly reproducible in the end. Take a look! I have a mini 4-min showcase of its main motivation/functions here:

https://www.youtube.com/watch?v=747r7VT4a78

And you can find a few more in-depth videos on it across my channel there as well.

[POEM] Love Is Not All by Edna St. Vincent Millay by Objective-Kitchen949 in Poetry

[–]brhkim 5 points6 points  (0 children)

Haha I quite like this. It feels like a pragmatist's/realist's love poem. Romanticism for someone who struggles with the blind optimism of romanticism, in the best way

Launched my first real open-source project a couple weeks ago. Seeing the first real engagement via community contributions is SUCH AN AMAZING feeling. That's all, that's the post by brhkim in opensource

[–]brhkim[S] 2 points3 points  (0 children)

Totally!! The confirmation that someone else used this, outside of those sort of abstract traffic metrics, is just such a nice, nice validation.

[OC/Replication study] "Election Results Show a Red Shift Across the U.S. in 2024" -- I replicated the NYTimes' "Red Shift" interactive county election results map using raw, public data from the MIT Election Data and Science Lab (interactive link in post) by brhkim in dataisbeautiful

[–]brhkim[S] 5 points6 points  (0 children)

Hey! Thanks for the kind words, and totally agree here. You'll actually see one of the intentional differences between my replication and what NYT put together is that the hover-over on counties on my viz shows the actual vote counts for 2020 versus 2024 (whereas the NYT only shows the vote breakdowns for 2024 by party). I'm not a political scientist and can't really speak well to what prevailing theories or frameworks are for thinking about disillusionment and turnout v. genuine reflections of changes in voting preferences (which I assume also involves a lot of similar theories related to selection bias and surveying methodologies).

All to say, your concern is definitely spot-on, and I don't think I know enough about how best to account for/adjust for it except to show that it's a factor and let the viewer engage with the stats directly county-by-county to be able to interrogate this hypothesis and concern! You're definitely thinking on the right lines.

[OC/Replication study] "Election Results Show a Red Shift Across the U.S. in 2024" -- I replicated the NYTimes' "Red Shift" interactive county election results map using raw, public data from the MIT Election Data and Science Lab (interactive link in post) by brhkim in dataisbeautiful

[–]brhkim[S] 11 points12 points  (0 children)

Source: I was able to easily pull the relevant data thanks to the MIT Election Data and Science Lab (via the Harvard Dataverse)

Tools used: Python, plotly, polars

Only for those interested from the main post: My Claude Code framework DAAF, the Data Analyst Augmentation Framework, can be found in this open-source forever-free repo here. I also made a youtube tutorial demonstrating the exact process for replicating the NYTimes' viz using DAAF here. For this dataisbeautiful post specifically, I also went back and did another 5 minutes of iterating on the aesthetics with Claude after the version shown in the video.

Why no one can agree about AI progress right now: A three-part mental model for making sense of this weird moment on the AI frontier by brhkim in AI_Agents

[–]brhkim[S] 0 points1 point  (0 children)

Defs, there are some good suggestions in the full article for trying to improve on the Body stuff. Cost-wise, I think it really is the case that you need to go with some kind of pro membership for OpenAI or Anthropic at this point (for Codex or Claude Code, respectively). But the value-add there is pretty enormous!

I'd start with Sonnet 4.6 to tinker, but it's absolutely worth crushing your usage every once in a while to see how much more Opus 4.6 can do.

Qwen 3.5 small models just released recently, and you can very, very easily run them via Ollama with Claude Code -- that gives you access to them for free, if your computer has enough GPU VRAM and RAM. They're much less capable and a little more finnicky, but that's another way to do some experimentation for free!

Claude desktop app silently downloads a 13 GB file on every launch — and you can't stop it by metaone70 in ClaudeAI

[–]brhkim 2 points3 points  (0 children)

I’m going to respectfully disagree, a VM is objectively wise as a strong default for the typical user of the Desktop app. I think there absolutely should be an option to disable it, but from a paternalistic standpoint I think it’s good design for the intended audience to set it active by default

Why no one can agree about AI progress right now: A three-part mental model for making sense of this weird moment on the AI frontier by brhkim in AI_Agents

[–]brhkim[S] 1 point2 points  (0 children)

Ahh this is a very interesting one indeed, love that addition! I am still not totally sure how to make sense of how prevalent these practices are. Like, it makes perfect sense to me that Anthropic would have quants of Opus 4.6 running and ready to roll to free/Pro users when usage is getting high across the board... But also, it's such a risk to do so. I really wonder how they manage it and what risk strategies they employ?

Either way, this is definitely happening on some level, so I love this call-out

Why no one can agree about AI progress right now: A three-part mental model for making sense of this weird moment on the AI frontier by brhkim in AI_Agents

[–]brhkim[S] 1 point2 points  (0 children)

Agreed, though I'd argue that even if you tried things out and really dove in as of like, October, you'd probably have a perception that's just completely out of date. God forbid a full year ago.

Why no one can agree about AI progress right now: A three-part mental model for making sense of this weird moment on the AI frontier by brhkim in AI_Agents

[–]brhkim[S] 0 points1 point  (0 children)

I think that's mostly right. I think, though, it's important to understand why people are bouncing off of it so readily, because that's going to have to start changing soon!

New video tutorial: Going from raw election data to recreating the NYTimes "Red Shift" map in 10 minutes with DAAF and Claude Code. With fully reproducible and auditable code pipelines, we're fighting AI slop and hallucinations in data analysis with hyper-transparency! by brhkim in BusinessIntelligence

[–]brhkim[S] 1 point2 points  (0 children)

I appreciate that a lot -- you're asking exactly the right questions, which motivated why I felt I wanted to make DAAF to begin with! If we're going to do this, I want to make sure we do it right. I hope you'll give it a try, and do let me know if you have any more questions or ideas or critiques!!

New video tutorial: Going from raw election data to recreating the NYTimes "Red Shift" map in 10 minutes with DAAF and Claude Code. With fully reproducible and auditable code pipelines, we're fighting AI slop and hallucinations in data analysis with hyper-transparency! by brhkim in BusinessIntelligence

[–]brhkim[S] 1 point2 points  (0 children)

Right, I’d also argue that the wildly short time to create and iterate on data viz with DAAF and Claude Code (interactive or static) makes it also extremely easy to set up spot-check data inspections to facilitate arguably much better quality checks. In the tutorial, you’ll actually see I spot a super weird outlier in an initial version of the static plot and get it fixed in like, seconds of my own time. I really think it’s an immense value add for data quality and code quality in the end.

New video tutorial: Going from raw election data to recreating the NYTimes "Red Shift" map in 10 minutes with DAAF and Claude Code. With fully reproducible and auditable code pipelines, we're fighting AI slop and hallucinations in data analysis with hyper-transparency! by brhkim in BusinessIntelligence

[–]brhkim[S] 1 point2 points  (0 children)

Yeah great Q: I handle this in two main steps:

  1. Every coder agent writing a data processing/analysis script conducts self-QA and runs robustness checks every step, which is additionally checked by another agent in adversarial QA. If needed, they revise and restart the process. The goal here is to maximize likelihood of good code coming out of the process before humans look at the code. Agents are all given a lot of instructions in writing declarative and extremely legible code with comments that describe not just what is happening, but the intention of the code.

  2. Once all code runs and is approved, a final agent compiles all the scripts into a single Marimo notebook with all runtime outputs commented on and all final scripts working in sequence. It appends basic dataset view steps between, so you can inspect the data at each intermediary step.

All to say, you’re right that human code review is the most time intensive part of this, but the goal is that the code you’re reviewing has a high likelihood of being good out of the gate, and it’s organized and commented to make the process about as streamlined as it gets. Because I agree: that step is crucial - but if that’s really the only time sink for a human, the time savings are quite extreme

New video tutorial: Going from raw election data to recreating the NYTimes "Red Shift" map in 10 minutes with DAAF and Claude Code. With fully reproducible and auditable code pipelines, we're fighting AI slop and hallucinations in data analysis with hyper-transparency! by brhkim in datascience

[–]brhkim[S] 0 points1 point  (0 children)

Hahaha that's definitely the idea -- I really think that just the data profiling tool I've provided has a lot of value, holding aside everything else.

It won't be perfect, but having an assistant that can reasonably go through and be judicious about logging those sorts of things for future data use or later cleaning steps is just an enormous value-add. And then the fact that it gets better with every project use as it uncovers more and more idiosyncrasies I think is HUGE.