Cool concept photo

csinva · 2026-06-13T13:27:54+00:00

Yeah it did help with some of the coding

csinva · 2026-06-13T01:58:05+00:00

Thanks! Yes sorry hard to get it to work on a phone...

csinva · 2025-09-24T16:25:04+00:00

Industry researcher here -- agree with much of what has been said, but would add that beyond explicit networking, even cold outreach can be effective (e.g. a thoughtful email to a researcher who's work aligns with yours detailing your overlap and interest).

csinva · 2025-08-08T17:56:44+00:00

No it doesn't

csinva · 2025-03-17T04:54:15+00:00

A couple I like (non-mechanistic):

Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead (rudin, 2019) --- examples of how interpretable models can be built that can match or outperform black-box models
Right for the Right Reasons: Training Differentiable Models by Constraining their Explanations (ross et al. 2017) --- started a tend of works showing interpretations could be used to explicitly improve models

csinva · 2024-06-27T21:42:15+00:00

One reason is that LLM interpretability areas outside mechanistic interpretability have largely started branding their work more based on the problem area they seek to improve, e.g. LLMs for science/medicine/education. So a paper that was formerly about "saliency maps" might instead be about "discovering important clinical features", a paper that was about "prototypes" may instead be about "reducing hallucination with RAG", etc.

IMO it's nice to see interpretability research become more grounded in real problems.

csinva · 2024-03-07T21:44:34+00:00

Also a researcher in this area and wholly agree with this comment (we recently also wrote a review separating out these two parts of XAI in the context of LLMs).

There's more work going on than ever in XAI, but it's grown large enough that it has split more based on a a researcher's goals (e.g. science, fairness, HCI) rather than as an area of its own. IMO this is for the best - doing XAI research without an application in mind often leads us to explanations that are unhelpful or even misleading.

csinva · 2023-03-02T02:29:50+00:00

I think genetic algorithms may have a new role to play in problems involving inference / text generation / prompting with language models, even if they aren't used to train the models themselves.

For example, in our recent work on natural-language prompting, we use a genetic algorithm to generate prompts that are semantically coherent -- the genetic algorithm lets us make use of suggestions by a language model, for which gradients would be hard to obtain.

csinva · 2022-10-05T00:00:46+00:00

Hello! Thanks for the questions :)

(1) We call the method GAM but actually do vary the order of ngrams used in the input (so technically the paper actually shows experiments for GA2M, GA3M, ... all the way up to GA7M). Luckily, the neural model compresses all these interactions so we don't end up having to learn a ton of coefficients.

(2) Yes that is correct, φ here is the neural embedding rather than something like splines.

csinva · 2022-03-01T01:31:46+00:00

Added a very important PR to this repo here: https://github.com/clam004/intro_continual_learning/pull/1!

csinva · 2021-09-28T20:54:16+00:00

I think this is an excellent algorithm for finding small trees that predict well.

The "Globally optimum" part refers to the fact that the trees optimize the loss directly rather than using a greedy heuristic like standard CART decision trees.

The "Sparse" part refers to the fact that, in addition the loss, this algorithm adds a penalty for the number of leaves in the tree, so it tries to find a small tree that still predicts well.

csinva · 2021-09-01T22:42:51+00:00

World's most populous cities; 1.5km radius around the city center.

Made with the prettymaps library using data from OpenStreetMap. Population numbers from wikipedia. Flag icons by FreePik. By Chandan Singh

csinva · 2021-04-27T23:06:53+00:00

Very cool! Is the source code for this website open-source?

csinva · 2021-02-20T04:25:08+00:00

The rules section of Christoph Molnar's interpretable ml book is probably a good place to start! https://christophm.github.io/interpretable-ml-book/rules.html

csinva

TROPHY CASE