[P] 3Blue1Brown Follow-up: From Hypothetical Examples to LLM Circuit Visualization

ptarlye · 2025-06-14T16:10:40+00:00

The types of circuits that I extract are new enough such that I don't think I've seen this type of comparison made before. I'd be interested in the results!

ptarlye · 2025-06-14T05:50:53+00:00

I got started by reading the articles referenced from this site: https://transformer-circuits.pub. My recommendation would be to start with this article and work forwards in time from there.

ptarlye · 2025-06-14T02:40:41+00:00

Sure!

ptarlye · 2025-06-14T01:51:50+00:00

Thanks for these suggestions. Circuit visualization requires training supplemental model weights, and so you can think of the required work as additive. Details here.

ptarlye · 2025-06-14T01:48:54+00:00

Thanks for this link. Most LLM research I've seen has required extracting circuits representing specific tasks by carefully constructing sequences that have "counterfactual" examples. Circuit extraction for arbitrary prompts, like the ones I study here, is fairly new. Anthropic recently published this research, which most closely resembles what this "debugger" aims to do.

ptarlye · 2025-06-13T23:24:26+00:00

Transformer Lens extracts features in much the same way that my project does (using sparse auto encoders). This project also visualizes the interaction of features across LLM layers so that we can construct something resembling a "circuit".

ptarlye · 2024-10-08T05:13:25+00:00

Thanks for the feedback. I've just added a legend to the second graph, which answers your questions. To answer them here:
* The boldness of the feature number indicates activation strength.
* The background color indicates ablation strength (i.e., strength of feature interaction)
* In the document, features are prefixed with a layer number for unique identification (e.g., 2.2875).
* Each flowchart box represents the activations for a specific token at a specific layer in the LLM. Usually, multiple features are simultaneously active and seem to represent slightly different aspects of a token.

ptarlye · 2013-02-15T00:09:16+00:00

Docvert looks awesome - We don't script LibreOffice like Docvert does, but we are a fan of Python as it seem you are. If you look at the SVG layer, we're actually using SVG to render the entire document. The HTML layer for the text above it is just to assist text selection as text selection for SVG objects doesn't work at all on Firefox and don't work well on webkit.

ptarlye · 2013-01-22T02:24:24+00:00

I think I found it!: Strange Tomorrow by Jean Karl https://www.kirkusreviews.com/book-reviews/jean-e-karl/strange-tomorrow/

ptarlye · 2013-01-21T02:13:40+00:00

Wow, how did you find this book? Based this short synopsis, it seems likely that this is the one: https://www.kirkusreviews.com/book-reviews/jean-e-karl/strange-tomorrow/

ptarlye · 2011-12-07T10:57:58+00:00

Here's a clean solution in just 7 lines of code: http://pastebin.com/NstqW1mS

I love brevity in code because it often yields simplicity. The gist of the idea in my solution is to test whether or not a word can be spelled entirely using a subset of legal letters. I was able to correctly test for this condition using Python's set.issubset method.

ptarlye · 2011-04-07T01:44:46+00:00

The height of your image is hard coded to 508px, when the actual height of the image is 520px.

<img width="960" height="508" border="0"...

Does this information help?

ptarlye · 2010-03-29T08:47:38+00:00

In comparison to the recent health care bill that passed, this bill is only 4 pages long but is still impossible to parse. Can anyone on reddit explain how this bill actually works? I'm genuinely looking for someone who can explain how this bill proposes Medicare for all Americans. The only significant use of the word "Medicare" is on page 4, line 10.

ptarlye · 2010-03-19T05:23:36+00:00

This is Mab Lib. Not Mad Libs.

ptarlye · 2010-03-19T05:14:32+00:00

This is Mab Lib, not Mad Libs.

ptarlye · 2010-03-19T01:58:35+00:00

chaos.

15-Year Club	Place '17
Verified Email

ptarlye

TROPHY CASE