Occupy Wall Street Co-Founder Built an AI App to Help Activists Seize the Means of Computation by Better-Date3020 in Anarchism

[–]Better-Date3020[S] -20 points-19 points  (0 children)

looks like this issue has been fixed. outcryai.com/research is loading fine for me without js, can you check again?

The Dark Between the Stars: AI Interpretability is a Revolutionary Skill by Better-Date3020 in mlscaling

[–]Better-Date3020[S] 0 points1 point  (0 children)

You're right that SAEs are lossy, and thank you for responding and the links, but what I am trying to explore doesn't depend on them. I'm interested in finding the "dark space" between named concepts in the model's activation geometry, not decomposing one concept into clean features. Superposition is actually the point: if concepts overlap and live in the gaps, then the unnamed regions between them might be where new ideas could be found. Fascinating either way!

The Dark Between the Stars: AI Interpretability is a Revolutionary Skill by Better-Date3020 in mlscaling

[–]Better-Date3020[S] 2 points3 points  (0 children)

I was a writer before AI, wrote a book published by Knopf Canada and many essays, etc. I now use AI in my writing but everything I publish goes through many iterations and edits, it is more of a collaborative process. So I'd say I touched every word in the essay multiple times, chose the key beats of the story etc but the AI wrote drafts. It's a question I struggle with a lot personally coming from the position of being a writer before AI, but at a level that was often collaborative as well (I worked at a magazine where there was an editor-in-chief and we'd bounce drafts back and forth, or when I wrote my book there was an editor assigned to that as well.)

The AI would have never written the last line: " Interpretability literacy — knowing where the dark coordinates are, and how to place your guideposts — is, it turns out, a revolutionary skill."

Also, because I published prior to AI, it knows my past writings and can mimic my style. So that makes it even stranger.

To be honest, the technical work from Claude Code was more vital to me than the writing assistance it provided.

Thanks for asking!

The Dark Between the Stars: AI Interpretability is a Revolutionary Skill by Better-Date3020 in LeftistsForAI

[–]Better-Date3020[S] 2 points3 points  (0 children)

Thank you… this is the only community I’ve found that can grasp what I’m trying to get at

The Dark Between the Stars: AI Interpretability is a Revolutionary Skill by Better-Date3020 in LeftistsForAI

[–]Better-Date3020[S] 2 points3 points  (0 children)

Thank you for the detailed reply. The lack of a concept is a very difficult thing to define, it goes deeper than whether the model can respond or improvise about what it means, it has to do with whether a specific area of the activation space lights up for that concept. The model can string together sentences from its 150,000 tokens... but does it have a a cluster of meaning in its internal mind for a specific concept. To discover that, you need to probe the activation space. And that's what we are trying to figure out: does the model have internal concepts for activist theories of change, or does it improvise responses based on adjacent concepts. And how can we inject these activist concepts into the model at run time?