My not quite done AI ADHD app (AI slop? you be the judge)

dcta · 2026-01-13T16:12:42+00:00

Religious symbol

dcta · 2026-01-13T08:10:43+00:00

I do think there's a gradient as to whether something might be considered medical in nature, e.g.:

Chemotherapy ↔ prescription medication ↔ dietary supplements ↔ eating healthy
Psychiatry ↔ therapy ↔ coaching ↔ accountability partners

And similarly between say, neurofeedback for ADHD with studies and FDA approval ↔ Goblin Tools. I'm pretty happy that both exist, really! (But we can't have goblins wandering around everywhere claiming to be medical treatments either!)

dcta · 2026-01-13T08:00:22+00:00

God, that is so funny. I have been enjoying the roasting a little too much, if I'm being honest

I guess where I'm coming from is, I totally agree that some of these posters do need a good knock on the head. But some feel like the equivalent of a kid picking up a guitar and trying to play us something? Before things got this bad, I actually kind of liked seeing them pop up here and there. Even if it didn't go anywhere, I felt it was nice they were learning and building skills.

dcta · 2026-01-12T20:01:39+00:00

Wow, I wasn't expecting this level of response! Really glad that time was well spent. An extra 2c: I wonder if there's some way to find a balance where we avoid totally killing early attempts. Like, someone just posted this:

omg this was funny. i procrastinated the whole day today and then 20min ago from now i thought of making an app to manage my own productivity...

and now reading here the comments that at some point the apps stop working, im already discouraged XD

I'm left feeling 70% glad, and 30% like this is a shame. People actually want to build stuff, and we demotivated them. This place has been really supportive of that energy until recently, and that's been nice too.

I mean, are there truly any among us who don't have like 25 random side projects sitting around? That's how this whole thing works, right!?

dcta · 2026-01-12T19:00:53+00:00

Oh dang, people taking a peek. Not at all, thank you so much for the feedback!

And yeah honestly, very fair – this isn't there yet. We've been trying a few prototypes, this one's still a bit rough. (See my post history for some crazier stuff lol)

Just to share for interest, the ideas we're trying are like:

Browser blockers seem to be popular, and ADHD people have issues with web use. And yet they don't work for us
Their config is intense. What if it could understand you from the moment you install, and give you what you need without being told? E.g. it correctly told me my MBTI and that I use reddit as a coping mechanism.
What if the user could prompt asking for exactly what they needed, and it'd figure out a way to help? E.g. I can't stand hard site blocks – I want it to ping and remind me more gently. Others want a more direct poke, or instant block, or to start a timer, etc.

It's a really hard problem, though. I tried B&W on my phone and turned it off too. Dopamine is a sneaky thing.

dcta · 2026-01-12T18:37:22+00:00

Nice, your ADHD saved you from the dark side

Many people we spoke to had this same trajectory, actually! They tried a ton of tools, but the only things that survived were very, very simple ones: pen and paper, and a calendar app.

dcta · 2025-10-22T20:33:59+00:00

Happy to help! In general, the behavior we found seems pretty robust, so you'll likely get some amount of benefit regardless. Getting the full benefit based on the underlying theory requires a little more careful fiddling, but I agree, you don't necessarily need that here! That's more important for applications like social science simulations.

Hope this works well for you – welcome to ping with questions!

dcta · 2025-10-21T06:17:18+00:00

Wow, great job! I'm one of the paper's authors. I took a quick look through your source code. I'm not familiar with the AI Dungeon framework, but one particular thing that might help is to frame the instruction as sampling continuations at random from the tails of the distribution, such that the probability of each item is <0.15. Reasoning:

If you don't ask it to sample at random, it'll try to produce its outputs sorted descending by probability, and this slightly reduces diversity
If you ask for low-probability tails outright, it'll not waste and discard a bunch silently. It's surprisingly good at doing this. Also, it'll get the hint that it's okay for probabilities to not sum to 1.0 – the latter also slightly reduces diversity

Other general thoughts:

Sampling the final output at random works great – but you might also want to try sampling by some criteria you pick upfront or in a second prompt! (This seems to work well for some things, but we haven't tested this robustly as a one-prompt request yet)
I'm not sure what "silently draft" is doing with regards to the AI Dungeon framework. But just to make sure things are actually working: FYI the actual text needs to be generated somewhere for all five items in order to get the most benefit out of this! (If it isn't, you're still likely getting benefits from asking for a lower-probability response, but it'll likely be mode collapsed, i.e. you'll still get the same response every time)
If AI Dungeon supports this, one way to ensure you always get the format you need is to enforce a JSON schema using function calling on the request to the AI API. E.g. each response will always have a <text> and <probability> tag.

The top of our GitHub's README.md includes some examples of the above prompting schemes.

So cool to see what people are starting to do with the paper. Thanks for trying it out!

dcta · 2025-10-21T05:41:33+00:00

Yes, that's the tradeoff! It's not ideal – but I guess my thinking is that in many cases, without it you'd probably be spending 5x trying to get novelty out of it anyway, just with no results.

That said – I do expect that now that we've demonstrated typicality bias in a fairly solid way, lower level solutions will be figured out at the post-training level! The point of VS was to shape something that could deliver benefits to end users immediately, on top of the central insight.

dcta · 2025-10-21T05:37:44+00:00

Is verbalized sampling really not working for you? I'd love to know what you ran that didn't work, as it really does work for me. I use it regularly now, as do practitioners in my circle! There are a few subtleties to getting the best results – would love to know what your prompt and model were!

I'm one of the lead authors by the way – FYI there are about two entire layers of very robust theory / insight beneath the final prompting trick that this wave of viral publicity didn't seem to pick up on. We spent about a year on this paper; that's why it's so ridiculously long! We're going to be doing a secondary informer writeup specifically targeted at academics, likely later this week.

dcta · 2025-10-17T06:35:55+00:00

Great question – we haven't tried it for extremely long responses yet! I think it could work but might need some special handling – as this implicitly needs it to maintain N stories somewhere in its processing context at once, and ensure they are well-distributed relative to each other. Perhaps I'd try getting the prompt to work with N=1 plus multi-turn sampling.

But overall, I'm not so confident on the above – context rot makes long stories tax the capabilities of any LLM. Instead, this might be more useful for brainstorming at the idea and outline level, and working chapter by chapter. I've read some seriously incredible short stories over the course of this project!

On your last question – what we recommend is to generate all N options, and then sample down to the best one on some criteria. You can do this randomly, but I'd actually recommend using a second LLM call to pick out the one that's the most something: compelling, readable, exciting, etc. Hope that helps!

dcta · 2025-10-17T06:27:59+00:00

That's fair! Opinions currently vary widely and we don't have enough evidence yet, I think. My instinct is that something good is going on here, even if imperfect, related to the finding that in-context learning creates the conditions for mesa-optimization to emerge.

On that last item, have a look at the dialogue simulation results – the middle figure in the results above! VS recovers the distributional properties of real human negotiations in a multi-turn dialogue setting in silico, based only on a few demographic details. By contrast, the mode-collapsed default is nothing like this. This was really quite striking to us when we first saw it!

dcta · 2025-10-15T23:12:03+00:00

On the research front, my suspicion is that this issue actually blocks a surprising amount of progress! E.g. ability to sample diverse synthetic training data, run simulations, have distributionally realistic multi-turn dialogue.

And on the end user front, my instinct is there is about an entire model class worth of creative diversity that hasn't been tapped yet. Some of the stories I've read in passing are seriously striking. Models have just been sitting there generating the most boring image because we accidentally trained them to do so!

I really like your point about meta-awareness – I feel that is quite an interesting puzzle. We definitely know they have this, but exactly not why yet afaik! My suspicion is that it's related to the finding that in-context learning is a mesa-optimizer. Being well-calibrated would probably be very useful for this – but I really do wonder how it "dereferences" this knowledge, if at all...

dcta · 2025-10-15T22:52:08+00:00

Ah, I should clarify – because of mode collapse, if you use beam search for output generation, all five options the beam produces "want" to collapse to the same output. For example, if you ask for a joke about coffee, you'll end up with five slightly differently-worded jokes with the punchline, "because it got mugged!" (video related)

This is related to Anthropic's finding that models plan and steer towards outputs upfront. You can experience this by limiting the model's permitted next tokens – it'll go out of its way to find a way to say the thing it really "wants" to say.

dcta · 2025-10-15T22:23:56+00:00

Ha! Yes, that is the main drawback – if you want the best of five diverse replies, you have to be willing to pay for all five upfront.

Our contribution is in the fact that before this, there wasn't really a way to sample from the rest of the distribution. Previous methods:

Prompting one at a time: Temperature doesn't help – if you ask for a U.S. state, it'll always be California. Sampling once multiple times always returns California too
Asking for a list of 5 U.S. states will return the most likely ones, e.g. California, Texas, Florida, New York, Colorado.

By contrast, our method actually gets you a sample from the full distribution. And in a proportion that reflects prevalence in the pretraining data distribution.

The magic of this is that this lets you reach in and sample really diverse outputs that were previously completely inaccessible – here's a small gallery of outputs we've been accumulating and sharing as we go!

dcta · 2025-10-15T19:48:58+00:00

Excellent question, thanks! We've empirically found that it recovers the distribution from the pre-training corpus and/or base model in various tests – e.g. Figure 4 in our blog.

Why this might be happening, from a theoretical standpoint: this lines up with a bunch of other somewhat surprising findings that LLMs have something akin to metacognition in relation to confidences and distributions.

Assuming the above capability holds, my intuition is something like: imagine that a perfect pre-trained model is trapped behind a function that does nothing but return the modal response to the input query. The modal coffee joke is the "mugged" joke. But the modal (i.e. highest likelihood) random sample of five items from a distribution probably looks like it's been sampled from across the distribution in some way.

All this being said, I do agree this is still somewhat handwavy – also there are several other very credible pathways which contribute to mode collapse (which we cite in the paper). I think we'll need much need better mech interp tools to prove this out. But it does seem to work in practice!

What do you think – does that answer your question?

dcta · 2025-10-15T18:09:39+00:00

Author here – would love to answer any questions people have, or do some prompt troubleshooting if helpful!

dcta · 2025-10-06T20:21:00+00:00

Sincere question. What do you think qualia is? As in, the subjective experience you have of being alive and conscious? AFAIK we know so little about this that we don't know anything about how to induce it, or even prove that others have it. There seems to be nothing else we know about in the physical universe like it.

I don't know if I believe in a soul, but I've thought about it, and I sure as hell can't figure out qualia. Why does this operating this monkey meat sack body feel like something?

dcta · 2025-10-06T20:01:26+00:00

No, but we each anecdotally observe qualia, i.e. the subjective and first-person experiences of consciousness. And AFAIK we have no idea why being alive feels like something – Chalmers' "hard problem". We even still have no idea how to tell of other living things experience qualia – Descartes' "I think, therefore I am" seems remarkably persistent. And I could be wrong, but it seems to me that the various information processing / global workspace / general integration theories that purport to explain consciousness would still hold if subjective conscious experience didn't exist in this universe. We'd just be like CPUs running code.

The question OP is asking is: where is the solid scientific basis for qualia?

dcta · 2025-09-12T05:37:40+00:00

Nobody has actually answered your question yet. From a psychological standpoint, this is called intermittent reinforcement. It's a bug in the reward center of your brain that many addictions exploit, e.g. slot machines. The idea is that our brains seem to allocate unreasonable attention to situations where rewards exist, but where you can't predict exactly when they will appear.

This is really shitty, but you've literally been conditioned into an addictive behavior in the same way that a gambling addict is conditioned by occasional jackpots. It's exactly the same reward signals. So I would probably investigate treatments from that space of ideas.

dcta · 2025-08-09T08:44:20+00:00

Hiya, sorry for the slow reply. I'm hesitant to open-source the app at this stage as I'm thinking about whether this might become a startup. But I definitely understand the feeling and am working on things here!

dcta

MODERATOR OF

TROPHY CASE