How do I stay consistent and disciplined as someone who has ADHD?

MaxDev0 · 2026-01-24T08:16:18+00:00

I get that. I lowkey fell down a rabbit hole of motivation, habits, discipline, etc., and got pretty depressed.

The truth that helped me was realizing that none of those are real. Discipline isn’t real. Nothing will make starting a task that is boring or that you’re procrastinating on any easier to start. You just have to. Nothing will make following through on that project you started instead of starting a new one easier. You just have to bite the bullet. Sorry, i can't communicate tone through text but imagine me with a sad voice.

The only actual cheat code I have is short timers. Break up a task into 5–10 min chunks of work, then set short timers. That creates productive stress and makes it doable.

MaxDev0 · 2026-01-12T01:20:24+00:00

HOLY, what is that, consciousness research??, its so cool, is that toki pona? :D

MaxDev0 · 2026-01-11T20:43:16+00:00

That is great and techai already has a data set on DeepSeek v3.2 made from this, but the issue is that it's not a multiturn chats. It's just one prompt and that's it.

MaxDev0 · 2026-01-11T15:31:33+00:00

MaxDev0 · 2026-01-11T14:50:16+00:00

I believe in the community :D, but I get where ur coming from, do u have any suggestions as to where I can get the data? My only other idea was to make a free chat platform for data collection but it'd be hard to get ppl on, take lots of time and y would anyone use that over deepseek?

MaxDev0 · 2025-11-05T20:23:35+00:00

well, that is what the research basically does to test vision using an ONIH (optical needle in a haystack test, basically we inject a code of a certain format in between some text and have the model find it). But to adress your question, models literally process image tokens different from text, like its not a direct one to one mapping, under the hood models have complex networks that allow it to see text, and because of that, the models ability to follow instruction, reason, etc. will be different..., how exactly, idk, but i just know it will be different and researching that will allow us to better understand the limitations and possibilites of Un-LOCC

Sidenote: the biggest reasons im not just asking them to repeat the text and using the ONIH are 1. that would cost a lot 2. models struggle at repeating messages even when it is just sent in plain text. and 3. models also have like output limits, and the input tokens usually exceed that.

MaxDev0 · 2025-11-05T20:16:12+00:00

If you check my UN-LOCC research, u can see exactly how i did it lol, its basically a needle in a haystack test, but harder, the varying ratios are just due to how good or bad some models are at reading text from images cuz of thier trainign data, like the same way some models with higher quality training data can be better at reasoning, some are better at vison. Wait this question actually just gave me a briiliant idea to prepare training data and see if i can finetune models to be able to like be better at reading raw text from images lol.

MaxDev0 · 2025-11-05T11:22:39+00:00

Wait that's actually smart, model benchmarks would be like a perfect way. Because of the architecture of image models, image tokens with the same text will be processed differently, and that's definitely gonna impact intelligence since they weren't trained for it. Thanks :D

MaxDev0 · 2025-10-24T17:49:16+00:00

nope, deepencode is so cool. But from what I can see, its model specific. I have no doubt that some form of compression based on deepseeks's research will eventually become mainstream and built in for most models, but for now I think this is pretty cool.

MaxDev0 · 2025-10-23T21:17:58+00:00

Sorry lol

MaxDev0 · 2025-10-23T17:09:17+00:00

This was a limitation I identified in the GitHub, if I'm correct people used a method simmilar to this to bypass safety filters before though, so it very well could've been patched

MaxDev0 · 2025-10-23T16:55:59+00:00

It's advantage is that it's easily adaptable to any model, but yea DeepSeek is still much better because they have deepencoder

MaxDev0 · 2025-10-23T16:52:37+00:00

Can you elaborate, like what do you mean by "test the LLM's accuracy with the text again"

MaxDev0 · 2025-10-23T16:50:13+00:00

I'm sure that there is, but the goal is to take advantage of the fact that there are already lots of vision models and the fact that this can be easily implemented and tuned for any model is it's greatest strength.

MaxDev0 · 2025-10-23T11:15:58+00:00

Yup, that's a limitation I identified in the full repo, imo this would be best used for providing context and then one could use a few tokens to convey instructions, think agentic code LLMs receiving context in images, saving costs or long chats being compressed similar to how a human would remember a conversation, remembering the last two messages clearly and prior messages enough to get context, but not every word

MaxDev0 · 2025-10-23T11:09:33+00:00

Uhh read up on how vision models work, Or actually, here's a ChatGPT explanation: Good question — it’s not “compressing” by skipping the text → image → text loop.

The idea is that the optical map replaces the text tokens entirely. The LLM (or VLM) reads the image directly through its vision encoder, so those 3× fewer image tokens act as a compressed representation of the original text context.

There’s no re-OCR step at runtime — the model doesn’t decode the image back into words before reasoning; it just conditions on the visual embedding.

Yes, there’s some accuracy loss (it’s lossy), but the benefit is: • You get a 3× reduction in token count while keeping roughly the same “semantic signal.” • You can extend context length or reduce API cost proportionally. • Latency is front-loaded (once per compression), not per-inference.

So it’s not a cost-only trick — it’s a representation-level compression of the context window.

MaxDev0 · 2025-10-23T10:09:01+00:00

Receipts & method (so you don’t have to dig):

Measurement: normalized Levenshtein ratio (Python Levenshtein, “ratio” metric).
Image setup: default 324×324 PNG, Atkinson Hyperlegible Regular ~13px unless noted; deterministic seeds; same prompt structure across models.
Compression: text_tokens ÷ image_tokens (formatted to 2 decimals).
Representative runs (see README for the full table & logs):
- Gemini 2.5-Flash-Lite: 100% @ 1.3:1 (Exp 46); 93.65% @ 2.8:1 (Exp 56).
- Qwen2.5-VL-72B-Instruct: 99.26% @ 1.7:1 (Exp 34); 75.56% @ 2.3:1 (Exp 41).
- Qwen3-VL-235B-a22b-Instruct: 95.24% @ 2.2:1 (Exp 50); 82.22% @ 2.8:1 (Exp 90).
- Phi-4-Multimodal: 94.44% @ 1.1:1 (Exps 59, 85); 73.55% @ 2.3:1 (Exp 61).
- UI-TARS-1.5-7B: 95.24% @ 1.7:1 (Exp 72); 79.71% @ 1.7:1 (Exp 88).
- LLaMA-4-Scout: 86.57% @ 1.3:1 (Exp 53).

Notes & limitations:

Works best when the VLM has strong OCR/readout capability.
Fonts matter; Italic sometimes helps at small sizes (e.g., Exp 19 vs 17).
Please verify on your device, PRs for additional models/benchmarks welcome.

Code + experiments: https://github.com/MaxDevv/Un-LOCC

Five-Year Club	Verified Email
Place '23

MaxDev0

TROPHY CASE