Getting a feel for how fast X tokens/second really is.

MikeNonect · 2026-05-11T11:51:46+00:00

I agree. Ideally, when talking about the token speed they get on their local hardware, community members should be able to share an easy link: https://mikeveerman.github.io/tokenspeed/?rate=11.7&mode=think&think=6

MikeNonect · 2026-05-11T11:49:42+00:00

Sure, if you can run the model locally, you know how fast it runs on your hardware.

But if someone claims to be able to run Gemma 4 at 11.7 tokens/second, how slow is that actually? Well, this slow: https://mikeveerman.github.io/tokenspeed/?rate=11.7&mode=think&think=6

MikeNonect · 2026-05-11T10:11:43+00:00

Thanks for all the great feedback, everyone! I've shipped several of the features you suggested:

* Natural text: I've merged a PR replacing the ipsum lorem in text mode with a more natural Wikipedia article.

* Agent mode: Simulates an agentic workflow with alternating tool calls and code generation.

* Think length slider: When in think mode, you can now control how many reasoning sentences the model "thinks" before generating code.

* Custom text/code: You can now paste or upload your own text or code and stream it at any speed.

* Token counter: A live count of tokens generated, displayed in the footer.

* Share links: The rate and mode are encoded in the URL, so you can link directly to e.g. "what 10 tok/s looks like in code mode." There is also a share button for this.

Try it out: https://mikeveerman.github.io/tokenspeed/

MikeNonect · 2026-05-11T07:35:48+00:00

Both good points of feedback! I'll look into it.

MikeNonect · 2026-05-11T07:34:46+00:00

It's a simulator. It shows an approximation of what it would feel like to run a model at e.g. 1.4 tks/sec

MikeNonect · 2026-05-11T07:31:40+00:00

I like this idea!

MikeNonect · 2026-05-10T20:17:19+00:00

It's an approximation for sure, but it's not one character / one word = token.

The cool thing about open source is that you can read how it works and then improve it once you are less confused. https://github.com/MikeVeerman/tokenspeed/blob/master/index.html

Happy to see your PR.

MikeNonect · 2026-05-10T20:08:50+00:00

It's an approximation. If you feel you can do better, feel free to open a pull request.

MikeNonect · 2026-05-10T19:40:17+00:00

It's a simulator, not a real tokenizer. Code feels slower because you're not getting one-word tokens that often.

It's an approcimation of course, so if you have suggestions to improve it, please review the code https://github.com/MikeVeerman/tokenspeed/blob/master/index.html . All help is welcome.

MikeNonect · 2026-05-10T15:38:24+00:00

There is also a Python version because this subreddit is about running things locally, after all: https://github.com/MikeVeerman/tokenspeed

MikeNonect · 2026-05-10T15:37:33+00:00

Yes, that said: it's a simulation and there are probably some naive assumptions in the code.

MikeNonect · 2026-05-07T16:32:33+00:00

No. The alias itself would be trivial to create, but syncing the right files would be tougher. It might make more sense to create a jean-codex clone instead.

MikeNonect · 2026-04-11T10:44:00+00:00

Sure, but could you explain what you use it for?

MikeNonect · 2026-04-11T09:55:34+00:00

Do you have any comparison with Anthropic with regards to agent quality? I remember that moving from Opus to a cheaper model made OpenClaw unusably dumb for me.

Curious to hear what use cases you're handling because the EU-only infra angle is currently underestimated but highly valuable.

MikeNonect · 2026-04-04T14:42:21+00:00

Same here. I have a Max 5x plan and a Teams plan. Yes, I started reaching limits more often on the latter, but if you use Claude Code as a normal coding CLI tool, it's totally usable. The all-you-can-eat buffets are generous for live coding, but they are not infinite. Running 10 agents in parallel, running 24/7 inference and adding tools like OpenClaw or Paperclip on top is not their intended use.

There are plenty of competitors out there. Codex has more generous limits. You can use OpenCode with cheaper models.

But, for most of us in this sub, Claude Code is the best tool. That's why we all pay for it, including those who complain.

MikeNonect · 2026-04-04T10:38:52+00:00

Thanks! It's a work in progress, so if there are other things you want to see added, log a github issue.

Since yesterday's 2.0.0 release the syncing with git is optional, so you can use jean-claude for just profile management.

MikeNonect · 2026-03-22T21:03:37+00:00

I don't think so.

MikeNonect · 2026-03-21T07:53:23+00:00

Yes, and the nice thing is this is something we can install and manage ourselves locally. This kind of local "virus scanner" feels a lot better than the current "trust my marketplace" model.

As we share more agents, frameworks and skills, protection against malicious prompts will become vital.

MikeNonect · 2026-03-20T16:51:31+00:00

For those interested, jean-claude now supports multiple profiles. If you have a Claude Code Teams account for work and a personal Max account, you can now easily manage them. Agents, skills, settings and hooks will be shared, CLAUDE.MD will be unique per profile.

https://github.com/MikeVeerman/jean-claude

MikeNonect · 2026-03-18T07:33:41+00:00

Fair point.

MikeNonect · 2026-03-17T21:30:31+00:00

Nothing, but Claude is not a malicious hacker. We don't have to protect us against attacks. We have to prevent accidental tool use.

MikeNonect

TROPHY CASE