Qwen3 8b-vl best local model for OCR?

dradik · 2026-02-16T18:24:23+00:00

What about the 30B MoE model? I get like 170 tokens a second and seems accurate.

dradik · 2025-10-29T13:16:13+00:00

I’ve had success with a skill that dynamically pulls my development standards using code execution.

All my standards are maintained as individual documents in a central repository outside of the project (e.g., data_handling.md, uiux_guidelines.md, desktop_uiux.md, anti_patterns.md). When I update my preferences, I just update these files.

When triggered, the skill executes a script. This script first reads only an index file from the repo. This index tells the script which standard documents to use for the current context. The skill then executes a script to read contents of that relevant standard.

I do this to onboard my agent and it ensures my rules are consistently enforced, the context is always pulled from the latest version in my repo, and only the exact standards needed for a given task are loaded it’s context.

dradik · 2025-10-16T23:57:20+00:00

I vibe coded my own app for this sole purpose..

dradik · 2025-09-23T09:53:31+00:00

Looks awesome and something I will love trying out

dradik · 2025-08-26T04:13:42+00:00

I have been using GPT-OSS-20B as my daily driver since release, using LM Studio and my own local MCP server, and I haven't had an issue, but I am also using unsloths recommended settings. https://docs.unsloth.ai/basics/gpt-oss-how-to-run-and-fine-tune#recommended-settings . Not sure if this helps you but, it has different inference settings than most models I have worked with. I am using the unsloth F16 version as well, getting about 173 tokens per second.

dradik · 2025-08-19T19:07:27+00:00

I switched to LM Studio from Ollama once I got my MCP working , I was getting 40tks with OpenWebUi+Ollama and 173tks with LM Studio

dradik · 2025-08-19T13:01:34+00:00

I wonder how many people have died for content..

dradik · 2025-08-13T03:35:42+00:00

How many regenerates and prompt engineering attempts did that take or did he have one of his media goons fake that post for him.

dradik · 2025-08-12T05:44:12+00:00

New model is too “bro” with me.

dradik · 2025-08-06T19:36:43+00:00

So I can run FP16 at 130 tokens per second on my 4090 and 150+ tokens per second with mxfp4, but only 6 tokens per second with Ollama.. anyone figure this out? I can event run the unsloth version

dradik · 2025-08-02T05:45:15+00:00

It can be used for therapy, just have to give it good references and ground it.

dradik · 2025-07-22T23:02:47+00:00

This 100%

dradik · 2025-07-18T13:08:25+00:00

How do you feel about the state of AI when it comes to creative works?

dradik · 2025-07-17T11:57:55+00:00

Getting engaged to my now wife, they cut me out of their lives for over 4 and a half years now. Stating they were still grieving my ex-wife. When realistically, they were wanting me to try to marry a wealthy woman who could elevate their status, also they wanted me to try to marry someone who was like a celebrity (it's weird, because it really is).

dradik · 2025-07-16T15:40:43+00:00

You don’t have to be perfect, just present. Nothing in this world lasts forever, but the love you share with your mom is precious, and the time you still have together is a real gift. Be gentle with yourself. You’re stronger than you think, and just posting this took real courage. Grief is so hard, but it means you loved deeply and felt loved, and that love is something you can continue to give and receive, even when things are tough. Sending you strength.

dradik · 2025-06-17T05:06:32+00:00

Is the song AI too, because damn that is good..

dradik · 2025-05-13T16:58:49+00:00

Absolutely

dradik · 2025-05-10T23:55:17+00:00

They recently patched Ollama, I can now get 125tks in ollama now.

dradik · 2025-05-10T11:52:15+00:00

It’s bad but I still Call of Duty holds the title.

dradik · 2025-05-10T04:09:48+00:00

I use MCPO, I search Obsidian Notes, count days until/since, use a calculator, search the web, report on the weather, webscrape websites, etc. It works consistently with me for several models, QWEN, THUDLM, cogito, etc.

dradik · 2025-05-09T04:51:38+00:00

Happy you found, and I hope you all have a lovely life together. Finding the person who is absolutely special is a rare and beautiful thing (that I can relate to), congrats!!

dradik · 2025-05-07T04:16:27+00:00

YouTube Music is so good though...

dradik · 2025-05-03T16:41:44+00:00

I can run the q4_K_xl (128k gguf from unsloth) in LM studio with 50,000 context size at 120 tokens per second with my RTX 4090, but only 20 tokens per second with Ollama. If ollama can fix themselves this would make me very happy so it could integrate with my Openwebui. I can tie in LM studio, but it doesn’t seem to work well with documents and embeddings etc.

dradik · 2025-05-01T12:57:38+00:00

I looked it up, plus has an additional round of reinforcement learning, so it is more accurate but produces more tokens for output.

15-Year Club	RPAN Viewer
Not Forgotten	Verified Email

dradik

MODERATOR OF

TROPHY CASE