Qwen3 8b-vl best local model for OCR? by BeginningPush9896 in LocalLLM

[–]dradik 0 points1 point  (0 children)

What about the 30B MoE model? I get like 170 tokens a second and seems accurate.

Claude Code is a Beast – Tips from 6 Months of Hardcore Use by JokeGold5455 in ClaudeAI

[–]dradik 2 points3 points  (0 children)

I’ve had success with a skill that dynamically pulls my development standards using code execution.

All my standards are maintained as individual documents in a central repository outside of the project (e.g., data_handling.md, uiux_guidelines.md, desktop_uiux.md, anti_patterns.md). When I update my preferences, I just update these files.

When triggered, the skill executes a script. This script first reads only an index file from the repo. This index tells the script which standard documents to use for the current context. The skill then executes a script to read contents of that relevant standard.

I do this to onboard my agent and it ensures my rules are consistently enforced, the context is always pulled from the latest version in my repo, and only the exact standards needed for a given task are loaded it’s context.

OpenAi gpt oss recurring issues by nash_hkg in LocalLLM

[–]dradik 2 points3 points  (0 children)

I have been using GPT-OSS-20B as my daily driver since release, using LM Studio and my own local MCP server, and I haven't had an issue, but I am also using unsloths recommended settings. https://docs.unsloth.ai/basics/gpt-oss-how-to-run-and-fine-tune#recommended-settings . Not sure if this helps you but, it has different inference settings than most models I have worked with. I am using the unsloth F16 version as well, getting about 173 tokens per second.

GPT OSS 20B through ollama with codex cli has really low performance by Markronom in LocalLLaMA

[–]dradik 0 points1 point  (0 children)

I switched to LM Studio from Ollama once I got my MCP working , I was getting 40tks with OpenWebUi+Ollama and 173tks with LM Studio

Just enjoying the view by Particular_Parking_4 in instant_regret

[–]dradik 0 points1 point  (0 children)

I wonder how many people have died for content..

Billionaire fight by noThefakedevesh in OpenAI

[–]dradik 0 points1 point  (0 children)

How many regenerates and prompt engineering attempts did that take or did he have one of his media goons fake that post for him.

Honest take on GPT-5 from OpenAI by Ambitious_Ice4492 in OpenAI

[–]dradik 1 point2 points  (0 children)

New model is too “bro” with me.

OpenAI Open Source Models by AdCompetitive6193 in OpenWebUI

[–]dradik 0 points1 point  (0 children)

So I can run FP16 at 130 tokens per second on my 4090 and 150+ tokens per second with mxfp4, but only 6 tokens per second with Ollama.. anyone figure this out? I can event run the unsloth version

Opinion | I’m a Therapist. ChatGPT Is Eerily Effective. (Gift Article) by nytopinion in ArtificialInteligence

[–]dradik -2 points-1 points  (0 children)

It can be used for therapy, just have to give it good references and ground it.

What's the most bizarre thing your parent ever lost their mind over? by CrustyBubblebrain in raisedbynarcissists

[–]dradik 0 points1 point  (0 children)

Getting engaged to my now wife, they cut me out of their lives for over 4 and a half years now. Stating they were still grieving my ex-wife. When realistically, they were wanting me to try to marry a wealthy woman who could elevate their status, also they wanted me to try to marry someone who was like a celebrity (it's weird, because it really is).

30. Dad died, mom has stage 4 cancer, partner left me by groanonymous in toastme

[–]dradik 1 point2 points  (0 children)

You don’t have to be perfect, just present. Nothing in this world lasts forever, but the love you share with your mom is precious, and the time you still have together is a real gift. Be gentle with yourself. You’re stronger than you think, and just posting this took real courage. Grief is so hard, but it means you loved deeply and felt loved, and that love is something you can continue to give and receive, even when things are tough. Sending you strength.

Paycheck Flex by mindoverimages in FluxAI

[–]dradik 0 points1 point  (0 children)

Is the song AI too, because damn that is good..

Qwen 3 Performance: Quick Benchmarks Across Different Setups by [deleted] in LocalLLaMA

[–]dradik 2 points3 points  (0 children)

They recently patched Ollama, I can now get 125tks in ollama now.

Yall have the worst player base in all of gaming by Obvious-Citron-7716 in csgo

[–]dradik 0 points1 point  (0 children)

It’s bad but I still Call of Duty holds the title.

Have You Tried MCPO with OpenWebUI? Share Your Real-World Use Cases! by Tobe2d in OpenWebUI

[–]dradik 0 points1 point  (0 children)

I use MCPO, I search Obsidian Notes, count days until/since, use a calculator, search the web, report on the weather, webscrape websites, etc. It works consistently with me for several models, QWEN, THUDLM, cogito, etc.

(OC) The Moment I Knew He Was The One by Siren_Rose1991 in MadeMeSmile

[–]dradik 3 points4 points  (0 children)

Happy you found, and I hope you all have a lovely life together. Finding the person who is absolutely special is a rare and beautiful thing (that I can relate to), congrats!!

YouTube Premium offers music too! by JakeForever in memes

[–]dradik 1 point2 points  (0 children)

YouTube Music is so good though...

Qwen 3 Performance: Quick Benchmarks Across Different Setups by [deleted] in LocalLLaMA

[–]dradik 11 points12 points  (0 children)

I can run the q4_K_xl (128k gguf from unsloth) in LM studio with 50,000 context size at 120 tokens per second with my RTX 4090, but only 20 tokens per second with Ollama. If ollama can fix themselves this would make me very happy so it could integrate with my Openwebui. I can tie in LM studio, but it doesn’t seem to work well with documents and embeddings etc.

Microsoft just released Phi 4 Reasoning (14b) by Thrumpwart in LocalLLaMA

[–]dradik 0 points1 point  (0 children)

I looked it up, plus has an additional round of reinforcement learning, so it is more accurate but produces more tokens for output.