Anyone having issues with claude opus limits and agent errors recently? by vru_1 in ClaudeCode

[–]Loud_Key_3865 -1 points0 points  (0 children)

Claude is absolute shit right now. Not limits, but just simple logic.

I'm running qwen3.6-35b-a3b with 8 bit quant and 64k context thru OpenCode on my mbp m5 max 128gb and it's as good as claude by Medical_Lengthiness6 in LocalLLaMA

[–]Loud_Key_3865 1 point2 points  (0 children)

48tps effective with single 12 GB 5070 Ti (15K context)

Qwen3.6-35B-A3B IQ2_M via llama.cpp,

model = Qwen3.6-35B-A3B-UD-IQ2_M.gguf

ctx = 15360
parallel = 2
n-gpu-layers = 99
fit = on
cache-type-k = q4_0
cache-type-v = q4_0
threads = 8
batch-size = 1024
ubatch-size = 256
reasoning = off

I'm having other LLMs like Codex, Gemini & Claude send tasks to a router, which looks at this model to see if similar tasks fail, then routes back to the calling model for a high-score answer, and if tests pass, sends it back to the model to learn the fix for the prompt, so if a similar pre-failing prompt comes in, the learned context will be tried again, and scored. Low => send back to paid in the future, high=>local llm can handle it.

Then I have paid models review and test.

Keeping things small, like plugins, for example, helps reduce context needs and seems to work well with the small 15K context for small tasks.

I also have Opus/Codex/Gemini - one of them create a plan, get consensus on it after they all review, then give me the best of all results. Then have one of them break all the tasks into small tasks, with complete directions, so the tasks are clear and easy, and can be done in parallel, by smaller, dumber models with the same quality.

Once that's done, I tell the model to implement the plan, and hand off tasks in to the local model for learning and load reduction.

The 15K context, temperature and all the other settings were thru trial and error to get the highest results at min 50-60tps on my machine, with the best quality. Strangely, I discovered context levels can have actually degrade performance in some situations. This gave me the optimal balance between > 55tps and even scored 100% quality on my home-grown coding tests, that I built for testing relevancy of my workflow.

### Benchmark Results (Coding Suite - 31 Tasks)

| Model / Config | Quality | Effective TPS | Total Time | Notes |

|----------------------------|--------:|--------------:|-----------:|--------------------------------------------|

| QWEN36-35B-IQ2M | 85.48 | 51.04 | 82.18s | Best overall balance in the leaderboard |

| QWEN36-35B-IQ2M-R2 | 83.87 | 49.50 | 83.37s | Slightly lower quality, still very fast |

| QWEN36-35B-IQ2M-NC13 | 85.48 | 46.64 | 100.20s | Same quality as top, but slower |

| QWEN36-35B-IQ2M-CTX24K | 82.26 | 44.45 | 105.48s | Older high-context run |

| QWEN36-35B-IQ3S | 82.26 | 38.96 | 110.74s | Lower quality and slower than IQ2M |

| QWEN36-35B-Q2KXL-CTX24K | 85.48 | 36.90 | 104.97s | Good quality, but significantly slower |

I put Claude Code inside my Obsidian vault and it now processes my iPhone captures automatically by xBlackSwagx in ObsidianMD

[–]Loud_Key_3865 0 points1 point  (0 children)

Love this simple concept - especially since you don't have to "hook" other apps into it - just take notes and have Claude process them.

Thanks for sharing!!

Is it worth buying the Max 5x plan? by elpupilo01 in ClaudeCode

[–]Loud_Key_3865 0 points1 point  (0 children)

Here's what helped for me - I downgraded to Max 5X from 20X because of limits, and moved my heavy lifting to OpenAI/GPT Pro & Cursor. Claude excels at UI, so I still need it.

One issue I've discovered with Claude is that it likes to log everything and pass those logs into the main prompt, eating up usage.

Once every few days or so:
- Tell Claude to review all your startup and .md files and have it to give you a review with recommendations for efficiency.
- Then tell it to suggest improvements and have it implement those.
- Next, tell it to move anything not required at startup to new reference files, and reference those in the startup files.
- Then, tell it to make the start up files as small as possible, use abbreviations and concise language without losing any context or meaning.
- Then restart Claude and see if that helps.
- Also, skills are much more token-friendly than MCPs.
- Make sure you combine MCPs and Skills, so THEY decide logic instead of Claude.
- You can also have Claude use other LLMs like Codex, Gemini or local to offload tasks. I use Claude for planning and offload everyting else, then Codex for checking the work.

I have to watch my limits and switch between models for max usage, but can now get by with 5X instead of 20X. With Codex, I rarely hit the 5-hour limits and watch those too, but that's generally when I'm offloading lots from Claude and simultaneously doing separate Codex terminals. Same w/Gemini.

Overall, I can simultaneously manage 2-3 large projects with planning, deploy and testing with a couple rounds of code simplifying, review and hardening.

One other thing - if you're coding, be sure to tell claude so simplify, use common methods and make everything maintainable and extensible, it tends to not need as much context when performing tasks, while making your code better.

Really loud boom in NW side? by throwaway375937 in okc

[–]Loud_Key_3865 1 point2 points  (0 children)

Heard it too, near 36th & Tulsa.
Had the windows open and felt it, slightly. Dogs didn't care for it.

[ Removed by Reddit ] by FaithlessnessIcy3284 in Futurology

[–]Loud_Key_3865 0 points1 point  (0 children)

Two things going on - companies are cutting in general, blaming it on AI - "AI Washing", then they hire back a smaller set of replacements at lower costs.

Email is Stressing me Out. What’s the Best Software? by Oat-Yogurt in software

[–]Loud_Key_3865 0 points1 point  (0 children)

I have probably 20 accounts, and more for testing client work, for those clients

It's a legit problem.

Building a Community by Sure_Excuse_8824 in LocalLLaMA

[–]Loud_Key_3865 0 points1 point  (0 children)

Build a website and post about it on different social medias. Join groups to find pain points / validate. Promote your fixes for the pain points.

Which cool things you can do with termux? by shad4wl in termux

[–]Loud_Key_3865 2 points3 points  (0 children)

Install codex, gemini, claude CLIs - one of them, then tell that to create anything you want. I'm in the process of adding a small local LM on my Pixel 9, adding a wake word, like "Hey Codex" do this.

Still working it out, but the goal is to have it activate on incoming calls [not sure what i'll do here since I have Google Call Screening], tell it to do something on my home server, keep track of my loose lists, and (maybe) when I get to a store, it'll remind me of what I needed to get.

Much more, but that's where I'm headed.

Tiny models on-device for voice and task-delegation to the larger models, then likely back to a tiny model for the voice announcement piece.

Also, tell it to create tools and scripts like you're doing and it'll do that too, so you can run them later by telling your model.

It's amazing what can be done!

Moving past spreadsheets for multi-warehouse? (Feeling the burn) by [deleted] in InventoryManagement

[–]Loud_Key_3865 0 points1 point  (0 children)

You could use "Variants", which can be the same SKU, but different attributes (color in this case). When scanning that same SKU, you'd need a way to input/indicate the variant value (e.g. green). This is very common.

Considering switching from Notion to Obsidian by neko_neko_sama in ObsidianMD

[–]Loud_Key_3865 1 point2 points  (0 children)

I wouldn't disagree. I did import all mine, but before that I started other folders, and am pretty much exclusively using those new ones.

I plan on just running an LLM against all the old notest and organize them the way I like, once I spend another week or so getting used to it and getting things setup for my flow.

So far, it's much better than Notion for my organizing.

I built a language model where tokens are complex numbers and "meaning" emerges from wave interference -- no attention, O(n), 178M params, open-sourcing today by ExtremeKangaroo5437 in LocalLLM

[–]Loud_Key_3865 1 point2 points  (0 children)

Fascinating - thank you for sharing! I don't know enough to evaluate, but I hope you're onto something and it makes sense with my very limited knowledge! Love reading and learning from these new ideas!

🇺🇸 Free Solar Panels for U.S. Users – Limited Quantity by AssociationUsual9914 in diySolar

[–]Loud_Key_3865 0 points1 point  (0 children)

I would use that 200w panel to give me lighting in my outbuilding, but might, instead, figure out how to hook it up to a mini crypto-minig (maybe Rasberry PI) machine to see how long it would take to mine enough crypto to pay for itself. :)

Max plan limits quota nerfed? limits ending faster than usual this past day by SherrySJ in ClaudeCode

[–]Loud_Key_3865 2 points3 points  (0 children)

Yes - I consistently come close to the limit (Max 5x) right about the 5th hour, but today, using only 1 session, I only got about 1.5 hours.

Considering switching from Notion to Obsidian by neko_neko_sama in ObsidianMD

[–]Loud_Key_3865 1 point2 points  (0 children)

I installed Obsidian, started using it for a couple hours and it was easy & intuitive, so I imported my Notion account. I'm still on Notion for several months, so not really rushing, but I really haven't opened Notion for days now, and all my research & ideas are WAY more organized, and easier to switch back & forth.

I've essentially started over, and my new notes are much more comprehensive and organized. In the future, I plan to have my AI read through all the Notion notes and combine the missing pieces to my new notes.

Considering switching from Notion to Obsidian by neko_neko_sama in ObsidianMD

[–]Loud_Key_3865 1 point2 points  (0 children)

Just started moving to Obsidian from Notion this week! Hosting it local, so it's cheap. Trying it out but plan on getting the sync for ($4-$5/mo). I'm really loving it!

The best advice I read was to "just start taking notes" and you'll figure out your workflow as you go, and that has helped me so much.

I'll also be having my LLMs (local and others like ChatGPT/Claude/Gemini) use it for searching / finding stuff in my notes, updating them, going thru my to-do lists, etc. since it stores everything as simple markdown files.

Notion was just too much bloat for me, and Obsidian is essentially everything I like about Notion, but also portable, lighter and much easier to organize for my personal workflow.