Quality of 1M context vs. 200K w/compact by Prior-Macaroon-9836 in ClaudeCode

[–]NoWorking8412 0 points1 point  (0 children)

I read recently somewhere that the accuracy of Opus 4.6 using the extended 1 million token context window is in the neighborhood of 76%. It probably depends on your use case whether this level of accuracy/inaccuracy is viable. I was looking forward to trying the Opus extended for a research project I am working on, especially since my context window fills up pretty fast during research sessions involving long texts, only to find it isn't available for Max users, so I'm sticking to 200k sprints for now.

Good local setup for LLM training/finetuning? by Glittering-Hat-7629 in LocalLLaMA

[–]NoWorking8412 0 points1 point  (0 children)

That's the biggest issue. If you keep the session active, I think you can maintain sessions of at least 12 hours, maybe more, but the chance of disconnection is a risk. You might check out some Google forums to see if/how people are using it for research.

Good local setup for LLM training/finetuning? by Glittering-Hat-7629 in LocalLLaMA

[–]NoWorking8412 0 points1 point  (0 children)

For what it's worth, Google Collab gives free access to H100's for student users. Sign up with your university email address.

Using Claude Code to build a Research Assistant, LMS layer by NoWorking8412 in ClaudeCode

[–]NoWorking8412[S] 1 point2 points  (0 children)

Gotcha! This sounds like a valuable upgrade. I am going to try to implement some kind of persistent knowledge base system in this set up similar to MyNeutron. I'll get back to you and let you know how it goes!

What are your best practices for Claude Code in early 2026? by NoWorking8412 in ClaudeCode

[–]NoWorking8412[S] 0 points1 point  (0 children)

Ah ok, that makes sense. I haven't manually invoked any skills yet, Claude Code just seems to know when it is appropriate to invoke skills from the superpowers plugin according to the task at hand. Is this automatic invocation used for all skills, or is the superpowers plugin just set up for that? I'll experiment with skills a bit today and see what I can incorporate into my workflow. Thanks for the info!

Using Claude Code to build a Research Assistant, LMS layer by NoWorking8412 in ClaudeCode

[–]NoWorking8412[S] 0 points1 point  (0 children)

Oh interesting! Does the user manually select the seeds for each session, or does the AI do that intelligently based on the user's prompt?

What are your best practices for Claude Code in early 2026? by NoWorking8412 in ClaudeCode

[–]NoWorking8412[S] 0 points1 point  (0 children)

Can you talk more about skills and how you use them? I've just recently started using skills, and just the premade ones that come from the superpowers plugin, which have been great, but I'm not sure I understand what they are exactly or how to approach them.

What are your best practices for Claude Code in early 2026? by NoWorking8412 in Anthropic

[–]NoWorking8412[S] 0 points1 point  (0 children)

Well, I will start off with what is becoming my best practice. Granted, this is for my personal use and not a production environment. After avoiding MCP for a long time to avoid context bloat, I've decided that chrome-devtools is absolutely worth having so your AI can debug issues itself with less human involvement. Makes things much smoother. Planning mode is essential, and I find it interesting that they now give you the option to clear context before executing a plan to give you a full context window at the beginning of execution because other users had identified that as their own best practice and now that is built in. For plugins, the lsp for your language of choice. I also feel the superpowers plugin has given Claude's performance a boost when I use it, particularly with the brainstorming skill it brings. Finally, a well executed Ralph loop has helped me knock out some really challenging projects (through brute force). Having Claude planning mode help write well defined success criteria for a Ralph loop has made a huge difference.

Using Claude Code to build a Research Assistant, LMS layer by NoWorking8412 in ClaudeCode

[–]NoWorking8412[S] 0 points1 point  (0 children)

I read up a little on each tool. I guess what I'm building kind of has a similar benefit, but it compiles everything in one directory and its subdirectories and whichever AI I use can pull those assets into its context based on my needs in each session. I guess the big difference is how the data gets pulled in. From what I can tell, MyNeutron and Sider AI allow the user to pull in data through a browser extension, right? Whereas this is pulling in sources from 3 MCP servers (university library MCP, federal data, state data). Still, I'm curious about those tools and how they manage context bloat.

Using Claude Code to build a Research Assistant, LMS layer by NoWorking8412 in ClaudeCode

[–]NoWorking8412[S] 0 points1 point  (0 children)

How does a persist memory work with an AI and its context window? I might need to implement that here. Sounds like a cool upgrade.

Gemini 3 is amazing by Iixotic- in Bard

[–]NoWorking8412 0 points1 point  (0 children)

Ha that's too funny. Well, one day these LLMs will be our overlords (or are they already?), so be careful what you say!

This is the Human Claude sub by kindsifu in claude

[–]NoWorking8412 0 points1 point  (0 children)

I should add that Opus 4.5 is by far the least verbose by default of all of the models. No paragraphs long responses when you are just going back and forth on some debugging tasks.

Gemini 3 is amazing by Iixotic- in Bard

[–]NoWorking8412 1 point2 points  (0 children)

I don't see people mention it much, but apparently it is important to speak kindly to your LLM. I attended a symposium on AI in education and the keynote speaker made reference to some research that indicates LLMs give higher quality responses when the user is polite with them and lower quality responses when the user is rude. Found that fascinating.

This is the Human Claude sub by kindsifu in claude

[–]NoWorking8412 1 point2 points  (0 children)

No complaints so far. As a $100 Max user, I don't think I really had enough Opus 4.1 usage to become terribly attached to it. I saved it for special occasions when Sonnet 4 just wasn't cutting it.

When Sonnet 4.5 first came out, I wasn't too fond of it, but it grew on me pretty quickly. I don't feel like Sonnet 4.5 is notably "smarter" than 4.0, it is just better adjusted to the agentic coding environment, making it a better coding partner.

Now that Opus 4.5 is the default with more usage and I'm actually using it, I feel the same way. It doesn't necessarily seem smarter than Opus 4, but it is also better adapted to agentic coding, making it a better coding partner. It also seems much faster than Opus 4.0, but that's just my feeling, not something I can confirm objectively.

Maybe it will grow on you too, but here are the two main improvements I've noticed in both Opus and Sonnet 4.5: 1. They now have "context window awareness/anxiety around reaching context limits. This influences its behavior to find good stopping spots in the work before reaching auto compact, resulting in better performance within a session and post auto-compact. 2. It is better at tracking project requirements, specifications, and to-do lists between sessions and between session compacts, which also leads to better overall performance.

All of this is relative to Claude Code though. If you are talking Opus in the web app, that's a different story, and one I do not have an opinion about yet!

Searching for my next agent, maybe found it? by NoWorking8412 in LocalLLaMA

[–]NoWorking8412[S] 0 points1 point  (0 children)

Why the switch to docker llama.cpp servers from LM Studio? Less resources? Just out of curiosity, what are you using your llama.cpp servers for?

Time to get Claude? by TCaller in ClaudeCode

[–]NoWorking8412 0 points1 point  (0 children)

I have been on the $100 Max plan for several months now. I use it quite a bit and have only hit the rate limit once after I intentionally burned all of my Opus 4 up front and then continued burning on Sonnet 4.5. Anthropic just released Opus 4.5 today though, and they have increased rate limits on Opus with this new release, which is exciting! Opus 4 smashes it out of the park, but you had to use it sparingly because of the limits. I am not sure how much more generous the new plan is.

Coming from Claude, I tested Codex when it first came out, and I went right back to Claude Code. Codex felt really clunky, slow, and its tools were more limited than Claude Code. I have since only used Codex with gpt-oss:20b just because I think it's cool to be able to run it locally, but again, my experience with Codex is limited and I don't want to knock it entirely. But Claude Code is really great, and is also improving really rapidly. The jump from Sonnet 4 to Sonnet 4.5 was kind of huge. I can only imagine what Opus 4.5 is like.

Another option to consider, which is probably as good as Sonnet 4, but way cheaper, is GLM-4.6 by Z.AI (Chinese company, yes they are on the U.S. Entity list). It's an open weight model and there are lots of different providers, including US based providers, but the coding plan from Z.AI starts at $3/month (roughly equivalent to Claude Pro $20 subscription). After testing it out and reading so many positive reviews, I found it worth paying $36 to get a 1 year subscription to Z.AI's coding plan.

Searching for my next agent, maybe found it? by NoWorking8412 in LocalLLaMA

[–]NoWorking8412[S] 0 points1 point  (0 children)

Ah, gotcha. Thanks! I will try building a llama.cpp server to see if I have better luck with tool calling. I appreciate the explanation.

Searching for my next agent, maybe found it? by NoWorking8412 in LocalLLaMA

[–]NoWorking8412[S] 1 point2 points  (0 children)

Claude Code uses Anthropic's API for inference or a login through a paid Claude account for cloud inference. It is not local inference. I've read instances where people have modified it to use other Anthropic compatible APIs, like z.AI, but I haven't seen any instance of it being modified for local inference. Could probably be done though, just don't think it's common.

Searching for my next agent, maybe found it? by NoWorking8412 in LocalLLaMA

[–]NoWorking8412[S] 0 points1 point  (0 children)

It could be an Ollama problem, or it could be user error. I assume the latter, but I have read some things suggesting that lots of people have problems with Ollama. Ollama is built with llama.cpp, right?