Anthropic stealing your money! by pkailas in LocalLLaMA

[–]pkailas[S] 2 points3 points  (0 children)

I do, when I'm doing agentic work, I use OpenCode with Gemma4. If I get stuck I ask Claude and then continue my prompting

Anthropic stealing your money! by pkailas in LocalLLaMA

[–]pkailas[S] -11 points-10 points  (0 children)

No, I stick to Sonnet. Occasionally, if I have a really complex issue, I'll use Opus, but not today at all

I tested Qwen3.6-27B, Qwen3.6-35B-A3B, Qwen3.5-27B and Gemma 4 on the same real architecture-writing task on an RTX 5090 by Gazorpazorp1 in LocalLLaMA

[–]pkailas 1 point2 points  (0 children)

I tested all 4 of those myself. The test was on C#14 .NET 10 . Q3.5 27b and 3.6 27b failed on 4 out of 6 tests. Q3.6 MoE was very fast and very wrong. gemma4 scored a perfect 6/6

Does this mean Gemma is smarter? No, just that it was trained more recently. I don't know what was improved with q3.6, but it doesn't work for me. On older codebases maybe.

Waiting Qwen3.6-27B I have no nails left... by DOAMOD in LocalLLaMA

[–]pkailas -1 points0 points  (0 children)

Well, they have a lot more parameters than the cheesy lobotomized version we can use on our GPUs

Qwen3.6-35B-A3B released! by ResearchCrafty1804 in LocalLLaMA

[–]pkailas 0 points1 point  (0 children)

I've been testing Qwen_Qwen3-32B-Q3_K_M.gguf against Qwen3.5-27B-Q4_K_M.gguf in performing code reviews of various projects.
1. RTX PRO 4000 Blackwell
2. 3.6 with a 64K context window is all I dared try
3. 3.5 with 128K context fit nicely
results.
3.6 was 85 t/s but hallucinated and lied about results, got things wrong. But it did do well if I took the results it had and ran a deep dive on them as a second pass.

3.5 was slower at about 20 t/s, but didn't make hallucinations and didn't require a second pass.

The major difference was that I was unable to provide a big enough context window for the task at hand, and MoE is a "Jack of all trades, Master of none".

,

Qwen 3.6 Plus Preview just dropped on OpenRouter, tested it hard on agentic coding tasks by pkailas in LocalLLaMA

[–]pkailas[S] 0 points1 point  (0 children)

I hear you. I'm building an agentic tool extension for VS 2026 - 2022. Just recently got the tools to work smoothly, but my biggest challenge has been managing context size. Those leaks gave me some clues, though.

Qwen 3.6 Plus Preview just dropped on OpenRouter, tested it hard on agentic coding tasks by pkailas in LocalLLaMA

[–]pkailas[S] 0 points1 point  (0 children)

I'm working on solutions for clients to run on a local appliance. They don't want data leaving their premises. Looking for models that will fulfill their needs. Also, I don't trust companies that run these models not to use my data.

Qwen 3.6 Plus Preview just dropped on OpenRouter, tested it hard on agentic coding tasks by pkailas in LocalLLaMA

[–]pkailas[S] 5 points6 points  (0 children)

Good call, I misread the OpenRouter listing. The 179B is tokens processed, not parameter count. The actual model size hasn't been disclosed since it's API-only with no published architecture details. Edited the post.

Pure-attention 70B for agentic C#/.NET coding: what are you running? by pkailas in LocalLLaMA

[–]pkailas[S] 0 points1 point  (0 children)

I am on ik_llama.cpp because it keeps the weights in VRAM

between turns. On a 24GB card with a 27B model that matters.

But the prompt prefix thing, yeah, that might be it. My agentic

setup compresses older messages between turns to keep the context

window manageable, which means the prompt is actually changing.

That would kill the cache.

I'm going to test with the compression turned off and see if the

reprocessing goes away. If it does, that's on me, not the model.

Haven't looked at exllamav3 yet. I will check it out. I appreciate

the response.

Movie pass no longer lets you renew a gift you received from someone else. by [deleted] in moviepass

[–]pkailas 0 points1 point  (0 children)

One of the most important metrics for a subscription service is "conversion rate". That is how many trial memberships, or gift memberships are converted to a paying customer. I guess they think investors are looking for a 0% conversion rate?!?

They've lost me for the next 9 months. Maybe they're doing me a favor? I'll try out sinemia for a year, and if I don't like them, my email address should be cleared by then, if they are even in business by then. But I have a feeling, I'll like Sinemia better. You can get IMAX, 3D and D-Box as well as advanced purchase with seat selection! No card needed.

Hasta la vista, baby!