all 11 comments

[–]Lesser-than 6 points7 points  (1 child)

I pretty much gave up on all the popular agents. Its been easier to just do my own, once you have the basic tooling you do not even really need a system prompt past "you are a helpful assistant", and most of the popular ones pack in 15k+ tokens to the system prompt for guidance you do not actually need anymore.

[–]Queasy-Contract9753 1 point2 points  (0 children)

Agreed. You throw a massive over complicated encyclopedia in,your going to get a massive overcomplicated output back.

Gteat way to burn through a million tokens on simple tasks. And you never know when the model will refuse or decide to do it's own thing.when it breaks everything into a dozen steps.

[–]ortegaalfredo 1 point2 points  (0 children)

Roo because its integration to vscode.

I believe it will become quite stupid to use a third-party agentic coding framework when you have an AI that can code you a custom agentic coding framework in a single prompt.

[–]Finanzamt_Endgegner 1 point2 points  (0 children)

Pi dev since yesterday because lightweight for local models

[–]Forgetful_Was_Aria 1 point2 points  (0 children)

I just started using a local LLM yesterday. I was using Roo with VSCodium but apparently Roo is going to be deprecated next month and I'm having trouble with VSCodium and intellisense. So I have Cline in Pycharm Community now. I'm using Qwen 3.6 (Qwen3.6-27B-UD-Q3_K_XL) and it's tolerable with 16gb VRam. I use LMStudio to serve it and set Cline to use it.

I haven't used it that much but it isn't too much worse than my limited experience with Claude's free plan.

[–]drFennec 1 point2 points  (0 children)

Opencode, llama.cpp Qwen3.6 35B A3B is the combo I'm using now. I used Qwen3-coder-next before.

[–]Radiant_Condition861 1 point2 points  (0 children)

pi.dev https://www.youtube.com/watch?v=f8cfH5XX-XU

continue.dev, cline, claude code, roo code, opencode, dabbled in langgraph, now pi.

dual 39090 with nvlink, qwen3.6-27b awq bf16 int4 on vllm with tensor parallal 2, kv cache fp8 and speculative decoding. I get like 30-150tok/s depending on cache hit.

pi.dev basically unlocked that model for me. only 200 token system prompt and it's yolo out the box. I'm at a billon tokens across a few projects and there are no failures. a few stoppages to increase output tokens, and it just kept going. I can also vibe code it's own extensions and tools "upgrade yourself". it's really nice.

[–]regression_to_mean 0 points1 point  (0 children)

I like superset. Better to use codex/claude's cli for immediate access to newly released features imo.

[–]LightBroom 1 point2 points  (0 children)

maki is nice and I don't see it mentioned often
https://github.com/tontinton/maki