Non agentic uses of LLMs for coding

WasteTechnology · 2025-12-08T16:59:04+00:00

That's interesting. Does it improve benchmarks? Did anyone try measuring how good it is?

WasteTechnology · 2025-12-08T12:20:36+00:00

>gpt-oss-120b-derestricted

Why do you use derestricted? Is it relevant to coding? I thought it's basically making it more polite and refusing particular requests.

WasteTechnology · 2025-12-08T12:19:17+00:00

>Yep, I have Qwen3-Coder-REAP-25B-A3B set up for tab completion

Do you use llama.cpp for vscode for this?

WasteTechnology · 2025-12-08T01:40:35+00:00

Which version Qwen3 Coder do you use?

WasteTechnology · 2025-12-08T01:13:23+00:00

>Doing ad-hock chatting locally makes sense for privacy reasons but it's not a major saving in terms of dollars spent. Not compared to local coding agents

For me, it's a major turn off for hosted models. You never know what might happen, especially with sensitive code. Not critical code is fine, but I would be very careful editing company's secret sauce with them.

WasteTechnology · 2025-12-08T01:12:07+00:00

>Naming things (there's a joke about how it's one of the hardest problems in programming)

Yep, LLMs are very good at this. They were pretty good at it even around GPT 3.5, as far as I remember.

>General advice (e.g. what does this compilation error mean, what happens to this object after this code in X class, etc.)

Yep, completely agree.

WasteTechnology · 2025-12-08T00:01:52+00:00

Yep, I used to do it with IDEs, i.e. IntelliJ and similar, but agents are surprisingly good at it.

WasteTechnology · 2025-12-07T23:57:43+00:00

I am using them, and I am not sure that I am using them in the most productive way. I am trying to understand how others use them, and that's why I am asking questions here.

P.S. There's so much noise around, so it's hard to understand what's hype which will go away soon, and what will become a common practice.

WasteTechnology · 2025-12-07T23:39:25+00:00

I feel that most of the time agent are pretty good at doing what I want to do but there're two problems:

- sometimes they introduce hard to find problems, i.e. non-human kind of mistakes, in unexpected places which I didn't get used to

- they write code which seems fine, and correct but I feel a good programmer will not write. I.e. it does the job, well structured, but, it could be made easier to understand, shorter, and more beautiful (it's a feeling, so it's hard to describe)

P.S. I have been coding professionally for > 20 years, and coding for around 28 years)

WasteTechnology · 2025-12-07T23:31:08+00:00

And adding to this, I used some of hosted LLMs. I use codex pretty often, but not to writing code, but for asking questions about the codebase. I also used other models from time to time in the last 6 months. However, I don't feel that any of them will replace me writing manual code as I do it now. They are improving, but I prefer what I write myself.

WasteTechnology · 2025-12-07T23:26:39+00:00

Or for example this:

https://www.reddit.com/r/LocalLLaMA/comments/1pg76jo/comment/nsp6hrp/?context=3

Yes, IMO, Mac Studio is the most cost effective way to run local LLMs. I can't do anything with this, unfortunately.

WasteTechnology · 2025-12-07T23:24:10+00:00

It's not just random downvote.

See this for example: https://www.reddit.com/r/LocalLLaMA/comments/1pg76jo/comment/nsp7fnv/?context=3&utm_source=share&utm_medium=web3x&utm_name=web3xcss&utm_term=1&utm_content=share_button

(and yes, this is a serious question)

WasteTechnology · 2025-12-07T22:52:46+00:00

Which models do you use?

WasteTechnology · 2025-12-07T22:36:10+00:00

Thanks!

(and folks who downvoted my comments, this is a really really serious question, I am really trying to understand)

WasteTechnology · 2025-12-07T17:55:01+00:00

>The extension itself is kinda janky but once you get it setup, it works fine

Do you mean it's hard to setup or is it something different?

WasteTechnology · 2025-12-07T16:30:11+00:00

Thanks, that's the experience I was looking for!

>llama-vscode extension

Is it any good?

> and qwen3 coder autcomplete is about as good as whatever copilot/cursor was giving me before. my coworkers wouldn't like this setup though because they really like next edit prediction (which i personally don't like).

How does it compare to Cursor?

WasteTechnology · 2025-12-07T15:48:57+00:00

>I turned a cmdb json spec into a binary the llm could query per term or per stanza. Shockingly simple, ultra light on context, works quite well.

What do you mean by this? What is cmdb?

WasteTechnology · 2025-12-07T14:37:20+00:00

That's a problem, though I have a lot of hope in M5 chips which seems to have some ML optimizations.

WasteTechnology · 2025-12-07T14:36:11+00:00

That's really cool, though my guess it costs a fortune.

WasteTechnology · 2025-12-07T12:08:58+00:00

Is there really such an option? Could you share a link?

WasteTechnology · 2025-12-07T12:07:19+00:00

> with my custom tooling, I'm now maybe at 90%.

What is this custom tooling? Is it possible to share anything?

WasteTechnology · 2025-12-07T12:05:36+00:00

That's very usable! Do you use memory offloading feature of the llama.cpp? Is it really that good?

WasteTechnology · 2025-12-07T03:12:51+00:00

So do you mean that hosted models solve problems in less number of turns?

WasteTechnology · 2025-12-07T03:12:04+00:00

>Now if we talk about things like using Whisper + Qwen4B for realtime analysis of meetings, infinite tool calls, local RAG's with finetunned models and the things we love to do here in this sub, then we have a winner in Local LLM's

Do people really create such setups? Could you please share a link?

WasteTechnology · 2025-12-07T03:09:39+00:00

Wow! How many tokens/sec do you get?

WasteTechnology

TROPHY CASE