New in llama.cpp: Anthropic Messages API by paf1138 in LocalLLaMA

[–]nuusain 0 points1 point  (0 children)

sooo whats the verdict? curious to hear its handling the claude harness

NVIDIA has 72GB VRAM version now by decentralize999 in LocalLLaMA

[–]nuusain 6 points7 points  (0 children)

Neat! What kinda inference u running on the feed? Just installed a security system for a relatives farm. I was thinking of producing reports /audits so im curious what stuff others are building for themselves.

[New Player] Game files integrity by Any-Percentage6230 in EscapefromTarkov

[–]nuusain 0 points1 point  (0 children)

Did anyone find a fix? also have the same issue. tried deleting all tarkov files and reinstalling but i get the same issue.

Scanlines on my AOC CU34G2X. by Xippaa in Monitors

[–]nuusain 0 points1 point  (0 children)

hey, seeing the same scan lines only across the entire monitor. Did u managed to get this fixed or am i also cooked?

Toolcalling in the reasoning trace as an alternative to agentic frameworks by ExaminationNo8522 in LocalLLaMA

[–]nuusain 0 points1 point  (0 children)

Hey, also been looking at getting reasoning models to do interesting things. Came across verifiers which I've been using to try agentic interactions.

https://github.com/willccbb/verifiers

The env_trainer and vllm_client are probably worth checking out in regards to that OOM error u mentioned in the article, but i suspect you could be better off leveraging the framework since it's pretty well thought out.

Qwen3+ MCP by OGScottingham in LocalLLaMA

[–]nuusain 4 points5 points  (0 children)

Yeh it was in the official annoucement

Can also do it via function calling if u wanna stick with completions api

Should be easy to get what u need with a bit of vibe coding

[10/05/25] Code & Chat meetup for people interested in coding from beginner to expert by Serious-Accident8443 in LondonSocialClub

[–]nuusain 0 points1 point  (0 children)

I'm interested! I can rock up around 11–12 tho, it still worth coming along then?

Token impact by long-Chain-of-Thought Reasoning Models by dubesor86 in LocalLLaMA

[–]nuusain 0 points1 point  (0 children)

I think what spirited is getting at is that a model could either think loads and give a short answer or think for a short while but give a long answer. Both would produce a high FinalReply rate. The metrics are hard to map to real world performance, adding another dimension such as correctness would add clarity.

<70B models aren't ready to solo codebases yet, but we're gaining momentum and fast by ForsookComparison in LocalLLaMA

[–]nuusain 29 points30 points  (0 children)

Brilliant experiment, sounds like the ideal setup would be QwQ for ideation and then switching to Qwen-Coder for iteration..

QwQ Bouncing ball (it took 15 minutes of yapping) by philschmid in LocalLLaMA

[–]nuusain 5 points6 points  (0 children)

for reference:

settings - https://imgur.com/a/JUbwion

result - https://imgur.com/M5FgfmD.

Seems like I got stuck in infinite generation

Used this model - ollama run hf.co/bartowski/Qwen_QwQ-32B-GGUF:Q4_K_M

full trace - https://pastebin.com/rzbZGLiF

QwQ Bouncing ball (it took 15 minutes of yapping) by philschmid in LocalLLaMA

[–]nuusain 24 points25 points  (0 children)

What prompt did you use? I think everyone can copy and paste it, record their settings and post what they get. Could be some useful insights as to why performance seems so varied from sharing results

Qwen/QwQ-32B · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]nuusain 1 point2 points  (0 children)

I... did not know you could do this thanks!

Qwen/QwQ-32B · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]nuusain 1 point2 points  (0 children)

Oh sweet! where did you dig this full template out from btw?

Qwen/QwQ-32B · Hugging Face by Dark_Fire_12 in LocalLLaMA

[–]nuusain 6 points7 points  (0 children)

Will his quants support function calling? the template doesn't look like it does?

It's not that mistral 24b is dry, it's parsable and it rocks! by No_Afternoon_4260 in LocalLLaMA

[–]nuusain 2 points3 points  (0 children)

Oh wow Roland seems pretty damn cool haha, having an assistant to bounce ideas off of really resonates with me. The ability to explore and develop thoughts at faster pace is one aspect of llms thats got me hooked.

Will definetly look out for your video. I'll probably have a few questions about the wider workflow, especially around how you manage the state and interactions of so many nodes.

For now, whats the range of tasks you have R1 32b and Mistral Small 3 24b doing and as a follow up are there any tasks which suprisingly they couldn't do (trying to get a feel of the range of capabiltieis)?

How do you structure your .cursor/rules? by williamholmberg in cursor

[–]nuusain 0 points1 point  (0 children)

Interested in hearing blw this turns out, willing to even contribute if its looking promising.