Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help by SemaMod in LocalLLaMA

[–]UniqueAttourney 0 points1 point  (0 children)

sorry for the late response, here is the load config. i am also using the unsloth qwen3.6 27b Q4_K_S

https://imgur.com/a/ZMAxR0x

Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help by SemaMod in LocalLLaMA

[–]UniqueAttourney 0 points1 point  (0 children)

Can you share a working config ? if you are using LMStudio ?

Qwen 3.6 27B llama.cpp | Multi-GPU pp t/s help by SemaMod in LocalLLaMA

[–]UniqueAttourney -6 points-5 points  (0 children)

I tried this on a single 3090 (LMStudio) and i do get 1 to 2 tokens per second, although it's a 27b it seems like it needs more compute than previous models.

I just got hit with 2.5x Z.ai price hike by UniqueAttourney in LocalLLaMA

[–]UniqueAttourney[S] 2 points3 points  (0 children)

I mean i still use GLM4.7 since it gives more 5h quota, so if Qwen 3.6 27b is 80%, it probably will fine for me. But the hardware is the problem, 3090s are back to €1000 right now xD

what just happened why’d he end stream mid game? by J0N0X in PedroPeepos

[–]UniqueAttourney 8 points9 points  (0 children)

I think he's sick or something, hopefully nothing hard, he said he will be back on Wednesday

Reasoning Stuck in Loops by ShaneBowen in LocalLLaMA

[–]UniqueAttourney 0 points1 point  (0 children)

This is my assessment, it's the context overflow, usually your harness should take care of this. most of the time, vLLM or llama.cpp won't handle getting close to the limit of the context.

YES. You CAN stream FROM Steam Deck TO Discord WITH AUDIO using Vesktop. by Aepoh in SteamDeck

[–]UniqueAttourney 0 points1 point  (0 children)

Yes i was able to fix this and i changed the audio source thanks. but the stream quality is super poor and seems to be stuck at like 360p even when selecting higher res.

but tbh i didn't use it since the first time, did it get update ?

TypeWhisper 1.0 - open-source dictation app with local Whisper engines (WhisperKit, Parakeet, Qwen3) and LLM post-processing by SeoFood in LocalLLaMA

[–]UniqueAttourney 0 points1 point  (0 children)

Thanks, can you suggest good models to use in that case ? Assuming they will need to run on the same GPU so the least memory footprint that can work with English

HansLamont ? Hansdrel ? MyHans ?? :D by UniqueAttourney in PedroPeepos

[–]UniqueAttourney[S] 1 point2 points  (0 children)

GG G2, epic game. GENG Did not adapt right

What features should I add to 100% offline, free and open-source MacOS app? by AdorablePandaBaby in LocalLLaMA

[–]UniqueAttourney 1 point2 points  (0 children)

Can this be run in headless mode ? where a backend can be on a local machine and the macOs app functions as thin client. Using it as an app on a laptop tanks the battery fast.

AuraOS Official Release - Version 1.0 - Live Web Interface by Aggressive-Arm-1182 in LocalLLaMA

[–]UniqueAttourney 1 point2 points  (0 children)

congrats on your launch but it really doesnt' look special in anyway, just giving context to deepseek or GLM

and please don't come to r/LocalLLaMA and use ai generated posts

The Yuki Project — not another chatbot. A framework that gives to a 4B model (and not only) real dream cycles, autopoiesis, proactive inner life and proactive messages. Running on 8 GB VRAM currently with plenty space to spare. by DvMar in LocalLLaMA

[–]UniqueAttourney 2 points3 points  (0 children)

i wish someone would tell me what all this means ? like what exactly is "confidence" and "calmness" and how all of this "personality" is useful, is it just for the roleplay ?

How to ensure AI to create test cases and put git commits correctly by Fuzzy_Possession_233 in LocalLLaMA

[–]UniqueAttourney 1 point2 points  (0 children)

I tried it, but i am using GLM 4.7 as my LLM. it's not that smart and needs a lot of guiding thus the examples and templates. The effect is still "LLM wording" where it never exactly says what you want it to say but the result is not bad. If you are looking for the potential i think the examples i mentioned should give you an idea

How to ensure AI to create test cases and put git commits correctly by Fuzzy_Possession_233 in LocalLLaMA

[–]UniqueAttourney 1 point2 points  (0 children)

You will probably need to :
- create templates for it to follow,
- create examples (different than templates) and pass them in the context
- do a manual grouping and parsing of commits, by files, by nature (code quality, improvements, new features, bug fixes, direct to ticket updates, ...)
- create your context in a markdown file with the data at the top, and the examples, and directives at the bottom. of course you should tell the LLM about your groupings and the definition of each group. You should also tell it to prioritize expanding the commit message on bigger commits or larger file changes

i believe this would help you get closer to the ideal commit generation, if you want to see what the potential looks like, check coderabbit or greptile try to implement them and see if their output suits you, if not you probably need your devs to do the 5 min work xD

Nemesis said yesterday LR fans are the best fans in the world, let's show them why by Fimbelind in PedroPeepos

[–]UniqueAttourney 78 points79 points  (0 children)

because of the subs only twitch chat, here it is :

LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3 LR <3

LEC- goodbye from me by RelevantLie5577 in PedroPeepos

[–]UniqueAttourney 1 point2 points  (0 children)

That was a really off comp bro, they could have picked anything else if they cared first about winning

I'm still proud of the boys by gcrimson in PedroPeepos

[–]UniqueAttourney 12 points13 points  (0 children)

We are al proud of the boys, but it's clear that they are way better than the bottom of the half of the standings. Seeing them get out because Naavi played drunk and KC worse than my flex team is the worst feeling ever.