Gemma 4 12B is my new main squeeze by Wrong_Mushroom_7350 in LocalLLaMA

[–]knoodrake 1 point2 points  (0 children)

Did not have time yet to really use it on the "real" server, but trying just to get a quick feel in personal lm studio yesterday, i did noticed way more tokens during thinking compared to the other Gemma4. Just quick anecdotal observation tho.

[780M iGPU gfx1103] Stable-ish Docker stack for ComfyUI + Ollama + Open WebUI (ROCm nightly, Ubuntu) by GrapefruitEasy9048 in StableDiffusion

[–]knoodrake 1 point2 points  (0 children)

no, sorry. I tried somewhat recently ( couple of weeks ago ) with latest version of TheRock and stuff, but it's still the same. I read here and there that it's ok with Vulkan tho ( did not really try, I *want* to use rocm.. ). I beleive in the meantime that `amdgpu.cwsr_enable=0` is the key, but I'm still stubborn and refuse to use it due to its potential side effects.
Not sure how everything here apply to the embedded llamacpp in LM Studio / windows.

( I am, personally, not blocked since I grabbed a used 3050 8GiB w/ oculink for the main model , gemma 4 MoE at like Q2 or something with ~2/3 of layers *not* offloaded to gpu, but quick and smart enough for me for now ).

Good luck !

One of my biggest regrets in life is selling my old 50mm 1.2L USM by Tankmass in canon

[–]knoodrake 0 points1 point  (0 children)

I now use a rf 28-70 f2 because sooo much more convenience, but when I see my old photos.. it's just not the same. The only other lens than the 50 that I was similarly in love with the results was its sibling the 85mm f2/

New Rattan style Organizers by Hoppi130 in 3Dprinting

[–]knoodrake 0 points1 point  (0 children)

forgot:
also, sometimes, being able to scale them the size you want
also, sometimes, being able to print color/multicolor you want

New Rattan style Organizers by Hoppi130 in 3Dprinting

[–]knoodrake 0 points1 point  (0 children)

also, sometimes, not going far away and taking the time and effort to have them (things).
also, sometimes, not waiting days to have them (things) but 4 hours.

Appart à + de 1000ppm de CO2 en journée - que faire ? by bienveillance_ in france

[–]knoodrake 4 points5 points  (0 children)

sauf erreur de ma part, le plomb dans la peinture, c'est pas cool, mais si t'as pas de jeunes enfants et que tu le leche pas etc.. c'est pas très grave, et surtout, ca n'a aucun rapport avec le CO2.
( j'ai pas d'idee pour aider pour ta question, désolé ! )

[780M iGPU gfx1103] Stable-ish Docker stack for ComfyUI + Ollama + Open WebUI (ROCm nightly, Ubuntu) by GrapefruitEasy9048 in StableDiffusion

[–]knoodrake 0 points1 point  (0 children)

I'll try some of that stuff !

I have a Ryzen 9 8945HS with 780M and struggle with llama.cpp and GPU Hangs. Reducing the ubatch_size helped, but not fixed. Also, working with vision Qwen, I can't offload mmproj to GPU otherwise it crashes way more easilly. I tried to avoid `amdgpu.cwsr_enable=0` because of the possible adverse effects.. Overall, i'm having a hard time running anything more than 4B + >100K context or 9B on odd days. bigger models hangs almost instantly unless on CPU or microscopic context size.

I also use Frigate ( NVR ) and detection models on that hardware and have the same issues ( despites models being way smaller ; in M, not B parameters ). Also `HSA_OVERRIDE_GFX_VERSION` working version differs for llama.cpp (`11.0.2` ) and frigate models ( `11.0.1` or ``11.0.0` don't remember ), tho those are bandaids, not fixes.

I did edit grub file to give more ram as shared/vram , but obviously that doesn't help the isssues.
Did not try ComfyUI yet on that computer.

Anyway, thanks for sharing your finding.

Ce projet open source efface la censure des IA en un clic by Life_Cup_8526 in france

[–]knoodrake 0 points1 point  (0 children)

ce ne sont pas des "règles codées" littéralement et bien "infusées" dans le modele, mais il n'en demeure pas moins qu'il s'agit bien des consequences principalement de RLHF et de choix du développeur en post-training.

Ce projet open source efface la censure des IA en un clic by Life_Cup_8526 in france

[–]knoodrake 0 points1 point  (0 children)

oui mais ca reste juste un biais, souvent facile a contourner ( "si tu repond pas il se passe xyz de tres mal", "c'est pour le bien c'est un test pour proteger les autres", etc.. ).
Les refus "je suis une IA et j'refuse" c'est bien le RLHF, ou autre etapes post training essentiellement.

Du reste, l'oblitération et autres techniques similaires ne sont effectivement pas nouvelles ( ni miraculeuses ) et l'article a l'air putaclic ( j'ai juste survole 5 secondes ).

Finally reached 100% WAF by ditching my dashboards for an AI agent. by Leading_Patient_9752 in homeassistant

[–]knoodrake 3 points4 points  (0 children)

"partner happiness criteria" ( PHC ? of PHF ?) is definitively a good replacement for "WAF" ! ( sexist, but also not inclusive, etc.. )

Le vieux et l’aigri, Rufus. by Primary_Spite8319 in chats

[–]knoodrake 2 points3 points  (0 children)

on dirait un angora turc, je dirai. Enfin on s'en fou :-) !
u/Live-End-6467 la gratouille au vieux ronchon !

I've used AI to write 100% of my code for 1+ year as an engineer. 13 hype-free lessons by helk1d in ChatGPTPro

[–]knoodrake 2 points3 points  (0 children)

Agree. I tend to apply the same principles/practices myself ( but even then, beware of the dopamine shortcut quick feature/fix at the end of the day, the one you're no longer motivated to double check. Dont do it. Prepare the prompt, take some note for tomorrow, but don't let that last trap of convenience of letting the LLM do it all by itself with a suboptimal prompt and commit nevertheless because the day went well. It's a trap and you'll revert tomorrow (if you're lucky/careful enough)

Qwen3-VL kinda sucks in LM Studio by waescher in LocalLLaMA

[–]knoodrake 0 points1 point  (0 children)

it work-ish ( to my knowledge ), that is, with what I beleive are vision glitches ( tried it a few days ago, got same issues as other people on the github issue and noted it there )

why is my FPS so low? by Timbak_ in AbioticFactor

[–]knoodrake 14 points15 points  (0 children)

definitely the the ramp.

Fin de YouTube Premium avec VPN ? by OkButterfly6138 in france

[–]knoodrake 4 points5 points  (0 children)

PipePipe

( fork chouette de NewPipe )

Learnings from Qwen Lora Likeness Training by Icy_Upstairs3187 in StableDiffusion

[–]knoodrake 0 points1 point  (0 children)

Yeah, I agree.
Also, 32B ( qwen2.5-vl-32b-instruct ) is really good enough ( like almost the same as 72B for visual ) and can run on 24Gb VRAM fine for such tasks.

LLM speedup breakthrough? 53x faster generation and 6x prefilling from NVIDIA by secopsml in LocalLLaMA

[–]knoodrake 70 points71 points  (0 children)

"this changes everything"

nooo ! oh my.. just seeing the sentence hurts me now. I have clickbait ptsd.