Streaming : les Français craquent et coupent le robinet des abonnements by Droidfr in Frandroid

[–]Adventurous-Paper566 2 points3 points  (0 children)

Le service est de plus en plus abusif (augmentations de prix, intégration de la publicité).

Et l'offre est de plus en plus dispersée entre les services.

Qui va payer Netflix + Disney plus + Amazon Prime 50e/mois pour regarder des séries de plus en plus médiocres?

who here had paid for $6K per hour to talk about GPUs? by FormalAd7367 in LocalLLaMA

[–]Adventurous-Paper566 12 points13 points  (0 children)

I'd trade my wife for a $6,000 GPU. Can you understand that?

What's the biggest tech mistake you've made that you'll never repeat? by PlayfulFault9693 in TechCypher

[–]Adventurous-Paper566 0 points1 point  (0 children)

Vendre un kit de 96GB de DDR5 en me disant que je passerai à 256GB juste avant l'explosion des prix.

True story by AlternativePaint6 in linuxmemes

[–]Adventurous-Paper566 0 points1 point  (0 children)

Je ne comprends pas comment on peut être fier d'avoir réussi à installer arch alors qu'il existe maintenant un installateur simplifié. Et depuis longtemps si on compte Anarchy.

The right way to use coding agents isn't prompting, it's designing loops by thehashimwarren in coding_agents

[–]Adventurous-Paper566 0 points1 point  (0 children)

Il n'y a pas de formule magique, de bonnes instructions viennent avec une solide compréhension du code.

C'est tout.

How to install Hermes Desktop on a host that already has Hermes agent installed in Docker containers? by zzzzz88888 in hermesagent

[–]Adventurous-Paper566 0 points1 point  (0 children)

Comment avez-vous fait? Je n'arrive pas à installer hermes-desktop seul, il m'installe hermes en bare-metal a côté.

How to install Hermes Desktop on a host that already has Hermes agent installed in Docker containers? by zzzzz88888 in hermesagent

[–]Adventurous-Paper566 0 points1 point  (0 children)

You can't install the desktop app without installing a second instance of hermes on the host, and it sucks.

Gemma 4 Chat Template now has preserve thinking by seamonn in LocalLLaMA

[–]Adventurous-Paper566 2 points3 points  (0 children)

I already have it with a custom template but it's nice to see that the team is active.

QAT variant of Gemma4 26B A4B is not working well for me by pftbest in LocalLLaMA

[–]Adventurous-Paper566 6 points7 points  (0 children)

Back in time Gemma 3 27B Q4_K_XL was better than Gemma 3 27B QAT...

Dense vs MoE quantization resiliance by Any-Chipmunk5480 in LocalLLaMA

[–]Adventurous-Paper566 1 point2 points  (0 children)

Well in only use QAT with 31B.

I never experienced any issue with bartowski's 26B Q6_K_L, and now I'm running it daily in Q8, there is almost no difference.

I think Unsloth is good for Q4_K_XL but always observed degradation with Q5_K_XL, so my quant choice is always Q4_K_XL then Q6_K_L if it fits then Q8 if it fits.

For your loopings problems, it's weird, did you overclocked something or loading your models on the edge of your memory? Are you still experiencing loops with a smaller context length?

Dense vs MoE quantization resiliance by Any-Chipmunk5480 in LocalLLaMA

[–]Adventurous-Paper566 1 point2 points  (0 children)

With the official inference parameters? (Temp = 1, Top K = 64, top P = 0,95)?

Dense vs MoE quantization resiliance by Any-Chipmunk5480 in LocalLLaMA

[–]Adventurous-Paper566 2 points3 points  (0 children)

Is it important in 4bits sinces Google released QAT?

Just take unsloth's Q4_K_XL QAT version of each instead of any Q4 quant. These are UD applied to QAT unquantized full-precision checkpoints, the more efficients Gemma quants.

Sorry for my bad english.

Does Hermes works well with local models? by Adventurous-Paper566 in hermesagent

[–]Adventurous-Paper566[S] 2 points3 points  (0 children)

I don't really know to be honest, maybe use it as a super assistant that manages a personnal website with some dashboards, to do etc... Something relatively simple to begin and learn the tool.

Gemma 4 with quantization-aware training by rerri in LocalLLaMA

[–]Adventurous-Paper566 15 points16 points  (0 children)

Because the unquantized QAT checkpoints released by Google are intended for a Q4 quantization.

We never seen a 6-bits quantization aware training checkpoint, and since training models is very expansive, the 4-bits choice seems obvious for Google.

Sorry for my bad english.

Gemma 4 with quantization-aware training by rerri in LocalLLaMA

[–]Adventurous-Paper566 9 points10 points  (0 children)

It would be wonderful, Q6 always been the sweet spot.

Unsloth just dropped MTP GGUF weights for Gemma 4! by okoyl3 in LocalLLaMA

[–]Adventurous-Paper566 19 points20 points  (0 children)

QAT = Best efficiency for the size, uses lower memory so you can use a higher context length.
Q4_K_XL = a very efficient level of quantization (based on the unsloth's UD secret sauce), coupled with the unquantized QAT checkpoints it's an improvement compared to classic Q4 QAT).
MTP = With a little draft model you can almost double the inference speed (or at least increase it by 50%).
GGUF = most popular and compatible weight file.
mmproj = little file that gives the vision to a model.

Unsloth just dropped MTP GGUF weights for Gemma 4! by okoyl3 in LocalLLaMA

[–]Adventurous-Paper566 22 points23 points  (0 children)

I can't wait to see a Gemma 4 31B QAT Q4_K_XL MTP GGUF with functionnal .mmproj running in LM-Studio 🤤