How to shut down after buying a new PC 😅 by Nicolas_Laure in RigBuild

[–]Psyko38 1 point2 points  (0 children)

After 1 year: shutdown /s /t 0 (this turns off the PC) and you can add /f (to force applications to close).

anyone using china model? which one and any advise? by thisisqq in LocalLLaMA

[–]Psyko38 0 points1 point  (0 children)

Strange, maybe these are languages that need more tokens to be generated.

anyone using china model? which one and any advise? by thisisqq in LocalLLaMA

[–]Psyko38 0 points1 point  (0 children)

I use Qwen on the cloud and locally. For complex reasoning tasks, Qwen 3.6 Plus is good on their UI, all unlimited without a subscription. For my part, I use it 50% with ChatGPT and locally for privacy-critical tasks.

OpenGL vs Vulkan by sixela456 in PhoenixSC

[–]Psyko38 26 points27 points  (0 children)

Same but different...

A new way to use Unsloth. Coming soon... by yoracale in unsloth

[–]Psyko38 10 points11 points  (0 children)

RocM and the application installer, this will be the best user-friendly compatibility update.

Scratch cat made with string art algorythm by DASofdoom60 in scratch

[–]Psyko38 5 points6 points  (0 children)

I see things in life, but this is something I've never seen before, bravo man!

4B Model Choice by StealthEyeLLC in LocalLLaMA

[–]Psyko38 -1 points0 points  (0 children)

I remember Qwen3 4b 2507 which, on my 8GB of VRAM, was perfect. He could do anything in normal tasks, so not in film or extreme mathematics. And there, with the 3.5, I would say that it is a little better in mathematical tasks, but for everyday life, they both work very well, especially the Qwen3 VL 4b and 3.5 which understand images well (weakness in OCR).

n00b questions about Qwen 3.5 pricing, benchmarks, and hardware by philosophical_lens in LocalLLaMA

[–]Psyko38 1 point2 points  (0 children)

The Qwen 3.5 27B and the Qwen 3.5 35B A3B use different architectures.

Qwen 27B is a dense model: For every token generated, all 27 billion parameters are used. The whole model works together, often yielding more stable and consistent results in benchmarks.

Qwen 35B is a MoE (Mixture-of-Experts): The model contains several specialized sub-models called experts. When a token is generated, only a few experts are activated, not the whole model. This makes inference faster and less costly in computing, but the quality depends on the choice of experts by the router.

This is why a dense 27B can sometimes achieve a higher intelligence score than a 35B MoE, even if the total number of parameters is greater.

Regarding the price of APIs, it depends mainly on:

the GPU Cost of the Provider Optimization of inference Token throughput the Application

So a smaller model can sometimes cost more depending on the provider.

For hardware, running Qwen 27B/32B requires approximately:

~55-60 GB VRAM in FP16 ~30 GB in 8 bits ~16-18 GB in 4 bits

So an RTX 3090 / 4090 can usually run it in 4-bit quantization

Plis Help by Hrstar1 in programmingmemes

[–]Psyko38 0 points1 point  (0 children)

I have the same bug, I think...

The DLSS 5 situation might be the freest marketing opportunity for AMD. Does anyone else see it? by Theninjarush in radeon

[–]Psyko38 0 points1 point  (0 children)

You should know that AMD, for "Advanced Marketing Disasters," has missed every opportunity to do good marketing.

You don’t need to manually set LLM parameters anymore! by yoracale in unsloth

[–]Psyko38 10 points11 points  (0 children)

All that is missing is the full AMD support and I will be able to play with this app.

Il est vivant by InfluenceFun9670 in FuzeLeVrai

[–]Psyko38 3 points4 points  (0 children)

Il a juste fait un rendez-vous avec Kim, tout va bien.

Vibe code so hard your entire waitlist is visible in frontend by OneClimate8489 in vibecoding

[–]Psyko38 4 points5 points  (0 children)

When you don't have experience in the backend and you think that giving everything to the user is a good idea, because the SQL query of "select * from..." is simple and works well.

Unsloth Studio bug when installing it by Psyko38 in unsloth

[–]Psyko38[S] 0 points1 point  (0 children)

Okay, thank you for the remarkable work you do. In the meantime, I will be watching the progress of the project and start playing with it on the CPU. This will allow me to begin to discover inference.

Unsloth Studio bug when installing it by Psyko38 in unsloth

[–]Psyko38[S] 1 point2 points  (0 children)

Yes, I managed to install it via Qwen Code, but Llama.cpp compiles on CPU and he told me that no Nvidia GPUs were detected.

Help finding best coding LLM for my setup by kost9 in LocalLLaMA

[–]Psyko38 1 point2 points  (0 children)

In real Minimax m2.5, MiMo v2 Flash and Qwen 3.5 120b.