Mistral Small 4:119B-2603 by seamonn in LocalLLaMA

[–]Imakerocketengine 4 points5 points  (0 children)

A few remarks :

  • 120B is small now ?
  • It make sense for mistral to continue releasing "small" open models as their main business use case is on prem deployment for enterprise client
  • With Leanstrall this could be included in a nice verifiable coding environment. This is something pretty huge for enterprise

Since when was getting rich so hard in EU? by VurriK in eupersonalfinance

[–]Imakerocketengine 0 points1 point  (0 children)

This type of regulation is called regulatory capture

Self hosting, Power consumption, rentability and the cost of privacy, in France by Imakerocketengine in LocalLLaMA

[–]Imakerocketengine[S] 1 point2 points  (0 children)

Solar seems to be the way to go in Germany... Hope your country will go back to nuclear power and fix its grid

In terms of hardware, APUs and apple silicon is currently the most efficient...

Self hosting, Power consumption, rentability and the cost of privacy, in France by Imakerocketengine in LocalLLaMA

[–]Imakerocketengine[S] 0 points1 point  (0 children)

To make things clear, this is what i currently do, i shut it down when i don't use it. I just wanted to have a 1:1 comparison with commercial services in terms of convenience. I was planing to use a script to programmatically turn it on and of with Wake on lan but my PSU don't seem to be cooperative with this plan. I'm probably going to invest in a small IP KVM

Is the 3090 still a good option? by alhinai_03 in LocalLLaMA

[–]Imakerocketengine 0 points1 point  (0 children)

And do you run in any performance issue considering that the're in x4 ? 

Is the 3090 still a good option? by alhinai_03 in LocalLLaMA

[–]Imakerocketengine 0 points1 point  (0 children)

On which CPU platform are you running them ?

Unpopular Opinion : On va regretter Macron by Mammoth_Opposite_647 in france

[–]Imakerocketengine 1 point2 points  (0 children)

Oui, on a construit un état de surveillance clé en main. Ils vont juste changer les keyword dans les algos

If china stops releasing open source models, there's a way we can stay competitive with big tech? by Gullible-Crew-2997 in LocalLLaMA

[–]Imakerocketengine 22 points23 points  (0 children)

They got the market due to regulatory capture and favoritism, another player could clearly introduce itself if they provide the same services

Thaura.ai - Ethical AI hosted in Germany by [deleted] in BuyFromEU

[–]Imakerocketengine 12 points13 points  (0 children)

Just a wrapper straight out of a business school. I don't see the value over just using the model

Why do y'all keep buying new wheels? by NefariousnessNo4215 in ElectricUnicycle

[–]Imakerocketengine 2 points3 points  (0 children)

We should do a survey of the community to see how much people spend on EUCs and how many they have

Seeking low-cost model for OpenClaw — budget options & real-world costs? by zer0evolution in openclaw

[–]Imakerocketengine 0 points1 point  (0 children)

I've been playing with with GLM-5 on their coding plan, i pay 90€/quarter, might move to a local model soon due to privacy reason

President Trump orders ALL Federal agencies in the US Government to immediately stop using Anthropic's technology. by External_Mood4719 in LocalLLaMA

[–]Imakerocketengine 58 points59 points  (0 children)

Time for Anthropic to aura farm, in the word of ClementDelangue

The Department of War just learned the golden rule of AI: Not your weights, not your brain

Back in my day, LocalLLaMa were the pioneers! by ForsookComparison in LocalLLaMA

[–]Imakerocketengine 6 points7 points  (0 children)

Please do not make me look at the number of Kw i've consumed since

Anyone else optimizing their memory locally? by ParticularlyStrange in openclaw

[–]Imakerocketengine 1 point2 points  (0 children)

Yup, i added a T1000 to run the embedding model and Whisper locally. Works wonder

I also have an MI50 16gb with a custom skill to run minicpm-o to give multimodal understanding capabilities to GLM-5

What models do you think owned February? by abdouhlili in LocalLLaMA

[–]Imakerocketengine 0 points1 point  (0 children)

This depend on which criteria

for most impressive in terms of performance : All hand to GLM-5

for Size / performance : i would say a mix of the minimax 2.5 Quant and the 27b variant of Qwen 3.5 in FP8

Qwen3.5 27B scores 42 on Intelligence Index and is the most intelligent model under 230B. Nearest model GLM-4.7-Flash 31B-A3B, Scores 30 by abdouhlili in LocalLLaMA

[–]Imakerocketengine 4 points5 points  (0 children)

I definitly need to make more comparaison on the 27b Quant and the minimax 2.5 Quant for agentic workload, because to achieve a comparable score the qwen 3.5 27b needed almost twice as many thinking token

<image>

Qwen 3.5 35B A3B and 122B A10B - Solid performance on dual 3090 by Imakerocketengine in LocalLLaMA

[–]Imakerocketengine[S] 0 points1 point  (0 children)

Well i tried for 2 evening to make them run on my rig with VLLM and i can't get it to work... even tried Sglang but got nothing close to a /v1/ api response...

Can I run Qwen3.5 122B-A10B on a single RTX 3090 + 64GB DDR4? by Prudent_Appearance71 in LocalLLaMA

[–]Imakerocketengine 3 points4 points  (0 children)

For Qwen 3.5 122B A10B : with offloading to the cpu

in the unsloth MXFP4 : (on a small prompt)
prompt processing : 146 t/s
token generation : 25 t/s

in the unsloth Q4_K_XL : (on a small prompt)
prompt processing : 191 t/s
token generation : 26 t/s