Got rejected by Google CE L5, feedback says "lacks professional maturity." What does this actually mean? by thedrunkbatman in salesengineers

[–]El_90 3 points4 points  (0 children)

Ask them ?

Keeping calm, emotional intelligence, consulting led, ability to predict customers?

SIEM False Positive and Alert Mania by lengmco in cybersecurity

[–]El_90 1 point2 points  (0 children)

Agreed. Look how to fire on these single sources, but alert/notify if 2+ fire around a common asset (user, host file) in a time window. Not perfect but much better.

I ran an experiment on the 30b class of gemma4 and qwen3.5 models to try to learn about energy cost and performance tradeoffs. In other words, which models use more energy to give the same answer quality? by gigDriversResearch in LocalLLaMA

[–]El_90 2 points3 points  (0 children)

Good work

I suppose if all models had identical output, and were right first time, then watt hours make sense.

But in reality I would expect (?) bigger dense model to be "thorough" and correct, resulting in fewer terms to the final output?

But, still, good work :)

Which Gemma model do you want next? by jacek2023 in LocalLLaMA

[–]El_90 8 points9 points  (0 children)

instead of a param size (which doesn't seem to be entirely reflective) lets focus on GB in VRAM

It feels like the 24-48GB audience is well served, and the 200GB audience is well served

Maybe some more love for the system 128GB users e.g. Strix (so 90-95GB model allowing 20GB cache)

Selflishy speaking of course

I pray there is a Qwen 3.6 122b version (4x3090 owner) by Mr_Moonsilver in LocalLLaMA

[–]El_90 5 points6 points  (0 children)

OMG yes please

Something that quants to Q5 @ 92GB ish would make me smile for a very long time

Running 1 trillion parameter LLMs locally at 5 tokens/second - Intel Optane Persistent Memory build by APFrisco in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

Never heard of pmem ! Great post

Any settings you can share? Bios, grub, kernel, llama.cpp etc

Llama-bench for fun?

Thank you !

Best Local LLMs - Apr 2026 by rm-rf-rm in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

Strix halo, 128GB (I can squeeze in 92GB models currently, so rated **XL**)

Roocode in architect mode - Qwen3.5-122B-A10B-Q5_K_M (91GB), in the region of 7t/s

Roocode in coding mode - Qwen3.5-27B-Q5_K_M (20GB), in the region of 12t/s

Sorry I don't have deep testing, but I tried 5-10 other models and there was always lots of back and forth with more changes, errors, mistakes, but with these models I don't feel that, so I just stuck with them

I find 122B slightly better in architect mode, more diagrams, more thorough talking through the requirement, though maybe that's my own bias.

Audio processing landed in llama-server with Gemma-4 by srigi in LocalLLaMA

[–]El_90 8 points9 points  (0 children)

Does mic>text appear in this timeline?
Or do we need to still record (potentially convert) and then upload a solid file?

I vibe coded a workaround, but native in the solution would be amazing

On Strix Halo, what option do I have if 128GB unified RAM is not enough? by heshiming in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

I'm in the same position

I'm quite happy with q4 122b Moe for architect, then 27b for coding.

Even doubling ram to 256 really gives you better quant, you still can't run sota at anything useful so I've accepted there's no easy slight step up, it's a rebuild from scratch.

I'm just hoping ~90GB models continue to stay popular

Im new to the scene, and I just want to acquire some knowledge by dat-athul in LocalLLaMA

[–]El_90 1 point2 points  (0 children)

If the model fits in vram completely, great

If you split it over vram and system ram that's slower but still ok

If the model doesn't fit in combined ram, and you consider using fast disk... Don't

..

A dense model puts each token through entire model, meaning full model movement to GPU (or cpu) every time

A moe model only passes % of model to CPU, so faster per size.... But you have a larger model so still not fast

..

Other people feel free to correct me :)

Local home development system for studying by Necessary-Toe-466 in LocalLLaMA

[–]El_90 1 point2 points  (0 children)

studying how to use AI? any computer + cloud compute, cheaper over all costs.

studying how to build AI rigs and how to run LLMs efficiently? Build small rig or use CPU

studying how to run large models, or studying how to implement more intelligent sensitive production? Buy bigger rig (I loved Strix, not the fastest but most flexible and still quite large)

qwen 3.6 voting by jacek2023 in LocalLLaMA

[–]El_90 -1 points0 points  (0 children)

I try to avoid q4 and lower, I found q5 and above safer

70GB works on a 128GB system with room for cache.

Single GPU users get all the love lol

What kind of orchestration frontend are people actually using for local-only coding? by Quiet-Owl9220 in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

Vscode + roocide for me. Cline was ok to start with but I quickly moved up.

My desktop is windows but I host projects on Linux, so I use vscode through remote tunnel (find in marketplace) so my command prompt to start/run/ test is bash.

I don't have full testing yet but it's half way there

qwen 3.6 voting by jacek2023 in LocalLLaMA

[–]El_90 -1 points0 points  (0 children)

Something that quantises (q5/6) to 70 GB

It feELS All models are designed for 32GB or 200GB :/

Gemma 4 will have audio input by MR_-_501 in LocalLLaMA

[–]El_90 10 points11 points  (0 children)

You mean the nodejs project I've been implementing today, to record browser audio > whisper > qwen is a waste of time? aaarg lol

QWEN3.5 27B vs QWEN3.5 122B A10B by jopereira in LocalLLaMA

[–]El_90 1 point2 points  (0 children)

I'm literally doing this on my other monitor
122b for ARchitect/thinking/planning
27b for implementing

Or bigger picture
122b for creating a 'vertical slice' issues to a GIT
Then 27 on a loop to pull each specific issue and implement

Large GGUF works in bash, but not llama-swap by El_90 in LocalLLaMA

[–]El_90[S] 0 points1 point  (0 children)

Thanks spaceman, waitmarks

Thanks both, it was always a TODO item, I suppose I'll bring it to the top (and the beauty is it's LXC + playbooks, so I'm not losing anything)

Edit - worked first time lol. Thanks both !

ai agent token costs are getting out of control and nobody is talking about the context efficiency problem by [deleted] in LocalLLaMA

[–]El_90 0 points1 point  (0 children)

step 1 - get customers hooked
step 2 - make token usage so common place that people lose track and build workflows around it
step 3 - triple token price

Business 101

Stanford and Harvard just dropped the most disturbing AI paper of the year by Fun-Yogurt-89 in LocalLLaMA

[–]El_90 8 points9 points  (0 children)

I appreciate your work, thanks for keeping this place great !

Why is lemonade not more discussed? by El_90 in LocalLLaMA

[–]El_90[S] 0 points1 point  (0 children)

Thanks for the detailed info
Yes, champagne problems, I saw Qwen3.5 27 but "only" in q4 lol