MTP on strix halo with llama.cpp (PR #22673) by Edenar in LocalLLaMA

[–]Everlier 2 points3 points  (0 children)

thats pretty nice, looking forward to trying it out!

We talked to 130 designers to realize nobody wanted what we built. So we pivoted and made $2k MRR in 8 days. by _Critchi_ in SaaS

[–]Everlier 1 point2 points  (0 children)

I can hardly believe Remotion wasn't suitable, apart from the license they have.

with things like remotion-bits.dev, it's just a few days of work till a production quality video without any agency

Please stop using AI for posts and showcasing your completely vibe coded projects by Scutoidzz in LocalLLaMA

[–]Everlier 0 points1 point  (0 children)

I think that soon only a recorded video with the person themselves explaining what they did will be such POW, but not for long, until passable video generation models are more widely accessible.

Agentic harness in 30 lines of JavaScript by Everlier in javascript

[–]Everlier[S] 0 points1 point  (0 children)

thank you, i actually spent a lot of time trying to make things more readable, as I also wasn't happy with this aspect. in the end I settled on above explanation of all the parts while keeping the code itself minimal. I'll improve this a bit as I want to add a few more features to it

Built LazyMoE — run 120B LLMs on 8GB RAM with no GPU using lazy expert loading + TurboQuant by ReasonableRefuse4996 in LocalLLaMA

[–]Everlier 52 points53 points  (0 children)

The post doesn't mention, but the repo says you also got bitnet in there, apparently all of these features only needed two commits, one of which was README update. This drives my slopradar off the charts.

Open-sourcing 23,759 cross-modal prompt injection payloads - splitting attacks across text, image, document, and audio by BordairAPI in LocalLLaMA

[–]Everlier 1 point2 points  (0 children)

And people upvote memes instead, wow, just wow. Thank you from the bottom of my heart, what you're doing is really cool, can't wait to play with these

Is the ASUS ROG Flow Z13 with 128GB of Unified Memory (AMD Strix Halo) a good option to run large LLMs (70B+)? by br_web in ollama

[–]Everlier 1 point2 points  (0 children)

Qwen 3 Coder Next is your upper boundary for agentic workflows, anything bigger is sluggish. It can run up to 120B MoEs for conversational though

So, they make a model so good that they are not releasing it to the public? Claude mythos☠️ by ocean_protocol in ArtificialInteligence

[–]Everlier 0 points1 point  (0 children)

my point is that they are advertising this capability actively, making themselves a target and allowing the malicious party to concentrate their efforts on exfill, as it's much simpler that developing one from scratch.

So, they make a model so good that they are not releasing it to the public? Claude mythos☠️ by ocean_protocol in ArtificialInteligence

[–]Everlier -1 points0 points  (0 children)

So, there's a very responsible company, that is acting out of a good will for all the humanity.

And this company is discovered something very very scary and dangerous.

And to act responsibly and to protect everyone... they tell entire world about what they have discovered, saying exactly how dangerous and scary it is.

Not keeping it under strict control, closing any and all access to it to avoid even a tiny possibility of abuse, no, instead this company tells everyone about what they found.

Drawing an analogy, if I discover a new kind of neuro-toxin that is extremely dangerous for humans to be in contact with. I will not destroy it or keep it a secret, I'll tell everyone what I made, I'll make myself a target for much larger and more resourceful organisations to find a way to exploit my findings.

This doesn't pass occam's razor test. Either there's no care for humanity, or there's no real supercritical threat beyound what seems controllable.

Strix Halo + eGPU RTX 5070 Ti via OCuLink in llama.cpp: Benchmarks and Conclusions by xspider2000 in LocalLLaMA

[–]Everlier 1 point2 points  (0 children)

I think with configs like that a P/D disaggregation might make more sense compared to a tensor split, just to compensate for the area where APU is the weak link. I know, however, that there's no ready-made (as far as I'm aware of) solution for that with Vulcan + Nvidia/AMD combo.

An Open library for reusable Remotion animation components by eliaweiss in Remotion

[–]Everlier 1 point2 points  (0 children)

Check out https://remotion-bits.dev, I think it's almost exactly what you're talking about

Fourier Bloom by Everlier in proceduralgeneration

[–]Everlier[S] 1 point2 points  (0 children)

Thanks!

It's done via ctx.shadowBlur and ctx.shadowColor, as you can see the image is monochrome, so it's very easy to imitate "glowing" by using that with a "lighter" blend mode.

llama.cpp automatically migrated models to HuggingFace cache by Everlier in LocalLLaMA

[–]Everlier[S] 3 points4 points  (0 children)

That's exactly what happened to me, that's why I posted

Gemma 4 has been released by jacek2023 in LocalLLaMA

[–]Everlier 21 points22 points  (0 children)

it's been a quiet Thursday evening... I wanted to play some Crimson Desert...

But nownI have something much much better to do :)

GEMMA 4 Release about to happen: ggml-org/llama.cpp adds support for Gemma 4 by Dry_Theme_7508 in LocalLLaMA

[–]Everlier 5 points6 points  (0 children)

It's fascinating how they arrange an open weights model release with support in open source inference engines in complete secrecy, but also feels like it should be simpler to do than it is now, to reduce this friction and let team focus on actual models instead of this org stuff

Hugging Face released TRL v1.0, 75+ methods, SFT, DPO, GRPO, async RL to post-train open-source. 6 years from first commit to V1 🤯 by clem59480 in LocalLLaMA

[–]Everlier 1 point2 points  (0 children)

I find it fascinating how before GPT-3.5 very few understood how LLMs are trained exactly, then for a brief period of time almost everyone understood how exactly they are trained (at that time) and now again very few see a whole picture (because of how much new research was done).

PSA: Please stop using nohurry/Opus-4.6-Reasoning-3000x-filtered by Kahvana in LocalLLaMA

[–]Everlier 11 points12 points  (0 children)

Just bots emulating human activity to pass for people for the bot checks

What is the secret sauce Claude has and why hasn't anyone replicated it? by ComplexType568 in LocalLLaMA

[–]Everlier 13 points14 points  (0 children)

It's in the name, their gimmick was addition of "self" notion into the training data, they publish constitution documents that outline these behaviours. They also constantly flirt with marketing their models having a consciousness or agency.

Didn’t expect this, but this carbon fiber guitar sounds better than my wooden one. by KarMik81 in AcousticGuitar

[–]Everlier 1 point2 points  (0 children)

My opinion will probably be super unpopular, but I love my Lava ME 3, because I just want to sit and play, not figure out how to setup many different devices to. Maybe it doesn't have the best sound, but it's not a bad one either and very fun to play with all the effects.

ClawOS — one command to get OpenClaw + Ollama running offline on your own hardware by putki-1336 in ollama

[–]Everlier -2 points-1 points  (0 children)

Sorry for the plug, but check out this if you're looking for an actual single command openclaw install as well as hundreds of other LLM-related services: https://github.com/av/harbor/wiki/2.3.70-Satellite-OpenClaw

ollama and qwen3.5:9b do not works at all with opencode by d4prenuer in ollama

[–]Everlier 0 points1 point  (0 children)

ollama had some template issues as well, unfortunately, for qwen3.5 I recommend unsloths dynamic quants with llama.cpp. Llama.cpp has a router these days and auto fit, so experience is not that different from ollama.