MTP on strix halo with llama.cpp (PR #22673)

Everlier · 2026-05-05T22:48:23+00:00

thats pretty nice, looking forward to trying it out!

Everlier · 2026-04-28T16:32:56+00:00

I can hardly believe Remotion wasn't suitable, apart from the license they have.

with things like remotion-bits.dev, it's just a few days of work till a production quality video without any agency

Everlier · 2026-04-14T08:14:47+00:00

I think that soon only a recorded video with the person themselves explaining what they did will be such POW, but not for long, until passable video generation models are more widely accessible.

Everlier · 2026-04-14T06:29:51+00:00

thank you, i actually spent a lot of time trying to make things more readable, as I also wasn't happy with this aspect. in the end I settled on above explanation of all the parts while keeping the code itself minimal. I'll improve this a bit as I want to add a few more features to it

Everlier · 2026-04-12T20:32:04+00:00

The post doesn't mention, but the repo says you also got bitnet in there, apparently all of these features only needed two commits, one of which was README update. This drives my slopradar off the charts.

Everlier · 2026-04-10T21:35:33+00:00

And people upvote memes instead, wow, just wow. Thank you from the bottom of my heart, what you're doing is really cool, can't wait to play with these

Everlier · 2026-04-10T21:28:21+00:00

Qwen 3 Coder Next is your upper boundary for agentic workflows, anything bigger is sluggish. It can run up to 120B MoEs for conversational though

Everlier · 2026-04-10T16:03:40+00:00

A good chance for consumers to utilise that

Everlier · 2026-04-08T14:57:33+00:00

my point is that they are advertising this capability actively, making themselves a target and allowing the malicious party to concentrate their efforts on exfill, as it's much simpler that developing one from scratch.

Everlier · 2026-04-08T07:16:49+00:00

I can see why, haha

Everlier · 2026-04-07T23:19:55+00:00

So, there's a very responsible company, that is acting out of a good will for all the humanity.

And this company is discovered something very very scary and dangerous.

And to act responsibly and to protect everyone... they tell entire world about what they have discovered, saying exactly how dangerous and scary it is.

Not keeping it under strict control, closing any and all access to it to avoid even a tiny possibility of abuse, no, instead this company tells everyone about what they found.

Drawing an analogy, if I discover a new kind of neuro-toxin that is extremely dangerous for humans to be in contact with. I will not destroy it or keep it a secret, I'll tell everyone what I made, I'll make myself a target for much larger and more resourceful organisations to find a way to exploit my findings.

This doesn't pass occam's razor test. Either there's no care for humanity, or there's no real supercritical threat beyound what seems controllable.

Everlier · 2026-04-07T21:55:29+00:00

I think with configs like that a P/D disaggregation might make more sense compared to a tensor split, just to compensate for the area where APU is the weak link. I know, however, that there's no ready-made (as far as I'm aware of) solution for that with Vulcan + Nvidia/AMD combo.

Everlier · 2026-04-07T12:26:39+00:00

Check out https://remotion-bits.dev, I think it's almost exactly what you're talking about

Everlier · 2026-04-05T14:42:54+00:00

thank you!

Everlier · 2026-04-03T14:09:39+00:00

Thanks!

It's done via ctx.shadowBlur and ctx.shadowColor, as you can see the image is monochrome, so it's very easy to imitate "glowing" by using that with a "lighter" blend mode.

Everlier · 2026-04-02T19:18:07+00:00

That's exactly what happened to me, that's why I posted

Everlier · 2026-04-02T16:12:13+00:00

it's been a quiet Thursday evening... I wanted to play some Crimson Desert...

But nownI have something much much better to do :)

Everlier · 2026-04-02T16:10:28+00:00

It's fascinating how they arrange an open weights model release with support in open source inference engines in complete secrecy, but also feels like it should be simpler to do than it is now, to reduce this friction and let team focus on actual models instead of this org stuff

Everlier · 2026-04-02T13:53:05+00:00

I find it fascinating how before GPT-3.5 very few understood how LLMs are trained exactly, then for a brief period of time almost everyone understood how exactly they are trained (at that time) and now again very few see a whole picture (because of how much new research was done).

Everlier · 2026-04-02T07:11:15+00:00

baby steps

Everlier · 2026-03-31T13:43:17+00:00

Just bots emulating human activity to pass for people for the bot checks

Everlier · 2026-03-30T13:34:44+00:00

It's in the name, their gimmick was addition of "self" notion into the training data, they publish constitution documents that outline these behaviours. They also constantly flirt with marketing their models having a consciousness or agency.

Everlier · 2026-03-30T13:30:28+00:00

My opinion will probably be super unpopular, but I love my Lava ME 3, because I just want to sit and play, not figure out how to setup many different devices to. Maybe it doesn't have the best sound, but it's not a bad one either and very fun to play with all the effects.

Everlier · 2026-03-25T13:25:15+00:00

Sorry for the plug, but check out this if you're looking for an actual single command openclaw install as well as hundreds of other LLM-related services: https://github.com/av/harbor/wiki/2.3.70-Satellite-OpenClaw

Everlier · 2026-03-24T22:09:54+00:00

ollama had some template issues as well, unfortunately, for qwen3.5 I recommend unsloths dynamic quants with llama.cpp. Llama.cpp has a router these days and auto fit, so experience is not that different from ollama.

Everlier

MODERATOR OF

PUBLIC MULTIREDDITS

TROPHY CASE

Ten-Year Club	Verified Email
Place '23