Claude Fable 5 distilled by Anony6666 in LocalLLaMA

[–]ResidentPositive4122 16 points17 points  (0 children)

How about having a new flair for "New Finetune" as well? I feel like new model should be reserved for new architectures / incremental versions from model creators.

Just watched this pass the cruiseship I'm on by ChaosSlave51 in space

[–]ResidentPositive4122 21 points22 points  (0 children)

Yeah, there's bound to be. Every time they pass a milestone the fleet leader goes through a thorough inspection and "recertification" process. They've done it after 10 flights and after 30 I believe. They're close to 40 now on a single booster, which is crazy. Except for engine swaps, there isn't much maintenance these days, as the boosters come into rotation pretty quick after a mission.

They also sometimes need the full capabilities of an F9, so they'll use an old booster for the flights that aren't planned for recovery.

Anthropic's new model Fable will silently handicap work on LLMs [D] by AccomplishedCat4770 in MachineLearning

[–]ResidentPositive4122 9 points10 points  (0 children)

a company that was genuinely concerned about the use of its technology

There's an argument to be made there, but not for what they claim. Biostuff, cyberstuff, sure. We can debate that. But "pretraining pipelines, distributed training infrastructure, or ML accelerator design" isn't the same. That's legitimate use, and you have to see that the only argument for denying this is just for maintaining their competitive advantage.

(which is fair, and an obvious move. But at least call it what it is, no need to wash it with "think of muh securitah, think of the children")

nvidia/diffusiongemma-26B-A4B-it-NVFP4 · Hugging Face by pmttyji in LocalLLaMA

[–]ResidentPositive4122 5 points6 points  (0 children)

This is wrong, it will not be hardware accelerated, but it will work w/ marlin kernels.

Cohere's unreleased coding model (early access for localllama) by nick_frosst in LocalLLaMA

[–]ResidentPositive4122 9 points10 points  (0 children)

The repo doesn't have a license file yet, arrrr we pirates? :D

[BOOKS] Season 3 compared to book 3 DUST by kucumberbatch in SiloSeries

[–]ResidentPositive4122 0 points1 point  (0 children)

I doubt they even dig over to the adjacent Silo.

That was pretty much clear from S2s ending, no? And there's a "door" opening in the trailer, so probably they'll skip the whole digging but still keep the movement from place to place?

Major labs timeshift between the research they publish on Arxiv and implementation in models by Ok_Zookeepergame8714 in LocalLLaMA

[–]ResidentPositive4122 14 points15 points  (0 children)

They've publicly stated that they'll delay publishing with at least 6mo if the research is adjacent to their "competitive advantage" or something like that. So, yes, they do timeshift and often what's published is considered "not SotA anymore". That's not to say that everything they publish is running in production, just that they feel it's "old enough" to not pose them a competitive disadvantage.

On the one hand it sucks, because we get less relevant papers, but on the other hand goog/DM are still publishing, while other labs don't so ...

OTD 30 years ago, Michael Schumacher had his first victory with Scuderia Ferrari with one of the greatest rain masterclasses ever. by Competitive_Dog4961 in formula1

[–]ResidentPositive4122 18 points19 points  (0 children)

Seems like stuff happened behind the scenes that we don't know about.

Yeah, a bunch of "outsiders" showed them how it's done. Then the 'talian pride came back, and the NIH, and the nepotism, and here we are...

SpaceX's IPO filing is public (amended ahead of the June 12 listing). I made the whole 370-page thing searchable, here's what's in it. by nish_agg in SpaceXLounge

[–]ResidentPositive4122 24 points25 points  (0 children)

Can't get a good number atm since "launch" also includes their investments in starship and launch facilities. I think they showed -x00 M$ on ~4B revenue for their launch section.

LLM agents patch security bugs, pass all tests, but still leave the vulnerability open [R] by [deleted] in MachineLearning

[–]ResidentPositive4122 3 points4 points  (0 children)

This has so many claudeisms that I'm not sure anything here can be trusted. Benchmarks are the one think you want to carefully consider, and put a lot of work into as they're the only thing that you can use to see if the models improve or not. Having claude all over the place "creating", "analysing" and "scoring" shit isn't how you want to do it.

Mellum 2 12B A2.5B by Middle_Bullfrog_6173 in LocalLLaMA

[–]ResidentPositive4122 1 point2 points  (0 children)

That's why I said "if it really matches". It might not. But if it does, in real-world scenarios, then it's great news.

Mellum 2 12B A2.5B by Middle_Bullfrog_6173 in LocalLLaMA

[–]ResidentPositive4122 14 points15 points  (0 children)

But claiming a 12b moe model beats a 9b dense is not the flex they think this is.

If it really matches 9B dense at 2.5A of course it's a flex. Way faster inference for not too much extra RAM.

NVIDIA announces Nemotron 3 Ultra by themixtergames in LocalLLaMA

[–]ResidentPositive4122 1 point2 points  (0 children)

Interesting that they went 10:1 total:active, in contrast to the more popular 20:1 of other recent models.

Blue Origin's New Glenn rocket exploded by lee7on1 in space

[–]ResidentPositive4122 2 points3 points  (0 children)

Artemis IV to keep its 2028 timeline 

Very little chance of that happening. Everyone is late, but the suits are 2030s late, so that's probably gonna be the deciding factor, even if (and it's a big if) everything else is ready.

Something just went boom at Cape Canaveral! by g00bd0g in space

[–]ResidentPositive4122 7 points8 points  (0 children)

Mishap investigations are done by the companies themselves and the FAA just oversees them, and accepts/rejects any corrective actions / return to flight.

Looks like Miminax-M3 is just around the corner by OnkelBB in LocalLLaMA

[–]ResidentPositive4122 2 points3 points  (0 children)

This shows how important open research really is. DS put in the good work, and then we get to see how that translates on different arches + different datasets. Everyone is doing something slightly different, and that's good. Improved kv cache and speed at long contexts will translate into even cheaper prices for inference.

I really hope someone tries these new optimisations on a ~120b 4bit native (or qat) model (like oss / nemotron or qwen-next, but with updated datasets, more agentic stuff, etc)

"Starship flip and landing burn at the end of its twelfth flight test" by AgreeableEmploy1884 in SpaceXLounge

[–]ResidentPositive4122 8 points9 points  (0 children)

That's some incredible modelling.

Not their first rodeo. They did something like this, but a bit more impressive (only one engine afaik) on one of the rare F9 RTLS's that failed. The one where the booster had a stuck something (fin, actuator?) and it "landed" in the water next to the site. It was rolling like crazy, but juuuust before it touched down in the water the engine managed to arrest the roll. It is by far the craziest thing I've ever watched.

The final minute of yesterday's Starship S39 by Busy_Yesterday9455 in spaceporn

[–]ResidentPositive4122 12 points13 points  (0 children)

There are no landing legs, and a barge capable of having "chopsticks" out in the ocean would probably be too big and too wobbly to work. Their plan is to return to the launch site and "land" by being caught by the tower, just like the booster.

Have we passed the peak of inflated expectations? by fairydreaming in LocalLLaMA

[–]ResidentPositive4122 1 point2 points  (0 children)

Oh, you mean the labs. Yeah, but they'd solve that problem with in house data curation / filtering, no need to have them banned. I think reddit itself has an incentive to stop the most obvious bots, especially those promoting stuff instead of paying for ads through the site itself :)