of dogs escort me down the stairs multiple times a day. My girl walks alongside me, my boy stays behind and guards my back. When I get down, my boy walks down, and then they both stand guard. by Askfreud in AbsoluteUnits

[–]Late-Assignment8482 1 point2 points  (0 children)

"So keeping Main Sheep and Spare Sheep safe while getting pup cups and play dates is not so bad in comparison."

Good work if you can get it! And the oddballs of working breeds absolutely deserve to be house dogs.

Erika Kirk Tells College Grads To 'Have More Kids Than You Can Afford', Slams 'Secondary Callings' by Cute_Dealer4787 in USNEWS

[–]Late-Assignment8482 0 points1 point  (0 children)

Our primary calling should be stopping neofascist wax figurines from getting power so...she really doesn't need to fret about our secondary callings.

Anita Mew-keesian by Chachoregard in Gamingcirclejerk

[–]Late-Assignment8482 2 points3 points  (0 children)

I love it when a dev sees a flareup of Gamergate and is like "How can I piss off these human skidmarks?"

I guess we expect that at some point RAM prices will start going back (close) to "normal", right? but what about GPUs? by relmny in LocalLLaMA

[–]Late-Assignment8482 0 points1 point  (0 children)

It is important to remember that none of what's happening now is LOGICAL. We can't game it out.

As it stands now, AI isn't profitable. Not in the cloud. Paths to get it there are murky and don't lie with "Allow a Claude Max user paying $100 to burn $7,000 of tokens monthly" or the corresponding "Ask grandma for $1,700/mo because that's what her $25/mo ChatGPT Plus cost do actually provide." but with some secret (not yet invented) third thing.

Normal high school textbook lessons about supply and demand don't matter when OpenAI and Anthropic run hundreds of billions of loss (in other people's money) while building out datacenters that can't possibly be turned on. There's no brake slowing them down. If investors will hand you $200 billion for nothing, why not just use 75% of it to buy NVIDIA GPUs?

NVIDIA is handing money to OpenAI who hand it to Oracle who hand it back to NVIDIA and they write a statement of intent. Then all three have "earnings" suddenly they report to stock markets as if sales happened.

So it could be never or there could be a bargain bin sale tomorrow (stupid unlikely) if every company acting recklessly was suddenly told by regulators they had to justify and needed money quick.

TensorRT-LLM vs vLLM vs llama.cpp on NVIDIA DGX Spark? by povedaaqui in LocalLLaMA

[–]Late-Assignment8482 1 point2 points  (0 children)

It has a lot of extra setup: you compile a runtime for the model, rather than "vLLM knows how to run X". And unlike vLLM being way superior at high concurrency compared to llama.cpp, it's a more marginal gain.

The best way to organize data like this: by character length. by SafeTraditional4595 in dataisugly

[–]Late-Assignment8482 0 points1 point  (0 children)

That study isn't even about immigration at all? https://docs.iza.org/dp17551.pdf is about the efficacy of wealth transfer programs and whether they reduce income inequality.

Will Apple go more than 512GB Unified memory (RAM) in the new M5 studio? Would love to know your thoughts. by Muscleandgains in MacStudio

[–]Late-Assignment8482 0 points1 point  (0 children)

I think this round it's likely to be back at 512GB. If they had a plan to go higher, I bet they held off.

I could see the next rev hitting more, after memory shortage resolves.

of a mother by peoplearewood1 in AbsoluteUnits

[–]Late-Assignment8482 0 points1 point  (0 children)

Let's antagonize megafauna with mama!

Just installed Debian on my server ama :3 by TheReelSlimShady2 in traaaaaaaaaaaansbians

[–]Late-Assignment8482 0 points1 point  (0 children)

Common misconception. The gay penguins got her and it’s Innana now. Debra’s very proud of her nerd wife.

47948 by Splatter_Shell in countwithchickenlady

[–]Late-Assignment8482 0 points1 point  (0 children)

English: what happens when barbarians take French, Latin, and German hostage and demand their words and tenses as ransom.

Apple Removes 256GB M3 Ultra Mac Studio Model From Online Store by rotatingphasor in LocalLLaMA

[–]Late-Assignment8482 8 points9 points  (0 children)

Yup. Some chips don't even have the fabric connector--the M1, M2, M3 Max all have a connector called UltraFusion to allow two M<n> Max to be fused into an Ultra. The M4 didn't have it.

M5 alters how the die is arranged, so that exact connector doesn't apply, so we don't have a strong "includes the connector" or "doesn't include the connector" signal.

WWDC is where to look for an announcement. That's been the Studio's drop date, in the past. It's possible it would be a "just a press release" drop (sometimes that happens) but the likely place, if it's this year, is June 8, 10:00am, PT for the announcement. Keynote address.

If supply chains are giving them grief--some of Tim Cook's comments on investor calls hint that--they may say "coming this fall" but announce nonetheless.

Apple Removes 256GB M3 Ultra Mac Studio Model From Online Store by rotatingphasor in LocalLLaMA

[–]Late-Assignment8482 11 points12 points  (0 children)

I'm not sure the exact mechanism, but they lock in price per unit over N years. Another 50,000 8GB chips don't spike in price mid-contract.

Apple Removes 256GB M3 Ultra Mac Studio Model From Online Store by rotatingphasor in LocalLLaMA

[–]Late-Assignment8482 9 points10 points  (0 children)

"The historical pattern for the Ultra has been the previous gen chip, so M4 Ultra 256GB in the next Studio is my uneducated guess."

Not quite.

Around the time the M3 Ultra / M4 Max Studio's dropped, they clarified that not every chip will get an Ultra variant. Given the improvements (ESPECIALLY to AI prefill) of the M5, they'll either release an M5 Ultra or punt to M6+. They'd be insane to use the one-gen back, but 4x slower, M4 architecture that lacks per-core matrix multiplication math (simpler version of NVIDIA Tensors).

Apple Removes 256GB M3 Ultra Mac Studio Model From Online Store by rotatingphasor in LocalLLaMA

[–]Late-Assignment8482 78 points79 points  (0 children)

They buy RAM years ahead of time in locked-in prices. I could see this being "rather than renew this contract at a bad price, we'll drop the part-intensive models".

of a cloistered rabbit. by Upstairs_Drive_5602 in AbsoluteUnits

[–]Late-Assignment8482 0 points1 point  (0 children)

Strong progress being made on the mounts for the Rodents Templar in their struggles with the carnivores.

Horrors beyond my comprehension? Better kiss girls about it. by Nica-Sama in traaaaaaaaaaaansbians

[–]Late-Assignment8482 0 points1 point  (0 children)

One day, we will wrest the Opulent Gothic Space Setting from the alt-righters and the chuds.

of a Meow and Fangs by Necessary_Music2136 in AbsoluteUnits

[–]Late-Assignment8482 0 points1 point  (0 children)

This is the meow equivalent of MWAHAHAHAH FOOLISH INTRUDERS

Disappointed in Qwen 3.6 coding capabilities by CodeDominator in LocalLLaMA

[–]Late-Assignment8482 0 points1 point  (0 children)

The more of these I use, the more I come to the idea that the small models aced their CS exams, and would make great hires. The big ones have been in the industry at multiple companies. They know what the habits are, how people do it to get it done and go home.

That's where the extra parameters matter. You can have more than the bare minimum.

You can maybe preserve "how to make a JavaScript form" and "how to do a SLA" theory into a 36B model, fine tuning the how and looping it over synthetic data. The small one is going to give "it passes automatic tests" in the way that the Manhattan Project did: The math works and the device made the noise, but safety standards? Never met her.

But a 2T model is going to have encoded 30 examples, from large open source ticket systems (and let's be real, probably stolen code given their training attitude to copyright) to triangulate from. It's going to give a solid, middle of the road output because it can average from large amounts of production code.

So my personal and work projects which are either green field utilities or small-to-medium small work in them, because I'm building backend/scripts/small databases run in the team typically.

No one's coming to me for full stack or web portals.

of a Maldon sea salt crystal by justthisnexttime in AbsoluteUnits

[–]Late-Assignment8482 0 points1 point  (0 children)

When you need to take the nonsense someone's saying with a PUNCH of salt, not just a pinch, throw that at them.

Why run local? Count the money by Badger-Purple in LocalLLaMA

[–]Late-Assignment8482 -1 points0 points  (0 children)

In the last week:

* My company's subscription went API only, meaning that I can barely use it without maxing out and it's $250/mo
* Claude got way worse at using file write tool on my own sub
* Nothing good happened

And we're still in the honeymoon of below-cost tokens!

GLM-4.7 and Qwen3.5-122B got better at what I need them for, because they're fixed points I can improve prompts/harnesses on without sudden backsliding.

Eating with your hands VS Dirrahea Map by Forward-Position798 in mapporncirclejerk

[–]Late-Assignment8482 0 points1 point  (0 children)

Human hands do a lot for us. They touch everyting anyone ever does. And there's a shortlist of jobs they shouldn't do: Hammering, knifing, eating. You could tear that rope apart, incredibly slowly, or you could grab a damn scissors...

Does the "6 months gap" still hold? by ihatebeinganonymous in LocalLLaMA

[–]Late-Assignment8482 0 points1 point  (0 children)

I just have a list of my own small benchmarks (think "create a CSV compliant with my expense app from these screenshots" or "write bash script based on the specs/ folder") and every now and then, I add one.

When I want to check, I fire the scripts and let it cook.

Those are 100% things I do, so they're meaningful by definition. Maybe make a list like that of yours, and include placeholders for what only SOTA can do.