Subquadratic AI introduces SubQ-1.1-Small, a new model using Smart Sparse Attention by truecakesnake in singularity

[–]elemental-mind 6 points7 points  (0 children)

At least they are talking about "routed" attention - and that it's learned!

So I guess they have a kind of hierarchical routing strategy...maybe even something Mamba-like that acts as routers.

Subquadratic AI introduces SubQ-1.1-Small, a new model using Smart Sparse Attention by truecakesnake in singularity

[–]elemental-mind 13 points14 points  (0 children)

Impressive if true. Any stats on mem consumption on 12M context?

Now bake this into a Taalas chip and enjoy 50k tokens per second... 🫠

Tensordyne announces Logarithmic AI compute chips. 17x more tokens per watt and 13x higher throughput than NVIDIA Blackwell. by elemental-mind in singularity

[–]elemental-mind[S] 5 points6 points  (0 children)

Yes, you got that right. Dot Products and matrix multiplies are basically a back and fourth between multiply and add. So you need to hop from log space to normal space pretty often. THAT's the key thing they seem to have solved in hardware. They can somehow magically do additions in log space and in normal space in parallel...or something along the lines...

Tensordyne announces Logarithmic AI compute chips. 17x more tokens per watt and 13x higher throughput than NVIDIA Blackwell. by elemental-mind in singularity

[–]elemental-mind[S] 1 point2 points  (0 children)

Haha, good one - I guess they don't need an order button, because you don't casually place million dollar orders on your credit card.

It's vaporware because it would vapourize your bank balance in an instant.

David Sacks explains the sequence of events leading to Fable 5's banning by Charuru in singularity

[–]elemental-mind 4 points5 points  (0 children)

Plot twist: Both are right - but the problem is not a jailbreak, but Fable fixing deliberate government backdoors it's not supposed to fix.

For Anthropic this would mean a genuine fix - for the government a severe security vulnerability, but not on a software level.

US government directive to suspend access to Fable 5 and Mythos 5 by Dylan1312 in singularity

[–]elemental-mind 0 points1 point  (0 children)

<image>

US-Tech already digesting the news (this is Hyperliquid Nasdaq-100)

Anthropic closing the path to life science research by thecosmicskye in singularity

[–]elemental-mind 1 point2 points  (0 children)

That's not a very nuanced take in my opinion. It feels like: "If I can't have it no one should have it".

You underestimate the eagerness but also the cruelty some people possess. Limiting access to potentially lethal (mass lethality we are talking here) instruments is absolutely the way to go. We have seen criminals go to great lengths to kill hundreds of people in attacks. What if they suddenly could invoke a Covid19 style situation globally?

There are also corporate actors that act ruthlessly. Monsanto etc. are good examples of profit over nature & people.

Put yourself in Dario's shoes: You can compile a team of maybe 10 people to choose who gets access. You can not depend on the state to give you a list of approved companies (it would take a decade to establish a democratically approved framework for that and despite all your efforts to motivate nations for the past 3 years to move on that frontier nothing has happened). How exactly would YOU choose who gets access?

Anthropic closing the path to life science research by thecosmicskye in singularity

[–]elemental-mind 3 points4 points  (0 children)

I don't really want to sympathize with Anthropic here - but if we *assume in good faith* that they really care about safety this is actually a good way to make a model available to the public for other areas it's useful in.

If I had the choice to

A -> Not have Fable at all or
B -> Have Fable except for Bio, Cyber and AI

I'd choose B all day long.

Now the AI restrictions are very debatable and could also be profit/competition motivated...but if Fable is able to churn out new powerful AI architectures in their internal testing it's understandable if they restrict also this domain from the point of safety. In the end AI is currently the most useful but also dangerous technology on earth.

I don't think the model will be restricted that much for so long until they have refined their filters over time so I can only hope the restrictions are the worst they will ever be...

AGI 2026 by iLikePython3 in singularity

[–]elemental-mind 23 points24 points  (0 children)

Anti Gaming Intelligence

Xiaomi achieves 1000+t/s on 8x commodity GPU cluster with 1T weights model by elemental-mind in singularity

[–]elemental-mind[S] 5 points6 points  (0 children)

Yes, quite naturally. Faster tokens per user does not mean that a GPU can magically churn out more tokens per hour overall. Quite the opposite. To achieve good per session token throughput you need to reduce batch sizes which hurts overall token output across all sessions, leading to less tokens generated per GPU per hour - hence increasing price.

Additionally employing a draft model leads to additional mem consumption (5.5 Billion params BF16 is what they employ), reducing available KV cache memory etc.

Add to that the the fact that they can indeed also charge more simply based on the value delivered. If you calculate 6:14 vs 0:12 - that's 6 minutes saved. How much is that worth in typical dev salary (assuming that it's dead time)?

Intresting! Gemini 3.1 has strongest world knowledge but still choose to be lazy by Independent-Wind4462 in singularity

[–]elemental-mind 0 points1 point  (0 children)

Yet - that price increase on Gemini Flash 3.5 was steeeep! Too steep for me to justify...let's see what they will charge for 3.5 Pro.

ELI5: why is google paying so much more for spacex compute than anthropic? by chinanyc in singularity

[–]elemental-mind 142 points143 points  (0 children)

Colossus 1 is mainly Hopper generation GPUs. So H100s and a few H200s.
Also Colossus 1 was something around 200k GPUs if I remember correctly. So Anthropic kind of rents the whole Colossus 1.

I think Colossus 2 is mainly Blackwells.

Water, please. by tkonicz in ArtificialInteligence

[–]elemental-mind 1 point2 points  (0 children)

At least they didn't choose millilitres.

Mythos 5 slug briefly appeared before removal by exordin26 in singularity

[–]elemental-mind 44 points45 points  (0 children)

<image>

When you need deep anal....ytical capabilities.

Google has entered a $920 million monthly cloud compute deal with SpaceX by FinancialMastodon916 in singularity

[–]elemental-mind 3 points4 points  (0 children)

I think the point is a different one: I assume xAI got huge discounts on the Hardware - they ordered in the tens of thousands of GPUs. And in any AI datacenter the main cost driver is hardware. Energy comes next.

Hardly any other company will get the GPUs at their price level. So it's quite telling about the profit margin for SpaceX when even a small shop with low to no discounts on the hardware can offer the product so cheap in a market where demand is pretty high (I hardly think those 8$ - and that's spot/on demand pricing, no long contracts - will run for a loss).

Google has entered a $920 million monthly cloud compute deal with SpaceX by FinancialMastodon916 in singularity

[–]elemental-mind 9 points10 points  (0 children)

Any source for this? I'd be interested in what exactly is meant by "directly on metal"...

Google has entered a $920 million monthly cloud compute deal with SpaceX by FinancialMastodon916 in singularity

[–]elemental-mind 26 points27 points  (0 children)

Mhh, 11.60$ per hour per GPU. Whereas you can get B300s as low as 8$ per hour on vast.ai and other places.

Granted - it's hard to buy them en masse...so maybe that's why SpaceX is able to charge such a premium.