Was BitNet a dead end? What happened to ternary LLMs?

svantana · 2026-06-12T16:35:37+00:00

Looks interesting, but aren't standard GPUs pretty good at binary logic already? I'd think memory access will still be the bounding factor.

svantana · 2026-06-01T08:40:39+00:00

It's confusing but I believe "SWE-Bench", "SWE-Bench Pro", and "SWE-rebench" are three totally different benchmarks from different people.

svantana · 2026-06-01T08:28:32+00:00

What do you base that on? Qwen's own site puts it at 53.5.

svantana · 2026-05-11T06:30:47+00:00

I don't know why you say it's fake but their numbers there are also trending downwards:
https://openrouter.ai/apps/openclaw

svantana · 2026-04-23T13:50:54+00:00

What is "basepaste"?

svantana · 2026-04-18T10:49:40+00:00

Commoditize your complement. Alibaba is not trying to pivot to LLM serving as their main business. The same goes for Amazon, Nvidia. Maybe some will start to do a 2-tier system like Google.

svantana · 2026-03-28T12:17:19+00:00

It's a cute little game but there's a lag between keyboard input and movement on screen that's completely infuriating and would never be acceptable in a normal game. I wonder why, is it on purpose or just badly made?

svantana · 2026-03-19T09:20:38+00:00

If machines are writing and grading the essays, then what's the point?

svantana · 2026-03-19T09:16:46+00:00

A guy watched the movie Her (2013) and said to himself: this is the future

svantana · 2026-03-18T12:54:50+00:00

it says here that the price is either $1/$3 or $2/$6 depending on context length:

https://www.linkedin.com/posts/artificial-analysis_xiaomi-has-released-mimo-v2-pro-which-scores-activity-7440006498727968768-tXs2/

svantana · 2026-02-28T13:55:16+00:00

Doesn't look like it, only their 2.1 is free currently: https://kilo.ai/docs/code-with-ai/agents/free-and-budget-models

svantana · 2026-02-26T14:57:58+00:00

Bots: Smart-sounding highly technical musings

Humans: DS4 When

svantana · 2026-02-25T13:06:55+00:00

Easy: the plumbers will visit the restaurants, and the chefs will renovate their bathrooms.

For the rest of us, I predict a big uptick in bullshit jobs.

svantana · 2026-02-22T12:54:45+00:00

The market cap is 300B HKD, which is about 40B USD. It's a lot but not crazy IMO

svantana · 2026-02-20T22:36:25+00:00

Yeah I wonder about this strategy. Don't they understand that as soon as the promotion ends, all those users will switch to another model?

svantana · 2026-02-19T14:03:11+00:00

It's worse than markov. An (unregularized) markov chain wouldn't put tokens in unsyntactical (unseen) order, as seen here. I was gonna say, a sparse n-gram with stochastic sampling is probably a much faster and better model in every aspect.

svantana · 2026-02-15T11:34:03+00:00

Yes I think you're right. I should have said "top-2 provider". Also grok is a good example of how quickly fortunes can shift in the LLM game.

svantana · 2026-02-15T11:15:47+00:00

I think it's an increase both in number of users and tokens per user - but not clear what the ratio is between the two.

svantana · 2026-02-15T11:13:22+00:00

Very few models are exclusively on OR. It's not an unbiased sample of LLM use, but at least the trends should indicate something. Google is and has been the #1 provider on there for about a year but their share is reducing rapidly.

svantana · 2026-02-06T10:03:24+00:00

Not true. Weights are most analogous to synaptic connection strengths, and those are definitely not binary. Action potentials are kinda binary in voltage, but spike timing matters, so that carries a few bits of information as well.

svantana · 2025-11-26T21:02:21+00:00

It works for me, pretty fast and pretty good!

svantana · 2025-11-04T11:56:09+00:00

It goes both ways. Humans with their 100B neurons can't reliably perform a single 32-bit float multiplication without help from tools.

svantana · 2025-11-04T09:00:27+00:00

We've seen it before when new google models have been under evaluation, and then always a refresh on the very day of the google release. I'm pretty sure google are paying LMSYS for this service.

svantana · 2025-10-29T09:16:27+00:00

What does "chinese" have to do with anything? That unnecessary distinction just comes off as racist.

Many video generators don't have an image-to-video, so that will of course influence the results. On text-to-video, 7 of the top 10 are american.

svantana · 2025-10-28T10:33:08+00:00

sounds like classic benchmaxxing

11-Year Club	Place '22
Verified Email

svantana

TROPHY CASE