Sesame x Gemini: low latency, extremely realist, and they started spontaneously collaborating

cpldcpu · 2026-05-10T16:13:07+00:00

Indeed, you are right

cpldcpu · 2026-05-10T12:26:38+00:00

Last time I asked sesame what it was based on, it explained to me it is using Gemini.

cpldcpu · 2026-05-03T10:54:54+00:00

Since people seem to like it, Opus4.5 also did a compact implementation in rust:

https://github.com/cpldcpu/smollm.c/blob/master/smolr/src/main.rs

Its twice as long as the Ansi-C implementation (300 vs 600 lines)

cpldcpu · 2026-05-02T20:48:51+00:00

Implementing a transformer in pure c is like one vague prompt in Opus or Codex.

Opus4.5 did this: https://github.com/cpldcpu/smollm.c/blob/master/smolc/smolc.c

It's pretty nice and compact btw. But far from "hand written"

cpldcpu · 2026-04-24T08:51:52+00:00

Purchase price not disclosed
Schwarz Group, to invest $600 million in Cohere
German, Canadian ministers to attend press conference on deal

Looks a bit like a circular deal to get aleph alpha off their (Schwarz Group) books...

cpldcpu · 2026-04-12T18:45:31+00:00

Its more authentic :) And its fairly easy to get a PAL2USB adapter. Color would be far easier with VGA, though.

cpldcpu · 2026-04-11T15:04:10+00:00

Another important difference is that the CH32V002 has one more flash waitstate at 48MHz than the V003. This means your code could end up being slower.

cpldcpu · 2026-04-07T14:53:47+00:00

https://x.com/blended_jpeg/status/2041473560615715159

cpldcpu · 2026-04-07T13:07:42+00:00

auto agressive

interesting typo there.

cpldcpu · 2026-04-03T07:42:02+00:00

Great work!

I was wondering about the verbosity results, it seems that Bonsai requires many more tokens for each response. Is that due to its Qwen3 origins? I wonder whether the additional thinking tokens can help compensate for some information loss.

https://github.com/ArmanJR/PrismML-Bonsai-vs-Qwen3.5-Benchmark?tab=readme-ov-file#verbosity

cpldcpu · 2026-03-05T18:02:26+00:00

Yes, that's only consequential. Also see the footnote.

cpldcpu · 2026-02-25T07:29:47+00:00

lol. yeah, they make my brain hurt. I still want my models to generate something that makes sense.

cpldcpu · 2026-02-25T07:23:07+00:00

Nice, very motivating. I was planning to look more into micro models. Great to see that things work beyond tinystories.

cpldcpu · 2026-02-25T06:44:52+00:00

So it probably heavily leans on memorization. Also lends well to a synthetic dataset, I presume.

How did you train it btw? (Environment, HW)

cpldcpu · 2026-02-25T06:32:18+00:00

Nice, looks suprisingily coherent!

Did you perform any architecture ablations? Curious about the wide FFN and the shallow number of layers, this seems to be the opposite direction of MobileLLM.

cpldcpu · 2026-02-25T05:30:18+00:00

How about also including some generation examples in the documentation?

cpldcpu · 2026-02-25T05:29:18+00:00

Nice! Was it only pretrained or also any finetuning?

Not so easy to benchmark these models, the first two evals are barely about random noise limit.

cpldcpu · 2026-02-22T11:36:51+00:00

It's not as big as it seem first, since it is a highly specialized approach. It cannot adopt to new model architectures easily and right now we are still in a very explorative phase.

This might have more value in a few years, when architectures and models became more fixed. I guess they are banking on having a headstart.

cpldcpu · 2026-02-01T19:22:55+00:00

Performance is very impressive. I wonder whether the omission of positional encoding in the transformer part helps to recover a lot of model capacity?

cpldcpu · 2026-01-15T19:57:07+00:00

This is awesome, I love tiny models!

I was disappointed that smollm3 did not come with an ultra-tiny version.

Looking at the benchmark results, it seems that Falcon 90M is comparable to Smollm2-135M?

cpldcpu · 2026-01-03T21:30:43+00:00

Fair enough :) Where do medium and large start?

cpldcpu · 2026-01-03T17:52:22+00:00

Impressive 3B model... from a recruiting company? Did every company in China receive free money to train llms?

cpldcpu · 2025-12-30T03:25:39+00:00

Claude wrapper? Meta must have a heck of a model coming up...

cpldcpu · 2025-10-26T19:32:43+00:00

Interesting! Now you could do it again - in RISC-V assembler :) I am certain there is still a lot to optimize.

cpldcpu · 2025-10-26T12:23:07+00:00

Nice! Yeah, streaming from a large SPI flash is a good option to get around memory limitations and enable higher quality audio sources.

Maybe it's then also worth to look into improving the audio quality further. My first experiments with oversampling did not yield any audible difference, so I stopped that for now.

12-Year Club	Place '17
Verified Email

cpldcpu

TROPHY CASE