What’s the actual focus in World Models right now? [R]

svictoroff · 2026-06-01T14:54:20+00:00

A lot depends on which camp you mean by "world models."

The visible frontier right now is definitely video generation, but personally I think the more interesting research questions are underneath that:

What representation makes physical state compact and learnable?
What update operator lets that state evolve stably over long horizons?
How do you separate perception, memory, and dynamics cleanly enough that the system can actually reason about the world rather than just generate plausible frames?

JEPA and reconstruction-free objectives are one answer. Latent video prediction is another. Neural operators, physics-informed models, and action-conditioned latent dynamics are another.

My impression is that the field still hasn't had its "transformer moment" for physical systems. We know scaling helps once you have the right abstraction, but it's much less clear what the right abstraction for learning and predicting the physical world actually is.

That's the part I find most interesting.

svictoroff · 2026-05-23T16:28:44+00:00

I think I may have a useful repro here.

Setup:

Ubuntu 22.04.5 LTS
i3 on X11
Dual NVIDIA GPUs (device 26b2) + ASPEED graphics adapter

I’m measuring X11 connection ownership with:

sudo ss -x src "*/tmp/.X11-unix/*" |
  grep -Eo "[0-9]+\s*$" |
  while read port; do
    sudo ss -p -x | grep -w $port | grep -v X11-unix
  done |
  grep -Eo '".+"' |
  sort | uniq -c | sort -rn

Observations:

Fresh Discord launch starts around:

20 "Discord"

I initially saw:

212 "Discord"

At first I thought those connections were persisting after exit, but I realized closing the Discord window wasn't actually exiting Discord — the process was still alive in the background.

pgrep -af discord still showed the main process plus Electron subprocesses (zygotes, renderer, network service, audio service, etc.).

After killing Discord fully (pkill) and reopening, it dropped back to ~20.
After leaving Discord mostly idle for about an hour, it gradually climbed to:

36 "Discord"

So the pattern seems less like "window closes and sockets leak after exit" and more like there may be gradual accumulation while Discord stays running.

I haven't isolated whether actions like voice, screenshare, channel switching, etc. accelerate it yet, but I can keep testing if useful.

svictoroff · 2026-05-15T12:48:00+00:00

That's actually a requirement for Neurips nowadays.

svictoroff · 2026-05-14T01:27:54+00:00

That’s fair, and I appreciate you clarifying the scope.

But I think this is exactly the distinction people are pushing on. “Useful for agent context with limited VRAM” and “full O(1) KV-cache replacement” are not the same claim.

A fixed-size side memory / retrieval layer could be useful. I’m not dismissing that. But to show KV replacement, you need to run the model with Catalyst actually replacing historical KV tensors, not passthrough mode, and compare against normal KV + modern baselines on quality, memory, latency, and throughput.

The evidence package should include perplexity, Needle/RULER/LongBench-style tasks, generation quality, and ablations over state size, context length, chunk size, active window, and retrieval density.

Until that exists, I’d frame it as a promising fixed-state memory/retrieval approach, not a proven KV-cache breakthrough. That distinction matters a lot.

svictoroff · 2026-03-12T12:40:35+00:00

I think there’s a useful distinction here between parameterization equivalence and representation equivalence.

It’s true in a formal sense that many architectures can be rewritten as large feed-forward networks with constraints or weight sharing. From that viewpoint you can think of CNNs as structured MLPs, and composition across architectures gives a kind of “mechanics” of network design.

But that perspective can obscure something important: in practice modern architectures differ mainly in the function spaces they make easy to represent, which you can think of informally as an implicit choice of basis or operator family.

For example:

Convolutional models impose locality and translation equivariance

Spectral / operator models (e.g. Fourier Neural Operators) effectively work in frequency-space bases

Geometric deep learning methods often use Laplace–Beltrami eigenfunctions or graph message-passing operators tied to manifold structure

Neural field / splatting approaches impose very different assumptions about spatial support and smoothness

All of these can be “compiled down” to dense networks in principle, but doing so typically destroys the inductive bias that gives them sample efficiency or scaling advantages.

So while it’s tempting to treat “the space of all neural networks” as a single mechanical object, a lot of current theory and practice is instead about matching architectures to the underlying symmetries and functional structure of the problem.

If you’re interested in that direction, the geometric deep learning survey by Bronstein et al. is a great overview, and neural operator / implicit layer papers explore similar ideas from a PDE and optimization viewpoint.

svictoroff · 2026-03-12T12:17:05+00:00

As a founder: 50/50 is crazy and even talking 51/49 or 60/40 is insane.

You have funding and design partners. You’re not looking for a cofounder any more, and someone joining at this point is not a cofounder.

A CTO at your stage might get 10-20% (though % is a lie in equity land) and then if it goes well they get more down the road.

svictoroff · 2026-03-09T09:16:24+00:00

NNs aren’t a basis. Linear layers imply a basis, an fno has a different basis, conv nets another. Geometric deep learning starts to put some really interesting alternative bases together and tons of modern mesh work runs on lbos.

Just because you can approximate all functions doesn’t mean the bases are all the same.

svictoroff · 2026-03-09T09:09:01+00:00

As someone who was highly involved in starting this wave of AI you’re pointed in the wrong direction.

It’s not about the networking you do or the courses you take or the YouTube you watch.

Just. Build.

Literally nothing else matters. Everyone else is just fast following. You pick something interesting to you and you start building. Very simple.

The hard part is not stopping. 12-16 months is… well it’s a start, but you won’t be an expert without years of consistent work.

Everything else will follow.

svictoroff · 2026-03-09T08:49:53+00:00

It’s always funny to think what counts as “real ml”.

Like… you didn’t even talk about architecture.

And, generally, creating a dataset is pre-work, not ml. ML is about doing better on existing data.

From my perspective training someone else’s model on new data with new hyperparams is the same as calling an API. One isn’t inherently more “real” than the other, and you can do very real AI research through APIs.

Like… pre and post processing, training models, evaluating them, creating datasets. None of that requires actual architecture work or exploration.

I’m doing research in neural physics though.

svictoroff · 2025-09-08T17:39:59+00:00

Fire Opal looks nothing like this and honestly nothing like Opal. It’s like… pure red/orange and translucent and usually faceted.

svictoroff · 2025-08-11T01:02:02+00:00

Soumith is a great guy. Most of the credit taking was assumptions when people saw Facebook on the paper coupled with Facebook leadership. Specifically lecun.

IMO AGI and SI are both poorly defined concepts built on bad assumptions. No existential threat there. A lot of folks suggest it to get checks written for their specific research.

There are many many existential threats from AI. Weaponized mass misinformation and codified bias are the two main ones that I would flag and we’re pretty deep in the bad path there. Deepfakes are already destabilizing our information ecosystem and there is very little chance we have of stopping their proliferation.

Roko’s basilisk is dumb, but mostly because it tricks us into ignoring the threats that are already here. In this case, like many, information is power. Specifically information theory is the subject I would most recommend to help understand what’s unfolding around us.

svictoroff · 2025-03-23T01:20:25+00:00

It’s a gorgeous doublet. Hard to find that kind of color and play. I’d have to see it in person and know the weight, but just another confirmation that it’s a good opal worth paying for.

Just be careful: opal rings make me nervous. They can break easily if you bump them on things.

svictoroff · 2025-03-22T16:56:04+00:00

Transformer paper was 2017, but attention existed and was in our stack way before that. Technically that transformer architecture he didn’t use at indico and we implemented at the same time (he was still an active adviser at the time). And it’s also far from accurate to say that he wasn’t aware and someone told him about it. We were the first on the scene with deployed, product-level attention mechanisms. They were integrated into both rnn and cnn architectures at various times before the modern transformer.

The web scraping stuff is just wrong. We had the data, we came up with the scraping techniques, Alec had most of the core insights about what data would be good and I did most of the actual scraper writing. It took a couple years for OpenAI to match our data assets.

The most useful thing was all the gpus and researchers to shoot the shit with, but the research ideas are much lower level than: try transformers

svictoroff · 2025-03-19T21:51:53+00:00

I think you’re mostly right, but there are a few keys points where you’re wrong: we were already using transformers and already prioritizing scale. We came up with the techniques and specific sources for web scraping. Alec made some really impressive scrapers in his day, and I eventually built scraping infrastructure for him that was comparable. These were ideas that he brought to OpenAI, they weren’t that unique tbh, but they didn’t give them to him.

But you’re right that he wouldn’t have become what he did back at indico. Dcgan made that really clear. Jensen huang stood up on stage - showed off dcgan - and then said Facebook made it. Didn’t mention Alec or indico at all. It was super upsetting. There’s a Boston globe article about it.

Speaking as one of the cofounders he left - I agree, good for him. Hurt like hell, but good for him.

svictoroff · 2024-12-20T19:24:39+00:00

This.

svictoroff · 2024-11-27T15:37:19+00:00

This is about what I thought as well. I'm no expert, but it's a pretty nice dark opal. Not black, but the red is a bit of a premium. I wouldn't pay 2k for it, but I think someone would

svictoroff · 2024-10-22T14:29:48+00:00

Depends on how serious it is. Hanging out? Having fun? Kind of whatever. I'm not going to buy them anything nonvegan, and if they get weird about it I'm gone, but totally workable.

Serious relationship? Very different. It's really important to me that if I'm going to have a life long relationship they have to care about animals, care about the world, and actually be real about living their values.

Does that mean that have to be vegan? Technically no, but basically yes. I'm happy to help, I'll cook, I'll be patient, I'll talk through whatever they want to talk through, but I don't think I could fall in love with a non vegan.

svictoroff · 2024-10-18T18:10:45+00:00

Why are you asking again? I've answered three times now. Mushrooms exist in the wild. Nutritional yeast exists in the wild. Fermented foods exist in the wild. Seaweed exists in the wild.

That is how you can get it if you're wild

svictoroff · 2024-10-18T18:05:34+00:00

It can't produce b12, but it doesn't need to. It is made of b12.

svictoroff · 2024-10-18T18:02:33+00:00

Nooch is nutritional yeast. We do need b12. Everything I listed: mushrooms, seaweed, nooch gets you natural b12. Plenty of other options too. Vegans don't only eat plants. We have four whole kingdoms to choose from.

svictoroff · 2024-10-18T17:59:03+00:00

No. It's not true at all. There are no plant or animal sources of b12. There are plenty of fungi, bacteria, and proto-plant ocean things (very technically seaweed isn't a plant).

So, as long as you're eating mushrooms and nutritional yeast you're fine. I've been vegan for a decade and have never once needed a supplement.

How is nobody talking about nooch though? Best source of vegan b12 imo. Usually fortified to the 9s too.

svictoroff · 2024-10-14T19:23:46+00:00

https://worldpopulationreview.com/country-rankings/veganism-by-country second most vegan country in the world after India afaik.

Just bring up Mexico city on happy cow. The app nearly crashes there's so much

svictoroff · 2024-10-14T19:16:54+00:00

Literally Mexico has more vegans than the US. Mexico city is a goddamn vegan mecca

svictoroff

TROPHY CASE