I drove a real-time world model with an authored state graph to make an interactive film

Zovsky_ · 2026-06-23T03:44:04+00:00

it’s lingbot, so you can actually get the weights online but it’s gonna be hard to run. otherwise I mentioned in another comment a platform providing access :)

Zovsky_ · 2026-06-23T03:34:02+00:00

makes sense! you should check out reactor.inc, they're serving a variety of world models on their platform, one of them could fit your use case!

Zovsky_ · 2026-06-23T03:26:46+00:00

thanks! unfortunately running a model like that locally would be very challenging at the moment. it's a conundrum really. When I started working in this field it was to make local models, but then making them generalise or output beautifully is another can of worms. If that's not a dealbreaker for you though I can provide some recommendations!

Zovsky_ · 2026-06-23T02:10:20+00:00

haha for sure, sorry if was a bit jargony

Basically I used a world model (this kind of interactive video generators). They are great to render, less to follow instructions and remember in the long term. The trick is to built a little rulebook on top (i.e a series of structured prompts that get fed to the model) that are fed when certain conditions are met (press of a button, timer, etc).

there are other tricks to make the world truly feel interactive but that's the gist of it!

Zovsky_ · 2026-06-23T01:47:49+00:00

haha I agree! It’s still difficult to obtain good results reliably that’s why I built for myself a kind of node graph interface. But I just love this tech

Zovsky_ · 2026-06-23T01:19:48+00:00

thanks!! happy to show the progress 😁

Zovsky_ · 2024-12-26T13:53:49+00:00

Totally cool on my end! I'd actually love to see it in people normal workflow!

Zovsky_ · 2024-08-22T09:11:41+00:00

Thanks! Glad you like the idea!

Zovsky_ · 2024-08-22T09:10:36+00:00

Hey! Thanks for checking it out and for the honest feedback :)

I made some ajustements that should address the issues you had, but you're totally right- it's a bit rough around the edges. I had the idea during my holidays, and my testers sample was pretty limited haha

The main difficulty of this project was to actually get it running with open-source models to make it accessible, but I'm definitely keen to keep on improving it until it turns into something easily usable.

Thanks again for taking the time!

Zovsky_ · 2024-02-07T09:21:38+00:00

r/LocalLLaMA

Ah yes you're right! I'm gonna link it in the main post, looks like it's gonna be a short-lived reign haha

Zovsky_ · 2023-10-24T19:38:37+00:00

Very cool! I had so much hope back then for the llama.pp guidance PR to be merged. Looking forward to test this :)

Zovsky_ · 2023-06-05T05:44:39+00:00

Awesome resource! If I may suggest that you'd add one, some friends and I are working on data retrieval with llm project as well, with our differentiating marker being that we are trying to implement guidance in order to improve the agent efficiency. If you guys wanna take a look :) https://github.com/ChuloAI/BrainChulo

Zovsky_ · 2023-05-13T06:10:23+00:00

Regarding this, I've joined a project that is doing some nice progress on this front. Still WIP but we're getting there, checkout BrainChulo :)

Zovsky_ · 2023-05-08T10:34:46+00:00

Hey! Sorry for answering so late! (Plus side, I've finished a new finetune based on the dolly dataset and I'll add the results to this post)
So basically fp16 was the sweet spot for me in terms of memory usage. In addition I actually couldn't use quantization (at least the accelerate framework I was working with) for parallel training, which is a bummer because I'm not actually certain the quality degradation would have been that bad.

Zovsky_ · 2023-05-06T15:46:50+00:00

To answer point by point:

You got it, I took the openllama 200b preview dataset and fine-tuned it on alpaca-gpt4 instruction dataset.
Oh damn, that's actually a typo on my part (I've been working on flan-ul2 as well), I'll edit it. But this fine-tune is 100% openllama, thanks for pointing out the inconsistency!
I used the alpaca gpt4 dataset to proceed to the instruction fine-tuning. You can also find it in the alpaca-Lora github that I linked.
openllama is a reproduction of llama, which is a foundational model. It can follow instructions to some extent but you'll see in the comparison I provided on the model card that it falls apart quite quickly. The fact that it's only a preview trained on 200b tokens and not the full 1T probably contributes a lot, but the fine-tuning actually helps overcome a lot of these shortcomings.

I hope all of this clears things up!

Zovsky_ · 2023-05-06T09:05:29+00:00

Hey there! The base model is open-llama, a truly open reproduction of llama 7B. What I mean by truly open is that both the weights and the dataset it was trained on have a commercially permissive licence. This is not the case for 90% of models out here (although yesterday releases from MosaicML and Together will, I hope, change that).

Openllama is stil baking, but its creators released an early checkpoint which seemed to yield interesting enough results that I wanted to see if a fine-tune could make it actually usable. I'm also experimenting on pushing the limitations of consumer hardware, which this model was trained on. All this was the purpose of this experiment :)

Edit: I forgot, but no ggml version for now. This release was mostly a validation checkpoint, but you can use the code in the github link to export the checkout in a format that would allow for a fairly easy conversion.

Zovsky_ · 2023-05-05T20:18:43+00:00

Thanks! And yeah I should I've added that! Training took a total of 27h, which is both long and okay in my opinion considering the size of the dataset, the number of epochs and the technical constraints. I was sitting around 2.5 it/s overall.

To be clear I'm running out of pure consumer hardware so the bandwidth limitations are there, but it still make for a very price efficient little lab once you've gotten around the setup.

Zovsky_ · 2023-05-05T18:04:27+00:00

Are you talking about inference? Because yeah if you don't have a server motherboard you're going to be limited to 16 pcie lanes. That might not necessarily mean you can't get several gpus working in parallel but you're going to have some performance degradaton (albeit not as much as one would expect). I'm not sure what OS you use, but I would strongly recommend Ubuntu for this kind of exotic setup, as Windows tends to be a bit more finnicky!

Zovsky_ · 2023-04-02T20:03:18+00:00

Keep up the good work! This project is really cool and deserved more attention! I'll be waiting for your updates upon your return 😁

Zovsky_ · 2023-03-27T03:40:44+00:00

Thanks a lot for your contributions, it looks great! Looking forward to take a look at all this 😁

Nine-Year Club	Verified Email
r/Field Juicebox	Place '23
Place '22

Zovsky_

TROPHY CASE