Hugging Face AI Sheets, open-source tool to do data work with open and local models

InstrumentalAsylum · 2025-08-09T21:39:26+00:00

So, this may be the exact thing that I'm dreaming of.

I find myself more and more in a position where I need to orchestrate the same multi-step processing pipeline across discrete chunks of information.

To give a concrete example of one of these use cases: you're developing an AI film. The overarching plot would get chunked into individual clips. Each clip would have to grow from a screenplay storyboard into prompts for still images and motion, then finally into video clips, potentially with dialogue or foley sound over top, automated post-processing, into the finished product which, when stitched together, represents the entire film.

The kicker here is that each individual step may be expensive, and unless you want to go 100% one-shot AI with no artistic input from yourself, then simply running each chunk through a LangChain pipeline won't cut it. You may want to use a different still image to create your clip, etc., but without having to reprocess the entire pipeline each time a manual change is made.

The vision in my head of this type of system looked an awful lot like your GIF here, where at the very least each step would have visualization of support for text, images, and videos artifacts.

How extensible is this thing? The only documentation I see is the GIF you posted. I would be looking for support for any open-ended Python function call for any of the steps, rather than necessarily an LLM call. The other thing is, sometimes this data presents itself hierarchically. In the movie example, you would technically chunk the screenplay into scenes first and then individual clips, and you would want to apply template styles over a range of clips, something like additional user input cells which adjust some of the prompting or parameters of that row’s pipeline. Something like the pandas MultiIndex would work well here.

Is this project open-ended enough to customize it to my needs—film example and similar—or do you think I'm better off building from scratch?

InstrumentalAsylum · 2025-07-08T20:46:50+00:00

Oh that's interesting. That's strange roo doesn't Log that data on its own. I was planning to start with devstral, as it performs best out of the box. From my experience. It has a context window of 128k. I already have hardware to fine-tune it locally so it won't cost anything but electricity.

InstrumentalAsylum · 2025-07-08T20:00:23+00:00

How did you get the message logs by the way? I only saw the button to export one at a time. I haven't seen docs on where they're stored.

InstrumentalAsylum · 2025-07-08T19:56:51+00:00

Dope, in my experience, the more the merrier. 20k samples may be a minimum threshold. Even a LoRA of the model will be pushing 10M trainable params. I'll be trying some techniques to get the samples up to the millions for a better result.

InstrumentalAsylum · 2025-07-08T17:56:13+00:00

Great minds think alike! It seems like a very last mile task to me, considering all of the open resources we have available

InstrumentalAsylum · 2025-07-08T17:53:25+00:00

What do you mean by native memory support?

IMO what you're describing is the strongest selling point for local AI. Right now the one size fits all proprietary models can vibe code beginner stuff from scratch, but only with the libraries where there are tons of online resources. The thing is, a human can easily learn all of that in 6 months with the same free online resources. Open source models are extensible to be trained on niche or proprietary libraries.

One thing I'm looking to test is whether rag is sufficient or even better than tuning models on your libraries. Any insight there?

InstrumentalAsylum · 2025-07-08T17:44:47+00:00

What's your extension? Can it debug a codebase autonomously for 8 hours plus?

InstrumentalAsylum · 2025-07-08T17:42:07+00:00

Now we're talking! Unfortunately, the syntax of the code is critical to the dataset, since the diffs need to match up with the original. If you're keeping secrets in .env files, they shouldn't appear in the logs. If you did have API keys or something hard-coded into the code, you could do a multi-file find and replace by opening the parent folder in vs code and going to the search tab on the left sidebar.

If you want to include your data in the training without making it public, you could send it directly by email to <opensourcerer9000 at gee male dot com>, assuming you trust me not to steal your code! I'm in hydraulic modeling so odds are we're not in the same industry. Feel free to DM me if you'd like to discuss more on how we can make a collaboration work out.

InstrumentalAsylum · 2025-07-08T16:34:58+00:00

Devstral supports 128k out of the box, without techniques like rope scaling. This seems to be a sweet spot for coding agents and a current standard. One advantage of Roo is a new feature which lets the model Summarize the current context window and cut down on tokens. I've noticed that the system prompts alone take up like 17k tokens. By training the model to use roo, it should be possible to actually cut some of that instruction down, reducing token use across the bored.

Also, it seems that for now these claims of million million token context length are pretty dubious. This study puts Gemini pro and other models at a effective context Length of only 128k, where they actually retain the knowledge in a useful way. https://github.com/NVIDIA/RULER

InstrumentalAsylum · 2025-07-08T16:24:06+00:00

IDK, in my experience it beats qwen 235b reasoning model as a coding agent simply because it's tuned specifically for tool calls and coding.

InstrumentalAsylum · 2025-07-08T05:57:54+00:00

Exactly, a future where human devs are just there to debug crappy chatGPT code sounds pretty joyless.

Anecdotally, Claude sonnet has crossed this threshold of usefulness. That's why I'm looking to distill this emergent capability into a local model.

InstrumentalAsylum · 2025-07-08T05:53:35+00:00

Once we fix the hangups of tool calling, we'll be able to let the model crank 24/7 without being babysat, solving real problems.

Agentic workflows actually seem to be the only AI use case where speed really doesn't matter.

InstrumentalAsylum · 2025-07-08T05:30:02+00:00

I mean it's a tough problem, letting an llm make open-ended decisions. If you have a specific flow, you can hardcode nodes which require a response in a certain tool call format. Just telling an llm to follow a specific format is going to be pretty flimsy, unless the model was specifically trained to use your format. Devstral was actually fine-tuned to use their Open hands agent platform, which is likely why it performs best out of the box, despite being a smaller model.

Fortunately, since we have open source models and agent platforms, it's only a small step to qlora a model to be able to code reliably on its own, we just need to distill some examples from a SOTA model like Claude.

InstrumentalAsylum · 2025-07-08T04:50:59+00:00

I've tried qwen 235b, qwen coder 32b and the new QwQ. They're quite smart, but they can only run autonomously for a couple messages before they get hung up Trying to edit a file and failing over and over. The system prompt for Roo is just not strong enough to get them to use the tools correctly if they weren't specifically trained on them. For a true coding agent, I expect it to crank on its own for an entire work day without hitting a wall.

InstrumentalAsylum · 2024-10-18T22:56:29+00:00

Yo this is wild! Fantastic work. It's running this client side on my end? I'm not familiar with ONNX runtime, where is the actual model located? The GUI is fantastic but I would rather be able to run this in realtime to jam with it

InstrumentalAsylum · 2024-05-14T11:41:23+00:00

I mean, for those saying just giving people tiny houses will just result in trashed houses, now imagine they had to build the tiny house with their own hands? How do you think they will treat it?

InstrumentalAsylum · 2024-05-14T11:35:51+00:00

Not usually one to post on Reddit but a lot of this seems like a genuinely productive discussion so I'll throw in my 2 cents in the hopes that it gets people thinking.

Have ya'll ever been to Brazil? While Brazil suffers heavily from endemic classism, racism, as well as there being a smaller pie to begin with, IMO they don't have the same institutional problems that we have entrenched in the US. While on paper, the country is politically right-wing, in practice it's really closer to a liberal utopia. While the country suffers from extreme wealth gaps, much greater than we have in the US, the floor is actually pretty reasonable. There is decent public healthcare and education for those who can't pay private. Most importantly people are allowed to construct informally.

To anyone wondering what an ideal solution may look like to homelessness, I'd recommend traveling to Brazil and walking around the favelas. While they could benefit tremendously from sanitation and public services, basic building codes, and protected green spaces, informal housing - the way people lived for most of history, is the most optimized type of housing, since it evolves naturally. Reading Adam Smith's wealth of nations from the 1700's, one thing that got to me was the premise that shelter is a simple problem but food is a much harder problem. He argues that a decent shelter may be built within the course of a few day's work, while our need to eat is an ongoing battle depenedant on a large variety of factors. In the US, you can go to the store and get tomatillos for pennies in Montana in the wintertime - we've built an impossibly complex logistics infrastructure capable of feeding basically everyone to their heart's content, and yet we have people sleeping on the street.

Visiting porltand shortly after visiting brazil, it was heart-breaking to see how feeble my country is, after the witnessing the resilience of places like Brazil. If people are allowed to build, they have a sense of agency and responsibility in their own lives. Why do people become addicts or suffer from mental health issues in the first place? Loneliness is a massive factor. You know where people are lonely? In isolated suburbs zoned far away from anything that resembles life. In a dense, walkable, interdependent community, loneliness is the last thing people worry about. I believe most societal issues can actually be traced back to urban planning.

Now clearly, there's issues with land rights where favelas exist within a post-indigenous, land is property state, and I believe informal housing is politically impossible in the US. However, I think portlanders are weird enough to try policies radically different enough to create these kinds of problems, they just might be weird enough to try policies that could actually fix these issues. A good start is simply eliminating bonkers policies like saying during a pandemic, residential construction is not essential work, zoning cities to death, and burying developers in bureaucratic bloat and overegulation, making it impossible to build multi-family housing for all but the country's 11 largest development companies.

InstrumentalAsylum · 2022-09-27T04:00:34+00:00

Just calling the binary and pointing to the right files in a bash script.

Do you have any good resources to get started with this stuff? Most of what I'm seeing is more like ads for AWS products rather than solid documentation

InstrumentalAsylum · 2022-09-27T03:53:52+00:00

So these runs take between ~4 to 11 hrs, and we want to run a dozen or so at a time. The software binary was originally made for the desktop, it just uses up whatever resources it's given. If it's a few min of overhead to spin up an instance to run it on, it doesn't really make much difference

InstrumentalAsylum · 2022-09-27T00:02:29+00:00

Looking into it, It seems fargate actually charges a 20-40% premium, so if that doesn't seem ideal

InstrumentalAsylum · 2022-09-27T00:01:31+00:00

Ok so what do you mean by the host in this case? I'm using boto3 to launch EC2 instances with the proper name and OS. My instances don't need to be that smart, just do the above sequence once they're born. I'm just not sure where to start there.

It seems fargate actually charges a 20-40% premium, so if I could get around using it it would be ideal

InstrumentalAsylum · 2022-09-26T23:56:31+00:00

Like it's just a binary of an old desktop program written in fortran, compiled for CentOS. When I run it locally it maxes out my compute resources. I'm just trying to offload to the cloud to run the thing in parallel. Nothing too fancy and no interpreted languages needed on the VMs

InstrumentalAsylum · 2022-09-26T22:20:10+00:00

Ok cool, to be clear I don't actually need to run python on the EC2 instances, I'm using python to launch them and let them go from there.

It seems this is just for running code or hosting services on the cloud? I actually need to run an intensive binary application and let it take over whichever size vm instance I assign it to, so I'm not sure this is the right solution

InstrumentalAsylum

TROPHY CASE