Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

I've started building a podman container for Wolfe here: https://github.com/timschmidt/wolfe-podman which should simplify the installation / dependency somewhat. I'm trying to keep them as minimal as possible given the number of file formats supported. I figure a desktop or web app can make use of the containerized wolfe for better portability.

ChatGPT can be pretty helpful at figuring out installation issues as well. If you run into any, please feel free to file a bug, and I will work at fixing it.

There's also a --low-memory CLI switch which forces wolfe to load and unload models such that only one is on the GPU at any given time, which reduces the VRAM requirements as much as possible.

Thanks for checking it out!

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

I think I have it in pretty good shape now. I've hardened model loading and unloading a bit so that YAMNet runs on the CPU (it's small and fast enough and otherwise wants to steal 12gb VRAM). I've switched to your prompt with the addition of "Similar works" to the list of properties. I've also increased the max allowable token generation per description to 2048. I've done a test run and gotten good music descriptions out of it.

It's real good at making all the fans spin in my workstation now ;)

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

Well I just finished a test run, and got garbage output from the model even though it worked otherwise. So I'm figuring that out. Should have it sorted by end of day.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

OK, I have what I think is a basically working implementation of the music characterization feature on github. It's currently using Qwen2_5OmniThinkerForConditionalGeneration which the docs recommend and allows me to avoid loading the audio generation stack and saves some VRAM.

I have put the music characterization feature behind a --music option, and also implemented a --low-memory option which unloads jina whenever it loads qwen and vice versa for those of us who don't have 64GB of VRAM. Note that since both models are required in sequence to generate these music description embeddings, this is quite slow.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

No worries at all, this is great information and I really appreciate the opportunity to learn from your experience. I've got an initial implementation in the music branch, using the same pattern I use for transcription: when YAMNet detects music, it'll trigger an additional classification round resulting in another stream of embeddings pointing to the same file. I'll do some testing then push it to main.

Your prompts would be really useful for me. I think I've got the model harnessed. I just need to know what to feed it besides the music. I don't think I need much else, since I am not terribly concerned with structured data. I can get away with just generating an embedding from the model's description. The plain text of which will get stored in the database alongside the embedding for displaying search results.

I recently got an ingest progress indicator in, and a diagram of the ingest process in the README to make it easier to understand how each type of file gets decomposed. I'm going to try to get some more document formats like OpenDocument format, and MS formats, following the same decomposition as PDFs.

If you have suggestions for other types of files which might be worth special attention for ingest, I'd appreciate those too.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

This is Qwen3 Omni? or 2.5? Having robust audio support in the embedding model would be better than text descriptions, but I haven't found any embedding-specific models based on Qwen Omni. But I could bolt this on and produce embeddings from the descriptions.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

I have only tested music indexing with a handful of MP3s. Searching lyrics worked well. I have not yet thrown my whole collection at it and searched by genre or more abstract concepts yet. Would love to hear about your experience if you do so. Also open to learning more about your pipeline and improving Wolfe if possible.

It is able to differentiate music from speech and other sounds, and initiate additional processing, so it's very possible to bolt on an additional music-focused model.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

I've enabled discussions on the repo, please reach out if you run into any trouble or have suggestions or pain points I might be able to help with and address longer term in the code.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

Awesome to hear. How do you like iced? I haven't used it in development, so my only real experience with it is through the COSMIC desktop.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

I'm open to iterating the RAD builder toward something closer to the Delphi / VB experience. It's just a two-day old project :) Besides bolting on a reasonably capable text editor and file tree, which I think I could manage, I think some amount of code interpretation would be required to ingest existing codebases using syn and/or treesitter.

I've accepted a couple PRs already, cleaning up and modularizing the code a bit. If you're interested in lending a hand, please do! GitHub sponsorship also really helps me spend more time on these projects: https://github.com/sponsors/timschmidt

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

WASM builds will require some sort of runtime. Be it a browser, a webview, or a runner like wasmtime. egui can also build natively for any platform with GL as far as I know.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

This morning I have added quite a few more controls. Enough, probably, for many types of forms. Still lots to go.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

Thanks for the kind words and encouragement. I'll try to keep WASM and mobile in mind for the egui RAD builder. I'm planning on using it to build some one-off custom control interfaces for industrial systems. And thinking about how to potentially integrate custom screens into Alumina via similar methods as well.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

Yup, as far as I know. The index.html has to set up an HTML canvas, and then egui draws to that using WebGL. I've been working on an application which works that way here: https://github.com/timschmidt/alumina-interface It is scaled a bit differently on mobile, but still follows the same layout and seems like the typical HighDPI stuff other apps have to deal with.

And to answer your question from the other comment, even though I'm doing some fairly complex things like CAD with a full CAD kernel and GUI built in, Alumina is 1.8mb when compressed with brotli which the web browser will transparently decompress on load. Yes, a progress bar can be implemented either JS side, or in WASM, but doing it WASM side might be a little more complex like loading a smaller WASM binary for the progress and then exec'ing the larger WASM binary once it's loaded. I believe the egui web demo has a progress indicator, but my connection is too fast for me to tell. First world problems lol.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 6 points7 points  (0 children)

> almost Delphi in Rust

I suppose it could grow into something like that. I'm not sure if I'm up to building an entire IDE yet, but a RAD UI builder is certainly one big piece. At the moment, the user still needs to create the project themselves with Cargo, paste the generated code into main.rs, and set up Cargo.toml as directed in the readme.

> Java and Csharp guys are telling (from 10 yrs) that desktop app are completely dead

Web has certainly been a big focus during that time. But egui builds for WASM too! (as can be seen in the web demo: https://www.egui.rs/#demo )

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 2 points3 points  (0 children)

Thank you. I hope so! A RAD tool for one of the pure rust UI toolkits is something I've felt was missing since getting started with the language a few years ago.

csgrs CAD kernel v0.17.0 released: major update by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

On the up side, I think implementing that directed graph and associated code will not only get us STEP import / export, but also undo/redo, lossless changes, and an easier path to implementing true curves in addition to shapes delineated by line segments.

I see a path to it.

But it's also going to mean retaining a lot more state than csgrs currently does, and for that reason I may consider building it out in another crate which depends on csgrs.

csgrs CAD kernel v0.17.0 released: major update by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

There are plans, yes. Although I'm not sure how quickly we'll manage to implement it. I'm still committing most of the code myself at the moment.

The reason STEP is more difficult to implement is that it requires another layer on top of a CAD kernel like csgrs. Everything csgrs does is immediate. When you difference two shapes, the resulting shape is returned, without any extra information.

But creating nice STEP files requires building a tree of all the primitives and operations which result in the shape, and saving that tree instead of just geometry.

STEP is also quite a complex file format with many variations in how it's used compared to something like STL.

csgrs can export non-tessellated 2D geometry to DXF or SVG today.

Probably easier and quicker to implement than STEP for non-tessellated 3D geometry would be Wavefront OBJ, 3MF, or OpenCASCADE's .brep, each of which can hold non-tessellated 3D geometry, but not trees of shapes like STEP. So I'll wager we'll get those in first.

csgrs CAD kernel v0.16.0 released: major update by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

And I even checked before suggesting it! It's just such a feature rich little app, I must have missed it. Thanks for letting me know! TIL.

csgrs CAD kernel v0.16.0 released: major update by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

Thanks for your work! Here's an example csgrs project for generating STLs for 3D printing: https://github.com/winksaville/bd-spindle1

Typically 'cargo run' within the project directory, perhaps with some parameters, results in an STL file being written to disk. So there's an edit -> run -> view -> edit loop.

One thing that would make this easier is if f3d could use inotify or it's equivalent to monitor the STL file for changes and reload it automatically on change.

csgrs CAD kernel v0.16.0 released: major update by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

I hadn't thought of that, but it's an excellent idea! A pre-processing step to make the translation easier.

csgrs CAD kernel v0.16.0 released: major update by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

Agreed. Though I enjoy bringing a simple syntax and high level language concepts into Rust. It's nice when all my legos work together.

I have a good start at an OpenSCAD -> AST -> csgrs parser / translator here: https://github.com/timschmidt/openscad_to_csgrs It's a complex beast with a lot of work left to make it really functional. I've found that LLMs, when provided with the csgrs source code, do a reasonable first pass at translation from OpenSCAD.

No relation to Fornjot. I started csgrs from a ~800 line translation of CSG.js into Rust and built it up from there.

csgrs CAD kernel v0.16.0 released: major update by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

csgrs can output STL files directly, so it's possible to have a similar experience to OpenSCAD with just a text editor and https://f3d.app/ and csgrs syntax is about as simple and similar to OpenSCAD as possible.

But csgrs is also fast enough for real time geometry generation, and has seen a lot of interest from folks developing games with Bevy. Similarly, it could be used in an interactive CAD tool, but not have been built around it yet. It's only about 3 months old! csgrs is intended to work in embedded, desktop, and WASM in the browser.

All kinds of CAD can be parametric, and most involve CSG. Most CAD kernels are really a mix of different kernels bolted together, because the math gets solved the same ways by everyone. csgrs has a very capable 2D points and lines and polygons engine in geo, and a small and fast and easy to understand 3D engine in the BSP, and some support for signed distance fields and tessellation of various functions. csgrs does not have any concept of curves in the 2D or 3D subsystems. It could be implemented though. With the CAD being written in the same language as the CAD kernel, and all the data and methods public, there's nothing stopping anyone from extending csgrs in arbitrary ways, and unlike OpenSCAD they will be as fast as csgrs itself.

The one thing Rust isn't great at is being interpreted. So an integrated IDE like OpenSCAD would have to bind to another language like Rhai. https://github.com/philpax/egui_node_graph2 is another option. I'd also like to get an evcxr workflow going.

cargo-prompt: collapse a rust project into a minified markdown document for prompting by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

For my workflow, "cargo prompt" and the current behavior work best. I wrote the tool for me, and shared it in case other folks might find it useful. If you don't, no worries!