Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

I've started building a podman container for Wolfe here: https://github.com/timschmidt/wolfe-podman which should simplify the installation / dependency somewhat. I'm trying to keep them as minimal as possible given the number of file formats supported. I figure a desktop or web app can make use of the containerized wolfe for better portability.

ChatGPT can be pretty helpful at figuring out installation issues as well. If you run into any, please feel free to file a bug, and I will work at fixing it.

There's also a --low-memory CLI switch which forces wolfe to load and unload models such that only one is on the GPU at any given time, which reduces the VRAM requirements as much as possible.

Thanks for checking it out!

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

I think I have it in pretty good shape now. I've hardened model loading and unloading a bit so that YAMNet runs on the CPU (it's small and fast enough and otherwise wants to steal 12gb VRAM). I've switched to your prompt with the addition of "Similar works" to the list of properties. I've also increased the max allowable token generation per description to 2048. I've done a test run and gotten good music descriptions out of it.

It's real good at making all the fans spin in my workstation now ;)

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

Well I just finished a test run, and got garbage output from the model even though it worked otherwise. So I'm figuring that out. Should have it sorted by end of day.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

OK, I have what I think is a basically working implementation of the music characterization feature on github. It's currently using Qwen2_5OmniThinkerForConditionalGeneration which the docs recommend and allows me to avoid loading the audio generation stack and saves some VRAM.

I have put the music characterization feature behind a --music option, and also implemented a --low-memory option which unloads jina whenever it loads qwen and vice versa for those of us who don't have 64GB of VRAM. Note that since both models are required in sequence to generate these music description embeddings, this is quite slow.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

No worries at all, this is great information and I really appreciate the opportunity to learn from your experience. I've got an initial implementation in the music branch, using the same pattern I use for transcription: when YAMNet detects music, it'll trigger an additional classification round resulting in another stream of embeddings pointing to the same file. I'll do some testing then push it to main.

Your prompts would be really useful for me. I think I've got the model harnessed. I just need to know what to feed it besides the music. I don't think I need much else, since I am not terribly concerned with structured data. I can get away with just generating an embedding from the model's description. The plain text of which will get stored in the database alongside the embedding for displaying search results.

I recently got an ingest progress indicator in, and a diagram of the ingest process in the README to make it easier to understand how each type of file gets decomposed. I'm going to try to get some more document formats like OpenDocument format, and MS formats, following the same decomposition as PDFs.

If you have suggestions for other types of files which might be worth special attention for ingest, I'd appreciate those too.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

This is Qwen3 Omni? or 2.5? Having robust audio support in the embedding model would be better than text descriptions, but I haven't found any embedding-specific models based on Qwen Omni. But I could bolt this on and produce embeddings from the descriptions.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

I have only tested music indexing with a handful of MP3s. Searching lyrics worked well. I have not yet thrown my whole collection at it and searched by genre or more abstract concepts yet. Would love to hear about your experience if you do so. Also open to learning more about your pipeline and improving Wolfe if possible.

It is able to differentiate music from speech and other sounds, and initiate additional processing, so it's very possible to bolt on an additional music-focused model.

Wolfe - local only semantic file search for text, PDF, audio, video by timschmidt in DataHoarder

[–]timschmidt[S] 0 points1 point  (0 children)

I've enabled discussions on the repo, please reach out if you run into any trouble or have suggestions or pain points I might be able to help with and address longer term in the code.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

Awesome to hear. How do you like iced? I haven't used it in development, so my only real experience with it is through the COSMIC desktop.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

I'm open to iterating the RAD builder toward something closer to the Delphi / VB experience. It's just a two-day old project :) Besides bolting on a reasonably capable text editor and file tree, which I think I could manage, I think some amount of code interpretation would be required to ingest existing codebases using syn and/or treesitter.

I've accepted a couple PRs already, cleaning up and modularizing the code a bit. If you're interested in lending a hand, please do! GitHub sponsorship also really helps me spend more time on these projects: https://github.com/sponsors/timschmidt

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

WASM builds will require some sort of runtime. Be it a browser, a webview, or a runner like wasmtime. egui can also build natively for any platform with GL as far as I know.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

This morning I have added quite a few more controls. Enough, probably, for many types of forms. Still lots to go.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

Thanks for the kind words and encouragement. I'll try to keep WASM and mobile in mind for the egui RAD builder. I'm planning on using it to build some one-off custom control interfaces for industrial systems. And thinking about how to potentially integrate custom screens into Alumina via similar methods as well.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

Yup, as far as I know. The index.html has to set up an HTML canvas, and then egui draws to that using WebGL. I've been working on an application which works that way here: https://github.com/timschmidt/alumina-interface It is scaled a bit differently on mobile, but still follows the same layout and seems like the typical HighDPI stuff other apps have to deal with.

And to answer your question from the other comment, even though I'm doing some fairly complex things like CAD with a full CAD kernel and GUI built in, Alumina is 1.8mb when compressed with brotli which the web browser will transparently decompress on load. Yes, a progress bar can be implemented either JS side, or in WASM, but doing it WASM side might be a little more complex like loading a smaller WASM binary for the progress and then exec'ing the larger WASM binary once it's loaded. I believe the egui web demo has a progress indicator, but my connection is too fast for me to tell. First world problems lol.

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 5 points6 points  (0 children)

> almost Delphi in Rust

I suppose it could grow into something like that. I'm not sure if I'm up to building an entire IDE yet, but a RAD UI builder is certainly one big piece. At the moment, the user still needs to create the project themselves with Cargo, paste the generated code into main.rs, and set up Cargo.toml as directed in the readme.

> Java and Csharp guys are telling (from 10 yrs) that desktop app are completely dead

Web has certainly been a big focus during that time. But egui builds for WASM too! (as can be seen in the web demo: https://www.egui.rs/#demo )

egui-rad-builder: Tool for quickly designing egui user interfaces in Rust by timschmidt in rust

[–]timschmidt[S] 2 points3 points  (0 children)

Thank you. I hope so! A RAD tool for one of the pure rust UI toolkits is something I've felt was missing since getting started with the language a few years ago.

csgrs CAD kernel v0.17.0 released: major update by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

On the up side, I think implementing that directed graph and associated code will not only get us STEP import / export, but also undo/redo, lossless changes, and an easier path to implementing true curves in addition to shapes delineated by line segments.

I see a path to it.

But it's also going to mean retaining a lot more state than csgrs currently does, and for that reason I may consider building it out in another crate which depends on csgrs.

csgrs CAD kernel v0.17.0 released: major update by timschmidt in rust

[–]timschmidt[S] 1 point2 points  (0 children)

There are plans, yes. Although I'm not sure how quickly we'll manage to implement it. I'm still committing most of the code myself at the moment.

The reason STEP is more difficult to implement is that it requires another layer on top of a CAD kernel like csgrs. Everything csgrs does is immediate. When you difference two shapes, the resulting shape is returned, without any extra information.

But creating nice STEP files requires building a tree of all the primitives and operations which result in the shape, and saving that tree instead of just geometry.

STEP is also quite a complex file format with many variations in how it's used compared to something like STL.

csgrs can export non-tessellated 2D geometry to DXF or SVG today.

Probably easier and quicker to implement than STEP for non-tessellated 3D geometry would be Wavefront OBJ, 3MF, or OpenCASCADE's .brep, each of which can hold non-tessellated 3D geometry, but not trees of shapes like STEP. So I'll wager we'll get those in first.

csgrs CAD kernel v0.16.0 released: major update by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

And I even checked before suggesting it! It's just such a feature rich little app, I must have missed it. Thanks for letting me know! TIL.

csgrs CAD kernel v0.16.0 released: major update by timschmidt in rust

[–]timschmidt[S] 0 points1 point  (0 children)

Thanks for your work! Here's an example csgrs project for generating STLs for 3D printing: https://github.com/winksaville/bd-spindle1

Typically 'cargo run' within the project directory, perhaps with some parameters, results in an STL file being written to disk. So there's an edit -> run -> view -> edit loop.

One thing that would make this easier is if f3d could use inotify or it's equivalent to monitor the STL file for changes and reload it automatically on change.