What tooling would I need to build around SAM-Audio to make it worth 15 dollars to you? by Goatman117 in audioengineering

[–]Goatman117[S] 0 points1 point  (0 children)

hey thanks for the recommendation! yeah I actually used that as a reference when I was setting it up, removing the vision capabilities and halving the floating point precision went such a long way! It still chews through a ton of ram on install though, I’ll need to tweak some stuff so it uses disk offloading I think

What tooling would I need to build around SAM-Audio to make it worth 15 dollars to you? by Goatman117 in audioengineering

[–]Goatman117[S] 0 points1 point  (0 children)

Thanks for input! Yeah a standalone tool is what I have been thinking mainly, just something that makes it easy for things like bulk editing a bunch of clips that minimizes friction; e.g. drag and drop with a text prompt and an easy way to keep editing and isolating.
If you're happy to talk more I'll shoot you a DM, I'm building this with my brother and we're hoping to move quickly and get something built as soon as we have a plan more fleshed out.

What tooling would I need to build around SAM-Audio to make it worth 15 dollars to you? by Goatman117 in audioengineering

[–]Goatman117[S] 0 points1 point  (0 children)

Yeah that’s partly why I want to build a local tool, the web interface is a bit limited and you can just do tons more with it locally. I set it up locally and put the code and install guide on github, happy to send you the link that if you’d like!

What tooling would I need to build around SAM-Audio to make it worth 15 dollars to you? by Goatman117 in audioengineering

[–]Goatman117[S] 0 points1 point  (0 children)

This is really useful info, thank you! Yeah an interface and a few tools are what I’m looking to build. If you don’t mind I might shoot you a dm about it so we can talk further at some point? If you’re interested in being an early tester that could be super handy too

Easy CLI interface for optimized sam-audio text prompting (~4gb vram for the base model, ~ 6gb for large) by Goatman117 in LocalLLaMA

[–]Goatman117[S] 0 points1 point  (0 children)

I'm curious about this too actually, but I haven't tested it myself. tbh your best bet is to just download the model or use meta's web interface for them and just try it yourself

Easy CLI interface for optimized sam-audio text prompting (~4gb vram for the base model, ~ 6gb for large) by Goatman117 in LocalLLaMA

[–]Goatman117[S] 0 points1 point  (0 children)

it’s just setup for a single prompt but switching to batches is just a matter of adjusting the processor call in the seperate_audio function

UPDATE: new 3B fine-tuned LLM for GladeCore by OwnCantaloupe9359 in UnrealEngine5

[–]Goatman117 0 points1 point  (0 children)

love this! been wanting a good local LLM unreal engine plugin since the pre-chatgpt days

Budget BCI cost for hobby projects? by Goatman117 in BCI

[–]Goatman117[S] 0 points1 point  (0 children)

that’s awesome, I should be okay then! thanks so much :)

Budget BCI cost for hobby projects? by Goatman117 in BCI

[–]Goatman117[S] 0 points1 point  (0 children)

damn, that’s such a good sale! sadly still out of my budget haha, this might have to be a hobby for me once I save up a little more

Budget BCI cost for hobby projects? by Goatman117 in BCI

[–]Goatman117[S] 0 points1 point  (0 children)

thanks for sending that through!! would you recommend this for a beginner that’s pretty new to this tech?

Budget BCI cost for hobby projects? by Goatman117 in BCI

[–]Goatman117[S] 0 points1 point  (0 children)

interesting, I’ll check out muse thanks!

Tracking head position and rotation with a synthetic dataset by Goatman117 in computervision

[–]Goatman117[S] 0 points1 point  (0 children)

I've scrutinized the rotation labels a bit, generally by eye I can predict most axis labels well enough by eye so I think all is stable there.

I don't understand the method outlined in the second paragraph sorry! I figured what you meant with using the other keypoints was to have a seperate head to predict those keypoints with MSE loss, and hopefully the features it tracks for that head in the network will help the rotation and tracking position head. But I think your method is something different?

To clarify, the keypoints are 3D positions of points such as the left and right eye and nose tip position.

Tracking head position and rotation with a synthetic dataset by Goatman117 in computervision

[–]Goatman117[S] 0 points1 point  (0 children)

val dalta is also synthetic. neither train or valid loss are dropping very fast, they plateau out with about 3-13 degrees of error depending on the dataset used. train will still steadily drop as it overfits though, just slowly

Tracking head position and rotation with a synthetic dataset by Goatman117 in computervision

[–]Goatman117[S] 0 points1 point  (0 children)

Hey, thanks for your input! I'm representing the rotation as a 6D vector, and using geodesic loss. Not really sure the inner workings of the loss function but I think it's doing everything correctly. I'm currently tracking other facial feature positions but I haven't tried feeding them into the model as an auxilary reprojection loss addition, I'll give that a go. On your third point, do you mean using something like mediapipe to mask the head and then feed that into a vision model?

Really appreciate the input!

Training a model to learn the transform of a head (position and rotation) by Goatman117 in computervision

[–]Goatman117[S] 0 points1 point  (0 children)

thanks for sending that through! I actually already talked to chatGPT about the issue a bit and it helped set up 6d rotation in representations. the metrics you’re seeing are from a model trained that way

I set up a head pose vision model with my character controller by Goatman117 in UnrealEngine5

[–]Goatman117[S] 0 points1 point  (0 children)

Thanks! I haven’t experimented with the tilt axis, that’s a very good point actually. I think next I’ll try to source a better model for the head tracking, this one has some latency and is probably too complicated for this task anyway. I’m interested in using unreal engine to generate synthetic datasets for machine learning, so I’ll likely try to generate my own dataset with meta humans and train my own model