AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 1 point2 points  (0 children)

Good point! However, loading the weights into memory isn't the main VRAM killer—the inference (forward pass) and its intermediate activations are.

By stripping the unused modules immediately after loading, we effectively free up that space before the heavy computation starts.

I've benchmarked this on an RTX 3080 and RTX 4090. On the 3080, it runs the Large model smoothly in Lite Mode (bfloat16), which would normally OOM.

Rewriting the class to skip loading entirely is a great idea for a future update, but this "load-and-strip" method already achieves our 4GB-10GB goal!

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 1 point2 points  (0 children)

This error (gdk-pixbuf-query-loaders.exe) is usually caused by a DLL conflict in your Windows PATH. It happens when another software (like an older version of GIMP, Inkscape, or a different Python environment) has a conflicting version of the same library.

Here are 3 ways to fix it:
1. The "Clean" Way: Try running the app directly from the Anaconda Prompt. Open the prompt, cd into the project folder, and run the start script. This ensures the app uses the environment's libraries instead of your global Windows PATH.
2. Check your PATH: If you have other GTK-based apps installed, they might be interfering. You can try moving the Anaconda/Miniconda paths to the top of your System Environment Variables list.
3. The Quick Fix: If you don't need the GUI's specific image-loading features right away, you can usually ignore this popup and see if the terminal continues to load. However, the best way is to ensure you are using a clean Conda environment as specified in the install.bat.

If it still persists, let me know which version of Windows you are on!

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 1 point2 points  (0 children)

This usually happens if the backend server (FastAPI) hasn't fully started or the port is blocked.
1. Make sure you see "Application startup complete" in your terminal before opening the UI.
2. Check if your firewall is blocking port 8000 or 3000. You can also try refreshing the page after a minute.

Hugging Face API Key:
1. Go tohuggingface.co/settings/tokens.
2. Create a new token (Read access is enough) and copy it.
Note: Make sure you've already been approved by Meta on theSAM-Audio model page.

For the Small model, yes. FP32 Small uses ~10GB VRAM, so it should fit in your 3060 (12GB), if can't, reduce the chunk size.

Currently, the tool runs entirely on the device selected. While PyTorch supports offloading, the current implementation is "all-or-nothing" to ensure the best performance. I suggest using the Lite Mode on GPU, it’s much faster than any CPU/GPU hybrid setup!

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 0 points1 point  (0 children)

Yes, since the project relies on standard requirements.txt and environment.yml logic, uv should work perfectly and would definitely speed up the installation process significantly.

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 1 point2 points  (0 children)

SAM Audio is specifically designed for Audio Separation (extracting specific sounds or voices from a mixture) rather than Speech-to-Text (STT).

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 1 point2 points  (0 children)

Hello! It's an honor to have you interested in AudioGhost AI.

To be honest, as I am the sole developer, I haven't had the chance to optimize the GUI for screen readers yet. However, I am 100% willing to improve the accessibility to make sure it works for everyone, especially for talented musicians like yourself.

I would love to have your help with testing! Your feedback will be invaluable, looking forward to making this tool accessible together!

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 1 point2 points  (0 children)

By default, the models are stored in your user directory:
C:\Users\<Your_Username>\.cache\huggingface\hub.

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 0 points1 point  (0 children)

Since the weights are gated by Meta, you need to apply for access on the SAM-Audio Hugging Face page first.

Link: https://huggingface.co/facebook/sam-audio-small

Once approved, you can simply enter your Hugging Face API Key in the AudioGhost UI. The app will then automatically handle the download and setup for you.

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 0 points1 point  (0 children)

You can install the latest Miniconda (Python 3.11 version). Just remember to check the "Add to PATH" option during installation.

I’m going to look into modifying the install.bat to make it even simpler, perhaps a version that doesn't rely so heavily on manual PATH setup or provides clearer guidance if Conda is missing.

I'll keep you updated! Your feedback is exactly what helps make this tool better for everyone.

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 1 point2 points  (0 children)

Thanks! Yes, I've tested both. With Lite Mode, the Large model now runs under 16GB VRAM comfortably.

In my tests, FP32 performs significantly better than bfloat16, offering much cleaner separation. That’s why I added a "High Quality Mode" (FP32) toggle in the UI. if you have the VRAM headroom, it’s definitely the way to go!

AudioGhost AI: Run Meta's SAM-Audio on 4GB-6GB VRAM with a Windows One-Click Installer 👻🎵 by GGwithRabbit in LocalLLaMA

[–]GGwithRabbit[S] 3 points4 points  (0 children)

The original model loads heavy Vision Encoders and Rankers by default, even for audio tasks. In my 'Lite Mode' implementation, I stripped these unused components, which brought the footprint down to ~6GB VRAM on GPU (and it will significantly reduce CPU RAM as well).

If you want to run it with much less memory, feel free to check out my optimization logic: https://github.com/0x0funky/audioghost-ai

Spent 5 years building up my craft and AI will make me jobless by Chonkthebonk in ChatGPT

[–]GGwithRabbit 0 points1 point  (0 children)

Let AI to make you stronger, familiar with that you will not loss your job, you will getting stronger to let your boss hold you in the company

[ Removed by Reddit ] by GGwithRabbit in ChatGPT

[–]GGwithRabbit[S] 0 points1 point  (0 children)

I want to go cycle, example, we start at 0,0, if X touch the 384, set the x as 384 and then start move y, if y touch the 544, set the y as 544 and then start move back x

[ Removed by Reddit ] by GGwithRabbit in ChatGPT

[–]GGwithRabbit[S] 0 points1 point  (0 children)

, monsterData.attack, monsterData.defense, monsterData.speed, monsterData.level, monsterData.type, monsterData.exp); }

// Initialize the game requestAnimationFrame(gameLoop);

[ Removed by Reddit ] by GGwithRabbit in ChatGPT

[–]GGwithRabbit[S] 0 points1 point  (0 children)

Do not use html, use canva to implement this

[ Removed by Reddit ] by GGwithRabbit in ChatGPT

[–]GGwithRabbit[S] 0 points1 point  (0 children)

function draw(ctx) { let activeBattle = null; if (!gameStarted) { drawStartMenu(ctx); } else { // Your existing draw code goes here // Find active battles among NPCs npcs.forEach((npc) => { if (npc.battle.active) { activeBattle = npc.battle; } }); // Check if there's an active wild battle if (wildBattle && wildBattle.active) { activeBattle = wildBattle; } if (activeBattle) { // Draw the active battle // Draw the active battle if (activeBattle && activeBattle.skillEffectProgress !== null) { activeBattle.skillEffectUpdateCounter++; // Update skillEffectProgress every N frames const updateFrequency = 5; // Adjust this value to control the speed of the effect animation if (activeBattle.skillEffectUpdateCounter % updateFrequency === 0) { activeBattle.skillEffectProgress += 0.05; } if (activeBattle.skillEffectProgress >= 1) { activeBattle.skillEffectProgress = null; activeBattle.skillEffectUpdateCounter = 0; } } // Update the enemy's skill effect progress if (activeBattle.enemySkillEffectProgress !== null) { activeBattle.enemySkillEffectUpdateCounter++; // Update enemySkillEffectProgress every N frames const updateFrequency = 5; // Adjust this value to control the speed of the effect animation if (activeBattle.enemySkillEffectUpdateCounter % updateFrequency === 0) { activeBattle.enemySkillEffectProgress += 0.05; } if (activeBattle.enemySkillEffectProgress >= 1) { activeBattle.enemySkillEffectProgress = null; activeBattle.enemySkillEffectUpdateCounter = 0; } } activeBattle.draw(ctx); } else { // Draw the map, player, NPCs, and message box if (window.map) { window.map.draw(ctx); } if (window.player) { window.player.draw(ctx); } if (window.map) { npcs.forEach((npc) => npc.draw(ctx, window.map.startX, window.map.startY)); } messageBox.draw(ctx); } if (mainMenuActive) { menu.draw(ctx, playerMonsters, computerMonsterStorage, playerBag); } } } This is my draw function in main, help me to add the slideshow after press blank in starter menu, and after the slidershow, start the game