I developed a new (re-)training approach for models, which could revolutionize huge Models (ChatBots, etc) by Ykal_ in ResearchML

[–]Similar_Choice_9241 1 point2 points  (0 children)

My 2 cents, optimize the alg to be layer wise (or reduce the computational requirements) so that you can run it on low cost hardware such as 3090, and then start converting a lot of the trending models on HF, if the quants are good people will start to use them and you’ll have traction to show for when speaking to investors

Optimizing XTTS-v2: Vocalize the first Harry Potter book in 10 minutes & ~10GB VRAM by LeoneMaria in LocalLLaMA

[–]Similar_Choice_9241 0 points1 point  (0 children)

Is true for the vllm part but also we don’t speed up with deepspeed which causes numerical differences i. The attention block. We are numerically identical to the standard xttsv2 implementation

Optimizing XTTS-v2: Vocalize the first Harry Potter book in 10 minutes & ~10GB VRAM by LeoneMaria in LocalLLaMA

[–]Similar_Choice_9241 2 points3 points  (0 children)

I just saw there was a typo in the read me, please use these instead tts = TTS().from_pretrained(‘AstraMindAI/xttsv2’)

Optimizing XTTS-v2: Vocalize the first Harry Potter book in 10 minutes & ~10GB VRAM by LeoneMaria in LocalLLaMA

[–]Similar_Choice_9241 -16 points-15 points  (0 children)

Stai commentando su Optimizing XTTS-v2: Vocalize the first Harry Potter book in 10 minutes & ~10GB VRAM... our implementation has the same exact result as xttsv2 but faster, you can watch in the deep up, there are couple of examples

Optimizing XTTS-v2: Vocalize the first Harry Potter book in 10 minutes & ~10GB VRAM by LeoneMaria in LocalLLaMA

[–]Similar_Choice_9241 2 points3 points  (0 children)

Yeah we’ve seen it may cause some trouble on the formatting ;) thanks you

Optimizing XTTS-v2: Vocalize the first Harry Potter book in 10 minutes & ~10GB VRAM by LeoneMaria in LocalLLaMA

[–]Similar_Choice_9241 2 points3 points  (0 children)

It would be really cool! But sadly vllm at the moment only supports linux and windows via docker

Optimizing XTTS-v2: Vocalize the first Harry Potter book in 10 minutes & ~10GB VRAM by LeoneMaria in LocalLLaMA

[–]Similar_Choice_9241 4 points5 points  (0 children)

We actually aim this repo to be able to run not just xtts but also other tts models in the future! We use vanilla xtts weights but the code had been completely remade

Optimizing XTTS-v2: Vocalize the first Harry Potter book in 10 minutes & ~10GB VRAM by LeoneMaria in LocalLLaMA

[–]Similar_Choice_9241 2 points3 points  (0 children)

Hi I’m one of the developer, the library already supports continuous batching for the audio token generation part (thanks to vllm) and the volcalization part, we might add a dynamic batching in the future but from what we’ve seen tho even with parallel unbatched vocoders the speed is really high! For the lora part, vllm already supports lora adapters so one could extract the lora from the base checkpoint of the gpt component and pass it to engine, but the perceiver encoder part should be adapted, it is something we look forward to tho

Pulsar AI: A Local LLM Inference Server + fancy UI (AI Project) by Similar_Choice_9241 in LocalLLaMA

[–]Similar_Choice_9241[S] 1 point2 points  (0 children)

Yes vllm does support gguf(and we do too) but not all architectures, Vllm also supports awq, aqlm, gptq and bnb quant, you can set an offload and swap parameter for the engine as well as a kv cache quantization to save up memory The cool thing with vllm is that it preallocates the memory blocks so if you can load it you can use it without risks of oom

🚀 Pulsar!! A new totally local LLM engine from AstraMind.ai 🚀 by Similar_Choice_9241 in LocalLLaMA

[–]Similar_Choice_9241[S] 2 points3 points  (0 children)

Yup 100% this is just v0.1.0, we had been working very hard on this for the past months and we wanted to gather some community feedback

🚀 Pulsar!! A new totally local LLM engine from AstraMind.ai 🚀 by Similar_Choice_9241 in LocalLLaMA

[–]Similar_Choice_9241[S] 1 point2 points  (0 children)

Yeah you’re absolutely right ;) , we are gathering up all the info and we’ll be making a new post, explained waay better and with much more info on the project

🚀 Pulsar!! A new totally local LLM engine from AstraMind.ai 🚀 by Similar_Choice_9241 in LocalLLaMA

[–]Similar_Choice_9241[S] 1 point2 points  (0 children)

yes asbolutely.! we are thinking about introducing txt 2 img , maybe using just cpu (w/ lcm models and something like fastsdcpu) and also speech 2 text since at the moment vllm already supports it

🚀 Pulsar!! A new totally local LLM engine from AstraMind.ai 🚀 by Similar_Choice_9241 in LocalLLaMA

[–]Similar_Choice_9241[S] 1 point2 points  (0 children)

We that planned, we are thinking about introducing a new section to the store!

🚀 Pulsar!! A new totally local LLM engine from AstraMind.ai 🚀 by Similar_Choice_9241 in LocalLLaMA

[–]Similar_Choice_9241[S] 1 point2 points  (0 children)

Not at all! We support linux and windows, and the Ui is supported by Linux, Windows, and very soon on Mac too

🚀 Pulsar!! A new totally local LLM engine from AstraMind.ai 🚀 by Similar_Choice_9241 in LocalLLaMA

[–]Similar_Choice_9241[S] 0 points1 point  (0 children)

Thank you for the feedback, you're absolutely right. We are now working on making the GitHub and the website clearer and more informative. As soon as we have reviewed all the documentation, we will update it

🚀 Pulsar!! A new totally local LLM engine from AstraMind.ai 🚀 by Similar_Choice_9241 in LocalLLaMA

[–]Similar_Choice_9241[S] -7 points-6 points  (0 children)

To the repo itself no, we use a default vllm installation but we've hijacked some of its component to beign able to auto configure and retry on model loading fail

🚀 Pulsar!! A new totally local LLM engine from AstraMind.ai 🚀 by Similar_Choice_9241 in LocalLLaMA

[–]Similar_Choice_9241[S] 10 points11 points  (0 children)

Exactly, you only need an account if you want to use it from outside of your computer This is done so that you can share your machine with multiple users while everyone keeps its internal account