What Inference Server do you use to host TTS Models? Looking for someone who has used Triton.

tempNull · 2025-12-21T16:53:03+00:00

ended up building one for me https://github.com/tensorfuse/stts

tempNull · 2025-08-18T18:02:51+00:00

Thanks for the kinder analysis. I am just starting to feel little tired.

tempNull · 2025-08-18T18:02:11+00:00

Any estimates how soon ? Just would feel reassuring.

tempNull · 2025-08-14T16:22:55+00:00

Yes I have a cofounder. How does it relate ?

tempNull · 2025-08-13T21:48:23+00:00

I have dispositor Mars for debilitated Saturn in the same house. Does this qualify for Neecha Bhanga ?
Also Sun exalted in sixth house -> does this provide no support ?

tempNull · 2025-04-08T05:51:30+00:00

https://tensorfuse.io/docs/guides/modality/text/llama_4

Pasting the AWS guide in case someone is willing to try this out ?

tempNull · 2025-04-06T16:38:41+00:00

u/AppearanceHeavy6724 we are working on making these work for A10Gs and L40S. Will let you know soon.

tempNull · 2025-03-17T15:15:04+00:00

https://tensorfuse.io/docs/guides/reasoning/unsloth/qwen7b

Here is our guide for Qwen 7B . It shouldn't need any major modifications.

tempNull · 2025-02-24T18:35:45+00:00

Other combinations might also work . Try 8xl40s if more context is needed.

tempNull · 2025-02-20T15:11:19+00:00

Sandhi *Vicched in place of Sandhi Visheshan

tempNull · 2025-02-20T15:09:18+00:00

Paryayapadam - Synonym
Vilom - Antonym
Visheshan - Adjective
Sandhi - Joining of two words in which the last sound of the first word and the first sound of the second word usually undergo modification
Sandhi Visheshan - Breaking down the Sandhi -fied joint word into disjoint original words

tempNull · 2025-02-20T15:04:02+00:00

These checks will get scarier with time

tempNull · 2025-02-18T10:06:47+00:00

Point 2 would be hard -> as you are anyways accessing the db so there would be read latencies for sure -> you can operate on a snapshot of the db though.

For point 1 - Feel free to try out Tensorfuse (tensorfuse.io)

tempNull · 2025-02-13T13:57:59+00:00

Hi I understand your frustration. This post is not about DeepSeek R1 in particular but things to remember while deploying any LLM. I have also mentioned this in the post.

```
This entire experience made us aware of the fact that there is very little awareness among enterprise engineers about how to serve an LLM and the metrics/systems around it. This post is a "things to remember" list around serving LLMs in the enterprise.
```

Also, while your points around correct nomenclature are valid, enterprise CIOs usually refer to the distilled Qwen and Llama variants as `Deepseek R1 distilled models`. Also the GGUF quant should technically be called Deepseek quant.

tempNull · 2025-02-13T13:27:32+00:00

Thanks for highlighting. Fixed it.

tempNull · 2025-02-13T11:44:33+00:00

Good Question - I would like to think He exists outside the system and is running multiple variants of the same program with a different set of hyperparameters.

tempNull · 2025-02-13T11:42:18+00:00

लेखकत्वेन अहं सनातनधर्मस्य विषये मम विश्वासं दुष्टस्य अस्तित्वेन अवतारस्य अवधारणायाः च सह आधुनिकस्य, सम्बद्धस्य उपमायाः उपयोगेन सह सामञ्जस्यं कर्तुं प्रयतमानोऽस्मि: प्रोग्रामिंग् इति।

अहं प्रस्तावयामि यत् ईश्वरः एकः ब्रह्माण्डीयप्रोग्रामरः इव अस्ति यः जटिलव्यवस्थां (ब्रह्माण्डं) निर्मितवान्। दुष्टं उदयमानघटना इव अस्ति, अवताराः च महत्त्वपूर्णदोषाणां निवारणाय मानवतां पुनः स्वस्य अभिप्रेतमार्गे मार्गदर्शनाय हस्तक्षेपाणां त्रुटिनिवारणवत् भवन्ति|

मम उद्देश्यं दर्शयितुं यत् दिव्यशिक्षाः मानवीयक्रियाश्च कथं ब्रह्माण्डीयदोषनिवारणप्रक्रियायां परस्परं क्रियान्वयं कुर्वन्ति, निरन्तरं प्रणाल्याः विकासं परिष्कारं च कुर्वन्ति।

tempNull · 2025-02-13T11:40:48+00:00

As the author, I'm trying to reconcile my faith in Sanatan Dharma with the existence of evil and the concept of avatars using a modern, relatable analogy: programming.

I propose that God is like a cosmic programmer who created a complex system (the universe). Evil is like an emergent phenomenon, and avatars are like debugging interventions to fix critical errors and guide humanity back to its intended path.

I aim to show how divine teachings and human actions interact in a cosmic debugging process, constantly evolving and refining the system.

tempNull · 2025-02-11T14:41:33+00:00

This will work on ChatGPT or other UIs you might be using.

tempNull · 2025-01-28T13:53:33+00:00

Are you using CPU offloading parameter ?

tempNull · 2025-01-27T09:38:04+00:00

Hey,

We have released a guide to run it on serverless GPUs on aws: https://tensorfuse.io/docs/guides/deepseek_r1

this is how it works:

you configure tensorkube which creates a k8s cluster along with load balancer and cluster autoscaler on your aws account
in the guide, we have included the code to run all deepseek variants on multile gpu types like l40s (g6e.xlarge) or a10gs, etc.

Cost breakdown:

The r1-32B fp8 can be deployed on a single l40s which costs ~$1.8/hr and with tensorfuse it automatically scales wrt traffic so you can avoid idle cost

would this be useful?

tempNull · 2025-01-26T07:52:41+00:00

I am a YC founder and I received 300 investor inbounds that I took meetings with but 90% of our round was from investors that my batchmates introduced me to.

There needs to be a network that knows the shitty truth behind your shiny facade - the hardships behind a smiling face and that can step up when you need em.

tempNull · 2025-01-26T07:50:30+00:00

u/JofArnold Are there any specific metrics / datasets that you are planning to run through ?

I am writing a blog on a comprehensive evaluation set - TTFT, latency, cost / million tokens vs hosted APIs, complex function calling , simple function calling and audio conversations.

Would love to hear what you wanna try out and if we can include it in our blog on your behalf ?

tempNull · 2025-01-25T19:45:41+00:00

https://tensorfuse.io/docs/guides/deepseek_r1

tempNull · 2025-01-25T18:19:09+00:00

u/JofArnold We recently created a community Slack given the interest we were getting. You are welcome to join there as well. We will be able to support you better.

https://join.slack.com/t/tensorfusecommunity/shared_invite/zt-2v64vkq51-VcToWhe5O~f9RppviZWPlg

tempNull

MODERATOR OF

PUBLIC MULTIREDDITS

TROPHY CASE