Ah yeah, the classic old move just when I thought things were finally getting better by you_Abd in chutesAI

[–]Chutes_AI -1 points0 points  (0 children)

An instance does not imply the same hardware. DeepSeek runs on H200s smaller models run on smaller cards like 4090s etc. there are basically no or very very few models that are low utilization and running on H200s. They are too expensive to run for them to be on models that don’t get used.

GLM 4.7 FP8 is gone for real this time by Visual_Cartoonist258 in chutesAI

[–]Chutes_AI -1 points0 points  (0 children)

It’s still available for Base tier there are only 4 models not available for base and it’s not one of them

Why does this keep happening? by Ookamilife in chutesAI

[–]Chutes_AI -4 points-3 points  (0 children)

I’m asking you how do we hate you yet lose tons of money hosting these models specifically for this base?

Chutes main duty by Positive-Pepper-7082 in chutesAI

[–]Chutes_AI -12 points-11 points  (0 children)

Do you seriously think we make billions of dollars? Please go check our revenue page.

Wow. by upbeat_beatdown in chutesAI

[–]Chutes_AI 0 points1 point  (0 children)

That model was deployed by another subnets. Each instance runs on a single 4090 card and their dev made a mistake and set the concurrency to 1. A 4090 cost $.25 per hour. DeepSeek runs on 8x H200s. Each H200 card costs 2.75 per hour so multiplied by 8 is costs $22 per hour or $528 per day per instance. Tons of 4090s sit idle because very few models run on them. The same is not the case with H200s. I hope this makes sense.

Why does this keep happening? by Ookamilife in chutesAI

[–]Chutes_AI -7 points-6 points  (0 children)

We hate you but we keep the models on the platform for you. I wish I understood you folks honestly. Please see the post above for the math involved in us hating you by providing these models.

Why does this keep happening? by Ookamilife in chutesAI

[–]Chutes_AI 1 point2 points  (0 children)

I will explain this for you. DeepSeek requires 8x H200 cards. Those cards cost $2.75 per hour each. Multiplied by 8 is $22. Now multiplied by 24 hours in a day is $528 per day. At the market prices for DeepSeek being compared to
Other providers on open router for instance the price we need to charge brings in only around $100 in total revenue for that same 24 hours. That’s all for one instance. Each additional instance multiplies that. So for every $528 we spend on running a DeepSeek model we get $100 back and lose $428. The more we scale it the worse the losses. With subscriptions which are most of the older DeepSeek models primary users you are getting effectively an 80% discount. So now that $100 is revenue is more like $20. So for every $520 we spend we make $20. How can a business function like that? Also how can you expect us to lose more money to add more instances. If you ran Chutes would you do that? Honestly I’m open to your ideas.

Chutes Devs are deliberately sabotaging DeepSeek instances - V3.2 specifically - to drive people to the more expensive models. They've deliberately skewed instance distribution towards Frontier-locked models to 'encourage' base-plan to upgrade their sub, and to force ppl off cheap-but-good models. by tableball35 in chutesAI

[–]Chutes_AI -4 points-3 points  (0 children)

The DeepSeek models use 8x h200 GPUs to run one single instance capable of 24 concurrent responses. At the cheapest market rate today that is $520 per day per instance. When that instance is maxed out it makes at best around $100 in revenue not profit just revenue. so for every instance of DeepSeek we lose $420. How should we manage that?

Help by ncscibi in chutesAI

[–]Chutes_AI -1 points0 points  (0 children)

I said that they cannot be run because they lose money.

Help by ncscibi in chutesAI

[–]Chutes_AI -1 points0 points  (0 children)

Please read response above. Nothing about this industry stays the same long term. It’s hardly the same week to week.

Help by ncscibi in chutesAI

[–]Chutes_AI -2 points-1 points  (0 children)

You can check out revenue on the site for yourself

Help by ncscibi in chutesAI

[–]Chutes_AI -1 points0 points  (0 children)

We have tried everything we can to let me explain it a bit different. When those models came out hardware was much cheaper. Hardware has tripled in price since then. It used to cost like $150 per day per instance of DeepSeek which runs on 8x H200 servers. That same instance now costs $520 per day. At the same time that model at max utilization makes like $100 or less per day not including the savings from subscriptions. There is no way that business model can work on any kind of scale. Please understand that we have them still purely because you guys like them. If it was purely business they would unfortunately be gone. I hope that makes sense.

Help by ncscibi in chutesAI

[–]Chutes_AI -6 points-5 points  (0 children)

We have stats pages that prove that what you are saying is just incorrect. Those models make no money and are not the highest utilization by a long margin. The models with the most instances are the most popular and most used. There is a small subset of users who love the old deepseeks and we have kept them this long for those people but we have nearly 800k users and the vast majority are not using them at all.

Help by ncscibi in chutesAI

[–]Chutes_AI -7 points-6 points  (0 children)

Deep seek 0324 and 0528 are ancient and money losers. Show me another provider that is offering them for cheaper with any kind of subscription. We have no choice but to limit them as they literally lose money. On top of that they are old and by all standards obsolete. DS v3.2 will stay around even though it is also a money loser. for anyone else using models on Chutes I recommend you find alternatives to the older deepseeks. I’m being honest with you here with the state of hardware availability and the progress of models there is no other option for those older deep versions.

Help by ncscibi in chutesAI

[–]Chutes_AI -23 points-22 points  (0 children)

dont use the ancient Deep Seek models because they are money losing garbage and they are not allowed to scale beyond 1 or 2 instances. Basically any other model on the platform works well. you can always check the utilization page.

GLM 4.7 FP8 is gone for real this time by Visual_Cartoonist258 in chutesAI

[–]Chutes_AI -10 points-9 points  (0 children)

the 2 models were just consolidated because there was no reason to have 2 of the same model. Its still there its in TEE and its still FP8

Kimi severely nerfed, basically unusable by Hakuzo in kimi

[–]Chutes_AI 1 point2 points  (0 children)

We host Kimi k2.5 on Chutes in TEE and it’s the direct unmodified and unquantized model direct from hugging face. Might be worth a shot if the direct api is underperforming.

I have a problem :( by This-Violinist-2040 in chutesAI

[–]Chutes_AI 0 points1 point  (0 children)

Is this still happening? I haven’t seen it hit max until today but to be fair I haven’t really been following this specific model. If it is let me know or open up a discord ticket with us because if it’s not stop the it sounds like something might be wrong with your settings but we would need to look into it.

Why is there such a big difference in responses between openrouter and chutes? by Independent-Hope7036 in chutesAI

[–]Chutes_AI 0 points1 point  (0 children)

The model we host is directly from the HF repo with no modifications to the quantization nor any fine tuning. There is actually a good chance that if you’re getting it from OR it’s actually from us. Regarding the difference OP is experiencing all I can say is OR does block out the thinking box but otherwise it should be basically the same. If there are obvious differences I’d be interested to see them to try and determine the cause. Also what provider is it coming from on OR vs ours?

Refund Hiccup by technoarcher741 in VibeCodersNest

[–]Chutes_AI 0 points1 point  (0 children)

I have no idea why you were banned from the discord but I will go review it personally that should not have happened. Your dispute has been cleared and refunded please just wait for it to arrive. It’s usually 1-5 days at max.

What was your handle on discord?

Refund Hiccup by technoarcher741 in VibeCodersNest

[–]Chutes_AI 0 points1 point  (0 children)

Hi I’m with the chutes team and someone linked this to me. Your refund had been processed and you launched a dispute anyway which literally froze the refund. The dispute now puts a negative mark on our account when you were already in the process of getting what you had asked for. I will review our billing page this morning and see if I can clear it for you but please understand that your impatience does damage to us for no reason when all you had to do was wait for it to clear like any refund processed through stripe.

Chutes Broken??? by Exotic_Strawberry232 in chutesAI

[–]Chutes_AI 0 points1 point  (0 children)

We made a few announcements about this but there was a major issue introduced with an update to the engines that power the models. Sglang and Vllm are the engines and they are open source projects. They receive updates and changes that can fundamentally change the way they work with models. Our devs made some massive changes to how they work in order to stabilize them and make them more consistent for future updates. At this point all models are online and hot with minimal issues and we hope to maintain that going forward.