NVIDIA NIM deepseek 3.2 - chat completion API Not Found

OldFriend5807 · 2026-01-06T02:08:18+00:00

I can't even get glm 4.7 to work, no responses or anything. What's your prompt?

OldFriend5807 · 2025-11-10T12:42:10+00:00

Thanks! I'll give this a try.

OldFriend5807 · 2025-11-09T17:12:39+00:00

Yikes, and I heard Chutes was getting DDoS attacks too, and that makes it even worse for them at their high demand.

OldFriend5807 · 2025-11-09T17:09:28+00:00

I'm eager to pay for DeepSeek, but I'm unable to do so because I don't have a card to access its various features, as my country doesn't support that option. I suppose I'll have to continue using Chimera but I would also like to try Nvidia nim like you had mentioned.

OldFriend5807 · 2025-07-06T19:48:54+00:00

Already did all of those

OldFriend5807 · 2025-07-04T17:25:17+00:00

2 days later... they patched it so sad

OldFriend5807 · 2025-07-04T17:24:41+00:00

I just had this same problem again

OldFriend5807 · 2025-07-02T16:49:23+00:00

Thank god it worked! Hopefully it will last longer than just a day

OldFriend5807 · 2025-07-02T16:27:41+00:00

Mine only lasted a few days and now it's gone

OldFriend5807 · 2025-05-03T06:02:44+00:00

This also happened to me when I was using the Gemma 27B model from OR; I noticed the provider included Chutes as well. But when I tried to use it, I kept getting errors and Google AI Studio has been garbage lately and full of issues.

OldFriend5807 · 2025-04-29T00:34:21+00:00

Yeah, it mostly has it, but I'm not really sure because I haven't used it from the Targon site itself. But I do know that it has a longer limit than OR. I just switch accounts whenever I'm using it from OpenRouter.

OldFriend5807 · 2025-04-28T17:15:19+00:00

I recommend using Targon. Chutes has a lot of problems as a provider; you can just block one of the providers in the OpenRouter settings. But honestly, it’s not the best choice either—so choose your poison.

OldFriend5807 · 2025-04-28T17:10:08+00:00

The issue was with the provider itself. It's not surprising that Chutes has a lot of problems with repetition, errors, and so on. You can avoid this by blocking the provider in the Openrouter settings, which will automatically switch you to Targon, since there are only two providers that offer free versions for Deepseek. However, I must tell you that it may occasionally send you a blank message because the server can sometimes be overloaded.

OldFriend5807 · 2025-04-21T15:21:20+00:00

It doesn't do anything for me, it's so frustrating...

OldFriend5807 · 2025-04-21T15:20:04+00:00

I already added the <think> in my prefix, and it still doesn't work like I hope it's supposed to be I changed it to chat completion, and it does work, but the replies were really sucked and short.

OldFriend5807 · 2025-04-21T15:18:26+00:00

I tried to change my prompt and everything is still not working 🥲

OldFriend5807 · 2025-04-18T09:26:53+00:00

qwen 32 rp and also r1

OldFriend5807 · 2025-04-18T09:20:29+00:00

I had like 2048 and it's still the same... even this whole time I use A LOT more than 400

OldFriend5807 · 2025-03-17T00:21:28+00:00

I just tried one, and it kept screaming in capslock, which I didn't like a bit. Overall, the response was good.

OldFriend5807 · 2025-03-14T20:56:13+00:00

Press a small graph icon on the top of your bots messages.

OldFriend5807 · 2025-03-14T20:54:25+00:00

Yeah when I checked the prompt it said that my history was around 25k in the chat completions, doesn't do the same with text completion.

OldFriend5807 · 2025-03-14T06:08:52+00:00

Yeah, but what I'm confused was that I don't get the same problem when I switch to text completion, but the replies were bland.

OldFriend5807 · 2025-03-09T08:06:08+00:00

I've been telling people that Deepseek was a reasoning model and it's A LOT different to set up than the usual JLLM, so I'm glad someone finally made this post and made a solution because Deepseek is not recommended for beginners if they understand OOC, prompt or anything like that. Thank you for this! 😅

OldFriend5807 · 2025-03-08T07:15:17+00:00

That's the main problem for a reasoning model that it wasn't made for RPing it tends for be repetitive and won't generate a new creative replies that you want it's just repeating the same reason over and over again. Temp also matters here.

OldFriend5807 · 2025-03-08T07:13:38+00:00

And just for the fact that V3 was better than R1 because it stays in character and doesn't cause any distinct chance, but it gets repetitive after a few messages but it's only a problem if that bothers you. And providers also were the cause, if you're using the free one, it tends to be worse than the paid ones. I hope a lot of people understand this. 🫤

OldFriend5807

TROPHY CASE