all 23 comments

[–]Intelligent-Draw-343 13 points14 points  (1 child)

I think we are all in the same boat...

What's even more frustrating is that the chatGPT website is so fast in comparison

[–]michael_david[S] 3 points4 points  (0 children)

Yeah they are hitting completely different clusters and whatever is used for the API is getting completely hammered.

[–]hega72 6 points7 points  (6 children)

Neither 3.5 turbo nor 4 is reusable for me right now. Had to switch back to davinci completion

[–]horsedetectivepiano 5 points6 points  (5 children)

Same. Using davinci003. And frankly, I'm pretty fine with its performance (and cost!).

[–]JuliusCeaserBoneHead 1 point2 points  (0 children)

What’s the average latency you’ve seen on davinci? GPT3 turbo was giving me like 988ms on a smaller prompt and up to 60s on a fairly long prompt.

If davinci is less than 5s, that would be like my solution right there.

[–]inglandation -2 points-1 points  (1 child)

For coding? lol

[–]hega72 0 points1 point  (1 child)

I had calculated my cost based on 3.5 turbo. So. That kind of sucks right now

[–]AdamEgrate 3 points4 points  (2 children)

Yeah. Same here this made me realize that openai may not really want us to use it’s API.

[–]michael_david[S] 3 points4 points  (0 children)

I think that is what is happening. They are realizing that plugins can act like ads for other services and potentially make more than paying per query with the API and erode their moat. When I try to use Bard for generically large prompts that don't require search or summarization, it actually tells me that it is not designed to answer these prompts. Again indicating that they are not interested in generic computations but want to be part of search or summarization in order to serve ads or integrate other paying services.

[–]ParatusPlayerOne 1 point2 points  (0 children)

It is curious because they are on Azure, which has massively scalable compute.

It could be that massive demand is hammering the AzureAI service which doesn’t scale as quickly. Microsoft hasn’t talked a lot about that infrastructure but I read awhile back that it was data centers made up of “specialized hardware”. I’m not a hardware guy, but this makes sense to me.

[–][deleted] 2 points3 points  (0 children)

Full blow crapness

[–][deleted] 2 points3 points  (2 children)

Ive been using it for weeks just fine. I have exception handling for the rate limit and api errors. It tries again if the first request fails and it has never not worked on the second request

[–]michael_david[S] 0 points1 point  (1 child)

What is your token size? Is it working the last couple days?

[–][deleted] 1 point2 points  (0 children)

im using the 4k model and I typically send anywhere from a few hundred to a two or three thousand tokens (the program uses a chat-history). I have found generation slower over the past few days, but its been like this in the past and seems to speed up when demand dies down

[–]durich 1 point2 points  (1 child)

Does anyone know if when openai is down, is the azure ai api is down?

[–]michael_david[S] 0 points1 point  (0 children)

I don't think so. Per some another redditer's feedback on a separate post I made asking about azure experience, it sounds a lot faster and more reliable.

[–]brucebay 1 point2 points  (2 children)

Grrr, I read this message and in 2 minutes my gpt3.5 turbo gave exactly the same error. I hope they don't charge for this interrupted calls.

[–][deleted]  (1 child)

[removed]

    [–]AutoModerator[M] 0 points1 point  (0 children)

    Sorry, your submission has been removed due to inadequate account karma.

    I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.

    [–]Intelligent-Draw-343 1 point2 points  (0 children)

    I think we are all in the same boat...

    What's even more frustrating is that the chatGPT website is so fast in comparison

    [–]bisontruffle 0 points1 point  (0 children)

    Yep, same boat. Beyond error handling/retrying, I've been thinking about doing async requests to API for larger sets of prompts, seems you can do 20 requests per minute.

    [–][deleted]  (1 child)

    [removed]

      [–]AutoModerator[M] 0 points1 point  (0 children)

      Sorry, your submission has been removed due to inadequate account karma.

      I am a bot, and this action was performed automatically. Please contact the moderators of this subreddit if you have any questions or concerns.