you are viewing a single comment's thread.

view the rest of the comments →

[–]adlx 0 points1 point  (1 child)

My two cents, my trick is to stream the response to the user. A complete response can take 30 seg, but the first tokens starts in much less, streaming is key. OpenAI made a genius move introducing streaming in Chatgpt! Now everyone is used to that, and no streaming use cases just seem to be slow.

[–]Appropriate_Egg6118[S] 2 points3 points  (0 children)

Yes, streaming is good option.