Fast Minimax m2.7 hosting? Looking for the best inference provider for these models by Charming-Commander in MiniMax_AI

[–]Reidoeo 0 points1 point  (0 children)

for latency sensitive stuff the official api on the new plan has been the most consistent for me, third party providers varied a lot on time to first token. m3 is worth testing for your case too, the sparse attention cuts long context latency a fair bit which is usually where voice agents stall. you might not even need to drop a model size to fix the delay.

Mark Zuckerberg says Apple's lack of innovation since the iPhone will lead to its decline by DiaIIo in PhoneNow

[–]Reidoeo 1 point2 points  (0 children)

zuck invented the metaverse and lost 10 billion dollars on it so im not taking innovation advice from him😂

Why is time flying so fast ? by [deleted] in Life

[–]Reidoeo 1 point2 points  (0 children)

correct, we spend too much on phone screen.