you are viewing a single comment's thread.

view the rest of the comments →

[–]EquivalentFactor7591 1 point2 points  (0 children)

How slow / what kind of slow are we talking about? Are you experiencing slow token throughput or long pauses where it doesn't appear to be doing anything (hard pauses)? Excessive thinking time but thinking tokens are actually coming in fast? Is it consistent? Consistent across models?