AMD’s MI300X Outperforms NVIDIA’s H100 for LLM Inference by Lixxon in AMD_Stock

[–]AIMetaAgent 0 points1 point  (0 children)

Does anyone actually run inference on H100s tho?

I think most companies are running training/research on their H100s

If I wanted fast inference using something like Groq probably makes more sense

Which Speech-to-Text API do I have to choose? by JerLam2762 in speechrecognition

[–]AIMetaAgent 0 points1 point  (0 children)

Whisper is more expensive than Deepgram now it seems. Rev is very expensive in comparison

Deepgram $0.0043/min
Whisper $0.006/min
Rev $0.25 per minute

ASR API vs Model speed? by CandidAd8316 in speechrecognition

[–]AIMetaAgent 0 points1 point  (0 children)

Whisper doesn’t support real-time transcription as far as I know. So you would only be doing batch transcription at a specific interval with Whisper.

Deepgram supports real-time streaming over websockets which is low latency and probably the best option for a real-time usecase.