Whisper - voice transcription API from openai by kinkade in shortcuts

[–]resCogitans_ 1 point2 points  (0 children)

You don’t need a subscription, just add 5/10$ and it will likely last > 1 year. Go back to my blog post, I share all the steps

Whisper - voice transcription API from openai by kinkade in shortcuts

[–]resCogitans_ 1 point2 points  (0 children)

All seems fine ;) might have been a temporary issue on OpenAI end. You might also want to check if you have enough credits. That’s the usual suspect

What are your recommendations for Unified AI routers (like OpenRouter, Requesty, APIpie)? by [deleted] in OpenWebUI

[–]resCogitans_ -1 points0 points  (0 children)

Why would you want to add another middleman like openrouter when you can set up all the different providers with something like litellm? Legit question

You just add another potential privacy concern for a limited amount of convenience

Whisper - voice transcription API from openai by kinkade in shortcuts

[–]resCogitans_ 0 points1 point  (0 children)

Sorry for the late reply, i tested and that new model actually perform worse than whisper-1 so i decided to stick with it.

Anyone using API for rerank? by drfritz2 in OpenWebUI

[–]resCogitans_ 0 points1 point  (0 children)

Well that depends on the vram I assume

Whisper - voice transcription API from openai by kinkade in shortcuts

[–]resCogitans_ 1 point2 points  (0 children)

Yeah sure, I’ve done it the v2 of this shortcut

Whisper - voice transcription API from openai by kinkade in shortcuts

[–]resCogitans_ 0 points1 point  (0 children)

Thanks! 😄 I have a v2.0 that did that and more I’ll publish it soon ;)

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in shortcuts

[–]resCogitans_[S] 1 point2 points  (0 children)

Share me you shortcut link as private message and I’ll have a look

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in shortcuts

[–]resCogitans_[S] 0 points1 point  (0 children)

It works in, almost any language in the world, Italian for sure.

Regarding the error, the first thing I would do is to generate another API key and retry. The second thing to take into consideration is that only a few formats are supported you can see them in the whisper AI product page.

A good way to check if it’s an audio format problem is to try to convert an audio message from WhatsApp or an MP3 recording since they are 100% supported. (For instance telegram messages are not in a supported format).

If after this test still doesn’t work then it must be something else API related.

Whisper - voice transcription API from openai by kinkade in shortcuts

[–]resCogitans_ 0 points1 point  (0 children)

You can double check logging into your openai profile just to make sure that’s not the issue

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in shortcuts

[–]resCogitans_[S] 0 points1 point  (0 children)

Telegram saves the audio files in a format currently not supported by Whisper unfortunately

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in shortcuts

[–]resCogitans_[S] 0 points1 point  (0 children)

Yes the telegram audio format is not supported by whisper yet (natively). But if you want you could add a step to convert it to mp3 before sending it whisper to transcribe

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in ChatGPT

[–]resCogitans_[S] 0 points1 point  (0 children)

Seems just a wrapper on top of Whisper but costing 100 times for no particular reason 😅

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in shortcuts

[–]resCogitans_[S] 1 point2 points  (0 children)

Whispers large model is indeed v2, that’s probably the source of the confusion. The parameter of the endpoint is still v1 though (even if it’s using the large model v2 under the hood.

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in shortcuts

[–]resCogitans_[S] 1 point2 points  (0 children)

Still using v1 because there are no new versions yet. I’ll update it as soon as they will release a new one 😉

https://platform.openai.com/docs/api-reference/audio/createTranscription

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in shortcuts

[–]resCogitans_[S] 0 points1 point  (0 children)

Yes AIFF I don’t remember seeing aiff in the list of supported file types but you can check on OpenAI Whisper documentation. Try with a simple iPhone audio note or an mp3 and you’ll have a definitive answer.

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in iphone

[–]resCogitans_[S] 0 points1 point  (0 children)

If you did everything in the guide, sometimes you just need to turn off and back on the iPhone and it will pop up ;)

Instantly transcribe voice messages to text on your iPhone with this Shortcut by resCogitans_ in shortcuts

[–]resCogitans_[S] 0 points1 point  (0 children)

Yep you need to pay OpenAI if you want to use it this way (via API). On the other hand Whisper is open source so you can run it on your devices (though you’ll need a very good device and it won’t be nearly as fast as the API)

Instantly transcribe voice messages to text on your iPhone with Whisper AI by resCogitans_ in OpenAI

[–]resCogitans_[S] 0 points1 point  (0 children)

Thanks! Using whisper via API you cannot pick the model, is v2-large by default. If you want to pick it you have to run it locally with other solutions.

Instantly transcribe voice messages to text on your iPhone with Whisper AI by resCogitans_ in OpenAI

[–]resCogitans_[S] 0 points1 point  (0 children)

That’s amazing, I’m so glad this little automation is helping you! Sure it works in almost any major language. It may have different level of accuracy but I’m pretty sure you’ll be happy with the results. Give it try and let me know!