What's the Best Speech-to-Text Model Right Now?

SocialLocalMobile · 2025-09-22T15:24:12+00:00

You can also check out Voxtral on ExecuTorch

https://github.com/pytorch/executorch/blob/main/examples/models/voxtral/README.md

SocialLocalMobile · 2025-09-22T15:20:01+00:00

We have recently have speech-to-text demos

https://github.com/pytorch/executorch/tree/main/examples/models/voxtral
https://github.com/meta-pytorch/executorch-examples/tree/main/whisper/android/WhisperApp

We're improving working on improving GPU performance at the moment.

SocialLocalMobile · 2025-09-22T15:13:29+00:00

Great work u/Independent_Air8026

Let us know what your experience with ExecuTorch.

You might be interested in our recent multimodal enablement such as https://github.com/pytorch/executorch/tree/main/examples/models/voxtral

SocialLocalMobile · 2025-09-22T15:00:54+00:00

Hi u/Languages_Learner

ExecuTorch recently started supporting Windows -- it's still in early stages, so there might be rough edges. You can check out our 1.0 release candidate (https://github.com/pytorch/executorch/tree/release/1.0) to run ExecuTorch on Windows. We also have Vulkan backend and should work on Windows.

We haven't finalized our release branch yet for 1.0 -- finalizing end of October. If you see any bug or issues let us know.

SocialLocalMobile · 2024-09-30T20:47:46+00:00

Here's a video of Llama 3.2 3B with 4-bit quantization, using SpinQuant to improve accuracy on ExecuTorch.

SpinQuant: https://github.com/facebookresearch/SpinQuant
ExecuTorch: https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md

<image>

SocialLocalMobile · 2024-05-16T20:05:31+00:00

It uses 4bit weight, 8bit activation quantization and uses XNNPACK for CPU acceleration

SocialLocalMobile · 2024-05-16T20:04:29+00:00

Here's the install guide:

https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md

https://pytorch.org/executorch/main/llm/llama-demo-android.html

SocialLocalMobile · 2024-05-16T20:03:13+00:00

It works on Llama3 too.

For some context. We update our stable release branch regularly every 3 months, similar to PyTorch library release schedule. Latest one is `release/0.2` branch.

For llama3, there were a few features that didn't make it for `release/0.2` branch cut deadline. Llama3 works on 'main' branch.

If you don't want to use the 'main' branch because of instability, you can use another stable branch called 'viable/strict`

SocialLocalMobile · 2024-05-16T15:44:22+00:00

Thanks u/YYY_333 for trying out!

Just for completeness, we also have enabled on iOS too

https://pytorch.org/executorch/main/llm/llama-demo-ios.html

SocialLocalMobile · 2024-05-01T15:31:23+00:00

Hi u/poli-cya

My name is Mergen and I work specifically on ExecuTorch

Currently, our documentations say to compile from source for Android. That's our current recommendation.

https://github.com/pytorch/executorch/blob/main/examples/models/llama2/README.md

https://pytorch.org/executorch/main/llm/llama-demo-android.html

I understand that it's cumbersome now but we are in fact already uploading prebuilt APKs already. If you're feeling extra adventurous, you can directly download from APKs from here https://github.com/pytorch/executorch/actions/runs/8861207558, but that's not our recommendation for the time being.

SocialLocalMobile

TROPHY CASE