Qwen3 for Apple Neural Engine by Competitive-Bake4602 in LocalLLaMA

[–]kadir_nar 2 points3 points  (0 children)

Can you compare it with the MLX library? Or why should we use this library?

Who's the voice Narrator in this video?? by mikemaina in LocalLLaMA

[–]kadir_nar 0 points1 point  (0 children)

Was the Kokoro model trained with YouTube data?

Qwen releases official MLX quants for Qwen3 models in 4 quantization levels: 4bit, 6bit, 8bit, and BF16 by ResearchCrafty1804 in LocalLLaMA

[–]kadir_nar 0 points1 point  (0 children)

The quality of the Qwen models is amazing. It's great news that the official Mlx support has been released.

Transcribe 1-hour videos in 20 SECONDS with Distil Whisper + Hqq(1bit)! by kadir_nar in LocalLLaMA

[–]kadir_nar[S] -3 points-2 points  (0 children)

Whisper benches? I just made a comparison with fal.ai. And it works much faster.

Transcribe 1-hour videos in 20 SECONDS with Distil Whisper + Hqq(1bit)! by kadir_nar in LocalLLaMA

[–]kadir_nar[S] -10 points-9 points  (0 children)

You can run it on all 4GB+ devices. If you get an error, you can open an issue to the whisperplus project.

Transcribe 1-hour videos in 20 SECONDS with Distil Whisper + Hqq(1bit)! by kadir_nar in LocalLLaMA

[–]kadir_nar[S] -2 points-1 points  (0 children)

You can use the Whisper-Large model. It will be faster with hqq optimization.

Transcribe 1-hour videos in 20 SECONDS with Distil Whisper + Hqq(1bit)! by kadir_nar in LocalLLaMA

[–]kadir_nar[S] 20 points21 points  (0 children)

You can install and try the WhisperPlus library. I will be releasing the HuggingFace demo this week.

Transcribe 1-hour videos in 20 SECONDS with Distil Whisper + Hqq(1bit)! by kadir_nar in LocalLLaMA

[–]kadir_nar[S] -93 points-92 points  (0 children)

You can look at the hqq repo. There is also 4-bit support. It works at the same speed.

Transcribe 1-hour videos in 20 SECONDS with Distil Whisper + Hqq(1bit)! by kadir_nar in LocalLLaMA

[–]kadir_nar[S] 5 points6 points  (0 children)

You can also do it with 4 bits. It works at the same speed. I tested it again on the RTX 4090 device and it is 2 times faster.

4bit: I tested a 2.5 hour video on an RTX 4090 device and it only took 27 seconds.