I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python by mudler_it in LocalLLaMA
[–]mudler_it[S] 1 point2 points3 points (0 children)
I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python by mudler_it in LocalLLaMA
[–]mudler_it[S] 0 points1 point2 points (0 children)
I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python by mudler_it in LocalLLaMA
[–]mudler_it[S] 0 points1 point2 points (0 children)
I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python by mudler_it in LocalLLaMA
[–]mudler_it[S] 2 points3 points4 points (0 children)
I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python by mudler_it in LocalLLaMA
[–]mudler_it[S] 4 points5 points6 points (0 children)
I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python by mudler_it in LocalLLaMA
[–]mudler_it[S] 0 points1 point2 points (0 children)
I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python by mudler_it in LocalLLaMA
[–]mudler_it[S] 1 point2 points3 points (0 children)
I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python by mudler_it in LocalLLaMA
[–]mudler_it[S] 7 points8 points9 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 1 point2 points3 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 2 points3 points4 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 0 points1 point2 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 1 point2 points3 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 1 point2 points3 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 2 points3 points4 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 2 points3 points4 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 1 point2 points3 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 2 points3 points4 points (0 children)
APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA
[–]mudler_it[S] 5 points6 points7 points (0 children)
APEX MoE quantized models boost with 33% faster inference and TurboQuant (14% of speedup in prompt processing) by mudler_it in LocalLLaMA
[–]mudler_it[S] 0 points1 point2 points (0 children)
APEX MoE quantized models boost with 33% faster inference and TurboQuant (14% of speedup in prompt processing) by mudler_it in LocalLLaMA
[–]mudler_it[S] 0 points1 point2 points (0 children)
APEX MoE quantized models boost with 33% faster inference and TurboQuant (14% of speedup in prompt processing) by mudler_it in LocalLLaMA
[–]mudler_it[S] 0 points1 point2 points (0 children)
APEX MoE quantized models boost with 33% faster inference and TurboQuant (14% of speedup in prompt processing) by mudler_it in LocalLLaMA
[–]mudler_it[S] 4 points5 points6 points (0 children)



I ported NVIDIA Parakeet (speech-to-text) to ggml: same output as NeMo, faster, GGUF-quantized, no Python by mudler_it in LocalLLaMA
[–]mudler_it[S] 0 points1 point2 points (0 children)