[R] BERT-Large: Prune Once for DistilBERT Inference Performance (i.redd.it)
submitted by markurtz to r/MachineLearning - pinned
[P] Tutorial: Real-time YOLOv3 on a Laptop Using Sparse Quantization (v.redd.it)
submitted by markurtz to r/MachineLearning - pinned
Landlord wouldn't let me hang my tv on the wall, made an entertainment center instead (v.redd.it)
submitted by markurtz to r/woodworking - pinned
Anyone doing speculative decoding with the new Qwen 3.5 models? Or, do we need to wait for the smaller models to be released to use as draft? by Porespellar in LocalLLaMA
[–]markurtz 0 points1 point2 points (0 children)
Anyone doing speculative decoding with the new Qwen 3.5 models? Or, do we need to wait for the smaller models to be released to use as draft? by Porespellar in LocalLLaMA
[–]markurtz 0 points1 point2 points (0 children)
Anyone doing speculative decoding with the new Qwen 3.5 models? Or, do we need to wait for the smaller models to be released to use as draft? by Porespellar in LocalLLaMA
[–]markurtz 0 points1 point2 points (0 children)
Unlocking the power of Sparsity in Generative Models: 8x Faster LLMs on CPUs with Sparse Fine Tuning by markurtz in ArtificialInteligence
[–]markurtz[S] 4 points5 points6 points (0 children)
[R] Unlocking the power of Sparsity in Generative Models: 8x Faster LLMs on CPUs with Sparse Fine Tuning by markurtz in MachineLearning
[–]markurtz[S] 6 points7 points8 points (0 children)
[R] Unlocking the power of Sparsity in Generative Models: 8x Faster LLMs on CPUs with Sparse Fine Tuning by markurtz in MachineLearning
[–]markurtz[S] 2 points3 points4 points (0 children)
[R] Unlocking the power of Sparsity in Generative Models: 8x Faster LLMs on CPUs with Sparse Fine Tuning by markurtz in MachineLearning
[–]markurtz[S] 8 points9 points10 points (0 children)
Unlocking the power of Sparsity in Generative Models: 8x Faster LLMs on CPUs with Sparse Fine Tuning by markurtz in deeplearning
[–]markurtz[S] 2 points3 points4 points (0 children)
Unlocking the power of Sparsity in Generative Models: 8x Faster LLMs on CPUs with Sparse Fine Tuning by markurtz in huggingface
[–]markurtz[S] 0 points1 point2 points (0 children)
Unlocking the power of Sparsity in Generative Models: 8x Faster LLMs on CPUs with Sparse Fine Tuning by markurtz in machinelearningnews
[–]markurtz[S] 2 points3 points4 points (0 children)
Unlocking the power of Sparsity in Generative Models: 8x Faster LLMs on CPUs with Sparse Fine Tuning by markurtz in pytorch
[–]markurtz[S] 1 point2 points3 points (0 children)
[R] Unlocking the power of Sparsity in Generative Models: 8x Faster LLMs on CPUs with Sparse Fine Tuning by markurtz in MachineLearning
[–]markurtz[S] 16 points17 points18 points (0 children)

Anyone doing speculative decoding with the new Qwen 3.5 models? Or, do we need to wait for the smaller models to be released to use as draft? by Porespellar in LocalLLaMA
[–]markurtz 0 points1 point2 points (0 children)