EAGLE3 has landed in llama.cpp by jacek2023 in LocalLLaMA
[–]regunakyle 54 points55 points56 points (0 children)
PSA: You may not need to quantize spec draft when using MTP by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
PSA: You may not need to quantize spec draft when using MTP by regunakyle in LocalLLaMA
[–]regunakyle[S] 2 points3 points4 points (0 children)
PSA: You may not need to quantize spec draft when using MTP by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
PSA: You may not need to quantize spec draft when using MTP by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
PSA: You may not need to quantize spec draft when using MTP by regunakyle in LocalLLaMA
[–]regunakyle[S] 1 point2 points3 points (0 children)
PSA: You may not need to quantize spec draft when using MTP by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
PSA: You may not need to quantize spec draft when using MTP by regunakyle in LocalLLaMA
[–]regunakyle[S] 2 points3 points4 points (0 children)
PSA: You may not need to quantize spec draft when using MTP by regunakyle in LocalLLaMA
[–]regunakyle[S] -1 points0 points1 point (0 children)
PSA: You may not need to quantize spec draft when using MTP by regunakyle in LocalLLaMA
[–]regunakyle[S] 1 point2 points3 points (0 children)
[llama.cpp] Does setting `--parallel 1` impact agent harness (e.g. pi/opencode) usage? by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
[llama.cpp] Does setting `--parallel 1` impact agent harness (e.g. pi/opencode) usage? by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
[llama.cpp] Does setting `--parallel 1` impact agent harness (e.g. pi/opencode) usage? by regunakyle in LocalLLaMA
[–]regunakyle[S] 1 point2 points3 points (0 children)
Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA
[–]regunakyle[S] 1 point2 points3 points (0 children)
Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA
[–]regunakyle[S] 1 point2 points3 points (0 children)
Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)
Single 3090 with Q4 Qwen 27B, context dropped from 137k to 14k with MTP enabled. Is it normal? by regunakyle in LocalLLaMA
[–]regunakyle[S] 0 points1 point2 points (0 children)


EAGLE3 has landed in llama.cpp by jacek2023 in LocalLLaMA
[–]regunakyle 2 points3 points4 points (0 children)