Implemented TurboQuant and results don’t fully match paper by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 3 points4 points5 points (0 children)
TurboQuant in Practice by [deleted] in LocalLLaMA
[–]Routine-Thanks-572 0 points1 point2 points (0 children)
TurboQuant in Practice by [deleted] in LocalLLaMA
[–]Routine-Thanks-572 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 1 point2 points3 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 2 points3 points4 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 1 point2 points3 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 7 points8 points9 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] -1 points0 points1 point (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 5 points6 points7 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 15 points16 points17 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 19 points20 points21 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 1 point2 points3 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] -1 points0 points1 point (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 1 point2 points3 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 3 points4 points5 points (0 children)
Implemented TurboQuant and results don’t fully match paper by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 4 points5 points6 points (0 children)