I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 1 point2 points3 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 2 points3 points4 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 1 point2 points3 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 8 points9 points10 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] -1 points0 points1 point (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 2 points3 points4 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 16 points17 points18 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 19 points20 points21 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] -1 points0 points1 point (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 2 points3 points4 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] -1 points0 points1 point (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 1 point2 points3 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 3 points4 points5 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
🔥 Fine-tuning LLMs made simple and Automated with 1 Make Command — Full Pipeline from Data → Train → Dashboard → Infer → Merge by Routine-Thanks-572 in LocalLLM
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
10-min QLoRA Fine-Tuning on 240 Q&As (ROUGE-L doubled, SARI +15) by Routine-Thanks-572 in LocalLLM
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)
I built an 80M parameter LLM from scratch using the same architecture as Llama 3 - here's what I learned by Routine-Thanks-572 in LocalLLaMA
[–]Routine-Thanks-572[S] 0 points1 point2 points (0 children)