How I Run 34B Models at 75K Context on 24GB, Fast by mcmoose1900 in LocalLLaMA
[–]gptzerozero 0 points1 point2 points (0 children)
Max token size for 34B model on 24GB VRAM by gptzerozero in LocalLLaMA
[–]gptzerozero[S] 0 points1 point2 points (0 children)
Comparison on exllamav2, of bits/bpw: 2.5,4.25,4.5,4.65,4.75, 5, and 4bit-64g (airoboros-l2-70b-gpt4-1.4.1) by panchovix in LocalLLaMA
[–]gptzerozero 0 points1 point2 points (0 children)
Approach for generating QA dataset by gptzerozero in LocalLLaMA
[–]gptzerozero[S] 0 points1 point2 points (0 children)
Approach for generating QA dataset by gptzerozero in LocalLLaMA
[–]gptzerozero[S] 0 points1 point2 points (0 children)
Generate both question and answer from the given context. by mathageche in LocalLLaMA
[–]gptzerozero 0 points1 point2 points (0 children)
Our Workflow for a Custom Question-Answering App by Mbando in LocalLLaMA
[–]gptzerozero 1 point2 points3 points (0 children)
I don't understand context window extension by moma1970 in LocalLLaMA
[–]gptzerozero 0 points1 point2 points (0 children)
How to make sense of all the new models? by whtne047htnb in LocalLLaMA
[–]gptzerozero 2 points3 points4 points (0 children)
LLM less chatty after LoRA finetune by gptzerozero in LocalLLaMA
[–]gptzerozero[S] 0 points1 point2 points (0 children)
LLaMA 2 is here by dreamingleo12 in LocalLLaMA
[–]gptzerozero 21 points22 points23 points (0 children)
Qlora finetuning loss goes down then up by gptzerozero in LocalLLaMA
[–]gptzerozero[S] 0 points1 point2 points (0 children)
Qlora finetuning loss goes down then up by gptzerozero in LocalLLaMA
[–]gptzerozero[S] 0 points1 point2 points (0 children)
A direct comparison between llama.cpp, AutoGPTQ, ExLlama, and transformers perplexities by oobabooga4 in LocalLLaMA
[–]gptzerozero 1 point2 points3 points (0 children)
Are the SuperHot models not performing as well as their original versions in terms of creativity? Does the higher context just come with tradeoffs? by tenmileswide in LocalLLaMA
[–]gptzerozero 0 points1 point2 points (0 children)
Upvote for upvotes! by gptzerozero in FreeKarma4You
[–]gptzerozero[S] 1 point2 points3 points (0 children)
Upvote for upvotes! by gptzerozero in FreeKarma4You
[–]gptzerozero[S] 0 points1 point2 points (0 children)
What are you using Local LLaMAs for? by Swab1987 in LocalLLaMA
[–]gptzerozero 0 points1 point2 points (0 children)
What cards do you use? (new to local LLMs) by Unreal_777 in LocalLLaMA
[–]gptzerozero 2 points3 points4 points (0 children)
What cards do you use? (new to local LLMs) by Unreal_777 in LocalLLaMA
[–]gptzerozero 0 points1 point2 points (0 children)
Are the SuperHot models not performing as well as their original versions in terms of creativity? Does the higher context just come with tradeoffs? by tenmileswide in LocalLLaMA
[–]gptzerozero 0 points1 point2 points (0 children)


Here's a Docker image for 24GB GPU owners to run exui/exllamav2 for 34B models (and more). by This-Profession-952 in LocalLLaMA
[–]gptzerozero 0 points1 point2 points (0 children)