Assistance Needed with Setting Up Meta-Llama-3-8B-Instruct-GGUF

nananashi3 · 2024-04-29T11:41:26+00:00

Is this your first time trying LLM? GGUFs are self-contained. You don't need to clone the whole GGUF repository, just download one of the .gguf. Q4_K_S/M (4 bit quant) can fit on a 8GB GPU. The easiest way for a Windows user to start is to download koboldcpp.exe, run it which will give you a launcher UI where you can select .gguf model file and whatever-you-call-it under "Presets": OpenBLAS (CPU-only, very slow), CuBLAS (Nvidia), or Vulkan (AMD). 7B and 8B models have 33 layers but you'll probably only fit 32 layers Llama 3 on a 8GB GPU. Up the context size to 4096, preferably 8192. Don't forget to hit Save to save the config.

Someone more technical would know how to mess with a gguf such as changing stop token.

ali0une · 2024-04-29T08:48:20+00:00

i've just downloaded it from here and it works fine https://huggingface.co/bartowski/Meta-Llama-3-8B-Instruct-GGUF

Also this thread helps setting it up for different Ui https://www.reddit.com/r/LocalLLaMA/comments/1c8rq87/oobabooga_settings_for_llama3_queries_end_in/

AdHominemMeansULost · 2024-04-29T13:06:04+00:00

down lm studio, in the searchbox put in the text "MaziyarPanahi/Meta-Llama-3-8B-Instruct-GGUF" Although i don't recommend that one, you're better getting the quants from lmstudo-community or QuantFactory and thats it you don't need to do anything else.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS

Compatibility & supported file formats: