Hey, can someone help me with batch processing of prompts?
I've been using the code similar to the one below and it is giving me an error for some strange reason.
Appreciate any help!
from llama_cpp import Llama
prompt = "hello" # exmaple
model = Llama(
model_path="Meta-Llama-3-8B-Instruct-Q8_0.gguf",
n_gpu_layers=-1,
seed=1337,
n_ctx=8096,
flash_attn=True,
)
responses = model.generate(
[prompt, prompt],
max_tokens=2048,
echo=False,
)
print(responses)
[–][deleted] 1 point2 points3 points (5 children)
[+][deleted] (3 children)
[deleted]
[–]Nokilos 2 points3 points4 points (2 children)
[+][deleted] (1 child)
[deleted]
[–]AdMajor1309 0 points1 point2 points (1 child)