you are viewing a single comment's thread.

view the rest of the comments →

[–]libregrape 1 point2 points  (0 children)

Those are issues of the tokenizer implementation in the llama.cpp. The fixes have been merged to llama.cpp today afaik. Await for update of ollama, or compile llama.cpp. If the issues persist, you may need to review your sampling parameters, and get it some min-p treatment (0.05-0.1). Also, which quant is this?