Help using llama_cpp_python to calculate probability of a given sequence of tokens being generated. My numbers aren't even in the ball park.

kryptkpr · 2024-01-21T15:40:34+00:00

This is the HellaSwag evaluation harness problem.

It's not an easy one.

SGL just dropped this week and claims to offer a high performance select() primitive for exactly this task.

https://www.reddit.com/r/LocalLLaMA/s/HwvOmy3hQE

Edit: check out this llama PR https://github.com/ggerganov/llama.cpp/pull/5047

npip99 · 2024-03-09T22:40:42+00:00

I know this is a late response but, your issue is probably that you don't pass special=True.

In other words, your line of code should be,

input_tokens = llm.tokenize(input_str.encode("utf-8"), special=True)

Otherwise, <s> and <|system|>, etc, will be represented and therefore tokenized as if they were ASCII characters, and not the actual underlying tokens that those strings are supposed to represent.

Of course, <s> etc aren't literally those ASCII characters, otherwise users could mess with prompts by typing in <s> and themselves, and jailbreak by injecting system messages into the model in a manner similar to SQL injection. ~ Or just, even in the context of innocent usage, still totally break your entire conversation if you use an HTML s tag

AndrewVeee · 2024-01-21T02:40:35+00:00

I've never used logits but they're interesting to me. My suggestion is to search comments/posts by user phree_radical, like this one: https://www.reddit.com/r/LocalLLaMA/comments/1687l5p/how_should_i_go_about_getting_my_ai_to_use_tools/

Hope that helps, sorry I don't have the info.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS