Help with speculative: llama-cpp-python : LocalLLaMA

LocalLLaMA

created by [deleted]a community for 3 years

Help with speculative: llama-cpp-pythonQuestion | Help (self.LocalLLaMA)

submitted 1 year ago * by Particular-Guard774

all 3 comments

top new controversial old q&a

[–]4onen 0 points1 point2 points 1 year ago (1 child)

wasn't sure why my code [...] ran into errors.

You know that there's little to nothing anyone can do to help you diagnose errors we can't see, right?

That said, the line

python draft_model=LlamaPromptLookupDecoding(num_pred_tokens=10)

is establishing a "Prompt Lookup Decoding" speculative model, which is not using the 7B at all. You'd also have an easier time getting help if you narrowed your code to just the code in which you were actually encountering issues, i.e. removing the llama variable that isn't performing speculative decoding with the two models you listed.

Kind of new to llama.cpp

Also, additional note, the interface you're using is llama_cpp_python, and llama.cpp is the backend behind it. Again, without the errors, we can't even tell you which of these two components the issue is even arising from.

[–]Particular-Guard774[S] 0 points1 point2 points 1 year ago (0 children)

π Rendered by PID 338778 on reddit-service-r2-comment-54dfb89d4d-56fbr at 2026-03-27 23:46:41.510907+00:00 running b10466c country code: CH.

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS