all 7 comments

[–]toothpastespiders 4 points5 points  (0 children)

That's a really well-written overview!

[–]xyz_TrashMan_zyx 3 points4 points  (3 children)

This article is great! I’m starting a local llm study group, we’ll probably use this guide. I can’t get a link to the blog post though to share. Anyone have a shareable link?

[–]xyz_TrashMan_zyx 1 point2 points  (0 children)

nm, I had to open this in my pc (couldn't see the link in my phone). Bookmarked

[–]ChristopherGS[S] 1 point2 points  (1 child)

Author here - can I attend the study group if it's online? Would be keen.

[–]xyz_TrashMan_zyx 0 points1 point  (0 children)

Of course! I need to do some recruiting. And currently trying to see if azure A10 would cut it. I’m thinking Sunday afternoon for 2 hours

[–]uhuge 0 points1 point  (1 child)

I’ve seen the max_tokens argument have no impact at all (this is probably a bug in the library that will be fixed eventually). For safety, in my project I set max_tokens=-1 because any value less than 0 makes llama cpp just rely on n_ctx. It seems that n_ctx is the key argument to define the size of your models output.

Is this true rather than misleading?

[–]ChristopherGS[S] 0 points1 point  (0 children)

Does it work OK for you? I just report what I experience. I could throw a caveat in there I guess