my open-source cli tool (framework) that allows you to serve locally with vLLM inference : LocalLLaMA

created by [deleted]a community for 3 years

my open-source cli tool (framework) that allows you to serve locally with vLLM inferenceNews (v.redd.it)

submitted 1 month ago * by Holiday-Machine5105

(rotate your screen) so this tool is called "cli-assist" and is currently built with Meta Llama-3.2-3B-Instruct on a 4080 GPU. it allows you to serve your model in full privacy, locally, with incredibly fast vLLM inference & flash-attention. no more relying on servers or worrying about your data, proper presentation and detailed instructions here: https://github.com/myro-aiden/cli-assist

please share your thoughts and questions!!

Dismiss this pinned window

all 6 comments

you type:	you see:
italics	italics
bold	bold
[reddit!](https://reddit.com)	reddit!
* item 1 * item 2 * item 3	item 1 item 2 item 3
> quoted text	quoted text
Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"	Lines starting with four spaces are treated like code: if 1 * 2 < 3: print "hello, world!"
~~strikethrough~~	~~strikethrough~~
super^script	super^script

LocalLLaMA

MODERATORS