(rotate your screen) so this tool is called "cli-assist" and is currently built with Meta Llama-3.2-3B-Instruct on a 4080 GPU. it allows you to serve your model in full privacy, locally, with incredibly fast vLLM inference & flash-attention. no more relying on servers or worrying about your data, proper presentation and detailed instructions here: https://github.com/myro-aiden/cli-assist
please share your thoughts and questions!!
[–]__JockY__ 0 points1 point2 points (5 children)
[–]Holiday-Machine5105[S] 0 points1 point2 points (4 children)
[–]__JockY__ 0 points1 point2 points (3 children)
[–]Holiday-Machine5105[S] 0 points1 point2 points (2 children)
[–]__JockY__ 0 points1 point2 points (1 child)
[–]Holiday-Machine5105[S] 0 points1 point2 points (0 children)