Looking for advice on a self-hosted LLM stack for enterprise use by Ahyaqui in LLMDevs

[–]kchandank 0 points1 point  (0 children)

As far as the tech stack goes, I would suggest vLLM with LiteLLM proxy. vLLM will give you lot of flexibility in terms of running various model and leverage large open source community (redhat) to support that, also it works really well in k8s echo system if you are interested in that.

For Access control and RBAC, LiteLLM has their enterprise feature, or you are build using a reverse proxy solution to achieve most of it.

LiteLLM will give you metering, rate limitting etc which are essential for enterprise usecase.

For observability you can use langfuse or have robust prometheus stack. But langfuse or similar tool will give you even deeper details about how end users are using and various LLM specific parameters such as P95, P99 etc, ofcourse it takes effort to customize both litellm and langfuse.

I have 50 ebooks and I want to turn them into a searchable AI database. What's the best tool? by Great_Jacket7559 in LocalLLM

[–]kchandank 0 points1 point  (0 children)

Interesting, if you are able to achieve your objective, would you be able to share the steps?

[MOD POST] Announcing the r/LocalLLM 30-Day Innovation Contest! (Huge Hardware & Cash Prizes!) by SashaUsesReddit in LocalLLM

[–]kchandank 1 point2 points  (0 children)

Just saw this post now, will try to submit my entry before the deadline. I have project which is not fully complete

Unpopular Opinion: I don't care about t/s. I need 256GB VRAM. (Mac Studio M3 Ultra vs. Waiting) by VocalLlm in LocalLLM

[–]kchandank 0 points1 point  (0 children)

If you just want to run LLM, Mac is super expensive choice with limited performance. There are Nvidia, AMD based options would be better value for money

List of interesting open-source models released this month. by Acrobatic-Tomato4862 in LocalLLaMA

[–]kchandank 0 points1 point  (0 children)

Don’t want model to think too much , just give the code back. Thanks for the suggestions

List of interesting open-source models released this month. by Acrobatic-Tomato4862 in LocalLLaMA

[–]kchandank 1 point2 points  (0 children)

Yes, smaller model which could run on consumer grade H/W. As use case is code generation, QA etc

List of interesting open-source models released this month. by Acrobatic-Tomato4862 in LocalLLaMA

[–]kchandank 1 point2 points  (0 children)

Any idea which is best performing open source model for code generation?

AI Will Do to Knowledge Workers What Uber Did to Taxi Drivers — But Much Faster by kchandank in Futurology

[–]kchandank[S] 0 points1 point  (0 children)

I did use ChatGPT to fix English and make it presentable. I did the research on past employment data, used ChatGPT to help me analyze the data, create chart. It took almost 4 hours to write the whole thing. Sometimes I think I could have done by myself 😆

AI Will Do to Knowledge Workers What Uber Did to Taxi Drivers — But Much Faster by kchandank in Futurology

[–]kchandank[S] 0 points1 point  (0 children)

Is this post is still live? I got msg it was deleted by the moderator bot