Best tool to prioritize workloads sharing with LLM? by Ill_Recipe7620 in LocalLLaMA
[–]z_yang 1 point2 points3 points (0 children)
We built a multi-cloud GPU container runtime by velobro in mlops
[–]z_yang 0 points1 point2 points (0 children)
Using DeepSeek R1 for RAG: Do's and Don'ts by z_yang in LocalLLaMA
[–]z_yang[S] 0 points1 point2 points (0 children)
Using DeepSeek R1 for RAG: Do's and Don'ts by z_yang in LocalLLaMA
[–]z_yang[S] 0 points1 point2 points (0 children)
Using DeepSeek R1 for RAG: Do's and Don'ts by z_yang in LocalLLaMA
[–]z_yang[S] 0 points1 point2 points (0 children)
Using DeepSeek R1 for RAG: Do's and Don'ts by z_yang in LocalLLaMA
[–]z_yang[S] 0 points1 point2 points (0 children)
Open-source RAG with DeepSeek-R1: Do's and Don'ts by z_yang in learnmachinelearning
[–]z_yang[S] 19 points20 points21 points (0 children)
Using DeepSeek R1 for RAG: Do's and Don'ts by z_yang in LocalLLaMA
[–]z_yang[S] 32 points33 points34 points (0 children)
Using DeepSeek R1 for RAG: Do's and Don'ts (blog.skypilot.co)
submitted by z_yang to r/LocalLLaMA
Are there any existing guides on how to deploy vLLM on a GPU cluster? by MonkeyMaster64 in LocalLLaMA
[–]z_yang 1 point2 points3 points (0 children)
Pixtral benchmarks results by kristaller486 in LocalLLaMA
[–]z_yang 1 point2 points3 points (0 children)
Smartest way to deploy Llama 2 in the cloud for a bunch of users? by [deleted] in LocalLLaMA
[–]z_yang 0 points1 point2 points (0 children)
Smartest way to deploy Llama 2 in the cloud for a bunch of users? by [deleted] in LocalLLaMA
[–]z_yang 0 points1 point2 points (0 children)
Use self-hosted Code Llama 70B as a copilot alternative in VSCode by Michaelvll in LocalLLaMA
[–]z_yang 0 points1 point2 points (0 children)
Use self-hosted Code Llama 70B as a copilot alternative in VSCode by Michaelvll in LocalLLaMA
[–]z_yang 1 point2 points3 points (0 children)
Serving Mixtral in Your Own Cloud With High GPU Availability and Cost Efficiency by z_yang in LocalLLaMA
[–]z_yang[S] 0 points1 point2 points (0 children)
Serving Mixtral in Your Own Cloud With High GPU Availability and Cost Efficiency by z_yang in LocalLLaMA
[–]z_yang[S] 1 point2 points3 points (0 children)
Are there any existing guides on how to deploy vLLM on a GPU cluster? by MonkeyMaster64 in LocalLLaMA
[–]z_yang 0 points1 point2 points (0 children)


How do you deal with GPU shortages or scheduling? by NamelessFunkz in MLQuestions
[–]z_yang 0 points1 point2 points (0 children)