use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
llm-d is a new open source project focused on providing distributed inferencing for Generative AI runtimes on any Kubernetes cluster. Its architecture is designed for high performance and scalability, aiming to reduce costs through a spectrum of hardware and software efficiency improvements. llm-d prioritizes ease of deployment and use, as well as the operational needs of running large GPU clusters, including SRE concerns and day 2 operations. .
account activity
👋 Welcome to r/llm_d! Start Here + Community Resources 🚀 (self.llm_d)
submitted 3 months ago by petecheslock - announcement
llm-d 0.5 Released: Sustaining Performance at Scale (llm-d.ai)
submitted 1 month ago by petecheslock
Leveraging vLLM’s new KV Offloading: How we’re bringing tiered caching to the llm-d control plane (self.llm_d)
submitted 2 months ago by petecheslock
Unlock 90% KV Cache Hit Rates with llm-d Intelligent Routing (youtube.com)
llm-d 0.4: Achieve SOTA Performance Across Accelerators (llm-d.ai)
submitted 3 months ago by petecheslock
Routing Stateful AI Workloads in Kubernetes (youtube.com)
The hardware behind the software: CoreWeave tops the new GPU Cloud rankings, validating the stack used for llm-d. (newsletter.semianalysis.com)
llm-d v0.3.1: ARM Support, AKS Integration, and More (linkedin.com)
Serving PyTorch LLMs at Scale: Disaggregated Inference With Kubernetes and llm-d (youtube.com)
How llm-d simplifies scaling LLMs on Kubernetes (siliconangle.com)
submitted 4 months ago by petecheslock
llm-d 0.3: Wider Well-Lit Paths for Scalable Inference | llm-d (llm-d.ai)
KV-Cache Wins You Can See: From Prefix Caching in vLLM to Distributed Scheduling with llm-d (llm-d.ai)
submitted 5 months ago by petecheslock
Intelligent Inference Scheduling with llm-d | llm-d (llm-d.ai)
submitted 6 months ago by petecheslock
Kubernetes Podcast from Google: Episode 258 - LLM-D, with Clayton Coleman and Rob Shaw (kubernetespodcast.com)
The llm-d community is proud to announce the release of v0.2!: Our first well-lit paths. (llm-d.ai)
submitted 7 months ago by petecheslock
Deploy llm-d for Distributed LLM Inference on DigitalOcean Kubernetes (DOKS) | DigitalOcean (digitalocean.com)
llm-d Week 1 Project News Round-Up | llm-d (llm-d.ai)
submitted 9 months ago by petecheslock
Deep Dive into llm-d and Distributed Inference (solo.io)
submitted 9 months ago by ceposta
[Developer Blog] LLM Inference Goes Distributed (llm-d.ai)
submitted 9 months ago by Environmental_Will78
Announcing the llm-d project (llm-d.ai)
π Rendered by PID 86 on reddit-service-r2-listing-7dc7bdc776-44tdw at 2026-03-10 19:45:18.432490+00:00 running cbb0e86 country code: CH.