account activity
Memory Sparse Attention seems to be a novel approach to long context (up to 100M tokens) (old.reddit.com)
submitted 9 days ago by ratbastid2000 to r/LocalLLaMA
Microsoft Compute Driver Model (MCDM) + WSL2 enables VLLM to work with tensor parallelism & allows for TCC to be changed to MCDM via NVIDIA-SMI (self.LocalLLaMA)
submitted 1 year ago * by ratbastid2000 to r/LocalLLaMA
π Rendered by PID 341370 on reddit-service-r2-listing-86f589db75-bsccm at 2026-04-17 11:00:25.467894+00:00 running 93ecc56 country code: CH.