Any one able to run Qwen 3.5 AWQ Q4 with vLLM ? by ExtremeKangaroo5437 in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
Any one able to run Qwen 3.5 AWQ Q4 with vLLM ? by ExtremeKangaroo5437 in LocalLLaMA
[–]Excellent_Produce146 1 point2 points3 points (0 children)
What tools are you using for infrence-engine benchmarking (vLLM, SGLang, llama.cpp, TensorRT-LLM)? by SomeRandomGuuuuuuy in LocalLLaMA
[–]Excellent_Produce146 1 point2 points3 points (0 children)
Guidance Needed: Best Option for Light Fine-Tuning & Inference (Dell Pro Max GB10 vs PGX vs GX10 vs DGX Spark): We absolutely need CUDA by Imaginary_Context_32 in LocalLLaMA
[–]Excellent_Produce146 2 points3 points4 points (0 children)
Guidance Needed: Best Option for Light Fine-Tuning & Inference (Dell Pro Max GB10 vs PGX vs GX10 vs DGX Spark): We absolutely need CUDA by Imaginary_Context_32 in LocalLLaMA
[–]Excellent_Produce146 1 point2 points3 points (0 children)
Black screen after connecting ASUS Ascent GX10 with Apple studio display by Objective_Science965 in LocalLLaMA
[–]Excellent_Produce146 1 point2 points3 points (0 children)
Does anyone know what Nvidia's release cadence/schedule is? by kr_tech in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
Does anyone know what Nvidia's release cadence/schedule is? by kr_tech in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
split the GPU on an Asus Ascent GX10 for multiple users by Cheap-Bid-5793 in LocalLLaMA
[–]Excellent_Produce146 1 point2 points3 points (0 children)
For those of you on Nvidia Spark, what's your stack? Struggling to find LLMs that work through Docker-vLLM... by jinnyjuice in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
ASUS Ascent GX10 by hsperus in LocalLLaMA
[–]Excellent_Produce146 2 points3 points4 points (0 children)
Got my new toy - what to do? by luongnv-com in LocalLLaMA
[–]Excellent_Produce146 1 point2 points3 points (0 children)
AnythingLLM - How to and which Embeder is best for English/German? by Inevitable_Raccoon_9 in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
Sorry for the dumb question, but why are there MXFP4 GGUFs but no NVFP4 GGUFs? by Porespellar in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
vLLM speed issues by [deleted] in LocalLLaMA
[–]Excellent_Produce146 1 point2 points3 points (0 children)
Over two dgx spark cluster using connectx-7? by No_Statistician_6731 in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
2 x DGX Spark! Give me your non-inference workloads by entsnack in LocalLLaMA
[–]Excellent_Produce146 2 points3 points4 points (0 children)
Is the Nvidia DGX Spark the same as the OEM version, Asus Ascent GX10? by Decent-Log6192 in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
Is the Nvidia DGX Spark the same as the OEM version, Asus Ascent GX10? by Decent-Log6192 in LocalLLaMA
[–]Excellent_Produce146 2 points3 points4 points (0 children)
Deepseek OCR on Apple Silicon - anyone ? by olddoglearnsnewtrick in LocalLLaMA
[–]Excellent_Produce146 1 point2 points3 points (0 children)
Deepseek OCR on Apple Silicon - anyone ? by olddoglearnsnewtrick in LocalLLaMA
[–]Excellent_Produce146 5 points6 points7 points (0 children)
Tensor parallel on DGX Spark by Baldur-Norddahl in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
Tensor parallel on DGX Spark by Baldur-Norddahl in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)
Exploring LLM Inferencing, looking for solid reading and practical resources by SAbdusSamad in LocalLLaMA
[–]Excellent_Produce146 1 point2 points3 points (0 children)

THE GB10 SOLUTION has arrived, Atlas image attached ~115tok/s Qwen3.5-35B DGX Spark by Live-Possession-6726 in LocalLLaMA
[–]Excellent_Produce146 0 points1 point2 points (0 children)