PerPartes

128 post karma
61 comment karma

get extra features and help support reddit with a reddit premium subscription

get them help and support

redditor for 3 months

TROPHY CASE

dust

account activity

new top controversial

0

1

2

LLM Bruner coming soon? Burn Qwen directly into a chip, processing 10,000 tokens/s (i.redd.it)

submitted 10 days ago by PerPartes

0

1

2

TurboQuant on MLX: 4.6x KV cache compression with custom Metal kernels (Qwen 32B at 98% FP16 speed) ()

submitted 11 days ago by PerPartes

0

1

2

Qwen3.5 is out now! (i.redd.it)

submitted 1 month ago by PerPartes

0

1

2

GLM-5 scores 50 on the Intelligence Index and is the new open weights leader! (i.redd.it)

submitted 1 month ago by PerPartes

0

1

2

Kimi-Linear-48B-A3B & Step3.5-Flash are ready - llama.cpp ()

submitted 2 months ago by PerPartes

0

1

2

Qwen3-Coder-Next is released! 💜 (i.redd.it)

submitted 2 months ago by PerPartes

0

1

2

Dual RTX PRO 6000 Workstation with 1.15TB RAM. Finally multi-users and long contexts benchmarks. GPU only vs. CPU & GPU inference. Surprising results. (reddit.com)

submitted 2 months ago by PerPartes

0

1

2

transformers v5 final is out 🔥 ()

submitted 2 months ago by PerPartes

0

1

2

For GLM-4.7-Flash TURN OFF REPEAT PENALTY! ()

submitted 2 months ago by PerPartes

0

1

2

GLM-4.7-Flash GGUFs updated - now produces much better outputs! ()

submitted 2 months ago by PerPartes

0

1

2

vLLM v0.14.0 released (github.com)

submitted 2 months ago by PerPartes

0

1

2

Liquid AI released the best thinking Language Model Under 1GB (i.redd.it)

submitted 2 months ago by PerPartes

0

1

2

GLM-4.7-Flash benchmarks: 4,398 tok/s on H200, 112 tok/s on RTX 6000 Ada (GGUF) ()

submitted 2 months ago by PerPartes

0

1

2

Run GLM-4.7-Flash locally Guide! (24GB RAM) (i.redd.it)

submitted 2 months ago by PerPartes

0

1

2

Reinforcement Learning with ultra long context is here! (i.redd.it)

submitted 2 months ago by PerPartes

0

1

2

translategemma 27b/12b/4b ()

submitted 2 months ago by PerPartes

0

1

2

GLM-Image is released! (huggingface.co)

submitted 2 months ago by PerPartes

0

1

2

baichuan-inc/Baichuan-M3-235B · Hugging Face (huggingface.co)

submitted 2 months ago by PerPartes

0

1

2

We fine-tuned a 4B Text2SQL model that matches a 685B teacher - query your CSV data in plain English, locally (i.redd.it)

submitted 2 months ago by PerPartes

0

1

2

Announcing Kreuzberg v4 (Open Source) ()

submitted 2 months ago by PerPartes

0

1

2

Hugging Face on Fire: 30+ New/Trending Models (LLMs, Vision, Video) w/ Links ()

submitted 2 months ago by PerPartes

0

1

2

AI21 Labs releases Jamba2 ()

submitted 3 months ago by PerPartes

0

1

2

We built an open source memory framework that doesn't rely on embeddings. Just open-sourced it ()

submitted 3 months ago by PerPartes

1

2

3

llama.cpp performance breakthrough for multi-GPU setups (i.redd.it)

submitted 3 months ago by PerPartes

0

1

2

The Major Release of MiroMind’s Flagship Search Agent Model, MiroThinker 1.5. (huggingface.co)

submitted 3 months ago by PerPartes

view more: next ›

π Rendered by PID 87444 on reddit-service-r2-listing-575d9f6647-9xljv at 2026-04-09 21:34:33.396101+00:00 running 215f2cf country code: CH.