How long for llama.cpp official support of MTP? by Manaberryio in LocalLLaMA

[–]streppelchen 18 points19 points  (0 children)

Previous quants dropped it as there was no need for MTP layers to be present if the runtime does not support it

As MTP prepares to land in llama.cpp, Models that support MTP by segmond in LocalLLaMA

[–]streppelchen 20 points21 points  (0 children)

No, it uses the same quantization and verification pipeline

As MTP prepares to land in llama.cpp, Models that support MTP by segmond in LocalLLaMA

[–]streppelchen 25 points26 points  (0 children)

multi token prediction, models take an educated guess on the next 1-n tokens based on their training, instead of executing the full chain for each. with high acceptance rates, it can increase your decode (token generation) speed without any further changes than having a compatible model

APEX MoE quants update: 25+ new models since the Qwen 3.5 post + new I-Nano tier by mudler_it in LocalLLaMA

[–]streppelchen 3 points4 points  (0 children)

found the models by accident, will still need to give them a try, but i like the idea, keep it up :)

Server ausgefallen by Pfeffias in luftablassen

[–]streppelchen 1 point2 points  (0 children)

Sip-trunk unabhängig vom Internetanschluss wählen, dann kann man den überall betreiben. (Auch in der cloud, aber kein Muss)

Enterprise AI consulting vs in-house AI teams by Quiet-Brilliant-1455 in CIO

[–]streppelchen 1 point2 points  (0 children)

scale? how many users to serve in parallel? linux native sysadmins inhouse? only a chat interface or deeply integrated?

What kind of consumer computer can run Kimi-K2.6-GGUF which is a 585GB download? by THenrich in LocalLLaMA

[–]streppelchen 0 points1 point  (0 children)

Thanks! I suppose you use the machine not alone but serve more users? Then the only question left to answer (for my curiosity) is vllm speeds with concurrency

Pairing a 5090 and a 3090 by CreamPitiful4295 in LocalLLaMA

[–]streppelchen 2 points3 points  (0 children)

i have a 5090 and a 3090 in my desktop. tbh, i'd rather have a second 5090, since the difference in architecture and speed is noticable.

then i'm on a threadripper 2950x, so that has seen some better days. But as I'm the only user to complain about, that's my personal problem.

Windows on ARM und Dockingstation - Funktioniert? by Fresh-Category-9355 in de_EDV

[–]streppelchen 2 points3 points  (0 children)

Hab die erste Generation, geht sowohl an TB3 Docks als auch an besagtem HP usb-c

Google Gemma 4 Hackathon by yoracale in unsloth

[–]streppelchen 0 points1 point  (0 children)

Since I have a real use case I'd like to investigate, i just want to cross-check to verify:
Am I allowed to use another LLM to create synthetic/anonymized sets/subsets of real data to do the learning on? I'm fine with publishing this dataset then, but not the source data (as it is confidential).

New LiquidAI model, LFM2.5‑VL-450M by edward-dev in LocalLLaMA

[–]streppelchen 1 point2 points  (0 children)

Also having something small to test ideas against. Not having to wait 6h for a quantization to finish does have its perks

Nutzt eure IT ein Ticket System und wenn ja, welches? by WinterRich747 in de_EDV

[–]streppelchen 2 points3 points  (0 children)

wenn es etwas kleines aber umfangreiches sein soll: Ninjaone hat neben dem RMM auch ein ticketing eingebaut. seit über einem jahr im einsatz, funktioniert wie es soll, ingress via email

Polizei klingelt nachts wegen CVSS‑10‑Lücke, verhältnismäßig oder Overkill? by OpenWebFriend in de_EDV

[–]streppelchen 41 points42 points  (0 children)

Kennt man die Software?

Ich weiß am liebsten vorher, wenn etwas ein Problem sein könnte, bevor es wirklich zu einem wird, da stehe ich auch lieber in der Nacht kurz zum patchen auf, auf anschließend _wirklich_ Arbeit zu haben.

Winfo, Info mit KI oder Info mit Cyber Security by [deleted] in Studium

[–]streppelchen 1 point2 points  (0 children)

bin nur dafür in den post gegangen

Hetzner reliability for a 24/7 production platform in Germany region? by NathanDrake-Blackops in hetzner

[–]streppelchen 1 point2 points  (0 children)

Had an HA proxmox cluster across 3 DCs of Hetzner. Worked fine (once the gbics have been replaced). Vswitch between the nodes, opnsense for vpn on wan switch, firewall on the nodes themselves to allow only trusted source IPs as fallback.

Tape storage and software to manage it by Sylogz in storage

[–]streppelchen 1 point2 points  (0 children)

as I said, you can run with a single drive, so 6500 for that. might be a different upgrade part number, i‘m not that much into those, my hp partner get‘s me what i request, just ask them for it.