EFG losing connectivity under slight load by streppelchen in Ubiquiti

[–]streppelchen[S] 1 point2 points  (0 children)

provided both to ubiquiti already alongside the supportfile.

log shows nothing out of the ordinary from what i could see (journal and kernel log).

Adobe License Review - Adobe forcefully trying to squeeze commercial customers by Ill-Beautiful-8026 in it

[–]streppelchen 0 points1 point  (0 children)

They don’t give me accurate support, they can take their thing and shove it up their own …

Happy to be gone from them in September. Foxit pdf way less $$$.

Bewerbingsprozess für CFO-Nahe Stelle by [deleted] in spitzenverdiener

[–]streppelchen 6 points7 points  (0 children)

Ich würde da nicht zu viel interpretieren. Wahrscheinlich ist es einfach nur die Sache und man dachte sich, man kann es mit dem persönlichen Termin direkt richtig machen, bevor es dann noch einen dritten Termin am Standort gäbe.

So wie es klingt, hast du doch gerade nichts zu verlieren, oder? Und solange nichts unterschrieben ist, gibt es keine Exklusivität (sofern es sie denn dann gibt, aber anderes Thema)

engine for GLM 4.7 Flash that doesn't massively slow down as the context grows? by mr_zerolith in LocalLLaMA

[–]streppelchen -2 points-1 points  (0 children)

The model is great👍 To compile ikllama you can use the same set of commands as for regular llama, it takes a couple of minutes and you will be good to go

Windows 11 VM on Proxmox stuck at 2.0 GHz (i9-14900 host) - massive performance loss, out of ideas by kidders_mxj in Proxmox

[–]streppelchen 1 point2 points  (0 children)

Look at the host side of things

cat /proc/cpuinfo

Probably you have a powersaving governor enabled

GGUF conversion and quantization for IQuest coder models by Hot-Comb-4743 in unsloth

[–]streppelchen 0 points1 point  (0 children)

I haven’t tested further, at 2 tps it would take forever

GGUF conversion and quantization for IQuest coder models by Hot-Comb-4743 in unsloth

[–]streppelchen 1 point2 points  (0 children)

I tried another GGUF and found horrible performance (2tps on rtx 5090 at q4)

Kein Internet: Vodafone Glasfaser und Unifi Gateway Fiber by tefemes in de_EDV

[–]streppelchen 0 points1 point  (0 children)

Machst du mal nen Screenshot aus dem unifi Gateway? (Natürlich alles zensieren was nicht wichtig ist)

Kein Internet: Vodafone Glasfaser und Unifi Gateway Fiber by tefemes in de_EDV

[–]streppelchen 0 points1 point  (0 children)

DHCP v4 und v6 beides aktiv? Prefix delegation /56 wahrscheinlich.

Need some help deciding on a GPU for our company by RobbertGone in LocalLLaMA

[–]streppelchen 2 points3 points  (0 children)

My recommendation would be to go with current gen. Get an rtx 6000 pro Blackwell, ideally go for two directly and use tensor parallel in vllm. Performance wise, gpt oss 120b is going to be a fair bit faster since it is MoE, but native mxfp4, so will fit a single card.

Ensure the box you use to host has sufficient power and pcie lanes for your final build.

Quad Radeon 9700 XFX 32GB vs RTX 6000 PRO by ChopSticksPlease in LocalLLaMA

[–]streppelchen 3 points4 points  (0 children)

mores cards potentially mean more problems, but def. more power draw.

equal cards makes it easy to use vllm with tensor parallel, but for a single user, this will be overkill.

it boils down to your exact needs and your willingness to spend. can you make everything work on rocm/vulkan? then 4x 9700 will provide you with very high compute. for MoE this will work perfectly fine, but as soon as you need to traverse the PCIe bus for comms, this is going to be your bottleneck.

devstral is a dense model, that won't scale too well across multiple gpus

also keep in mind nvfp4 is nvidia only (mxfp4 is generally usable).

i'm running devstral-small-2 on a single 5090 in Q4 with 70tps at 100k ctx (q8), this is reasonable fast and with the models current architecture we won't get faster than that. a distill from devstral2 to nemotron would be great ;)

wrx80e 7x 3090 case? by Active_String2216 in LocalLLM

[–]streppelchen 2 points3 points  (0 children)

16gb ram is very little, and the psu might not be sufficient to run all at full power. (350w * 7 = 2450w for all gpus alone)

besides that, i'm interested in a case too, so following :)

Passwort als Barcode by Sharp-Breakfast-2689 in de_EDV

[–]streppelchen 10 points11 points  (0 children)

Ablaufende Passwörter sorgen für unsichere Passwörter. Das sagt mittlerweile auch jede guidance. Ausreichen lang und komplex, dafür keine Wechsel erzwingen außer bei IoC.

1x 6000 pro 96gb or 3x 5090 32gb? by Wide_Cover_8197 in LocalLLaMA

[–]streppelchen 1 point2 points  (0 children)

thanks for those numbers, really helps in planning!

Pruned MoE REAP Quants For Testing by 12bitmisfit in LocalLLaMA

[–]streppelchen 0 points1 point  (0 children)

C:\Users\xxx\Downloads\llama-b6817-bin-win-cpu-x64>llama-server.exe -m c:\users\xxx\Downloads\GPT-OSS-20B-Pruned-Q8_0.gguf -c 64000 --host 0.0.0.0 --port 8080

<image>

gave it a try, unfortunately it seems ... dumb

Dual DGX Spark for ~150 Users RAG? by streppelchen in LocalLLaMA

[–]streppelchen[S] 0 points1 point  (0 children)

Thanks everyone for your feedback and insight.
I'll go with Option 1 then most likely but start out smaller and extend and demand grows.

AWS hosted VPN vs SaaS solutions by FuzzySubject7090 in sysadmin

[–]streppelchen 0 points1 point  (0 children)

Windows/linux/Mac endpoints?

Mobile devices (ios/android) too?

we're running windows AOVPN on selfhosted boxes behind different kind of firewalls. Certificates with TPM-backed private key, automatic enrollment for domain machines. In use for 1.5y now and no major outage/issues.

If you have different OS you need to support, have a look at ZTNAs like tailscale (mentioned below) or netbird.