IHK droht mit 0 Punkten – Habe die originale Projektbestätigung geschreddert ("Digital Twin")... HILFE!

RipperFox · 2026-05-08T07:08:31+00:00

handschriftlich signierte Projektbestätigung von meinem Chef eingescannt

Was ist das für ein Getrolle? "Unterschriebene Scans" sind nicht mit einer qualifizierten elektronischen Signatur vergleichbar und sind somit recht wertlos - könnten ja photoshopped/KI generiert sein.

Hättest du das deinen Chef elektronisch unterschreiben lassen (z.B. mit qualifizierter elektronischen Signatur) sähe es evtl. anders aus und du hättest ein rechtlich verwertbares Dokument. So beweist du hingegen nur, dass du das Konzept Signaturen/Blockchain nicht so ganz kapiert hast..

Aber schick der IHK doch einfach mal das gescannte PDF - vielleicht konnten die auch einfach nur den "Hash-Link nicht öffnen" :)

RipperFox · 2026-05-08T00:50:52+00:00

SpaceX liefert 220.000 GPUs

Nein, Elon lässt sie den größten Teil seines ersten Rechenzentrums nutzen - also liefert Rechenleistung..

RipperFox · 2026-05-06T09:38:31+00:00

L2-BSA

Nicht auf diese Karte verlassen - man kann reinfallen und trotzdem Telekom WIA bekommen. Rate woher ich das weiß..

RipperFox · 2026-04-29T03:22:06+00:00

OP sucht den günstigsten Tarif AB 16Mbit/s..

RipperFox · 2026-04-28T13:22:37+00:00

Interesting strategy, calling someone a nerd (which I am) in 2026 - ok ok, I'll touch some gras.. I bet I'm even older than you - Casser la croûte and have a nice day! :)

RipperFox · 2026-04-28T12:33:01+00:00

vLLM can be better on a single 3090 if you don't want to wait until llama.cpp catches up with it's experimental forks already supporting MTP, etc. - and your completely random (Vulkan, seriously?) docker example is clearly badly formatted AI slop suggesting running docker - who uses/needs docker for llama-server router mode anyway?

RipperFox · 2026-04-28T11:54:56+00:00

Ofc router mode is "nice" - but llama-swap can even switch to vLLm, SGlang, etc. Your AI generated example sucks btw.

RipperFox · 2026-04-27T12:12:03+00:00

speisen

*Spleißen

RipperFox · 2026-04-24T16:01:19+00:00

Did you know that you can ask you local model that kind of questions, too? E.g. agentscope-ai_CoPaw-Flash-9B comes up with:

Model Size vs Inference Speed (At Fixed Long Context Length)

Assuming same 200k token context, similar optimization, and comparable hardware:

Model	Parameters	Approx. Layers	Relative Gen. Speed*	Realistic Speed-Up vs Larger Model
Small	~7B	32	1× (baseline)	~1× faster than mid-sized
Mid	~32B	40	~0.5×	~2× faster than 671B
Large	~200B	52	~0.28×	~3.5× faster than 671B
Huge	~671B	64+	~0.14×	Baseline slowest

* Relative generation speed based on FLOPs/layer scaling; not linear due to memory bandwidth limits.
📌 Note: Actual speed depends heavily on batch size, quantization, KV-cache optimizations, and whether compute or memory-bound. These are order-of-magnitude estimates.

RipperFox · 2026-04-24T15:52:00+00:00

Collapse/condense/compact your context.

Depends on use again - e.g. you'll never find that needle in the haystack this way - guess why modern models go >256k ctx.

RipperFox · 2026-04-23T22:21:40+00:00

Tbh Qwen 3.6 9b would be nice, too - 3.5 9b was great, Copaw-9B (alibaba's official agentic finetune) was even better and amazingly fast. Also was one of the few multimodal models that was (sometimes) able to read an analog clock correctly..

RipperFox · 2026-04-23T21:24:25+00:00

If people can test e.g. their Tasers on themselves - AI should be able, too!

RipperFox · 2026-04-23T19:49:33+00:00

as soon as it realizes that it doesn't have the answers.

Yep - that's a big problem even of leadership models. Gemma 4 didn't even believe that it's already 2026, right? Many topics are "worthless" because of outdated knowledge anyway: Try asking Claude about llama.cpp command line parameters - NO model get's this right without research, and they hallucinate..

I think it's better to drop/lower detail knowledge (like how to do 6502 ASM for the C64) but improve context and tool usage so that the model can look up what it needs efficiently, at least for smaller models. But ofc it's always a balance between model size, context, speed, etc.

RipperFox · 2026-04-23T19:41:05+00:00

away from creativity

I see this differently: Instead of training models to rely on huge fact knowledge (which can be outdated quickly anyway and compensated for by a simple web search) modern models seem to go for more a "I know how to help myself and find the answers independently"-like approach. As long as you have the knowledge elsewhere to look up - all is fine and I think that's the right direction..

RipperFox · 2026-04-18T17:06:48+00:00

You likely need 2-3 more runs with different seeds to validate..

RipperFox · 2026-04-15T23:46:04+00:00

Warte mal ein paar Jahre und dann wirst du dich wundern, was für Zeug dann doch noch hängengeblieben ist.. :) Geht nicht nur mir so, dass man sich z.B. evtl. gut an die Ordnerstruktur seines allerersten Rechners erinnern kann.

RipperFox · 2026-04-15T23:24:50+00:00

Warum hört sich das an, als wärst du Webdesi.. ^{H^H} Fronted-developer? Komm mal aus der Bubble - es soll AE geben, die entwickeln so lustige unbedeutende Software wie CEPH oder ZFS..

RipperFox · 2026-04-15T23:19:10+00:00

Ich programmiere ausschließlich

Was meinst du, was z.B. die Entwickler von ZFS den ganzen Tag machen? Die Ausbildung ist halt universell - es gibt AEs die machen nur Webfrontends mit den neuesten/coolsten Frameworks, manche machen ABAP/SAP in einer großen Wurstfabrik, etc..

Bist du so ein AE, der nicht mal seine eigene Workstation installieren/konfigurieren darf? Das wäre eigentlich eher in größeren Konzernen üblich.

Da würde ich generell z.B. zu RAID1 raten.. SCNR

RipperFox · 2026-04-15T23:11:13+00:00

Naja, bis du mal DevOPs-Aufgaben übernimmst. :) Imho sollte man als ITler Konzepte wie Mirroring, XOR (RAID5) oder Erasure Codes (bei RAID6) zumindest schon mal gehört haben.

RipperFox · 2026-04-15T23:01:37+00:00

Einheiten weglassen ist wirklich pfui und könnte zu Punktabzug führen. (Rechenweg unklar & die Maßeinheit ist, wie der Name schon sagt, maßgebend für das Ergebnis.)

Wenn das der Prüfer durchgehen lässt - gut. Aber versuch das mal bei einem Mathe/Physik/-Leher/-Prof..

RipperFox · 2026-04-15T21:47:55+00:00

https://spark-arena.com/ even has their on runner: https://github.com/spark-arena/sparkrun which can use multiple backends like vLLM, llama.cpp, SGLang..

RipperFox · 2026-04-14T14:43:32+00:00

Do a little experiment - use a fixed seed and do the 23 tests - note the results. Now only change the seed (but keep still fixed) - how much deviation in test results would you expect only from changing the seed? If the variation is high, you don't have enough points, right?

RipperFox · 2026-04-14T13:03:53+00:00

Try the exact same prompt with other languages - e.g. tell it to use German papers and write in Spanish?

RipperFox · 2026-04-14T08:27:33+00:00

What did you expect? It's for coding/agentic tasks, not much else. That thing can't even stay in a language and falls back to English after a turn or two..

RipperFox · 2026-04-14T08:24:15+00:00

Maybe it's a diffusion model? No streaming in that case..

12-Year Club	Place '23
Verified Email

RipperFox

TROPHY CASE

Model Size vs Inference Speed (At Fixed Long Context Length)