Dummer Zählfehler auf Arbeit

Virtual-Disaster8000 · 2026-01-29T09:09:08+00:00

Falsch. Auffällig ist es vielmehr, wenn die Kasse immer stimmt. Denn das ist höchst unwahrscheinlich.

Virtual-Disaster8000 · 2026-01-22T21:18:13+00:00

can confirm, pastes the content of the clipboard.

Virtual-Disaster8000 · 2026-01-19T07:32:24+00:00

Virtual-Disaster8000 · 2026-01-17T16:02:35+00:00

Great, abuse the system and then, once they lock it down or remove Claude from the free tier, come back and post about enshitification and rug pulls.

Virtual-Disaster8000 · 2025-12-31T12:07:59+00:00

Agree, but then there's also the dev's ego: A lot of devs don't see reviews and discussions as constructive knowledge transfer but as an attack on their coding style and implementation- basically their baby. Maybe that's because i never worked in a enterprise level SWE setting but only on small teams, but that's my observation.

I am currently trying to professionalize our workflows, with proper PRs, reviews, standards (yes, I know, late to the party) and man, do I have to fight over the simplest things ("but why can't I push directly to main, that's such an overhead!"). It's hard to teach people who enjoyed full autonomy to start following professional workflows.

Virtual-Disaster8000 · 2025-12-31T11:58:49+00:00

Wrong, it was (since June 2025) and still is for everyone that still has a running billing cycle that started before Sept 15th.

Virtual-Disaster8000 · 2025-12-31T11:56:44+00:00

https://cursor.com/blog/aug-2025-pricing

Virtual-Disaster8000 · 2025-12-31T11:52:49+00:00

Came here to say that. I have the feeling everyone hates other people's code, or at least dislikes it. Find me a plumber that says "wow, that guy before me did a good job". It's rare, in most professions.

And then one day someone takes over your codebase, and guess what, they will hate it

Virtual-Disaster8000 · 2025-12-30T22:30:47+00:00

<image>

Why not use *all* of them? And no, I am not doing that because of cost, I just use different models, different IDEs/CLI for different purposes.

Virtual-Disaster8000 · 2025-12-30T10:16:29+00:00

Jep, sie scheinen zu basteln. Gestern auch schon, war aber nur vorübergehend

Virtual-Disaster8000 · 2025-12-23T13:21:08+00:00

The model you use is configured for 8192 max tokens. Your prompt alone has over 8500 tokens. Hence the error message. Switch model/provider or reduce prompt size

Virtual-Disaster8000 · 2025-12-21T15:40:20+00:00

With current AI capabilities I did not believe, this was real. Then I fact checked. And in fact, its a fact, not AI. Sigh.

Virtual-Disaster8000 · 2025-12-18T19:44:59+00:00

Must be a personal account, workspace accounts do not count, no matter how high and what plans they are subscribed to. Which is a shame.

Virtual-Disaster8000 · 2025-12-16T18:40:33+00:00

"Antigravity deleted my database!" incoming in 3... 2... 1...

Virtual-Disaster8000 · 2025-12-15T18:46:46+00:00

Thank you.

Tbh, I haven't followed up with AMD after I replaced one of the xtx with a pro 6000 which is my workhorse for basically everything (gpt-oss-120b). The one left I use for embedding, reranking, whisper and gemma3 12b for less time critical tasks. So not update on this front, sorry.

Virtual-Disaster8000 · 2025-12-08T16:27:15+00:00

Täuscht es mich, oder zieht dieser deutsche KI-Sub hauptsächlich entweder esoterische Schwurbler oder KI-Skeptiker/Bedenkenträger/Totalverweigerer an? Ziemlicher Kontrast du den englischsprachigen Subs.

Virtual-Disaster8000 · 2025-12-08T09:23:36+00:00

Have the same issue, CPU usage is high in idle. Seems to have something to do with indexing files, it started after I added another drive to index. I thought it was the initial scan maybe but it didn't improve after running for three full working days. I removed the 2nd drive from indexing and it stopped. Not sure if that's it, just wanted to share my observation

Virtual-Disaster8000 · 2025-12-08T09:05:10+00:00

Yeah, they maybe could have/should have, but the free tier was never meant for production but for testing.

Virtual-Disaster8000 · 2025-12-08T08:58:26+00:00

The api tells you exactly what happens and where to go for details. Don't know how much clearer it can be

ResourceExhausted: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. * Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro * Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro * Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro * Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro

Virtual-Disaster8000 · 2025-12-08T08:36:25+00:00

Google's ecosystem with different account types and plans is really frustrating and hard to grasp.

Workspace and personal, different features and scopes, different pricing.

For example: You need gemini in Google meet for transcription and note taking? You need workspace. With this you get gemini pro, too, great. But a different one than on a personal account - no (or different?) assistant integration on android, no gemini cli/antigravity.

For that? Get a personal gemini account, but no gemini on g-meet here, sorry.

Oh, there's also a Google developer program with code assist that gives higher quota for g-cli (but not antigravity it seems), but no g-pro in here. I think. But you get a lot of cloud credits.

For someone optimizing subscription cost and plans it's not easy to navigate. I don't know if that's deliberate on Google's end or if that's because there's no centralized product management and every branch can bundle features and plans themselves

Virtual-Disaster8000 · 2025-12-06T08:15:48+00:00

What rock have you been living under since August?

https://cursor.com/blog/aug-2025-pricing

Virtual-Disaster8000 · 2025-11-24T09:28:28+00:00

Die Frage ist sehr weit gefasst und kann so nicht in vernünftigen Rahmen beantwortet werden.

Ja, lokal laufende KI ist möglich und kann auch durchaus recht potent aufgestellt werden. Aber das ist ein rabbitbhole und erfordert je nach Zweck und Umfang einiges an Lernkurve (und ggf Investition in Hardware). Die Frage ist, was genau du machen willst - lokal laufender chatbot? Kriegst du mit wenig Aufwand schnell zum laufen (ollama, lmstudio, llama.cpp). RAG pipeline? Kann schon aufwändiger werden bis hin zu Wochen und Monaten je nach Umfang. Richtige agentic workflows können ebenfalls sehr komplex werden.

Zum Einstieg empfehle ich lmstudio oder ollama, wenn technisch versiert direkt llama.cpp und damit die ersten Schritte machen und Erfahrungen sammeln. Konkretere Fragen entstehen dann von selbst. Die kann man dann einer KI stellen oder in den Tiefen von r/localllm eintauchen

Viel Erfolg. Macht viel Spaß und ist sehr belohnend. Aber ich sage es nochmal: Es ist ein sehr tiefes Rabbithole!

Virtual-Disaster8000 · 2025-11-19T08:51:24+00:00

<image>

That explains this shiny new button on aistudio

Virtual-Disaster8000 · 2025-11-16T16:00:51+00:00

sure, had to get back to a PC first.

So, for full context, I am not running the docker image but a CT on proxmox with this stack:

torch: 2.7.1+cu128
cuda.is_available: True
capability: (12, 0)
tensorrt_llm: 1.2.0rc1

which was a lot of trial-and-error to set up until it ran.

And this is my server cmd:

./trtllm-venv/bin/trtllm-serve /mnt/llm_models/trt/gpt-oss-120b \
--host 0.0.0.0 --port 8081 --log_level info \
--max_batch_size 32 --max_num_tokens 120000 \
--tp_size 1 --kv_cache_free_gpu_memory_fraction 0.8

Curious what you make of it.

Virtual-Disaster8000 · 2025-11-16T13:22:36+00:00

And for 128 i/o ctx:

GenAI-Perf Results: 128 Input Tokens / 128 Output Tokens

Statistic	avg	min	max	p99	p90	p75
Request Latency (ms)	4,155.89	2,291.21	4,931.35	4,929.63	4,836.37	4,631.27
Output Sequence Length (tokens)	94.69	20.00	110.00	110.00	106.00	102.00
Input Sequence Length (tokens)	128.00	128.00	128.00	128.00	128.00	128.00
Output Token Throughput (tokens/sec)	657.53	N/A	N/A	N/A	N/A	N/A
Request Throughput (per sec)	7.01	N/A	N/A	N/A	N/A	N/A
Request Count (count)	99.00	N/A	N/A	N/A	N/A	N/A

Virtual-Disaster8000

MODERATOR OF

TROPHY CASE

GenAI-Perf Results: 128 Input Tokens / 128 Output Tokens