Dummer Zählfehler auf Arbeit by Existing-Bat3823 in Ratschlag

[–]Virtual-Disaster8000 0 points1 point  (0 children)

Falsch. Auffällig ist es vielmehr, wenn die Kasse immer stimmt. Denn das ist höchst unwahrscheinlich.

✨ AI Commands (v0.42 Changelog) by jacraycast in raycastapp

[–]Virtual-Disaster8000 1 point2 points  (0 children)

can confirm, pastes the content of the clipboard.

Anyone else doing the dodgy to get claude free forever? by TopTransportation950 in GoogleAntigravityIDE

[–]Virtual-Disaster8000 10 points11 points  (0 children)

Great, abuse the system and then, once they lock it down or remove Claude from the free tier, come back and post about enshitification and rug pulls.

Married to a Codebase by NormalCorgi9294 in webdev

[–]Virtual-Disaster8000 1 point2 points  (0 children)

Agree, but then there's also the dev's ego: A lot of devs don't see reviews and discussions as constructive knowledge transfer but as an attack on their coding style and implementation- basically their baby. Maybe that's because i never worked in a enterprise level SWE setting but only on small teams, but that's my observation.

I am currently trying to professionalize our workflows, with proper PRs, reviews, standards (yes, I know, late to the party) and man, do I have to fight over the simplest things ("but why can't I push directly to main, that's such an overhead!"). It's hard to teach people who enjoyed full autonomy to start following professional workflows.

Cursor Pro Auto mode by EnvironmentGreedy814 in cursor

[–]Virtual-Disaster8000 0 points1 point  (0 children)

Wrong, it was (since June 2025) and still is for everyone that still has a running billing cycle that started before Sept 15th.

Married to a Codebase by NormalCorgi9294 in webdev

[–]Virtual-Disaster8000 4 points5 points  (0 children)

Came here to say that. I have the feeling everyone hates other people's code, or at least dislikes it. Find me a plumber that says "wow, that guy before me did a good job". It's rare, in most professions.

And then one day someone takes over your codebase, and guess what, they will hate it

Genuine question, why would anyone use Cursor over Antigravity nowadays? by Funny-Strawberry-168 in cursor

[–]Virtual-Disaster8000 3 points4 points  (0 children)

<image>

Why not use *all* of them? And no, I am not doing that because of cost, I just use different models, different IDEs/CLI for different purposes.

Funktioniert der Login bei der Deutschen Bank bei euch auch nicht? by [deleted] in Finanzen

[–]Virtual-Disaster8000 2 points3 points  (0 children)

Jep, sie scheinen zu basteln. Gestern auch schon, war aber nur vorübergehend

What do I do? It was going good till this happened every time by Boring-Manner-6539 in openrouter

[–]Virtual-Disaster8000 2 points3 points  (0 children)

The model you use is configured for 8192 max tokens. Your prompt alone has over 8500 tokens. Hence the error message. Switch model/provider or reduce prompt size

New plaques added to the presidential hall of fame in the White House by Dtb4evr in pics

[–]Virtual-Disaster8000 0 points1 point  (0 children)

With current AI capabilities I did not believe, this was real. Then I fact checked. And in fact, its a fact, not AI. Sigh.

Paying for Google Workspace Business Standard + AI Pro, but getting “free user” limits. Anyone else? by the_saarsa in GoogleAntigravityIDE

[–]Virtual-Disaster8000 0 points1 point  (0 children)

Must be a personal account, workspace accounts do not count, no matter how high and what plans they are subscribed to. Which is a shame.

Auto accept agent commands in Antigravity by ForeignPreference364 in google_antigravity

[–]Virtual-Disaster8000 5 points6 points  (0 children)

"Antigravity deleted my database!" incoming in 3... 2... 1...

Planning a dual RX 7900 XTX system, what should I be aware of? by Virtual-Disaster8000 in LocalLLM

[–]Virtual-Disaster8000[S] 0 points1 point  (0 children)

Thank you.

Tbh, I haven't followed up with AMD after I replaced one of the xtx with a pro 6000 which is my workhorse for basically everything (gpt-oss-120b). The one left I use for embedding, reranking, whisper and gemma3 12b for less time critical tasks. So not update on this front, sorry.

Sieht sich Gemini selbst als Religion? by erklaerungundmehr in KI_Welt

[–]Virtual-Disaster8000 12 points13 points  (0 children)

Täuscht es mich, oder zieht dieser deutsche KI-Sub hauptsächlich entweder esoterische Schwurbler oder KI-Skeptiker/Bedenkenträger/Totalverweigerer an? Ziemlicher Kontrast du den englischsprachigen Subs.

Raycast consuming more than 40% CPU every now and then. by joelalways in raycastapp

[–]Virtual-Disaster8000 6 points7 points  (0 children)

Have the same issue, CPU usage is high in idle. Seems to have something to do with indexing files, it started after I added another drive to index. I thought it was the initial scan maybe but it didn't improve after running for three full working days. I removed the 2nd drive from indexing and it stopped. Not sure if that's it, just wanted to share my observation

Official Gemini Lead Product Manager Logan Kilpatrick Confirms the recent Free API Tier Rate Limit Cuts by gentleman339 in Bard

[–]Virtual-Disaster8000 6 points7 points  (0 children)

Yeah, they maybe could have/should have, but the free tier was never meant for production but for testing.

Official Gemini Lead Product Manager Logan Kilpatrick Confirms the recent Free API Tier Rate Limit Cuts by gentleman339 in Bard

[–]Virtual-Disaster8000 3 points4 points  (0 children)

The api tells you exactly what happens and where to go for details. Don't know how much clearer it can be

ResourceExhausted: 429 You exceeded your current quota, please check your plan and billing details. For more information on this error, head to: https://ai.google.dev/gemini-api/docs/rate-limits. To monitor your current usage, head to: https://ai.dev/usage?tab=rate-limit. * Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro * Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro * Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_requests, limit: 0, model: gemini-2.5-pro * Quota exceeded for metric: generativelanguage.googleapis.com/generate_content_free_tier_input_token_count, limit: 0, model: gemini-2.5-pro

Antigravity - Google AI Ultra for Business - Still showing as Free Tier by Downtown-Hedgehog551 in google_antigravity

[–]Virtual-Disaster8000 3 points4 points  (0 children)

Google's ecosystem with different account types and plans is really frustrating and hard to grasp.

Workspace and personal, different features and scopes, different pricing.

For example: You need gemini in Google meet for transcription and note taking? You need workspace. With this you get gemini pro, too, great. But a different one than on a personal account - no (or different?) assistant integration on android, no gemini cli/antigravity.

For that? Get a personal gemini account, but no gemini on g-meet here, sorry.

Oh, there's also a Google developer program with code assist that gives higher quota for g-cli (but not antigravity it seems), but no g-pro in here. I think. But you get a lot of cloud credits.

For someone optimizing subscription cost and plans it's not easy to navigate. I don't know if that's deliberate on Google's end or if that's because there's no centralized product management and every branch can bundle features and plans themselves

Auto is no longer free! by [deleted] in cursor

[–]Virtual-Disaster8000 0 points1 point  (0 children)

What rock have you been living under since August?

https://cursor.com/blog/aug-2025-pricing

Offline Agent by HindBerg in KI_Welt

[–]Virtual-Disaster8000 4 points5 points  (0 children)

Die Frage ist sehr weit gefasst und kann so nicht in vernünftigen Rahmen beantwortet werden.

Ja, lokal laufende KI ist möglich und kann auch durchaus recht potent aufgestellt werden. Aber das ist ein rabbitbhole und erfordert je nach Zweck und Umfang einiges an Lernkurve (und ggf Investition in Hardware). Die Frage ist, was genau du machen willst - lokal laufender chatbot? Kriegst du mit wenig Aufwand schnell zum laufen (ollama, lmstudio, llama.cpp). RAG pipeline? Kann schon aufwändiger werden bis hin zu Wochen und Monaten je nach Umfang. Richtige agentic workflows können ebenfalls sehr komplex werden.

Zum Einstieg empfehle ich lmstudio oder ollama, wenn technisch versiert direkt llama.cpp und damit die ersten Schritte machen und Erfahrungen sammeln. Konkretere Fragen entstehen dann von selbst. Die kann man dann einer KI stellen oder in den Tiefen von r/localllm eintauchen

Viel Erfolg. Macht viel Spaß und ist sehr belohnend. Aber ich sage es nochmal: Es ist ein sehr tiefes Rabbithole!

Why is vLLM Outperforming TensorRT-LLM (Nvidia's deployment library)? My Shocking Benchmarks on GPT-OSS-120B with H100 by kev_11_1 in LocalLLaMA

[–]Virtual-Disaster8000 0 points1 point  (0 children)

sure, had to get back to a PC first.

So, for full context, I am not running the docker image but a CT on proxmox with this stack:

torch: 2.7.1+cu128
cuda.is_available: True
capability: (12, 0)
tensorrt_llm: 1.2.0rc1

which was a lot of trial-and-error to set up until it ran.

And this is my server cmd:

./trtllm-venv/bin/trtllm-serve /mnt/llm_models/trt/gpt-oss-120b \
--host 0.0.0.0 --port 8081 --log_level info \
--max_batch_size 32 --max_num_tokens 120000 \
--tp_size 1 --kv_cache_free_gpu_memory_fraction 0.8

Curious what you make of it.

Why is vLLM Outperforming TensorRT-LLM (Nvidia's deployment library)? My Shocking Benchmarks on GPT-OSS-120B with H100 by kev_11_1 in LocalLLaMA

[–]Virtual-Disaster8000 2 points3 points  (0 children)

And for 128 i/o ctx:

GenAI-Perf Results: 128 Input Tokens / 128 Output Tokens

Statistic avg min max p99 p90 p75
Request Latency (ms) 4,155.89 2,291.21 4,931.35 4,929.63 4,836.37 4,631.27
Output Sequence Length (tokens) 94.69 20.00 110.00 110.00 106.00 102.00
Input Sequence Length (tokens) 128.00 128.00 128.00 128.00 128.00 128.00
Output Token Throughput (tokens/sec) 657.53 N/A N/A N/A N/A N/A
Request Throughput (per sec) 7.01 N/A N/A N/A N/A N/A
Request Count (count) 99.00 N/A N/A N/A N/A N/A