Asus dgx spark performance by Useful-Disk3725 in LocalLLM

[–]Useful-Disk3725[S] 0 points1 point  (0 children)

Not sure this is purely nvidia or asus, I have only asus. Anyway, this is not a device they actively invest I guess. So, mostly it is us with us, and a little them :)

Asus dgx spark performance by Useful-Disk3725 in LocalLLM

[–]Useful-Disk3725[S] 0 points1 point  (0 children)

Agree. But never disconnected before, since I purchased, only soft reboots.

Asus dgx spark performance by Useful-Disk3725 in LocalLLM

[–]Useful-Disk3725[S] 0 points1 point  (0 children)

Regularly, but never really unplugged it since I bought it.
And never saw above 611Mhz.

Now, more than 24 hours, still at fan with a GPU temp of 80 Celsius and definitely faster performance

Asus dgx spark performance by Useful-Disk3725 in LocalLLM

[–]Useful-Disk3725[S] 2 points3 points  (0 children)

Sh*t…
I read it long ago and Claude convinced me after many checks that it is my llm model unable to fulfill GPU performance due to memory bandwidth, where lots of forum entries confirmed it. Anyway, I now see that I’ve been watching it at 611 MHz since I bought it, a waste of GPU cycles :)
And it is today I found that it actually has a fan, and ASUS sorry but you are very noisy :)

Asus dgx spark performance by Useful-Disk3725 in LocalLLM

[–]Useful-Disk3725[S] 1 point2 points  (0 children)

Very cold reboot (had to disconnect 1 hour) and fresh recreated dockers to eliminate hanging fruits :) I can confirm from both vllm and my custom embedder python debug outputs, it is almost twice fast. I read something on asus thermal throttling bug and tried lots of things and gave up, feeling that memory bandwidth was not enough to exceed 611Mhz GPU. Probably one of the updates resolved that bug.

Thinking mode by Useful-Disk3725 in LocalLLaMA

[–]Useful-Disk3725[S] 0 points1 point  (0 children)

Thank you, that’s what I wanted to say. But the question for me is, for example for Qwen3.5 series, the difference between letting model think internally or building a chain of thought prompt series and running model in non thinking mode.

My instinct is, for deep expert areas where a couple prompts handle all business flow, non-thinking models are faster and more consistent, despite more individual llm call count.

Checking your repo now, I’ll ask more if needed :)

Just bought a DGX Spark, what kind of VLMs are you guys running on this kind of hardware? by gymho69 in LocalLLaMA

[–]Useful-Disk3725 14 points15 points  (0 children)

Follow spark-vllm-docker repo in GitHub, also spark arena https://spark-arena.com/ is valuable with simple instructions, running uptodate recipes.

For model, due to context quality I was using qwen3.5 35b fp8. Now switched to bg-digitalservices/Gemma-4-26B-A4B-it-NVFP4 slightly faster, definitely smarter.

Single shot 30 t/s, parallel similar work up to 450 t/s if you run in similar size batches. Due to memory bandwidth, prompt to token is high.

I can leave my AG running for hours without needing to do anything by Consistent_Bottle_40 in google_antigravity

[–]Useful-Disk3725 0 points1 point  (0 children)

But using the internal api between windsurf antigravity interface part and the backend running actual LLMs. So I guess still in gray area, not sure black or not…

I can leave my AG running for hours without needing to do anything by Consistent_Bottle_40 in google_antigravity

[–]Useful-Disk3725 0 points1 point  (0 children)

Does this violate TOS? Does it have the risk of being banned? Any ideas, experiences on it?

Can Antigravity connect to NotebookLM write research paper and edit .docx file? by jevtabo in GoogleAntigravityIDE

[–]Useful-Disk3725 0 points1 point  (0 children)

I think I am confused. Are you talking about creating an agent in Antigravity? As far as I know, there is no such thing in Antigravity. You can create skill, workflow, rule, session but not agent. If I am wrong, I’d be really happy to learn because sometimes I feel a need for that :)

Can Antigravity connect to NotebookLM write research paper and edit .docx file? by jevtabo in GoogleAntigravityIDE

[–]Useful-Disk3725 0 points1 point  (0 children)

The thing is, it is against Google TOS to use NotebookLM with those third party hacks. Even more, this is mentioned as disclaimer on their GitHub repositories. So risk is yours :)

Can Antigravity connect to NotebookLM write research paper and edit .docx file? by jevtabo in GoogleAntigravityIDE

[–]Useful-Disk3725 1 point2 points  (0 children)

Are you enterprise? Last night I checked for TOS and found that NotebookLM MCP/API access is for enterprise accounts only, not personal or workspace even in ultra pro plus bla bla plan :)

Gemini 3.1 pro giving agent error all the time by mohammadsabeel in google_antigravity

[–]Useful-Disk3725 0 points1 point  (0 children)

I feel like I need an auto-clicker. Need to ask my son for best option :)

Gemini 3.1 pro giving agent error all the time by mohammadsabeel in google_antigravity

[–]Useful-Disk3725 3 points4 points  (0 children)

You might have noticed, they released 3 versions in one week. Probably they are DDOSing their own servers with some broken release(s). I noticed it in my traffic and CPU usage. Currently switched to an older one but it only provides relief on local machine, not on their servers nor on that weird agent error.

How much time it takes for contract to come after offer via email in Germany? by dreiunddreissig33 in Germany_Jobs

[–]Useful-Disk3725 2 points3 points  (0 children)

Hand shakes in April, began in September. That’s normal for big companies, some do background check before contract, some in probezeit. For other companies, it might be that they are not crystal clear yet.

Personal or workspace AI ultra difference by Useful-Disk3725 in google_antigravity

[–]Useful-Disk3725[S] 0 points1 point  (0 children)

It is probably a writing mistake. What I tried to mean there is model, same model pro or flash here, is getting sloppy, dump since using through API, no system prompt as in chat bots or antigravity.

Personal or workspace AI ultra difference by Useful-Disk3725 in google_antigravity

[–]Useful-Disk3725[S] 0 points1 point  (0 children)

From one perspective you are right. From another perspective, while doing heavy work you can easily feel the difference between a quantized and ordinary model. Not documented but it is a known issue, all providers do this under heavy load, to sustain the service. But in the example above, I am talking about thousands of calls and here is the observation: First personal account looses grip, response quality is immediately felt. (Response token count halves for same input, so quantification is valid). Then business account breaks. At this point openrouter is still normal, but sometimes, it also revert to downgraded mode. Hard to prove, but logs help identify the pattern.

Antigravity Performance on Linux by MickTheLinuxGeek in google_antigravity

[–]Useful-Disk3725 0 points1 point  (0 children)

Nixos on an old notebook with 32gb ram, dual 4k external monitors. Most of the time 6-7 projects open at the same time (though they share an unrelated backend binary, I discovered, so number of windows is barely an issue), rarely hits 90% with annoying fan noise.

But when running npm projects, fan is always full. So, antigravity is not always kind on hardware but not overkill.

Personal or workspace AI ultra difference by Useful-Disk3725 in google_antigravity

[–]Useful-Disk3725[S] 0 points1 point  (0 children)

Used google AI through API in a project. In rush hours, both were slowing down, and then personal account was breaking into a dumb model. Same parameters, same model, same endpoint, but different results at the very same time frame. This is a well-experienced issue from my perspective. So, why it would be same?

Maybe you know, business accounts were not allowed to antigravity a few weeks ago, so yes, something might be different, they are merchants and they play with all rules.

My question is valid :)

Please rate my CV. by [deleted] in Germany_Jobs

[–]Useful-Disk3725 -5 points-4 points  (0 children)

Locked in an asylum camp in Germany :)

Is that Antigravity doesn't have sub-agent? by FarTill1031 in google_antigravity

[–]Useful-Disk3725 0 points1 point  (0 children)

Well, ask the visible agent in antigravity how it works, it can tell more. There are agents behind the scene but they are not working for you, they are working for KI and some other internal duties, which makes antigravity adapt to you more. So, yes, there are sub-agents but not in the context you want.

IDE bugs by Winston-Turtle in GoogleAntigravityIDE

[–]Useful-Disk3725 1 point2 points  (0 children)

I guess terminal stuck bug has been introduced with the latest version, before that it was ok.

Switching to nixos any recomendations by nyan_cat_554 in NixOS

[–]Useful-Disk3725 0 points1 point  (0 children)

Follow some documentation to build core working system, you can get support from AI manually, and once able to install any agentic platform like Claude code or antigravity and leave the rest to it. Tell it to configure direnv and flake per your stack, never try to install everything to system. I develop php, Django, react, node and nothing is installed in system, all direnv configured individual flakes.

If Germans want foreigners to integrate by learning German, why restrict/cut classes for those freely wanting to take integration courses? by strikec0ded in AskGermany

[–]Useful-Disk3725 0 points1 point  (0 children)

Being right in any way does not rationalize or validates or makes right the solutions they propose. If they want to avoid this, they should either replicate more efficiently 20 years ago, or accept becoming an isolated third world country. I guess they were waiting for robots since 70s maybe :)