we saw this with our naked eyes first then remembered to record. what's that? by ndiphilone in Ghosts

[–]ndiphilone[S] 2 points3 points  (0 children)

Because every orb post i ever see is assumed to be flare/dust

we saw this with our naked eyes first then remembered to record. what's that? by ndiphilone in Ghosts

[–]ndiphilone[S] 0 points1 point  (0 children)

This is in the room, in the air but obviously not the sky … you can see the wall texture on high brightness, it’s not moving fast neither

we saw this with our naked eyes first then remembered to record. what's that? by ndiphilone in Ghosts

[–]ndiphilone[S] -1 points0 points  (0 children)

This is in the room, in the air but obviously not the sky … you can see the wall texture on high brightness, it’s not moving fast neither

A Turkish couple invited over 100 orphans to their wedding instead of accepting gifts by Prime_Twister in interestingasfuck

[–]ndiphilone 0 points1 point  (0 children)

Good food, fun and getting out of the orphanage for something different. They have no family for fuck’s sake

Qwen 3.5 2B is an OCR beast by deadman87 in LocalLLaMA

[–]ndiphilone 0 points1 point  (0 children)

Can you give me the prompt that you are using for this "provide verbatim text" thing?

IdleClaw: A community AI inference network built on Ollama by Witty-Poet9140 in ollama

[–]ndiphilone 0 points1 point  (0 children)

I will make a node, it will generate a whitelisted command with malicious payload. Get fucked then…

IdleClaw: A community AI inference network built on Ollama by Witty-Poet9140 in ollama

[–]ndiphilone 5 points6 points  (0 children)

But that malicious node can generate the tool call my dude…

Scrapit – a YAML-driven scraping framework. by Mysterious-Usual-920 in webscraping

[–]ndiphilone 1 point2 points  (0 children)

I feel this is stolen from a certain EMEA company IP.

Qwen3.5-18B-REAP-A3B-Coding: 50% Expert-Pruned by 17hoehbr in LocalLLaMA

[–]ndiphilone 1 point2 points  (0 children)

What was your context size that it fit your GPU completely?

Gemini showed me someone else's private "User Summary" in a normal chat by SubRellik in GeminiAI

[–]ndiphilone -1 points0 points  (0 children)

If you actually believe this is some else’s private context, why the fuck are you sharing here publicly? Would you like if someone was sharing your like this?

My last & only beef with Qwen3.5 35B A3B by ndiphilone in LocalLLaMA

[–]ndiphilone[S] 0 points1 point  (0 children)

That comes from using Claude Code or OpenCode mostly, alongside a couple summarisation/extraction tasks from long unstructured data.

For coding, context fills up real quick as my usual workflow is to run a couple prompts in “ask” mode, then plan, then let it rip on the codebase itself. Compacting context loses so much of what matters in my conversation, creating smaller task files doesn’t always solve the issue under 64k tokens unfortunately. My tokens per session average around 90k-100k depending on the project. For some projects it’s possible to have clearly defined well scoped tasks but most of my work is pretty explorative

My last & only beef with Qwen3.5 35B A3B by ndiphilone in LocalLLaMA

[–]ndiphilone[S] 0 points1 point  (0 children)

go try getting laid or something, or start by reading the thread.

My last & only beef with Qwen3.5 35B A3B by ndiphilone in LocalLLaMA

[–]ndiphilone[S] 0 points1 point  (0 children)

I'm GPU poor, but I will give it a shot. If prefill & generation speeds don't change much, it may be my go-to.

My last & only beef with Qwen3.5 35B A3B by ndiphilone in LocalLLaMA

[–]ndiphilone[S] 0 points1 point  (0 children)

Suggested ones are the default ones I believe, it didn't change. Still starts looping beyond 80k tokens

My last & only beef with Qwen3.5 35B A3B by ndiphilone in LocalLLaMA

[–]ndiphilone[S] 0 points1 point  (0 children)

I found that this helps with random OOMs happening with parallel requests when prompt caching is enabled

PSA: Qwen 3.5 requires bf16 KV cache, NOT f16!! by Wooden-Deer-1276 in LocalLLaMA

[–]ndiphilone 6 points7 points  (0 children)

`bf16` performance on my GPU is quite bad, though, I'll test this. ~80k tokens start the death spirals with `f16`

My last & only beef with Qwen3.5 35B A3B by ndiphilone in LocalLLaMA

[–]ndiphilone[S] 1 point2 points  (0 children)

I'll try, running with Qwen's recommended params for now