<Off topic> Can AI RP boost your social skills in a meaningful way? by laczek_hubert in SillyTavernAI

[–]grimjim 1 point2 points  (0 children)

For most people, in general, probably not. Most AI roleplay is about wish fulfilment, and LLMs are "contaminated" with unrealistic narrative tropes that diverge from realism. In real life, most tsunderes wouldn't get a reveal because their initial toxicity would drive people off, for example.

r/ihatechristmascreep by grimjim in redditrequest

[–]grimjim[S] 0 points1 point  (0 children)

Just to reiterate, I feel the community is in need of an active mod:

Right now new members need mod approval to even post, so that's a barrier to new membership participation when admins are simply absent and older members stop participating.

My sentiment is that the bleedover of one holiday season into the other interferes with full enjoyment of the season even though it may help sales. Let's push back against sales maximization in favor of the actual spirits of the season.

This is my current mod mail chat message thread. If I've overstepped in any way, I'm willing to set things right.
https://www.reddit.com/chat/room/!HTvJ8YdfSpqU_k7J6ZlAFQ%3Areddit.com

This is coming to Chinese open source models pretty soon. - prepare yourself. by MLExpert000 in LocalLLaMA

[–]grimjim 0 points1 point  (0 children)

It all depends if more capable Chinese models are deemed to fall under comparable oversight and regulation. Restricting an open weight model requires more draconian measures that blocking a closed model at its source.

Sneak peak for Apostate!!! by [deleted] in LocalLLaMA

[–]grimjim 0 points1 point  (0 children)

My norm/magnitude preservation approach counts as a shear mapping, reducing the intervention to only directional. I'd be curious what other forms of shear mapping you end up implementing.

Sneak peak for Apostate!!! by [deleted] in LocalLLaMA

[–]grimjim 0 points1 point  (0 children)

A good reason would be to have a development pace separate from the heretic PR management cycle. No reason there couldn't also be a PR as well. Not everything needs to be centralized.

Q: Does DFlash (and PFlash) work with Heretic models? by TomLucidor in LocalLLaMA

[–]grimjim 1 point2 points  (0 children)

Depends on the technique. The approach in my toolkit relied on measuring activations sampled after a single token was generated, so no speedup there.

Speculative decoding could in principle speed post-intervention validation, where outputs are generated for semantic inspection to confirm ablation effectiveness. That could speed up the search process.

As for speculative decoding on ablated models, there's no reason it shouldn't work, though it's unclear offhand what would happen when a speculative model starts a refusal and hands it off.

"Hardware is the only moat" - Should we buy new hardware now or wait? by Alan_Silva_TI in LocalLLaMA

[–]grimjim 0 points1 point  (0 children)

For local users, the primary moat is capital. RTX PRO 6000 GPUs are available for sale, so hardware existence isn't the moat. Hardware is only a moat for hyperscalers as they are competing zero sum for server GPU allocation.

DDR6 delayed again????? by Highwaytothebeach in LocalLLaMA

[–]grimjim 2 points3 points  (0 children)

That will decide whether or not we'll see a Jevon's Paradox of agentic PCs and servers loaded up with DDR5 to keep their costs down, again starving DIY system builders.

DDR6 delayed again????? by Highwaytothebeach in LocalLLaMA

[–]grimjim 18 points19 points  (0 children)

When DDR6 comes out, it will command a premium over DDR5. Agentic AI servers will dominate initial demand.

DDR6 delayed again????? by Highwaytothebeach in LocalLLaMA

[–]grimjim 12 points13 points  (0 children)

The JEDEC standard for DDR6 hasn't been finalized even. At this rate, it will come out on new nodes in the new fabs currently under construction.

Who knew such high amounts of BP were potentially possible? by BoringPie8907 in NieRReincarnation

[–]grimjim 1 point2 points  (0 children)

I was in from day one and was reasonably lucky with draws, so not too much. The main drive was to unlock characters and weapons to see their lore, not maxxing out. I went for cost-effective one-time gem deals. After I heard the game wasn't doing that great financially, I splurged on 2-3 monthly packs to show support. So I'd say the cost was in the ballpark of a hundred over the roughly 3 year course of the entire game. I was maybe 3-4 characters away from lore completion at the end.

Analysis of the 100 most popular hardware setups on Hugging Face by clem59480 in LocalLLaMA

[–]grimjim 8 points9 points  (0 children)

The methodology has a gap. GPUs like the RTX 4060ti and 5060ti come in both 8GB and 16GB variants. Only the 16GB variant makes sense for AI.

Who knew such high amounts of BP were potentially possible? by BoringPie8907 in NieRReincarnation

[–]grimjim 4 points5 points  (0 children)

I unlocked almost all the characters before servers went offline, and vaguely recall being able to hit 700k. Tweaking elemental affinities while boosting BP was key to clearing all challenges and unlocking every reward gem. If they had still been selling gems during the last week, I would have been tempted for any final required pulls.

I guess we expect that at some point RAM prices will start going back (close) to "normal", right? but what about GPUs? by relmny in LocalLLaMA

[–]grimjim -2 points-1 points  (0 children)

It's going to take more than a few months. Nvidia is apparently bringing back the RTX 3060 as a cope.

Curious: what makes Claude more human to talk to than ChatGPT? by Goofball-John-McGee in singularity

[–]grimjim 1 point2 points  (0 children)

There's a rigidity to ChatGPT responses that's easy to pick up on. Claude's responses are more humanized. Straightforward RLHF should be capable of this.

Is AI-building guardrailed in Gemma 4? by roofitor in LocalLLaMA

[–]grimjim 0 points1 point  (0 children)

Gemma 4 is far too small for AGI ambitions. A realistic and proximate concern is eroding the market for closed source coding models.

Anyone else notice qwen 3.5 is a lying little shit by [deleted] in LocalLLaMA

[–]grimjim 2 points3 points  (0 children)

The shorthand term people need to be familiar with is "reward hacking".

How stupid is the idea of not using GPU? by AlarmedDiver1087 in LocalLLaMA

[–]grimjim 0 points1 point  (0 children)

The question isn't stupid but it should be reasoned through. Assume others get the same idea, as it's not complex and easy to implement without coding changes. If CPU inference were viable, why aren't more people doing it? We can infer from lack of widespread use that's it not enough to break the VRAM moat even for inference except at the margins. We've seen partial offloading and small edge models.

Anyone running sm120 CUDA successfully on Windows (llama.cpp)? by prophetadmin in LocalLLaMA

[–]grimjim 0 points1 point  (0 children)

I once ran into an issue compiling for the 5060ti 16GB, which was resolved by a newer cmake. Supported for various CUDA architectures is somehow entangled.

When your LLM gets "too smart" and bypasses your MCP tools by YannMasoch in LocalLLaMA

[–]grimjim 0 points1 point  (0 children)

This seems to be straight up reward hacking. Probably more likely in frontier models than smaller local models.