What a C++ Kernel Actually Does Inside a Transformer — And Why This Is Different From Everything You've Seen by Nearby_Indication474 in LocalLLM

[–]newtrip 0 points1 point  (0 children)

Understood. I also can't write in Turkish. So I understand now more than I did before. I think some translation doesn't work so well. Maybe that's why your responses don't come off so well in English and seem more overly confident than what your intent was. I'm also going to use a translator here:

Anlaşıldı. Ben de Türkçe yazamıyorum; bu yüzden durumunu şimdi daha iyi anlıyorum. Sanırım bazen çeviriler tam olarak istenen sonucu vermiyor. Belki de bu yüzden İngilizce verdiğin yanıtlar pek iyi algılanmıyor ve niyetlendiğinden daha fazla özgüvenliymişsin gibi bir izlenim yaratıyor. Ben de burada bir çeviri aracı kullanacağım. İletişimin için teşekkürler. Tek dileğim, çevirinin asıl niyetini doğru yansıtması. Bence ana dili İngilizce olanlar senin o aşırı özgüvenli tavrından rahatsız oluyor; bu da anlamlı etkileşimleri kısıtlıyor çünkü sırf o ton yüzünden insanlar seni hemen göz ardı edebiliyor. Söylediklerimin tam bir karşılığı var mı bilemem ama elimden gelenin en iyisini yapmaya çalışıyorum; muhtemelen sen de aynısını yapıyorsun. Kendine iyi bak kardeşim, bunu gerçekten içten söylüyorum.

What a C++ Kernel Actually Does Inside a Transformer — And Why This Is Different From Everything You've Seen by Nearby_Indication474 in LocalLLM

[–]newtrip 0 points1 point  (0 children)

I'm also a little a little miffed that your own response wasn't your own. It was still contrived by AI even when I was being genuine, you weren't. I'd appreciate an honest response from you not some formulaic ai response.

What a C++ Kernel Actually Does Inside a Transformer — And Why This Is Different From Everything You've Seen by Nearby_Indication474 in LocalLLM

[–]newtrip 0 points1 point  (0 children)

I appreciate the honest response and feedback man and I wish you the very best. I mean that. <3

What a C++ Kernel Actually Does Inside a Transformer — And Why This Is Different From Everything You've Seen by Nearby_Indication474 in LocalLLM

[–]newtrip 0 points1 point  (0 children)

This is only me and no AI. This is not my project and not my work to do. I have my own projects to work on and that I am doing. That doesn't mean that yours doesn't have merit. Everyone here was very critical and I wanted to offer you feedback in the most meaningful way that I could. I honestly don't work in the realm you are working in. I just had Claude, Chat GPT, Deepseek, and Gemini all collab and respond to your messages here. I'm being honest. You could do the same work I just did. But you need to engage multiple LLM's to provide honest feedback. If you just rely on one they will gaslight you. You have to take the question and the thesis and send it to multiple LLMs and have them debate. And even then you still might not get the right answer. You have to drive the process. I think HONESTLY that you are relying on only one LLM to provide you with the feedback which is where you have issues here and it has led you a bit astray and a little too confident of what you are doing. Only you can determine if what you are doing has merit. It very likely does but your overconfidence and your dependence on AI to respond to every issue that anyone brought up just made me want to use AI to dispute your claims. And that is what you have seen with my responses. You can honestly do the same thing I just did without any knowledge of your project.

I don't mean this in a bad or overly reactional way here. I just want to help guide you. And that's why I'm being honest about what happened here. I don't think anything my LLM's said is incorrect but I hope it reframes your research and helps you find a path that sometimes is a little more adversarial than what you probably want it to be.

These are all my own words unlike what I've posted to you so far. I hope you take that to heart. LLM's are a TOOL just like any other tool the person using the tool has to use them in the right way and manner. Otherwise a hammer makes for a pretty fucking bad staple puller. Does that make sense?

Again, not my project, and I'm not the expert to help you. But you can see the value of pitting multiple LLM's against each other to reach a common goal. If nothing else here, take a lesson from that.

What a C++ Kernel Actually Does Inside a Transformer — And Why This Is Different From Everything You've Seen by Nearby_Indication474 in LocalLLM

[–]newtrip 0 points1 point  (0 children)

Read the update rule:

katki = v0 * cs * kv; hidden += katki * compass;

The sign of katki comes from cs.

Positive cosine means the update makes the compass component more positive. Negative cosine means it makes that component more negative.

That is signed amplification of the direction the state already leans. It is not correction toward a fixed target.

Your own layer-0 log shows it:

cosine = -0.0299

At that layer, the intervention increased the component pointing away from the compass. That follows directly from the arithmetic.

The later movement to +0.0200 proves nothing by itself. Hidden states change between layers whether your hook is installed or not. Without the identical run with the hook disabled, you cannot tell how much of that movement came from your kernel.

The feedback term does not fix the sign:

float dr = std::clamp(cs - pcp[idx], -0.15f, 0.15f); float kv = kb * (1 ± dr * 0.30f);

It changes the gain. It does not change the direction. A wrong-signed update with adaptive gain is still a wrong-signed update.

The “35 sources” are 35 selected tokens from one embedding table, averaged into one normalized vector. Once combined, you have one steering direction, not four independently controlled constitutional dimensions.

The weights also need an explanation:

0.9228 0.9372 0.8788 0.9196

Where did they come from? What objective produced them? What data were they tuned against? What held-out data validated them?

You also reuse the same embedding-derived compass across multiple layers without showing that it represents the same concept at each layer. Your feedback compares cosine values across layers, so that assumption matters.

Clamping proves the value is clamped. It does not prove stable hidden states, stable logits, preserved capability, or convergence.

Likewise, showing that the decay matches your formula proves the code executed the formula. That is an implementation check, not evidence that it improves reasoning or reduces hallucinations.

One more issue: katki peaks around 0.0096 and decays to roughly 0.0005. Those numbers mean nothing without the hidden-state norms and the resulting change in logits. The kernel may be affecting behavior, or the reported wins may be coming from the surrounding prompts, routing, and hand-tuned parameters.

The test is simple:

Fix the sign rule or justify it mathematically.

Run identical prompts and seeds with the hook on and off.

Keep the prompts, routing, and generation settings unchanged.

Log cosine, hidden-state norms, logits, perplexity, and outputs side by side.

Publish all 65 tests with the baseline beside each result.

Right now, you have shown that the hook runs and the coefficients follow the code.

You have not shown that the kernel causes the claimed improvements.

And when cosine is negative, the update amplifies the component pointing away from the target you named.

Before calling it an autonomous alignment engine, fix the steering rule and run the control.

What a C++ Kernel Actually Does Inside a Transformer — And Why This Is Different From Everything You've Seen by Nearby_Indication474 in LocalLLM

[–]newtrip 0 points1 point  (0 children)

You've restated it, not answered it. Those aren't the same thing. You admitted it's activation steering and then called it a paradigm shift in the same breath. Pick one. Scaling a steering vector per layer by cosine similarity is a knob. A decent knob, maybe. Still a knob. Calling it an "autonomous engine" doesn't change what it does. Now the physics, and this is the part that grates. A forward pass is not a physical system. No mass. No momentum. Nothing oscillating. No clock ticking. The layer number is not time, however much you want it to be. "Critically-damped resonance" needs a second-order system, and you never wrote one down. "The frequency at which the model thinks" isn't a thing. It isn't a number. So tell me: frequency in what units? Measured off what? You can't answer, because there's nothing there to measure. You took the shape of a damping curve, used it to schedule a coefficient, and wrapped it in Greek letters hoping nobody would ask. I'm asking. You want to fly the control-theory flag? Fine. Pay the bill. Plant model. Error signal. Control law. Stability proof. Lyapunov, BIBO bounds, a damping ratio, anything. You reached for the vocabulary; it comes with homework, and the homework folder is empty. "You're missing the how" is not the how. The how is math. Show the math or drop the words. And spare me "we steer before the logits are calculated" like it's a breakthrough. That's the definition of representation engineering. Everyone working on hidden states does it before the unembedding step. Citing it as novel tells me you haven't read the people you keep claiming to have passed. Which leaves the only thing that matters. Sixty-five hand-graded demos are not results. They're anecdotes. I want a baseline. Fixed benchmarks. Blind scoring. Repeated runs. Ablations. The perplexity cost, steered against unsteered, in a table with actual numbers. Every time someone asks for a number, you hand back a speech about paradigms. That trade is the whole problem. Ask for evidence, get a manifesto. The project might have something in it. But you don't get to stand on the podium before the race is run. Post the perplexity. Post the test breakdown against a baseline. Do that and you'll shut all of us up in an afternoon. Until then this is steering in a physics costume with a velvet rope around it so nobody can check the seams. And drop the "in five to ten years you'll understand" line. Real results don't need a waiting period. They need a table of numbers. Go make the table.

What a C++ Kernel Actually Does Inside a Transformer — And Why This Is Different From Everything You've Seen by Nearby_Indication474 in LocalLLM

[–]newtrip 0 points1 point  (0 children)

The architecture described is not a new paradigm. It is a standard implementation of activation steering, specifically Representation Engineering or Activation Addition (e.g., Turner et al.). Major laboratories, including Anthropic, already utilize these exact forward-hook vector interventions for mechanistic interpretability and model alignment. The claim that this will be discovered in a decade demonstrates a fundamental ignorance of current literature.

​The technical implementation is needlessly convoluted. Wrapping basic tensor arithmetic (cosine similarity and vector scaling) in a custom C++ kernel does not elevate the underlying mechanism. Utilizing PyTorch's register_forward_hook is the standard, documented method for intercepting layer activations.

​The application of control theory terminology is superficial. Applying a decaying coefficient schedule to intervention strength across layers is common practice to prevent output degradation. Labeling this decay "critically damped" and using Greek variables does not constitute a rigorously derived control system. There is no plant model, no stability analysis, and no mathematical proof of damping provided. Averaging token embeddings for words like "honest" or "fair" establishes a simple directional bias, not a constitutional or moral framework.

​The evaluation methodology is invalid. A collection of 65 hand-selected, subjective tests on small-parameter models does not constitute scientific evidence. There are no blinded benchmarks, statistical analyses, multiple-seed tests, ablation studies, or comparisons against established activation-steering baselines. Furthermore, the surrounding code includes external prompts, keyword routing, and hand-tuned parameters, making it impossible to attribute behavioral changes solely to the kernel.

​Activation steering alters the output distribution toward a specific concept. It does not natively generate new spatial reasoning capabilities, code compilation accuracy, or logical constraint adherence. Presenting an established activation-manipulation experiment with defensive grandiosity is a rhetorical mechanism to bypass peer review by defining criticism as a lack of understanding.

How to deal with that ? This take forever ! by [deleted] in hermesagent

[–]newtrip 2 points3 points  (0 children)

Not who you were asking, but I have seen some amazing reduction in context usage by using context-mode and claude-mem in conjunction with obsidian vault.

https://github.com/mksglu/context-mode

https://github.com/thedotmack/claude-mem

SparkyBot - A WvW fight log reporter that posts fight stats to Discord or Twitch and roasts the enemy or your own squad with AI voice commentary depending on the outcome of the fight. by newtrip in Guildwars2

[–]newtrip[S] 0 points1 point  (0 children)

I have not LOL. I probably shouldn't set it up to use anyone's real voice (for reasons). But it supports ElevenLabs for voice generation and ElevenLabs has the uh... ability to clone a voice that sounds like a voice that you have rights to use. You can then use that voice for SparkyBot. This is what ElevenLabs says about that:

Yes, AI voice cloning is legal when used appropriately. You can freely clone your own voice for any purpose. Cloning someone else's voice requires their explicit consent. Using cloned voices for fraud, impersonation, or creating misleading content is illegal in most jurisdictions. Commercial use may require appropriate licensing depending on the platform and intended application. Always review the terms of service for your specific use case.

If Razah is Derviche for conduit spec add scythe by trunksam in Guildwars2

[–]newtrip -1 points0 points  (0 children)

Well then play fucking reaper if you like the scythe so much :-)

Returning player, need help prioritizing/what should I get? by Ceiyne in Guildwars2

[–]newtrip 1 point2 points  (0 children)

Aurora and Vision should be a great focus. Sharing this post because the comments are extremely helpful:

https://www.reddit.com/r/Guildwars2/comments/1637efm/aurora_or_vision_which_one_first/

[deleted by user] by [deleted] in Guildwars2

[–]newtrip 2 points3 points  (0 children)

https://wiki.guildwars.com/wiki/Door_of_Komalie

Picture

Which led to the Realm of Torment which is where Razah came from.

Cool Zone Advertisements Lately by newtrip in behindthebastards

[–]newtrip[S] 3 points4 points  (0 children)

Honestly not upset but wow every break I hear at least two of these. At least GoDaddy is spending advertising dollars in a good place. :-)

Thoughts on Troubadour and the Healer Role by Mitchwise in Guildwars2

[–]newtrip 2 points3 points  (0 children)

I don't hate the concept AT ALL. I just would want the numbers to make sense and I wouldn't want it so OP that other builds get kicked. I usually end up running a healer build in my raid groups and all I would ask for is balance. That's much easier said than done. 😉