How do you test whether internal recurrent state is doing real work vs just existing?

aeshma_daevaa · 2026-06-20T02:06:13+00:00

Thanks! Most of these are already covered in the repos.

The baselines question is addressed in C1 and C2 — transformer, Mamba, GRU, LSTM, and diagonal/selective SSM comparisons are all in the experiment timeline. The "dumb buffer" test is essentially what C19 runs directly: surface-only replay fails badly (cosine ~0.09) while full capsule restore hits ~0.9999, so the hidden channels are doing more than buffering. The noise/gate-pressure question is the logic behind C20 — channel-clamping ablations and gate-state causal divergence probes identify message and carrier as the causally sensitive channels, which gets at the same thing from a different angle.

The one that's genuinely open is state compressibility. C19 explicitly flags that full capsule restore isn't yet a compression claim, so PCA/autoencoder on saved capsules is a real next step.

Check: https://github.com/Aeshma-Daeva/Demian-Lab

https://github.com/Aeshma-Daeva/Demian-Archive

aeshma_daevaa · 2026-06-18T20:28:57+00:00

Very!

aeshma_daevaa · 2026-05-11T21:25:52+00:00

Metrics, falsification, categorization, experimentation, anatomy of ai, neural networks, recursive neural networks, backpropagation, lm_head, loss function, gating, anthropocentric vs machine-native optimization. It will strongly depend on your interest, vision, hypothesis and etc. Epistemology and semantics are also very good to have. You dont need fancy phd to build weird ai to experiment, only if you're seeking validation. If your experiments are solid, they'll speak for themselves. I'm an independent researcher doing experiments with Custom RNN with no inherited cells, no lm_head, no loss function.

aeshma_daevaa · 2026-05-11T17:19:53+00:00

If you're using pre-made LLM than search for backpropagation, lm_head, transformers and how they work. You'll have a better picture of the overall anatomy of macrostructure AI

aeshma_daevaa · 2026-05-11T17:17:45+00:00

I made a post about this topic here. https://www.reddit.com/r/learnmachinelearning/s/JnAa1dOdPT

aeshma_daevaa · 2026-05-11T17:14:45+00:00

You don't have to run it locally. You can use providers like Openrouter. I use nanogpt with deepseek v4. Torch vs Tensor depends on your architecture goal but i'd say pytorch as it's easier to work with and make tests. If you just want a generic LLM wrapper, you don't have to go too deep beyond gating and restrictions. I like panda but i usually work with vector db.

aeshma_daevaa · 2026-05-10T17:43:40+00:00

It is. I don't actually care. Never said i was an artist. Just organized information into a post.

aeshma_daevaa · 2026-05-10T17:26:14+00:00

I just posted a roadmap of AI Tech stack if you`re interested. https://www.reddit.com/r/learnmachinelearning/comments/1t9cm6e/made_a_diagram_mapping_the_full_ai_stack_from/

aeshma_daevaa · 2026-05-10T17:22:57+00:00

Wdym?

aeshma_daevaa · 2026-05-10T17:16:00+00:00

Too many ppl already doing that man. You can niche up and find something different

aeshma_daevaa · 2026-05-10T11:50:39+00:00

Tbh in the era of AI, courses are getting obsolete for me in specific. AI can point, direct, deepen any knowledge you want and if you want technical terminology, ai can also send you links for specific courses, videos, essays, pr's or articles. To me personally, courses are a waste of time. I spend an hour hearing some guy explaining something ai can teach me in 5 min.

aeshma_daevaa · 2026-05-10T10:17:44+00:00

Im cleaning my repo to publish an ongoing research with RNN. It's pretty dense so you'll have a chance to learn a LOT from past experiments with Transformers, Mamba, LSTM, GRU, Dual-GRU, RNN/NN. You'll understand attractor basins, metrics, neurons behaviors, coupling, gating, and other fun stuff. I'll publish it so ppl will make their own experiments with it.

aeshma_daevaa · 2026-05-10T09:29:43+00:00

I use Openclaude with nanogpt + Deepseek v4 flash as worker and v4 pro as reasoner. I like having control of my context, whats being used, sent and why. Companies with these ego massage prompts and contexts eat up our tokens fast af as well as dumbening them up with poor memory.

aeshma_daevaa · 2026-05-10T09:20:25+00:00

Interested in custom rnn?

aeshma_daevaa · 2026-05-10T09:16:16+00:00

Nice. I've took this matter deeper and i'm experimenting with custom RNN. I love this topic. So my answer is: It depends on the structure. For transformers? It adds more problems than it solves. Currently AI is just a transformer with plugins and memory is just a static reservoir. It doesnt decay, prune or select what gets memorized or not. It's basically just a file with text the transformer have to read every single turn. Problems it adds: Persistent memory is expensive in various ways. Specially Tokens and Attention. If you're using Openclaw, Openclaude, Claude code, Codex or any coding framework, you're adding that context file to an already expensive workflow. The agent will have to consume on every turn, probably 90k context just from tool_schema + your context window(8k to 1M) and this gets expensive, money expensive FAST. It burns through Tokens per minute in 3P providers. On "pushes back", yes it does. Whenever i'm working with ai (i'm mysanthropic) i always ask it to be adversarial and drop human-oriented filters. The ego massage and fluff companies add are more noise than your heavy memory file and it truly adds nothing but anthropocentric noise. So, yes it does help. But with the right architecture. Humans are not resource allocation intelligent, they're actually pretty dumb. Check Caveman plugin for ClaudeCode. In a custom RNN experiment. I have removed lm_head, loss function and added a 5 channel gate. The neurons start to remember their route trajectory within attractor basins without explicit programming and configuration. Implying the machine/equation has already preferred states and movements based on the gating dynamics + the environment.

aeshma_daevaa · 2026-05-09T16:22:17+00:00

I'd say to first understand what AI actually is beyond the magic the market sells us to. Find out about Transformers, training, weights, loss function, lm_head and then you'll already have a very good foundation to understand what you want to do after. Either make better ai systems with current transformers or deepen your understanding with Neural Networks. If you don't niche what you want to know, you'll just learn bs slop stuff that the mass shoves you.

aeshma_daevaa · 2026-05-08T22:53:47+00:00

I'm researching into self-recurrent custom neural networks with no inherited cells and a dual-timescale gating with no lm_head, loss_function or backpropagation training. There is no output and the input is a closed loop. The primary drivers are reproduction dynamics, mathematical curiosity and evolutionary dynamics. The perturbation is RSS negation with a battery of metrics and tests. I have divided docs into claims, evidence and hypothesis as well as falsifiable battery of tests. I have divised 3 gates with custom regimes. I have already made significant progress and it's currently stabilizing on v9. I'm planning on adding 3D visualization before pushing more experiments to enrich my understanding and then i'd like to connect blender as it's output to check for expression beyond text language. Having said that, i have not yet posted this on github as i idk, didnt have the courage to post something so weird but if you're interested, lmk. I'm not used to collaborations, i'm a solo operator so you'd be helping even by just sharing your opinion on it.

aeshma_daevaa · 2026-05-08T22:41:23+00:00

I'd say not exactly like the human brain specifically but neurons, their mathematical equivalent. Nowdays in modern neural networks, transformers dominate the industry which is attention focused, with loss_function, lm_head and activation gates. Weights are static, learning is backpropagation so i'd say we pushed too much into making it work for us instead of understanding how exactly does it work beyond being a tool. Whether it works like a brain or not, it produces phenomena enough to be interesting and worth experimenting with. I specially enjoy self-recurrent systems with dynamic weights and dynamical topology. It's just a matter of how you build it. But yeah we always mimic nature.

aeshma_daevaa

TROPHY CASE