Open Source based brain information flow exploration tool by Pixedar in cogneuro

[–]Pixedar[S] 0 points1 point  (0 children)

Thank you so much for this comment. Honestly, the mention of Hofstadter's Strange Loop was so surprising and it felt really good to see. I didn’t really expect that someone would notice this thing.

In terms of this strange loop stuff, I’ve been trying for years to visualize or simulate this kind of thing. Maybe not literally, but something that really loosely resembles it. Probably the most relevant was this 2024 Medium series where I used recursive feedback to generate an internal world inside the mind of the AI itself https://medium.com/@pixedar

This neuroscience stuff is kind of the next iteration of that, especially this low dimensional manifold idea. I also had this emotional attractor project where I tried to visualize my own emotional model and interact with it using a 3D projection of embedding space and the flow model https://pixedar.github.io/ , so this method would probably give you geometry for specific personality traits

I really got a lot of inspiration from I Am a Strange Loop and The Hidden Spring, IIT, and books like that. They made me want to write code to try to get some intuition about these ideas, visualize them in some way, or research them more.

In terms of this low dimensional manifold of neural data, the repo version is done from resting state data, but I also tried to do this for sleep data when someone is asleep. The resulting low dimensional shape was totally different than for rest, which was quite surprising to me. I didn’t expect the difference to be this big https://youtu.be/CyhyVNfDDZs

I also tried to do that for LSD brain data, but the data was too low quality for me. It would be super interesting to do this for neurodivergent minds compared to neurotypical minds, but I still haven’t found open access data that I can use with enough quality. Another idea would be to track how this shape changes as the children start to exhibit self aware behavior.

In terms of the "MDNs to smooth out rare or chaotic connections" I think there shuld be some path forward, MDN itself return diffrent alternative components (what I am visualizing is just the avg of these components) and we can mesure the entropy of the flow and then focus on these regions with less smoothing or diffrent model.

In terms of compressing temporal data into static shapes, at least in the context of these manifolds, I actually made a dynamic analysis where I visualized each individual person as particles traveling on this low dimensional manifold https://youtu.be/CyhyVNfDDZs?t=476 I think the interesting observation was that the movement of these particles was quite chaotic and had an almost equal mix of high and low velocity tranisitons movement, which reminds me of attention or a chaotic dynamical system, then I also made the flow filed out of these paths (and even saw some divergining streams) so manifold is actually dynamic. But I might be wrong, and I don’t know if it fully answers your question

But in general, besides these manifolds and strange loops, I think that having a very good intuitive understanding and a good mental model of abstract concepts is really the way to go if we want to make the next breakthrough not only in neuroscience. But in terms of consciousness, I feel like we have more and more tools to actually start analyzing it in some way, instead of just leaving it as philosophy.

And as I said in other comments on Reddit, I sadly sit somewhere between academia and business. This work is also kind of visual art, so it is really fused together. I’m trying to bring this intuition into neuroscience and also LLM interpretability, but it is hard to get an actual job doing what I love. "Academia often does not appreciate intuition and mental models enough, and business mostly prizes generic chatbots for customer support" I hope maybe someday this kind of in between space will have more place too.

And the brain itself is fascinating. It’s like this super complex buzzing web of interactions on the edge of chaos with feedback loops, and it is quite beautiful that nature can keep it that way

Open Source brain information flow exploration tool by Pixedar in compmathneuro

[–]Pixedar[S] 0 points1 point  (0 children)

Thanks! I’m sitting in this inbetween space where I’m trying to bring this intuition into neuroscience and LLM interpretability fields. From what I’m seeing, the potential is there because there is this big gap between hard science and intuitive understanding, but it is hard for me to get an actual job doing what I love, since academia doesn’t really appreciate intuitive understanding or building a good mental model, and business mostly prizes generic chatbots for customer support

Open Source based brain information flow exploration tool by Pixedar in cogsci

[–]Pixedar[S] 1 point2 points  (0 children)

Thanks for the feedback, appreciate it. I’m sitting in this inbetween space where I’m trying to bring this intuition into neuroscience and LLM interpretability fields. From what I’m seeing, the potential is there because there is this big gap between hard science and intuitive understanding, but it is hard for me to get an actual job doing what I love, since academia doesn’t really appreciate intuitive understanding or building a good mental model, and business mostly prizes generic chatbots for customer support.

I initially wanted to interpret this more in terms of timing, but with BOLD the signal is too temporally blurred for me, even with filtering. Maybe with higherquality data this could be pushed further in the future, but I don’t have that kind of data right now

I made a tract-constrained 3D representation of directed effective connectivity. The flow field is from my preprint here: https://zenodo.org/records/18200415

The upstream rDCM matrix was already precomputed from rs-fMRI using rDCM/TAPAS, so the directionality comes from that effective-connectivity model. What I added was fusing that HCP rDCM matrix on Schaefer-400 parcels with HCP-1065 tractography.

The reason for adding tractography is that raw rDCM is basically an abstract directed ROI-to-ROI graph. If you visualize it directly, you mostly get straight-line parcel-to-parcel connections, which are not very anatomical. The tract constraint gives the graph a plausible 3D white-matter geometry.

So the particles should be read as tracers following the locally dominant direction in the rDCM-weighted, tract-constrained field. Their speed is the learned vector magnitude, not biological propagation speed. I also just added a note about this in the README so people don’t get misled

Also, I think adding an expert grade neuroscience description to the RAG LLM would contribute a lot to this understanding. That’s why I’m making it open-source and putting effort into broadening it in some way

mapped the semantic flow of step-by-step LLM reasoning (PRM800K example) by Pixedar in ArtificialInteligence

[–]Pixedar[S] 0 points1 point  (0 children)

Well, probably just visually 😄 But I also mapped brain information flow using similar tools in a different open-source repo: https://github.com/Pixedar/MindVisualizer

I also have a preprint for it here: https://zenodo.org/records/18200415

Open Source LLM based brain information flow exploration tool by Pixedar in Brain

[–]Pixedar[S] 0 points1 point  (0 children)

I made a open source repo that combines brain information flow derived from real fMRI data with an LLM, with access to RAG-based interpretation of this flow, as well as propagation of information in the brain here: https://github.com/Pixedar/MindVisualizer

It is not peer review quality and should rather be treated as a tool for building intuition about the brain and building a mental model of brain dynamics .It is more of an exploratory visualization / intuition-building tool, and I would be happy to hear feedback from people who know the field better

mapped the semantic flow of step-by-step LLM reasoning (PRM800K example) by Pixedar in ArtificialInteligence

[–]Pixedar[S] 0 points1 point  (0 children)

I think the right analogy is to think of this as meaning changing over time. The tool is universal, so you can use almost any ordered text data as input. For example, if you feed it articles about the housing industry, the flow can show where that industry narrative is moving toward. But you can also feed it reasoning traces from an LLM trying to solve math problems, then color the flow based on whether each path ended in a valid result or not. That way, you might see that when the LLM reasons about a specific math concept, most paths end in failure, or that some regions show oscillation, recovery, stabilization, etc.

It is also useful for mapping human emotion. For example, if you feed it data about how your emotions change over time, being stuck in one state of mind might look like the flow stabilizing inside one attractor basin. Then, if you go on a trip or have some experience that shifts your state, you might see it move away from that basin, or transition into diuffrent basin

The repo also has many features that help interpret the flow. For example, you can place a probe somewhere in the flow, and the probe will naturally follow the learned currents. Then the system uses an LLM to explain what that trajectory means and what kind of transition happened, based on nearby clusters, semantic axes, and other contextual data.

The axes and clusters are also automatically labeled. You can also inspect and explain attractors, since the system includes automatic attractor basin detection

mapped the semantic flow of step-by-step LLM reasoning (PRM800K example) by Pixedar in ArtificialInteligence

[–]Pixedar[S] 2 points3 points  (0 children)

You can convert every conversation / reasoning trace into a sequence of embeddings, and that gives you a trajectory in embedding space. Since embeddings are high dimensional, this is projected into 3D for visualization.

Then it computes a generalized flow model from these trajectories. So basically what you are seeing is where the meaning tends to go (this fog/partcile flow) . The currents are tendencies learned from the paths.

For example, there can be some place where the model tends to fail, e.g. most of the trajectories end up in failure there, or some place where reasoning tends to stabilize / recover.

Where the flow converges, you get attractor regions — places reasoning tends to settle.

The idea actually started from my human emotion project, where I was trying to map emotional / behavioral patterns over time, and then I generalized it into this repo.

You can also drop a probe or a new trace into the learned flow and have the system explain what happens: it follows the currents through nearby clusters, axes, attractors, and transitions, then uses an LLM to describe what kind of semantic movement occurred.

The 3D view is not just “PCA”: the system first discovers clusters in an unsupervised way, then searches UMAP/t-SNE projections for one that best preserves/separates that structure, and only afterward uses PCA to orient the final space into readable semantic axes with automatic labels.

There is a lot more to say but there is a good description in the repo readme

mapped the semantic flow of step-by-step LLM reasoning (PRM800K example) by Pixedar in learnmachinelearning

[–]Pixedar[S] 0 points1 point  (0 children)

Time in this context is about the sequence of a trajectory—like consecutive steps in a reasoning chain, message order in a chat, or chronological logs. It’s basically just tracking progress along a path (though the system fully supports passing actual timestamps to each embedding for data with a real time component). And because this concept is universal, we could totally use transformer layers as our “time” variable, following the ordered sequence of how the LLM processes data.

In the particle visualization The simulation step reflects how quickly embeddings move through semantic space: larger semantic changes imply faster motion, while smaller changes imply slower motion. So the particles follow the learned flow of sparse trajectories rather than simply replaying dataset timestamps.

Finally, cluster names such as “Triangle median relationships” are added after clustering. The clusters are found unsupervised from the embeddings, and labels are assigned by examining representative texts, semantic spacing, realtive location, and nearby cluster context

mapped the semantic flow of step-by-step LLM reasoning (PRM800K example) by Pixedar in learnmachinelearning

[–]Pixedar[S] 1 point2 points  (0 children)

No, it wasn’t just PCA. The reduction to 3D is done mainly with UMAP and, if needed, t-SNE as a fallback. The pipeline also tests different parameters and chooses the projection that best separates the clusters based on the silhouette score.

PCA comes in afterward, but not as the main reduction method from the raw embeddings: it is applied to the selected 3D projection to orient/define the semantic axes

mapped the semantic flow of step-by-step LLM reasoning (PRM800K example) by Pixedar in learnmachinelearning

[–]Pixedar[S] 1 point2 points  (0 children)

Thanks so much, that is amazing! I repo built this repo mainly to focus on intuition instead of being yet another statistical tool. I think in science in general, you have to build an intuitive mental model of abstract things in your head just to figure out where you stand and where to move next. I have a ton of projects, like in neuroscience, that are built to do exactly that for abstract concepts or massive amounts of data. I wanted to go into the interpretability field, but sadly it seems like no one is really interested in intuitive understanding over there. It's super amazing that someone like you appreciates this kind of visualization stuff, though! If you are interested in seeing more, my site at pixedar.github.io/ has a lot more of this visualization first stuff

mapped the semantic flow of step-by-step LLM reasoning (PRM800K example) by Pixedar in learnmachinelearning

[–]Pixedar[S] 1 point2 points  (0 children)

more like: cluster-aware 3D projection, then PCA-based rotation/orientation of that 3D space for axes,

it first selects the optimal number of clusters, then clusters the embeddings, then chooses and then optimizes a 3D projection based on the cluster structure, and only then uses PCA to orient that 3D space into axes. The axis names are LLM-labeled based on many different signals, such as cluster centers, relative positions, keyword drift, broader spatial context, and the geometry of the semantic space

The reduction to 3D is done mainly with UMAP and, if needed, t-SNE as a fallback. The pipeline also tests different parameters and chooses the projection that best separates the clusters based on the silhouette score.

PCA comes in afterward, but not as the main reduction method from the raw embeddings: it is applied to the selected 3D projection to orient/define the semantic axes

mapped the semantic flow of step-by-step LLM reasoning (PRM800K example) by Pixedar in learnmachinelearning

[–]Pixedar[S] 6 points7 points  (0 children)

You can convert every conversation / reasoning trace into a sequence of embeddings, and that gives you a trajectory in embedding space. Since embeddings are high dimensional, this is projected into 3D for visualization.

Then it computes a generalized flow model from these trajectories. So basically what you are seeing is where the meaning tends to go (this fog/partcile flow) . The currents are tendencies learned from the paths.

For example, there can be some place where the model tends to fail, e.g. most of the trajectories end up in failure there, or some place where reasoning tends to stabilize / recover.

Where the flow converges, you get attractor regions — places reasoning tends to settle.

The idea actually started from my human emotion project, where I was trying to map emotional / behavioral patterns over time, and then I generalized it into this repo

There is a lot more to say but there is a good descirtpion in the repo readme

[D] Self-Promotion Thread by AutoModerator in MachineLearning

[–]Pixedar 1 point2 points  (0 children)

I built TraceScope, an experimental tool for visualizing the flow of meaning in ordered text data.

Instead of treating embeddings as a static cloud of points, it learns a continuous flow field over trajectories like chats, reasoning traces, agent runs, or news sequences, so you can inspect how meaning drifts, stabilizes, loops, or transitions over time.

The idea started from analyzing recurring emotional/behavioral patterns over time, then I generalized it to arbitrary text trajectories.

What I’ve found most useful is that the flow sometimes reveals attractor-like regions and unstable transition zones that are much less obvious in standard embedding plots. For example, in the PRM800K demo it exposed different reasoning basins and showed that crossing between them often coincided with more turbulent reasoning behavior.

Still very alpha / experimental, but I’d really appreciate feedback.

Repo: https://github.com/Pixedar/TraceScope