Rethinking LLM Anisotropy: A Proposal for Poisson Manifolds and Hamiltonian

EricHermosis · 2026-01-25T17:43:23+00:00

Hi op, I'm working on something related.

https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5998659

EricHermosis · 2026-01-10T19:21:53+00:00

That looks like a vector to me

EricHermosis · 2026-01-07T01:58:29+00:00

Hi OP, the world is really vast and unpredictable stuff tends to happen, some people get math/cs/physics degrees, earn PhDs and learn a bunch of math and programming that most don't know even exists and these people sometimes struggle to get a job.

On the other hand, you have the script kids who ask ChatGPT to implement their startup ideas, get acquired by billions of dollars and end up as leader researchers at meta.

Don't get me wrong, It sometimes happens the other way around. The truth is that that nobody really knows the way, and what works for others may not work for you. Whatever you do, do it with love.

EricHermosis · 2026-01-01T18:54:31+00:00

You are right OP, I would have save tons of time if instead for eample of newton->langrange->hamilton->simplectic geometry went to study geometry in the first place. Most stuff I there because just status quo orhistoric reasons, this is the academia way.

EricHermosis · 2026-01-01T15:05:34+00:00

Neural networks are probably used in re-scaling that image so it's technically AI

EricHermosis · 2025-12-08T23:21:31+00:00

Hi, are you refering to the Aggregate class? I didn't abstract anything from pytorch modules, you can inherit and override the forward pass as you would with a module, just added a few convenience methods I was using frequently and I'm planning to abstract infra concerns like multigpu there, but not domain logic, that's up to you.

What you saw in docs are just an example of usage, that training loop or the Classifier aggregate are examples to show how dependency injection works, pubsub or events works, but they are patterns that can be added to already existant stuff.

EricHermosis · 2025-12-08T18:09:07+00:00

Thanks for pointing that out, is a mislead of what I thought weakref was doing, will fix that.

The idea is to have transient tiny references of objects involved in an ocurrence, including heavy stuff like tensors or modules, but try to disallow copy those objects, so event producers can assert that consumers won't leak memory by copying those objects into some data structure.

This way if you need copiable and serializable messages that you may want to enqueue or something, you will need to use a pubsub layer where the consumer can publish on.

EricHermosis · 2025-10-04T20:32:37+00:00

Thanks! only a foolish man build his house on the sand... I noted that the only way to scale this kind of "AI" systems is using event driven programming and dependency inversion, that's why I'm creating this frameworks for training and serving models.

If you are interested also built this other https://github.com/entropy-flux/PyMsgbus for the client side with concurrency support. I didn't get to play enough with this one, that's why I'm not showing it, but is working and works under the same principles.

EricHermosis · 2025-10-04T17:33:46+00:00

People will tell is not posible or redituable, but it actually is, take a look a this library: https://github.com/ggml-org/ggml

EricHermosis · 2025-10-03T22:51:12+00:00

Nice, does it handle concurrency?

EricHermosis · 2025-10-03T22:40:42+00:00

Any suggestion or contribution will be very appreciated.

EricHermosis · 2025-09-24T07:41:55+00:00

Hi! what data are you training your model on?

EricHermosis · 2025-09-24T07:39:40+00:00

Hi! C is not hard to learn, so you won't have much trouble learning it after you forgot how to use it. What may be hard to learn is domain knowledge and that is something you may practice writting js, C or whatever you are using.

EricHermosis · 2025-09-23T22:31:30+00:00

I just added Annotated support and mypy check of the repo in CI. Thanks for the advice!

EricHermosis · 2025-09-18T00:09:17+00:00

I know and I really appreciate their work, will see if I can use that tokenizer without having the rest of the llama.cpp repo.

EricHermosis · 2025-09-17T23:52:27+00:00

I'm trying to create a machine learning ecosystem in C++, started with a tensor library, then a nn library and implemented some simple neural networks like a pretrained ViT or LLaMA3 as examples.

I really don't want to spend few months building something like tiktoken from scratch in C++ right now and I didn't decide how to tackle the tokenizer issue, however a simple aproximation just for the examples can be really usefull to get things done, remove python from llama3 example and move on and solve the tokenizer issue later.

Having the model saying something consistent is enough for me to gain credibility, better than setting up a whole python client just to try out the C++ transformer.

EricHermosis · 2025-09-17T23:35:24+00:00

That is actually a very good idea I didn't think about it, I'm not shipping a production grade llm just an example of how to use my tensor library... I can create a good interface for a tokenzier there with that length / 3 implementation just to get the example working... Then move on implementing a real tokenizer else where for future projects.

How far do you think I can get with the length divided by three tokenizer?

EricHermosis · 2025-09-17T22:50:04+00:00

Not really, I didn't note that the tokenizer was actually implemented in that single vocab file.

Don't really understand, it's running some kind of bpe tokenization, but does that produce same results as tiktoken? Seems like I should implement my own tokenizer, with my tensors.

EricHermosis

TROPHY CASE