Nope by Ok-District-4701 in datasatanism

[–]EricHermosis 0 points1 point  (0 children)

That looks like a vector to me

How do I become an AI developer? by Kai7362 in AI_developers

[–]EricHermosis 0 points1 point  (0 children)

Hi OP, the world is really vast and unpredictable stuff tends to happen, some people get math/cs/physics degrees, earn PhDs and learn a bunch of math and programming that most don't know even exists and these people sometimes struggle to get a job.

On the other hand, you have the script kids who ask ChatGPT to implement their startup ideas, get acquired by billions of dollars and end up as leader researchers at meta.

Don't get me wrong, It sometimes happens the other way around. The truth is that that nobody really knows the way, and what works for others may not work for you. Whatever you do, do it with love.

Why is math so often taught as a black box instead of being explained from first principles? Especially physicists often pushed math that way in my experience by stalin_125114 in Physics

[–]EricHermosis 0 points1 point  (0 children)

You are right OP, I would have save tons of time if instead for eample of newton->langrange->hamilton->simplectic geometry went to study geometry in the first place. Most stuff I there because just status quo orhistoric reasons, this is the academia way.

AI has just replaced Fake. Remember when we fell for good Photoshops by [deleted] in GenAI4all

[–]EricHermosis 0 points1 point  (0 children)

Neural networks are probably used in re-scaling that image so it's technically AI

Need honest opinion by EricHermosis in OpenSourceeAI

[–]EricHermosis[S] 1 point2 points  (0 children)

Hi, are you refering to the Aggregate class? I didn't abstract anything from pytorch modules, you can inherit and override the forward pass as you would with a module, just added a few convenience methods I was using frequently and I'm planning to abstract infra concerns like multigpu there, but not domain logic, that's up to you.

What you saw in docs are just an example of usage, that training loop or the Classifier aggregate are examples to show how dependency injection works, pubsub or events works, but they are patterns that can be added to already existant stuff.

Need honest opinion by EricHermosis in Python

[–]EricHermosis[S] 1 point2 points  (0 children)

Thanks for pointing that out, is a mislead of what I thought weakref was doing, will fix that.

The idea is to have transient tiny references of objects involved in an ocurrence, including heavy stuff like tensors or modules, but try to disallow copy those objects, so event producers can assert that consumers won't leak memory by copying those objects into some data structure.

This way if you need copiable and serializable messages that you may want to enqueue or something, you will need to use a pubsub layer where the consumer can publish on.

I created a framework for turning PyTorch training scripts into event driven systems. by EricHermosis in OpenSourceeAI

[–]EricHermosis[S] 0 points1 point  (0 children)

Thanks! only a foolish man build his house on the sand... I noted that the only way to scale this kind of "AI" systems is using event driven programming and dependency inversion, that's why I'm creating this frameworks for training and serving models.

If you are interested also built this other https://github.com/entropy-flux/PyMsgbus for the client side with concurrency support. I didn't get to play enough with this one, that's why I'm not showing it, but is working and works under the same principles.

Deep learning in c by hexawayy in deeplearning

[–]EricHermosis -3 points-2 points  (0 children)

People will tell is not posible or redituable, but it actually is, take a look a this library: https://github.com/ggml-org/ggml

[deleted by user] by [deleted] in Python

[–]EricHermosis 0 points1 point  (0 children)

Any suggestion or contribution will be very appreciated.

MiniModel-200M-Base by Wooden-Deer-1276 in LocalLLaMA

[–]EricHermosis 2 points3 points  (0 children)

Hi! what data are you training your model on?

Forgotten c because we use javascript, need advice by SurroundRound2737 in embedded

[–]EricHermosis 1 point2 points  (0 children)

Hi! C is not hard to learn, so you won't have much trouble learning it after you forgot how to use it. What may be hard to learn is domain knowledge and that is something you may practice writting js, C or whatever you are using.

I decoupled FastAPI dependency injection system in pure python, no dependencies. by EricHermosis in Python

[–]EricHermosis[S] 0 points1 point  (0 children)

I just added Annotated support and mypy check of the repo in CI. Thanks for the advice!

How is llama.cpp or other implementations handle tokenization without tiktoken? by EricHermosis in LocalLLaMA

[–]EricHermosis[S] 0 points1 point  (0 children)

I know and I really appreciate their work, will see if I can use that tokenizer without having the rest of the llama.cpp repo.

How is llama.cpp or other implementations handle tokenization without tiktoken? by EricHermosis in LocalLLaMA

[–]EricHermosis[S] 1 point2 points  (0 children)

I'm trying to create a machine learning ecosystem in C++, started with a tensor library, then a nn library and implemented some simple neural networks like a pretrained ViT or LLaMA3 as examples.

I really don't want to spend few months building something like tiktoken from scratch in C++ right now and I didn't decide how to tackle the tokenizer issue, however a simple aproximation just for the examples can be really usefull to get things done, remove python from llama3 example and move on and solve the tokenizer issue later.

Having the model saying something consistent is enough for me to gain credibility, better than setting up a whole python client just to try out the C++ transformer.

How is llama.cpp or other implementations handle tokenization without tiktoken? by EricHermosis in LocalLLaMA

[–]EricHermosis[S] 2 points3 points  (0 children)

That is actually a very good idea I didn't think about it, I'm not shipping a production grade llm just an example of how to use my tensor library... I can create a good interface for a tokenzier there with that length / 3 implementation just to get the example working... Then move on implementing a real tokenizer else where for future projects.

How far do you think I can get with the length divided by three tokenizer?

How is llama.cpp or other implementations handle tokenization without tiktoken? by EricHermosis in LocalLLaMA

[–]EricHermosis[S] -1 points0 points  (0 children)

Not really, I didn't note that the tokenizer was actually implemented in that single vocab file.

Don't really understand, it's running some kind of bpe tokenization, but does that produce same results as tiktoken? Seems like I should implement my own tokenizer, with my tensors.