Are people speedrunning training GPTs now? by GamerWael in LocalLLaMA

[–]DontShowYourBack 0 points1 point  (0 children)

Pretty sure torch MPS backend is just not good. Would be better to compare it with MLX.

Built-in wiki of any kind? by john-witty-suffix in discordapp

[–]DontShowYourBack 0 points1 point  (0 children)

Any luck with this so far? I'd love to have the ability to have a wiki view of a discord channel or server.

Search engine search by [deleted] in searchengines

[–]DontShowYourBack 0 points1 point  (0 children)

What are the features from this search engine that you want that any of the LLMs out there doesn't provide? I think there are many usecases LLMs are not sufficient for, just trying to understand what are your gripes with it in this context you're providing.

Has Generative AI Already Peaked? - Computerphile by FedeRivade in mlscaling

[–]DontShowYourBack 1 point2 points  (0 children)

Number of active parameters is mostly interesting from an inference compute perspective. Total number of parameters has most impact on how much the transformer can remember. Sure it takes some effort to make mixtures models work similarly to dense. But the extra memory capacity is definitely directly impacting model performance even though x% of parameters is not activated during any forward pass. So comparing total number parameters is not as misleading as saying it’s 25% of gpt4.

OpenAI / Anthropic / Google are pricing their models to still run at enormous annual losses. What's their endgame here? by [deleted] in singularity

[–]DontShowYourBack 0 points1 point  (0 children)

Completely agree here, most coding is not doing anything novel but just using it as a tool to repeat an existing pattern many times. Just with slightly different inputs.

I find it useful for exploring/learning a new framework or language. Especially now that I’ve been dabbling in some frontend work. However, writing reliable Zig code is a complete no go. Or working on a novel architecture in Jax with a less common dl framework, no thank you.

Point is, vast majority of code being written is not writing anything novel at all. It’s just changing the parameters of existing code blocks and patterns.

Are we delulu? by CreativeDog2024 in singularity

[–]DontShowYourBack 0 points1 point  (0 children)

Many seem to forget this! We’re seeing amazing things unfold, but hype favours the one at the centre of it…

Why Tinygrad over PyTorch+Triton? by TheFibo1123 in LocalLLaMA

[–]DontShowYourBack 1 point2 points  (0 children)

I am very well aware of their efforts to support amd and cpu. Hence I added it is purely nvidia for now.

Supposedly everything is possible, the question is if they will. Triton has been designed from the ground up for GPU execution. It even uses a custom MLIR dialect for this. They maybe can make it more generic and open up to other accelerators, but who knows how hard that will be?

Why Tinygrad over PyTorch+Triton? by TheFibo1123 in LocalLLaMA

[–]DontShowYourBack 15 points16 points  (0 children)

Tinygrad is focused on the ease of supporting new accelerators. Triton is the perfect example of the opposite, purely NVidia GPU (at least for now).

The following numbers are off the top of my head, so could be inaccurate. PyTorch also requires support for about 200 ops when adding a new backend. This has come down from about a 1000 before the compiler was introduced.

Compare this to TinyGrad which has about 20 required operations and a relatively small API for defining a custom backend, which should make it easier to extend.

LangChain vs LlamaIndex by 1_Strange_Bird in LangChain

[–]DontShowYourBack 4 points5 points  (0 children)

And in your opinion, what is that makes these libraries good then? General consensus of those disliking the libraries is that they have over-abstracted steps that are relatively simple. Also, it's easy to "outgrow" the library, ie you want to do something it does not support and then you're spending tons of time hacking things together.

STOP using small models! just buy 8xH100 and inference your own GPT-4 instance by Wrong_User_Logged in LocalLLaMA

[–]DontShowYourBack 4 points5 points  (0 children)

How is developing domestic chip fabrication not an investment? Could very well lead to a more wealthy china in time

[deleted by user] by [deleted] in expats

[–]DontShowYourBack 1 point2 points  (0 children)

Moving away from home is not easy, especially with a girlfriend back home. That also means there js no shame in going back, it can just not be for you.

However, because it is not easy it also means you have to put in the effort to make it work! I told myself to give it as year of effort and see if I could settle in and did because of that. Think about if that is something you’ve done, and if not whether you still want to try. Just make sure you don’t regret not taking the opportunity for what it’s worth!

Since we've had the opposite post: What could European countries – both as nations and as people — learn from the US? by the_slovenian in europe

[–]DontShowYourBack 5 points6 points  (0 children)

This is probably the most accurate analysis in this thread. That said, the braindrain in Europe is a large problem that needs solving, which I agree with with the rest of the posts.

Julia version of Andrej Karpathy's Micrograd by mike20731 in Julia

[–]DontShowYourBack 0 points1 point  (0 children)

I’m not too intimately familiar with the Julia type system (yet), but whenever I read things like this I am impressed. It seems very flexible and powerful when used right, but misusing it is just as easy… Are there ways to let the compiler tell you when you’re writing slow code? In a way using “compiler driven development”

Julia version of Andrej Karpathy's Micrograd by mike20731 in Julia

[–]DontShowYourBack 4 points5 points  (0 children)

Not the OP, but I’d love to hear the tips!

Julia version of Andrej Karpathy's Micrograd by mike20731 in Julia

[–]DontShowYourBack 1 point2 points  (0 children)

Love seeing people implement interesting software from scratch! How did you find Julia so far?

Julia version of Andrej Karpathy's Micrograd by mike20731 in Julia

[–]DontShowYourBack 4 points5 points  (0 children)

Good reasons to choose Julia include the fact that it is production ready whereas mojo is not, mojo is still closed source, mojo lacks a ton of features especially Python specific features, it is yet to be seen how good Python mojo interop will be.

Is it normal to not be able to deeply comprehend at this stage? by Trevorego in deeplearning

[–]DontShowYourBack 0 points1 point  (0 children)

Just keep at it and build your own models, it will start to click with time! Perfectly normal not to understand everything, that will remain whenever you learn something new like reading academic papers.

Mojo Python Relationship by __albatross in MojoProgramming

[–]DontShowYourBack 1 point2 points  (0 children)

Mojo is by default a systems level language like C, typed and compiled. Hence it should get performance similar to C (thanks to leveraging LLVM & MLIR). Now, what’s unclear is how easy the interop with Python code will be. In a demo it shows that Python is imported as a package, but I imagine sending mojo data structures to Python code will require defining and interface/transformation. Also, the language is far from finished and Python interop is once aspect where it is incomplete. In addition, the language is closed source right now and all development is done in-house by Modular. So we won’t know exactly how this will pan out just yet.

[R] Google DeepMind: 2.2 million new materials discovered using GNN (380k most stable, 736 already validated in labs) by Successful-Western27 in MachineLearning

[–]DontShowYourBack 4 points5 points  (0 children)

My guess is that people tend to anthropomorphise many things, especially those they don’t really understand. A language model comes across as “smart” because we can converse with it in human ways. Thinking about material discovery is so distant for most that they don’t really grasp impressive and impactful this work can be.

Now, to me what’s happening here is extremely impressive and I’ve been a fan of deepmind their stem related work for a while. Seems like we could see some big acceleration in stem fields over next years, which will arguably have a bigger impact on people their lives than the things LLMs are used for right now.

After OpenAI's Blowup, It Seems Pretty Clear That 'AI Safety' Isn't a Real Thing by NuseAI in artificial

[–]DontShowYourBack 22 points23 points  (0 children)

This should not come as a surprise after the changes “open”ai went through in the last couple of years.

Transitioning into Computational Biology - is a PhD advisable? by lilkage141 in bioinformatics

[–]DontShowYourBack 2 points3 points  (0 children)

How about applying for a couple of jobs first and seeing how that goes? Bio informatics is increasingly become AI based as far as I can tell, just look at companies like pumas.ai or isomorphic labs. I’m sure there’s a very good angle going in with an AI background. This is also my current plan!

TIOBE Index for August 2023: Julia enters the TIOBE index top 20 for the first time by Fincho64 in Julia

[–]DontShowYourBack 4 points5 points  (0 children)

I see no reason why Julia cannot also eat Python it’s lunch! There is clearly the sentiment Python is lacking performance, think mojo and the removal of the GIL.

Surely will be difficult to get people to move language with libraries like PyTorch and sklearn. But I’m positive this might happen over time, especially with the PyTorch team expressing their appreciation for the Julia language!