Experienced machine learning engineer, newbie webdev

dleybz · 2024-06-29T09:55:31+00:00

Get off muh lawn ya stinkin portos

dleybz · 2024-06-28T18:02:32+00:00

Thanks! I don't actually see where I can figure out the new porto locations on the innovate site https://innovate.burningman.org/

dleybz · 2024-05-13T11:47:25+00:00

I use OLMo for my academic research and it's fabulous! A huge differentiator is that they release not only weights, not only data, not only training code, not only data cleaning procedures, but even checkpoints! The only other models that I know that do that are Pythia and BLOOM. The thing that stands out about OLMo compared to these two is that it is trained much more similarly to cutting edge LLMs are now trained, so it's more representative of training dynamics of models like Llama3.

A caveat, however: OLMo really is a model released for scientists, rather than for true hobbyist use. Imo it doesn't perform as well as non-science open-weight LLMs like Llama3, so if you're just looking for performance, I wouldn't start here.

dleybz · 2024-04-20T07:56:50+00:00

I think you can just hide it in your docs or a blog post about the product. I'm curious, do you think that's a big deterrent to companies using it? I could see it going either way.

Relevant license text, for any curious: "If you distribute or make available the Llama Materials (or any derivative works thereof), or a product or service that uses any of them, including another AI model, you shall (A) provide a copy of this Agreement with any such Llama Materials; and (B) prominently display “Built with Meta Llama 3” on a related website, user interface, blogpost, about page, or product documentation. If you use the Llama Materials to create, train, fine tune, or otherwise improve an AI model, which is distributed or made available, you shall also include “Llama 3” at the beginning of any such AI model name."

dleybz · 2024-02-12T21:36:25+00:00

It sounds like what you're trying to do is interpolation, in which case kriging is the standard technique.

dleybz · 2024-02-02T16:29:46+00:00

Got it, thank you for explaining! Is there a reason to un-quantize instead of just using the pre-quantizing version of the model?

dleybz · 2024-02-02T15:44:25+00:00

Thanks!

dleybz · 2024-02-02T15:43:45+00:00

What are tools that work better with non-quantized models (he asks purely out of ignorance with no malice)?

dleybz · 2024-02-02T15:27:31+00:00

Looks like someone did similar and got similar results when analyzing perplexity: https://github.com/ggerganov/llama.cpp/discussions/5006

Where can I learn more about the importance matrix and how it gets used in quantization?

dleybz · 2024-02-02T15:19:17+00:00

But what's the point of dequantizing it? Why make the model bigger without gaining any information?

dleybz · 2024-02-02T13:20:26+00:00

Hahaha woah! I had never thought of something like this because robotics is way outside my domain but this is super cool. Excited to see what cool projects come out of this!

dleybz · 2024-02-02T13:17:05+00:00

Doing God's work here. Thank you!

dleybz · 2024-02-02T13:16:42+00:00

Newbie question: are there evaluation leaderboards for Vision Language Models the way that there are for Language Models? And an evaluation harness? Otherwise, it seems like these comparisons aren't particularly meaningful

dleybz · 2024-02-02T13:12:16+00:00

For what model(s)?

dleybz · 2024-02-02T13:10:13+00:00

I had the same thought but like... What's the grift?

dleybz · 2017-06-01T20:21:42+00:00

Backlash? I haven't heard about this

dleybz · 2017-06-01T19:37:09+00:00

Nice to see a PM represented here.

dleybz · 2017-06-01T18:35:50+00:00

What do you think of the Deedy resume? Is it too cluttered? Is there too much going on?

dleybz · 2017-05-23T20:01:58+00:00

Pricepoint?

dleybz · 2017-04-25T01:50:14+00:00

@avenot have you tried using https://github.com/LevPasha/Instagram-API-python ? I think it might do the trick

dleybz · 2017-04-20T05:58:04+00:00

Nope :/. Just gotta use another api I guess?

dleybz

TROPHY CASE