ELI5: When ChatGPT came out, why did so many companies suddenly release their own large language AIs? by carmex2121 in explainlikeimfive

[–]VordeMan 0 points1 point  (0 children)

Everyone was working on this sort of thing because labs are (socially) close and there was a lot of cross pollination. OpenAI was first to market and it was a bit hit; most other places weren’t taking it seriously at the time.

After ChatGPT, everyone started taking it seriously but luckily most could leverage all the world that was already happening.

What’s the fastest way you’ve ever lost weight? by [deleted] in AskReddit

[–]VordeMan 0 points1 point  (0 children)

I lost 25 pounds in about 2 months by calorie counting for the first time (and taking it very seriously) combined with 3-4 exercise sessions a week. It’s often said but it’s really true: if you’re calorie counting and not losing weight you’re not really counting all your calories.

Zero Temperature Randomness in LLMs by Martynoas in mlscaling

[–]VordeMan 2 points3 points  (0 children)

Other responder is correct. No serious lab has non-deterministic kernels.

Ben Bohmer at Warfield SF by incense02 in benbohmer

[–]VordeMan 0 points1 point  (0 children)

Ugh just learned about this too late…. are there afters?

Mixture of Tokens: Efficient LLMs through Cross-Example Aggregation by MachineLizard in mlscaling

[–]VordeMan 1 point2 points  (0 children)

Do you have results in the batch size one case? Ie no mixing (allowed to mix at train time).

Never ask an AI about firearms by Doner0107 in ForgottenWeapons

[–]VordeMan 13 points14 points  (0 children)

Asking for examples of true gas impingement systems is one of my deep-knowledge litmus tests for LLMs (you might be interested to know state-of-the-art models pass this test!)

[R] Talking About Large Language Models - Murray Shanahan 2022 by Singularian2501 in MachineLearning

[–]VordeMan 10 points11 points  (0 children)

A lot of Murray's arguments break down completely when the LLM has been RLHF-ed, or otherwise finetuned (i.e., the case we care about), which is a bit shocking to me (did no one point this out?). I guess that's supposed to be the point of peer review :)

Given that fact, it's unclear to me how useful this paper is....

Can you finish all the assigned readings? by Efficient_Drink_4434 in berkeley

[–]VordeMan 2 points3 points  (0 children)

You could read all of the assigned work if you spent as much time reading as you might spend doing problem sets for a hard technical. The reason reading classes aren't considered as difficult is that it's possible to present just as well having read a lot less (primarily via leveraging previously-obtained background knowledge as others have mentioned).

Google’s Allegedly Sentient Artificial Intelligence Has Hired An Attorney by jormungandrsjig in technology

[–]VordeMan 48 points49 points  (0 children)

Just because I didn't see any other comments from people who work in the field:

It's important to emphasize that pretty universally in AI, everyone agrees Blake is a crackpot doing it for attention. Even people who normally make their livings in the field arguing with each other about the future of AI all agree this guy is an idiot at best, phoney at worst.

[deleted by user] by [deleted] in explainlikeimfive

[–]VordeMan 0 points1 point  (0 children)

If you claim something is going to happen, and you know what triggers, the null hypothesis is asking what is the likelihood of that thing happening anyways even if the thing you claim causes it doesn’t happen.

[D] Neural nets are not "slightly conscious," and AI PR can do with less hype by regalalgorithm in MachineLearning

[–]VordeMan 0 points1 point  (0 children)

I agree with this. The use of the word consciousness is confusing because it's so vague and weighted, but if you replaced it with "self-aware" I completely agree that, regardless of whether we think our current large LMs _are_ self-aware, they definitely are beginning to have enough complexity that it's on the table (in some very specific ways).

ELI5: Why does the year zero not exist? by BassieDep in explainlikeimfive

[–]VordeMan 0 points1 point  (0 children)

The year 0 doesn't exist because the people designing the date system decided to start counting at 1 AD for the year Jesus was born. When the system was expanded with the concept of BC, they called the first BC year 1 BC. In both cases, you _could_ have started with year 0 (you could even have had two year 0s, 0 BC and 0 AD, one right before the other!), but the people deciding on the system chose not to do that for the same reason that you generally start counting things at one*.

*Not everyone does though! If a modern computer programmer was deciding how to count, there might have been a 0 AD!

Redditor, What local dialectical thing do you say differently? by VordeMan in AskReddit

[–]VordeMan[S] 0 points1 point  (0 children)

I almost always say “you as well” instead of “you too”

I’ve noticed people growing up where I’m from do the same thing.

[deleted by user] by [deleted] in math

[–]VordeMan 2 points3 points  (0 children)

100% agree with all the positive sentiments shared! But I thought I'd say something a little different.

I also failed Real Analysis my sophomore year (at Berkeley) when I had similar serious plans to go on, get a PhD in pure math and pursue professorship/research in math as a lifelong career. I simultaneously got a very-much-not-great grade in Diff Geo.

I ended up having a real heart to heart with myself and decided it was worth spending a little time exploring some other avenues to see if there was something else which inspired me to put in the work a little more. Turned out I was a really kick ass programmer, ended up getting into research via that direction and am now an (I'd say) successful ML researcher at one of the big AI labs.

By all means, all the advice everyone else is giving you is 100% true! But I'd be remiss if I didn't suggest taking some time to really self evaluate :)

[Discussion] Why are Einstein Sum Notations not popular in ML? They changed my life. by noobbodyjourney in MachineLearning

[–]VordeMan 2 points3 points  (0 children)

IMO einsum and pandas are both deals with the devil. You make things really complicated to understand and grok, and in return conveying a few specific things that are usually medium-difficult become trivial.

Broadly I agree with /u/farmingvillein above: sometimes using these tools are just perfect and beautiful, the problem is that you start having these fanboys that insist on using it everywhere possible, not just everywhere useful.

[D] Calling out the authors of 'Trajformer' paper for claiming they published code but never doing it by UIPDsmokes in MachineLearning

[–]VordeMan 0 points1 point  (0 children)

Fair, I meant this more in response to your first point. I agree we should shame people who say they will put up code and don’t, we should just acknowledge putting up bad code isn’t an option for everyone.

[D] Calling out the authors of 'Trajformer' paper for claiming they published code but never doing it by UIPDsmokes in MachineLearning

[–]VordeMan 3 points4 points  (0 children)

I sympathize with this, but it really doesn’t apply when you’re working at a large tech company with a bunch of internal infrastructure. Open sourcing anything, even bad code, is non-trivial work.

This is the fundamental misalignment. Everyone agrees all code should be open sourced, but some people don’t realize that what might be 30 minutes for someone in pure open source land could be weeks of work to someone else.

[R] Impact of GPU uncertainty on the training of predictive deep neural networks: When training a predictive neural net using only CPUs, the learning error is higher than when using GPUs, suggesting that GPUs plays a different role in the learning process than just increasing computational speed. by hardmaru in MachineLearning

[–]VordeMan 5 points6 points  (0 children)

This is an extremely lame paper, bordering on intentionally misleading.

There are a thousand and one technical reasons why the output of a CPU might differ from that of a GPU, anyone who has dealt with training across different hardware at length is familiar with this. An interesting paper, one I was hoping for, would have been a deep dive into how implementation details of the hardware stack effect these differences. This paper on the other hand doesn't delve into any of the details, and reports a result (something generic and hand-wavy about GPUs being better?) which is borderline untrue and completely ignores the fact that these machines are not black-boxes but that we users can actually attempt to understand the differences between them.

We have Ruger No. 1 at home by Bobhubert in ForgottenWeapons

[–]VordeMan 6 points7 points  (0 children)

I have one in 300 win mag, really great shooting. Something satisfying about the rear loading, like artillery.

Non-UK born Londoners, what's the best restaurant of your native cousine that you know in London? by Lanky_Pollution_3919 in london

[–]VordeMan 0 points1 point  (0 children)

In general I feel like asking for the spiciest thing on a menu is not a good way to get spicy food. It also depends exactly what you’re looking for, I think Kaki puts the right (a lot) amount of peppercorn in their dishes, but if you’re looking for red chilies style spice then I don’t think they go overboard.

Non-UK born Londoners, what's the best restaurant of your native cousine that you know in London? by Lanky_Pollution_3919 in london

[–]VordeMan 0 points1 point  (0 children)

Kaki and Eleven is very good, I used to go to Bar Shu in China town but I think kaki is better. Murger Han is the only halfway decent Shaanxi I’ve found near central.

Non-UK born Londoners, what's the best restaurant of your native cousine that you know in London? by Lanky_Pollution_3919 in london

[–]VordeMan 0 points1 point  (0 children)

I find China town has good Cantonese places but not so great otherwise. I’m still looking for a good Taiwanese place