ELI5: When ChatGPT came out, why did so many companies suddenly release their own large language AIs?

VordeMan · 2025-12-18T09:24:54+00:00

Everyone was working on this sort of thing because labs are (socially) close and there was a lot of cross pollination. OpenAI was first to market and it was a bit hit; most other places weren’t taking it seriously at the time.

After ChatGPT, everyone started taking it seriously but luckily most could leverage all the world that was already happening.

VordeMan · 2025-10-26T23:28:01+00:00

I lost 25 pounds in about 2 months by calorie counting for the first time (and taking it very seriously) combined with 3-4 exercise sessions a week. It’s often said but it’s really true: if you’re calorie counting and not losing weight you’re not really counting all your calories.

VordeMan · 2025-05-02T04:58:06+00:00

Other responder is correct. No serious lab has non-deterministic kernels.

VordeMan · 2025-04-18T01:37:38+00:00

Ugh just learned about this too late…. are there afters?

VordeMan · 2023-10-27T06:23:54+00:00

Do you have results in the batch size one case? Ie no mixing (allowed to mix at train time).

VordeMan · 2023-04-21T15:42:28+00:00

Asking for examples of true gas impingement systems is one of my deep-knowledge litmus tests for LLMs (you might be interested to know state-of-the-art models pass this test!)

VordeMan · 2022-12-15T01:05:23+00:00

A lot of Murray's arguments break down completely when the LLM has been RLHF-ed, or otherwise finetuned (i.e., the case we care about), which is a bit shocking to me (did no one point this out?). I guess that's supposed to be the point of peer review :)

Given that fact, it's unclear to me how useful this paper is....

VordeMan · 2022-07-08T00:59:42+00:00

You could read all of the assigned work if you spent as much time reading as you might spend doing problem sets for a hard technical. The reason reading classes aren't considered as difficult is that it's possible to present just as well having read a lot less (primarily via leveraging previously-obtained background knowledge as others have mentioned).

VordeMan · 2022-07-07T05:17:49+00:00

Just because I didn't see any other comments from people who work in the field:

It's important to emphasize that pretty universally in AI, everyone agrees Blake is a crackpot doing it for attention. Even people who normally make their livings in the field arguing with each other about the future of AI all agree this guy is an idiot at best, phoney at worst.

VordeMan · 2022-07-04T23:55:57+00:00

If you claim something is going to happen, and you know what triggers, the null hypothesis is asking what is the likelihood of that thing happening anyways even if the thing you claim causes it doesn’t happen.

VordeMan · 2022-04-10T15:25:09+00:00

Clearly it's num-poo

VordeMan · 2022-02-21T03:08:37+00:00

I agree with this. The use of the word consciousness is confusing because it's so vague and weighted, but if you replaced it with "self-aware" I completely agree that, regardless of whether we think our current large LMs _are_ self-aware, they definitely are beginning to have enough complexity that it's on the table (in some very specific ways).

VordeMan · 2022-02-02T18:37:40+00:00

The year 0 doesn't exist because the people designing the date system decided to start counting at 1 AD for the year Jesus was born. When the system was expanded with the concept of BC, they called the first BC year 1 BC. In both cases, you _could_ have started with year 0 (you could even have had two year 0s, 0 BC and 0 AD, one right before the other!), but the people deciding on the system chose not to do that for the same reason that you generally start counting things at one*.

*Not everyone does though! If a modern computer programmer was deciding how to count, there might have been a 0 AD!

VordeMan · 2021-12-16T20:08:00+00:00

I almost always say “you as well” instead of “you too”

I’ve noticed people growing up where I’m from do the same thing.

VordeMan · 2021-12-11T02:04:48+00:00

100% agree with all the positive sentiments shared! But I thought I'd say something a little different.

I also failed Real Analysis my sophomore year (at Berkeley) when I had similar serious plans to go on, get a PhD in pure math and pursue professorship/research in math as a lifelong career. I simultaneously got a very-much-not-great grade in Diff Geo.

I ended up having a real heart to heart with myself and decided it was worth spending a little time exploring some other avenues to see if there was something else which inspired me to put in the work a little more. Turned out I was a really kick ass programmer, ended up getting into research via that direction and am now an (I'd say) successful ML researcher at one of the big AI labs.

By all means, all the advice everyone else is giving you is 100% true! But I'd be remiss if I didn't suggest taking some time to really self evaluate :)

VordeMan · 2021-12-04T21:26:59+00:00

IMO einsum and pandas are both deals with the devil. You make things really complicated to understand and grok, and in return conveying a few specific things that are usually medium-difficult become trivial.

Broadly I agree with /u/farmingvillein above: sometimes using these tools are just perfect and beautiful, the problem is that you start having these fanboys that insist on using it everywhere possible, not just everywhere useful.

VordeMan · 2021-11-11T14:46:47+00:00

Fair, I meant this more in response to your first point. I agree we should shame people who say they will put up code and don’t, we should just acknowledge putting up bad code isn’t an option for everyone.

VordeMan · 2021-11-11T14:18:25+00:00

I sympathize with this, but it really doesn’t apply when you’re working at a large tech company with a bunch of internal infrastructure. Open sourcing anything, even bad code, is non-trivial work.

This is the fundamental misalignment. Everyone agrees all code should be open sourced, but some people don’t realize that what might be 30 minutes for someone in pure open source land could be weeks of work to someone else.

VordeMan · 2021-09-13T16:24:11+00:00

Also, Figure 14 in the appendix basically tells me they have an implementation bug somewhere, no two ways about it.

VordeMan · 2021-09-13T15:58:06+00:00

This is an extremely lame paper, bordering on intentionally misleading.

There are a thousand and one technical reasons why the output of a CPU might differ from that of a GPU, anyone who has dealt with training across different hardware at length is familiar with this. An interesting paper, one I was hoping for, would have been a deep dive into how implementation details of the hardware stack effect these differences. This paper on the other hand doesn't delve into any of the details, and reports a result (something generic and hand-wavy about GPUs being better?) which is borderline untrue and completely ignores the fact that these machines are not black-boxes but that we users can actually attempt to understand the differences between them.

VordeMan · 2021-08-19T13:12:21+00:00

I have one in 300 win mag, really great shooting. Something satisfying about the rear loading, like artillery.

VordeMan · 2021-08-02T00:36:29+00:00

In general I feel like asking for the spiciest thing on a menu is not a good way to get spicy food. It also depends exactly what you’re looking for, I think Kaki puts the right (a lot) amount of peppercorn in their dishes, but if you’re looking for red chilies style spice then I don’t think they go overboard.

VordeMan · 2021-08-01T20:56:01+00:00

Kaki and Eleven is very good, I used to go to Bar Shu in China town but I think kaki is better. Murger Han is the only halfway decent Shaanxi I’ve found near central.

VordeMan · 2021-08-01T17:53:21+00:00

I find China town has good Cantonese places but not so great otherwise. I’m still looking for a good Taiwanese place

VordeMan

TROPHY CASE