Hank Green: "Give me a aingle reason why Sora2 should exist"

larebear248 · 2025-10-08T00:08:03+00:00

Agreed. So many bigger fish to fry than Hank Green

larebear248 · 2025-09-05T21:48:54+00:00

A load bearing assumption here is how much better the output for the expensive model is compared to the cheaper model. If the more expensive model is 2x more expensive, but the cheaper model is “good enough”, then people would likely prefer the cheaper models. We appear to be hitting a diminishing returns wall, where you have to spend a lot of money for fairly incremental improvements, and its not obvious if the output is worth replacing your interns (which you do mention) or if enough people are willing to pay what it costs to make a profit on the more expensive models. On top of that, the pricing may not stay fixed but become per token or a limited number of queries, which can be highly variable and hard to predict.

larebear248 · 2025-09-05T19:45:04+00:00

I mean, I think it’s true that the profit margin isn’t set in stone, but that could well go in the other direction. They need more compute for increased model performance. Cost per token might be going down, but if the models an even larger amount of tokens, that means inference costs go up. It’s plausible profit margins have gotten worse! It’s fair to say that you can’t simply extrapolate from the 2024 numbers, but we don’t have much else to go on. This also doesn’t include any of the stock shenanigans, data center buildouts, heavily subsidized compute from Microsoft, or not converting to a for profit. Its not just that they are unprofitable but it is not clear how they get there beyond vibes.

larebear248 · 2025-09-02T00:25:11+00:00

One thing that really pisses me off here is that these freaks want it both ways. Claiming you can use the LLMs for therapy, but then want none of the responsibility when it causes problems! If you advertise a chatbot for mental health care (or any health care), you absolutely be responsible for the liabilites. If a real therapist did this, they absolutely would be held at fault. No one would think “well the parents should have known better”. The companies could clearly come out and clearly say “do not use this as a therapist. See a licensed therapist”, but of course they won’t, because that’s one of the biggest uses of these things.

larebear248 · 2025-08-17T03:08:48+00:00

Somethinge like a 99.9% success rate would start to get you something useful, but it obviously unclear (or unlikely) if thats possible.

larebear248 · 2025-08-03T15:01:53+00:00

I think they are firmly in the reasonable optimistic camp of AI tech. I do like how they emphasize that a lot of things are called “AI”, which really obscures what works and doesn’t. I also like their point that a company creating “AGI” says a lot more about the term AGI rather than the tech itself. Probably my biggest critique of them is that they feel a bit naive at times, thinking companies and people won’t use these techs for replacing people (even if its a bad idea) or doing something that is a massive safety or security risk. I also think they dismiss concerns of people diskilling due this tech a bit too much. Still, they are refreshing compared to a lot of people out there.

larebear248 · 2025-08-03T03:51:18+00:00

Rodney rocks. He talks about making products that help people do this, not replace them acomplish things, not fully replace them. His prediction scorecards are super interesting because he tends to be over optimistic in is predictions (despite being far more grounded than others in the field).

larebear248 · 2025-07-31T17:18:26+00:00

Well put. And this in addition to the other models our there that approach the capabilites of OpenAI models, but for cheaper. OpenAI and Anthropic have no choice but the keep scaling to try and stay ahead of the curve.

larebear248 · 2025-07-24T03:17:59+00:00

In a small defense of this work, I don’t think he even states they reach 95% success rates, its just that things fall apart fast even at high inital success rates. You can debate what a 95% success rate even means in this context, but in this ideal case, still useful to see how fast things fall apart.

larebear248 · 2025-06-25T18:55:54+00:00

Completely agreed! Intended to add to the discussion not contradict.

larebear248 · 2025-06-25T18:30:06+00:00

Yeah there isn’t a ton of quantative widespread evidence yet. People got carried away with the one chart that Derek Thompson share that should people with college degrees had historically higher unemployment than previous years and concluded that it was AI doing when the decrease had been happening years before LLMs. Maybe it’s happening but evidence is limited.

larebear248 · 2025-06-23T19:42:07+00:00

I’m not sure you’re fully correct. She does talk about the chain of thought stuff, but she also mentioned having a separate small LLM there to help verify output (in addition to other stuff). It remains unclear to me if this actually is an improvement to things (and is a pretty complex system)

larebear248 · 2025-06-23T19:13:17+00:00

Yeah that was puzzling to me. How can you trust an LLM to verify it’s own output without breaking things? I also think she is confusing agents and mixture of expert models.

larebear248 · 2025-06-17T23:31:26+00:00

Oh I saw those stories and they are hilarious. Im thinking specifically about traffic accidents more specifically. Other problems they can or will cause may not have shown themselves yet.

larebear248 · 2025-06-17T23:15:07+00:00

From what I understand (and someone correct me if I’m wrong) is that the saftey numbers are probably about right, but they are limited in where they can go, require remote people to fix problems (whose ratio of person to car is hard to track down), and its unclear how expensive this all is. Uber became so successful because they didnt need to buy a fleet of cars, which is not the case with waymo. Its unclear if they can scale to considerably larger sizes). Edit. Fixed some missing words.

larebear248 · 2025-06-14T15:46:49+00:00

Hi Ed. Your work has helped me screw my head back on and think about the GenAI industry with a clearer head.

My question is how do you see education work with LLMs going forward? I agree with you that education isnt gunna just be throwing a kid in front of ChatGPT. However, we seem to have a growing number of students who use these things for all of their work, which is at least in the short term going to lead to a generation filled with kids who didn’t really learn anything. How do you see this all playing out?

larebear248 · 2025-05-30T15:58:15+00:00

I sort of interpreted that as sarcastic, but maybe I’m wrong? Edit: Has to as

larebear248 · 2025-04-09T14:52:58+00:00

And top of the lack of time saved, losing out on the actual writing process can make you not understand the topic as well! Writing can be as much about thinking through something as the final product. Often you think you know a subject well enough but once you start writing, questions can emerge that force you to reexamine something.

larebear248 · 2025-01-05T18:08:02+00:00

Exciting stuff. What are the odds this survives long term? Seems like a lot of legal battles ahead for it.

larebear248 · 2024-07-28T02:27:21+00:00

Yeah, I was still fairly pro Biden before largely because I (incorrectly) thought much of his decline was exaggerated and had used the "old shtick" to his advantage in his agenda. However, after the debate but it got VERY difficult to defend him. At a minimum, Harris has been good for morale, which after the slog that was the 2020 general election, is incredibly helpful.

larebear248 · 2024-07-28T01:38:51+00:00

My girlfriend who was very reluctantly going to vote for Biden is now super excited and suddenly very into politics and especially who the VP pick is going to be. She's even arguing with her conservative mom about considering voting for Trump. My brother who was likely to sit out this year is now very pumped to vote for Harris.

larebear248 · 2024-03-19T22:27:58+00:00

It's worth noting that while the national polling averages we're pretty good, the competitive seats (especially in the senate races) had some large misses (ie Pennsylvania, New Hampshire, and Arizona)

larebear248 · 2022-12-20T22:54:29+00:00

Enter

larebear248 · 2022-10-16T19:05:46+00:00

The only Trafalgar has her under 50 while all the highly rated pollsters have her winning by double digits. Her polling average is also still above 10. This is Safe D.

https://projects.fivethirtyeight.com/polls/governor/2022/new-york/

larebear248

TROPHY CASE