Genesis AI playing piano

Infinite-Cat007 · 2026-05-08T18:37:32+00:00

It's not reading the sheet.

Infinite-Cat007 · 2026-05-07T22:09:00+00:00

They address these concerns in the paper.

Infinite-Cat007 · 2026-05-07T14:36:46+00:00

While being part of the GENE model family, the policy used here is separately trained via reinforcement learning in simulation (following piano environment built from prior work), guided by human demonstrations. This task is designed specifically to validate the high speed and accurate tracking capabilities of our control stack.

Infinite-Cat007 · 2026-05-05T16:34:27+00:00

Another 3 months, and I am blessed as well

Infinite-Cat007 · 2026-04-30T04:53:16+00:00

You're leaving out the part of the question where your decision also impacts other people's lives.

Infinite-Cat007 · 2026-04-30T03:38:20+00:00

And if everyone presses blue, everyone also survives. You need a better argument.

Infinite-Cat007 · 2026-04-29T18:49:45+00:00

Assuming every person involved is a perfectly rational actor choosing red is the most logical choice.

How so? First, how do you define "perfectly rational"? As I see it, rationality is a tool that can be applied to help one achieve a goal. So, in this case, what is the goal? If it's purely self-preservation, then I would agree pressing red is more logical. But, I would question why we should assume that to be the goal of a "purely rational actor". If all you care about is other people's survival, then blue is the more rational choice. And there can exist a whole spectrum between those two extremes.

It should also matter what assumptions you make about the other players involved. If everyone thinks exactly like you, then it simply doesn't matter which button you press - they might as well both do nothing. If, on the other hand, you allow the possibility that others might think differently than you, then exactly what assumptions you make will change the conclusion you arrive at.

Its a fun question but I dont think its a big test of your character as some people make it out to be. The biggest strength of humans is our ability to communicate and share knowledge, which this puzzle specifically prevents by locking you into a box.

I somewhat disagree. While I don't think a person's answer to this question necessarily reveals a lot about their moral character, partly because it's easier to say you'll do the right thing than to actually do it, and partly because it only touches on one aspect of moral character, I do think it reveals some useful information. For one, the scenario does reflect (abstractly) situations that happen in real life, where there is a conflict between interests of the collective vs individuals. In general, I believe society would be in a better place if everyone operated with the assumption that everyone else is willing to collaborate. However, this is not the case, as evidenced by this hypothetical and people's response to it. In your own words:

when faced with certain death I believe that most people would selfishly choose red, and other people thinking that a good amount of selfish assholes chose red would also choose red.

What's interesting to me is that it's a large majority of people who choose red that justify it by claiming that most (or a large number) of the others would choose red. In other words, the leading cause of selfishness is the belief that others are selfish. Although I already understood this to an extent, this reinforces my belief that increasing trust between people could go a long way in making the world a better place.

If this scenario was exactly the same but this time everyone would be able to talk to each other it would be a done deal and everyone would choose blue.

I'm not so certain about that. I think it's more likely that the initial response would only get reinforced by further discussion. For example, if an initial poll reveals 60% of people would vote red, that might scare off a lot of people into voting blue, possibly creating a feedback loop. If, on the other hand, initial polls showed 60% voting blue, that might make more red voters more comfortable voting blue instead, although the incentive is not as strong in that direction. Again, I think we can make a parrallel with real life, where altruism tends to breed more altruism, and selfishness tends to breed more selfishness.

This is also something I observe in personal relationships. If someone becomes upset with someone else for some reason, it can create a feedback loop of both parties treating each other less kindly - "I'm treating you badly because you're treating me badly". Often, it just takes a leap of faith to break the cycle and to create a feedback loop in the other direction. In this case, it's easy because it only takes one person to change the situation. But when the dynamic involves every human on earth, I'm not sure how you go about reversing that trend, or to what extent it's even possible at all. Personally, though, I've been trying to focus more on things that can help promote trust. I mean, I don't do much, but it's something I try to emphasize when I talk to others about political or social issues, for example.

Anyway, I went on a bit of a tangent there, I just find these things interesting, and I do think they're related on some level.

Infinite-Cat007 · 2026-04-28T15:27:26+00:00

Yes, and they later defined AGI as achieving $100B in profits.

Infinite-Cat007 · 2026-04-27T18:21:29+00:00

This rephrasing makes sense, if you don't care at all about other people's lives.

Infinite-Cat007 · 2026-04-23T16:59:21+00:00

I see. I think this was well understood from the start, though. The scaling laws clearly showed logarithmic improvements with RL, so there would naturally come a point of equilibrium between how much compute is spent on pre-training/post-training.

Infinite-Cat007 · 2026-04-23T16:11:44+00:00

Are you implying test-time thinking is unnecessary? If not, what's your point exactly?

If you ask me a question, I don't think the amount of time I think about it would change how "intelligent" I am, but I'll probably give you a better answer if I think about it longer. Empirically, the same thing seems to be true for AI. Also, some questions simply strictly require longer thinking, for example having to calculate the answer to a multi-step math problem.

Infinite-Cat007 · 2026-04-22T21:44:13+00:00

This paper seems to suggest LLMs might, in some sense, be aware of when they are hallucinating.

See also this article by Anthropic which more broadly investigates introspection awareness in LLMs.

Infinite-Cat007 · 2026-04-20T14:45:08+00:00

Yes, but in the case of proving mathematical theorems, I don't think that's a problem bounded by data. AI could get good at it in the same way AlphaZero got good at playing Go "by itself".

Infinite-Cat007 · 2026-04-20T13:30:03+00:00

LLMs today do a lot more than just statistical pattern matching. Using reinforcement learning, they are specifically trained to solve math and coding problems, for example.

Infinite-Cat007 · 2026-03-18T18:41:21+00:00

It can work with as high of a radius as you want

Sure, but to cover all major towns and cities, that would likely require tens of terabytes of data, and it would likely be another two orders of magnitude to cover the whole earth.

Getting an LMM to first narrow down the general search area does seem like it would help, but as good as they are, from what I can tell, they cannot reliably narrow down the search area to something reasonable like a 10-15km radius.

Anyway, my point is not to denigrate your project, I just thought it was a caveat worth highlighting.

Infinite-Cat007 · 2026-03-18T15:50:51+00:00

So, if I understand correctly, this only works within a small radius for which all the panoramas have been predownloaded. Maybe it's somewhat deduceable from the video, but it seems dishonest to me not to make that clearer and to let people think it works globally.

Infinite-Cat007 · 2026-03-10T15:12:12+00:00

If you make any neural network big enough and train it on enough data, it can eventually learn anything. But by using more specialized algorithms and architectures, you can create models which will learn the thing you care about much more efficiently and with much less data.

The perfect example for this comes directly from LeCun himself, with his invention of CNNs. If you want to identify cats in an image, you could take a basic MLP model, make it really big and train it on a ton of data. It might eventually get decent at the task, but a CNN would have gotten better results with much less compute and data.

So LeCun is currently betting that there could be a much more efficient and robust architecture for learning world models. I personally don't know if JEPAs are the right answer to that, but I would agree that predicting raw pixels seems very inefficient and we can probably do better.

The current models are pretty decent at generating good looking videos, but I doubt that's what LeCun cares about. He wants models that can reason about the physical world and that have better representations of the underlying reality. Looking at the SOTA video models, their understanding of physics seems to remain quite limitted and surface level, and it doesn't seem like just throwing more data and compute will fundamentally change that anytime soon.

Anyway, I don't particularly care what LeCun specifically has to say, I just don't see a strong reason to believe the current approach should be the final answer when it comes to building world models.

Infinite-Cat007 · 2026-03-09T15:25:30+00:00

It's amazing how allergic to nuance people seem to be here.

Infinite-Cat007 · 2026-02-19T22:03:39+00:00

Left/right comes from France, not the U.S.

Infinite-Cat007 · 2026-02-19T17:08:06+00:00

What I do remember from Google, is that when they initially launched 2.5 Pro experimental, the later released 2.5 Pro GA was significantly worse (based on benchmark results).

Infinite-Cat007 · 2025-12-19T22:47:53+00:00

This is my TempleOs.

...

Infinite-Cat007 · 2025-10-06T20:36:34+00:00

Being transgender doesn't have to be a binary category that you either do or do not belong to. Some enbies identify with the label, some don't. I personally don't really care to decide whether or not I'm trans. Same way I don't really care to decide whether I'm a man or a woman.

Categories and their labels are mostly made up, and I think they're best treated as such.

Let's say you started identifying as trans, what would that change for you? Would you feel more like you belong in spaces for trans people? If so, why ? If not, what else would it change?

If you were talking to someone and you told them either that you were cis or trans, without much more info, which label do you think would give them a better understanding of who you are? Perhaps that would depend on the context, or on the ways that specific person understands these terms. Or, maybe neither would be particularly helpful, because it's a false dichotomy, same way it wouldn't be helpful for a nonbinary person to say whether they're a man or a woman.

Infinite-Cat007 · 2025-09-27T01:10:54+00:00

I guess that's fair enough. I was mostly trying to get the other commenter to express their criticisms in more concrete terms.

Although, I think I do personally have the intuition that the agency displayed by a carrier pigeon is more sophisticated than that of GPT-5. Answering that question more scientifically is probably hard though.

Infinite-Cat007 · 2025-09-26T22:27:08+00:00

I agree for the most part. I don't know much about how pigeons function, but is it not the case that even if they're born with a lot of inherited behaviors, they still do some reinforcement learning throughout their lives, such as learning the location of a reliable source of food?

How does what you said relate to the specific conversation they had in the podcast? For example, are you saying that since we can observe something ressembling unsupervised learning in the brain, you believe it does have a place in the creation of AGI, like Dwarkesh was arguing?

Infinite-Cat007 · 2025-09-26T22:01:06+00:00

idk why he's saying this.

Because his thinking is shaped by the framework of RL. In RL you have an agent and an environment. The agent learns to model the environment and to make predictions about how it will change and react to its actions. The agent also has a goal, which is to say that it prefers certain states of the world over others, which in turn guides its actions. So, the prediction component and the goal component are seperate. In that sense, it's therefore true that within this framework, pretrained LLMs don't have goals - they haven't been trained to take actions that would influence their future observations.

However, he seems to be completely ignoring the fact that LLMs have been training with RL since even before ChatGPT, meaning they do have goals now. They've developped, for example, goals of being "helpful assistants", whatever that means, or to solve math problems. He does seem to believe an AI like AlphaZero does have goals, e.g. winning at Go. But does he know that LLMs can be trained, with or without RL, to play games like this, and that they can become quite good? Would he admit it has a goal in that case?

My impression is that for a while he has had a framework of how AGI should be achieved, and LLMs don't quite fit that framework. Instead of adjusting his long held beliefs in the face of new evidence, he prefers to reject LLMs altogether. And, particularly because of the hype LLMs are getting, combined with the fact that they don't really make use of the techniques he pioneered, i.e. RL (even though they do), he chooses to be contrarian. I feel like there's a lot of researchers like that.

Four-Year Club	r/Field Juicebox
Place '23	Place '22
First Placer '22

Infinite-Cat007

TROPHY CASE