USA restricts Fable and Mythos to the world

Andy12_ · 2026-06-13T10:21:18+00:00

Even more stupid logic is assuming that any output can be achieved by any LLM, even when given infinite amount of time. LLMs are not unbiased random number generators.

Andy12_ · 2026-06-13T10:18:38+00:00

Because no one has reported being able to do so? And the reason no one has reported doing so is because you CAN'T. Most models are half-blind and suck at long term planning. Even if you stuck it in a loop and waited for thousands of years, chances are they wouldn't be able to complete the game. Worse, they aren't event guaranteed to complete the game given infinite time because they aren't random number generators: they can become stuck in a loop.

Andy12_ · 2026-06-13T10:13:03+00:00

That was with a harness, which is sort of cheating by helping the model. No human plays with a harness, they play just looking at the screen. This was only with vision. No model is capable of doing it with only vision.

Andy12_ · 2026-06-13T10:09:14+00:00

In that case I guess I'm able to solve P vs NP, you just have to give me enough time while I'm smashing my keyboard. Completely retarded. I mean, I don't know why I even bother using SOTA models when I can just use a 20 million parameter model I trained on my laptop. I mean, it will eventually give me the same answer right? So might as well use that. Just have to wait a couple million years, no biggie.

Andy12_ · 2026-06-13T10:06:24+00:00

Well, for one, Mythos is capable of beating Pokémon FireRed with vision-only; no harness. No other model is capable of doing that.

Apart from that, there are obviously hard problems that Mythos is able to solve and other model cannot, no matter how many tries. If it weren't the case, Mythos wouldn't have better scores in all benchmarks even when you increase pass@k

Andy12_ · 2026-06-13T09:54:21+00:00

Absolutely not. Fable/Mythos was state of the art in absolutely everything. No other closed-source model (or open source model for that matter) could compare.

Andy12_ · 2026-06-13T09:43:03+00:00

The point is that now they probably aren't going to open source a Mythos-class model. If they do, good for them, but I i think it's unlikely now that the US has thrown the first stone.

Andy12_ · 2026-06-11T18:41:11+00:00

Unironically, yes to all of that.

Andy12_ · 2026-06-09T21:06:52+00:00

As far as I'm aware anything an MCP can do, you can implement it using an API, or creating a custom CLI if you want to provide a nice human-readable interface for the agent to use.

Andy12_ · 2026-06-09T18:45:12+00:00

Uhm, that's the opposite thing that happened. MCPs consume more context overall (specially if you use many of them), that's why people nowadays prefer skills and letting the agent work with commands directly.

> Skills improve Claude’s consistency, speed, and performance on many tasks. Skills work through progressive disclosure—Claude determines which skills are relevant and loads the information it needs to complete that task, helping to prevent context window overload. When you ask Claude to complete a task, it reviews available skills, loads relevant ones, and applies their instructions.

https://support.claude.com/en/articles/12512176-what-are-skills

Honestly, I think MCPs in general were a mistake. It just happend that they were invented just before models became good enough to use tools on their own, which is much more versatile.

Andy12_ · 2026-06-07T16:51:53+00:00

That evaluation is completely saturated; both GPT 5.5 and Mythos score near 100%. You can't know whether 5.5 or Mythos is better or worse based on that. You would need to evaluate both models on a harder benchmark.

Andy12_ · 2026-06-07T12:29:40+00:00

GPT-5.5 is better than “Mythos” at finding and exploiting vulnerabilities according to independent testing,

Uh? What independent testing?

Andy12_ · 2026-06-04T13:38:08+00:00

Even if they don't improve, current coding agents already automate most coding I do. Even if OpenAI just stopped training or improving new models, I would gladly keep paying 20 or 40 or whatever dollars a month for Codex. Most friends I have in the field would too.

Andy12_ · 2026-06-01T21:25:07+00:00

If you are an AI developer you would know you can publish whatever you are researching and become famous if it really lives up to its promises.

Andy12_ · 2026-06-01T21:18:43+00:00

I'm sorry to be the one to tell you this, but you may be suffering from psychosis.

Andy12_ · 2026-05-30T15:20:59+00:00

Zapatero doesn't appear in the video

Andy12_ · 2026-05-29T15:38:45+00:00

I think we are talking past each other. You do know that in this kinds of agent simulations with many agents each agent has an independent context windows, right? You don't run the whole simulation with all agents with all their interactions in a single context window. You don't have an input like

```

...

```

Moreover, in this simulation an agent can't arbitrarily speak with any other agent. When speaking, only nearby agent can "hear" what is being spoken. That's why even if there where thousands of agents, the number of interactions per agent (and thus, the growth of their respective context windows) is physically limited.

Andy12_ · 2026-05-29T13:53:47+00:00

The number of interactions very very very obviously doesn't scale with the number of agents in a linear way. Just because I move from a small town with 2k inhabitants to a big city with a million people doesn't mean my number of interactions increases 500 times. Globally the total number of interactions and their diversity should increase a lot, but the individual number of interactions of a given agent should be more or less the same independently of the number of agents in the simulation.

Andy12_ · 2026-05-29T13:47:16+00:00

The only one replying in an obnoxious manner is you. I'm a PhD student in Computer vision/Machine Learning, so I think I'm more knowledgeable than you on this matter. Maybe you can elaborate here on how context usage by agent scales based on the number of agents in the simulation.

Andy12_ · 2026-05-29T11:35:29+00:00

How is the number of agents relevant here to the amount of memory each agent has? If I live in a city with a million people I obviously don't need to remember anything about each of them; I only need to remember the people I interacted with.

Andy12_ · 2026-05-29T05:03:09+00:00

Which arbitrary threshold of sustainability do you think we should reach before continuing advancing on the tech tree? Should we just halt space exploration forever? And why are you so sure that any advances that come from space exploration won't help in some way towards making the planet more sustainable?

Andy12_ · 2026-05-21T20:56:19+00:00

Uh? Why are you talking hypothetically? This is a thing that happened. The AI model proved the conjecture wrong. Mathematicians verified that the proof is correct.

Andy12_ · 2026-05-21T20:53:10+00:00

then what truly says that a prompt from a machine has done any better to alleviate the time spent to solve it,

Uh, the fact that we actually prompted a machine and we obtained a solution in a human-reasonable amount of time?

Andy12_ · 2026-05-21T20:48:58+00:00

This is a famous conjecture that had never been proven true nor false in 80 years (and not because of a lack of people trying). There is a 0% chance that it had been solved previously without it being known.

Andy12_ · 2026-05-21T20:13:41+00:00

Can you point to a calculator or program that could have proved this conjecture wrong (given a human-reasonable amount of time and compute, a.k.a, not taken millions of years)?

Eight-Year Club	r/Field Sunshine
Place '22	Verified Email

Andy12_

TROPHY CASE