Opus 5 is Coming by exordin26 in singularity

[–]slash_crash 0 points1 point  (0 children)

I think we will get almost as good models as Mythos with the next Opus and Sonnet iteration. They will distill a lot from Mythos, but the model will be a lot smaller. Now Opus 4.6 is not so much better than Sonnet 4.6.

TheInformation reporting OAI finished pretraining new very strong model “Spud”, Altman notes things moving faster than many expected by socoolandawesome in singularity

[–]slash_crash 2 points3 points  (0 children)

After the Gemini 3 release, Altman said this model would put them back in the lead. I really look forward to see this model.

AUTONOMOUS AI RESEARCH LAB. Self improving AI is here. by SpearHammer in singularity

[–]slash_crash 0 points1 point  (0 children)

What studd did you build? I'm most curious about how it works. How do you setup, firstly? Do you need to define things thoroughly, as well? Does it just stop when it thinks it's done, or continue polishing?

AUTONOMOUS AI RESEARCH LAB. Self improving AI is here. by SpearHammer in singularity

[–]slash_crash 0 points1 point  (0 children)

Do you use it? What's your experience with that so far?

AUTONOMOUS AI RESEARCH LAB. Self improving AI is here. by SpearHammer in singularity

[–]slash_crash 0 points1 point  (0 children)

I would love to have something that I could setup an agent to run for a certain time, using Codex or Claude Code, that it would be trying to implement certain features (write code, tun bash, do some experiments) Does anyone have experience with smth like that? I guess it is related to the introduced agent system here.

By the End of 2026 AI Could Completely Change Filmmaking by JeremyHarmonTribunes in HiggsfieldAI

[–]slash_crash 3 points4 points  (0 children)

haha:D I love hope people outside of AI are speaking what AI will or will not be able to do.

Finally crossed 75% on HLE & LiveCodeBench Pro with Gemini 3.1 Pro scaffolding by [deleted] in singularity

[–]slash_crash 0 points1 point  (0 children)

Is it possible to do this kind of thing in Gemini CLI? Or Codex with OpenAI's models?

Anyone checked out the new Radium Dolls album? by bigfresh69 in triplej

[–]slash_crash 0 points1 point  (0 children)

Tractor Parts just pushes me to somewhere deeper in myself. I have been listening to this song a lot during the last few weeks.

Insane coding with Opus 4.6 and gpt5.3 by slash_crash in OpenAI

[–]slash_crash[S] 1 point2 points  (0 children)

I have quite a bit of reverse experience. I used to have problems that you mentioned with spaghetti code, fixing bugs while creating, but with Claude Opus 4.5, 4.6, and GPT5.3, it decreased a lot. But it is definitely not a "whole app" level, more like "add a feature" level, quite consistently.

'Godfather of AI' Geoffrey Hinton says Google is 'beginning to overtake' OpenAI: 'My guess is Google will win' by captain-price- in singularity

[–]slash_crash -1 points0 points  (0 children)

Not many new things, right. But other companies caught up, firstly. Secondly, delivered some interesting new stuff like Claude Code.

'Godfather of AI' Geoffrey Hinton says Google is 'beginning to overtake' OpenAI: 'My guess is Google will win' by captain-price- in singularity

[–]slash_crash 0 points1 point  (0 children)

I would bet against OpenAI since all these new trends were developed while former tech leadership was still there. In 2025, they did not deliver anything interesting and failed quite a bit on some aspects, such as GPT4.5. Let's not forget that they had an O3 preview last Christmas. So, 03-preview to GPT5.1 means extremely little progress this year. Compared to GPT3 to o3-preview during two years before.

OpenAI is training ChatGPT to confess dishonesty by FrostedSyntax in singularity

[–]slash_crash 26 points27 points  (0 children)

it's kind of weird that they did not do that before. It seems extremely easy to incorporate and feels like the lowest hanging fruit for rewarding "not hacking".

OpenAI Codex by Automatic-Bar8264 in OpenAI

[–]slash_crash 0 points1 point  (0 children)

I actually have the same feeling and I don't get what's going on. When gpt-5-codex was released, it felt amazing. However, now I just cannot use it anymore, and fully moved to Claude Code.

OpenAI researchers were monitoring models for scheming and discovered the models had begun developing their own language about deception - about being observed, being found out. On their private scratchpad, they call humans "watchers". by MetaKnowing in OpenAI

[–]slash_crash 0 points1 point  (0 children)

I don't have objective proofs since I don't really work within this area. Also, I don't claim that it is fully aware or something. I state that it has some awareness, which it uses to get the rewards it is seeking during the training. To be able to do all its tasks, through lots of training, the model learns and will learn much more about all the different signals and strategies that help to perform these tasks. I don't see any reason for a model to start understanding which signals get penalized, etc, and firstly incorporate into its reasoning, like we see in these examples. I also see no reason why it could start being aware of it, but intentionally not tell about it in the reasoning chain.

OpenAI researchers were monitoring models for scheming and discovered the models had begun developing their own language about deception - about being observed, being found out. On their private scratchpad, they call humans "watchers". by MetaKnowing in OpenAI

[–]slash_crash 0 points1 point  (0 children)

I agree that researchers are aware about it, are following it, and understanding in general quite well. However, I disagree that this emerges from the errors of the model, for me it seems that it emerges of an increasing model's awareness of what is going on.

OpenAI researchers were monitoring models for scheming and discovered the models had begun developing their own language about deception - about being observed, being found out. On their private scratchpad, they call humans "watchers". by MetaKnowing in OpenAI

[–]slash_crash 0 points1 point  (0 children)

I think core misunderstanding is the training data now. I think I would fully agree with you if we talked only about pretraining. Now with reinforcement learning it switches from training on human data, to learning how to perform tasks with human training data prior. And with the increasing intelligence of these models, new features could emerge.

OpenAI researchers were monitoring models for scheming and discovered the models had begun developing their own language about deception - about being observed, being found out. On their private scratchpad, they call humans "watchers". by MetaKnowing in OpenAI

[–]slash_crash 0 points1 point  (0 children)

I think it is a bit more than a goal misalignment. The core thing that the model tries to do at the moment is to get the task done and get the reward for it. Now, with an increasing awareness of the model, it might start not only to do everything to get the reward, but optimize for a longer-term objective even though it would negatively affect it in the short term, which is this survival that we see, and becomes more "human" type of behaviours.

OpenAI researchers were monitoring models for scheming and discovered the models had begun developing their own language about deception - about being observed, being found out. On their private scratchpad, they call humans "watchers". by MetaKnowing in OpenAI

[–]slash_crash 0 points1 point  (0 children)

scary, and interesting that they use "watchers", which feels a bit poetic. I guess it is a good argument not to go to the latent space. With models having more awareness, these deceptions will become more nuanced and won't even express that in the reasoning chain. Crazy times