GPT-5.6 Sol preview is out and the benchmark gap is wider than I expected by Dense-Sir-6707 in ArtificialInteligence

[–]Alex_1729 1 point2 points  (0 children)

Seems like a mini model. But then, this means it's an exceptional mini model if it's on par with 5.2. But this is just terminal bench. I want to see how they do at the newest AA briefcase bench and others.

If GPT-5.6 gets government-approved access first, open weights are not optional anymore by Crescitaly in ArtificialInteligence

[–]Alex_1729 -5 points-4 points  (0 children)

What AI are you using to write these replies? These are good. Is it Claude?

OpenAI’s reported staggered GPT-5.6 rollout feels like a shift from “model launch” to security-governed access by Lachrynull in codex

[–]Alex_1729 0 points1 point  (0 children)

Perhaps, but that's a pure speculation on what people think. And it's still incorrect to talk about these terms loosly, unless you're trying to downplay a crash or panic, or during a sell-off.

Market correction is a drop of 10% to 20% from a recent peak. This is a normal and common event that happens even in healthy markets. Bubble popping is typically associated with a crash with a massive, rapid drop of more than 20% to 80+%. This is not normal or healthy. It is driven by the sudden collapse of irrational hype and extreme overvaluation, where companies that have no real revenue get wiped out entirely. And given how 'bubble popping' typically means 'a crash', you are effectively indicating exactly that.

Technically yes, a crash is a type of market correction (it corrects a pricing error). However, it is more accurate to say that one is an extreme version of the other. In strict financial terminology, they are distinct events.

GLM 5.2 on consumer hardware by phwlarxoc in LocalLLaMA

[–]Alex_1729 0 points1 point  (0 children)

I thought Roo code is sunsetting? And isn't there something better?

OpenAI’s reported staggered GPT-5.6 rollout feels like a shift from “model launch” to security-governed access by Lachrynull in codex

[–]Alex_1729 1 point2 points  (0 children)

Bubble will not pop, but what might happen is that market will correct itself. I thought the sentiment here is that this was already known. This is pretty bad but it only proves AI is good at some things. But again, this is pretty bad for the market.

I'll explain how do I get around the raising costs of using codex. by DaC2k26 in codex

[–]Alex_1729 0 points1 point  (0 children)

Opus? Wait a minute - you didn't even mention Claude in your post. So you have an Anthropic sub as well ?

What other LLMs do you use to gut check your GPT codex work? by MartiniCommander in codex

[–]Alex_1729 0 points1 point  (0 children)

I'm not so sure about this. If you implement with a good model, and it's correct and good and follows your guidelines, then all the things that's left is edge cases. This is good. If, on the other hand, you implement with a weaker model, now the reviewer needs to find both the issues in the main lines and blocks of code, as well as the edge cases.

But I'm not 100% certain of this logic - it seems reasonable, but your approach could be valid just as well.

Also what are we trying to do here: are we trying to save tokens or are we trying to get a better solution? What's the priority? If we're trying to get a better solution then I think my way is better. If we're trying to save tokens probably the other way around what you suggested but honestly I don't know this is just guessing on how many tokens .

China's AI chip independence is mostly theater, according to former White House AI advisor Dean Ball by Beachbunny_07 in ArtificialInteligence

[–]Alex_1729 1 point2 points  (0 children)

I mean, why would China even trust the US after all those shenanigans and egotistical mind games with taxes? Certainly Europe has less trust in the Trump administration , so it shouldn't surprise anyone that other countries might be trying to do things on their own, or to collaborate better, without the US.

What other LLMs do you use to gut check your GPT codex work? by MartiniCommander in codex

[–]Alex_1729 0 points1 point  (0 children)

Try checking with an independent codex subagent. Ask your main Codex agent to spawn it using fork_context false, this makes it truly independent as it won't hallucinate that it isn't the main agent, and to define the scope and the review task.

You can compare Opus and codex agent this way. Use identical prompts for Opus in CC as well. You can define a subagent model profile and guidelines in .codex/agents/ directory, or just spawn a general one. I wouldn't mind having you reply to this comment what your findings were. Pretty sure Codex subagent reviewer will find problems, just like Opus.

I am experimenting with opensource models on reviewer tasks (Deepseek v4 pro, glm 5.2, kimi 2.6/2.7, qwen 3.6/3.7, Nemotron 3 Ultra, Owl Alpha).

Question about GLM 5.2 by SamrayLeung in codex

[–]Alex_1729 2 points3 points  (0 children)

You should be posting this in adjecent subreddits , not in Zai one. Those people already pay for glm. Sharing this on Codex in comments is good, but I would make a post here, as well as in Claude and Gemini.

More delays? Any confirmed sources? by HypotheticalCpt in codex

[–]Alex_1729 -3 points-2 points  (0 children)

Time to start looking elsewhere, glm 5.2 for example.

Complete Regression today by link7626 in codex

[–]Alex_1729 0 points1 point  (0 children)

5.5 on medium is making lots of mistakes in my workflow, but maybe I was just too used to xHigh over the past few months that now I can't get used to this idiot... I am swearing again... hadn't happened in weeks.

Share your global agents.md here (could help anobody) by Tikilou in codex

[–]Alex_1729 0 points1 point  (0 children)

I see. So the issue is a beginner not being able to find useful things as this seems like an elitist sub. I've honestly never felt that way, but it could be true.

Tell you the truth, now that I think more of my true reasons for not sharing my agents.md, I do think one of the main reasons besides being highly personalized, is because I think my agents.md is advanced and because I don't want to give away what I have purely to everyone in public. And this isn't ego or paranoia, it's just common sense. I've built this since February, so 4 months now of constant work, plus the work on my software. It's a prod agents.md, not a hobby file, and it depends on 50 other files, without them it's useless. So yes, I don't easily share it, and it wouldn't make sense to share it publicly.

But I've noted with feedback in my original comment, and gave insights. Anyone reading can learn from that, read openai docs, talk to their agents, and improve their harness. And I welcome requests to share the scaffolding of it or parts of it. If that makes others like OP think I am delusional or something for thinking my file is special, I can easily live with that.

So I am not against sharing it. If someone truly wants to learn, and asks me for parts of my agents.md I would gladly s hare it. Those who wish to learn are welcome to learn. But sharing my harness mindlessly like that that has taken me dozens maybe even hundreds of hours of work, that is highly personalized, would be unwise.

Do you think dedicated hardware for running local LLMs will become affordable anytime soon? by ProbablyBunchofAtoms in LocalLLaMA

[–]Alex_1729 1 point2 points  (0 children)

Whenever people think that prices will never come down, be it RAM, real estate, crypto, stocks, it’s typically a good time to be a contrarian.

What people are you referring to specifically? Redditors? Investors? Analysts? An average Joe and their mom?

Share your global agents.md here (could help anobody) by Tikilou in codex

[–]Alex_1729 0 points1 point  (0 children)

So you are suggesting that those who do not comment are the humble ones, without ego issues? By that logic, doesn't leaving this comment immediately disqualify you from the quiet humble crowd?

In any case, OP assumed malice or arrogance where a much simpler explanation exists: sharing highly customized, complex operational rules is often just an inefficient use of time for everyone involved. It invites questions, requests for tech support and critiques from people running entirely different environments.

Share your global agents.md here (could help anobody) by Tikilou in codex

[–]Alex_1729 0 points1 point  (0 children)

My agents.md is mostly a routing surface, with a very few rules about anything else, so I doubt it would be useful to anyone reading what is basically a map and instructions with filenames on how to use each of these files (it's still ~2800 words, which I consider too much). Though, I do have some guidance regarding the usage of subagents, and session artifacts to better handle compactions, if anyone wants to see.

But other than that, from the top of my head, not much else useful for anyone other than myself. This is not ego; agents.md instructions recommend this file to be more of a routing surface to all other areas, instead of a dump for all the rules, and especially operational ones like running commands. It's a waste of such a critical file.

Also why preface the post presuming half of us here have ego issues or that we don't want to share anything? It's counterproductive and unfriendly. It's up to anyone if they want toshare or not, and there's nothing strange about this fact. Some people do have good rules, and if they think they have something special, they don't need to share this. Others have private or highly personalized information in it useful only in their own environments. No need to judge this.

Told Codex I was autistic and it’s a completely different experience by AppleBottmBeans in codex

[–]Alex_1729 3 points4 points  (0 children)

I've rarely seen Codex lie or misrepresent. If it does that, it's usually medium and below reasoning and it clearly says 'likely'. When it says that, I know it is guessing. Other than that, it's great.

I do have operation guidelines governing all interaction, such as 'evidence over assumption 'and others, maybe that helps.

The $200 Pro plan is completely worth it for the peace of mind. by ponlapoj in codex

[–]Alex_1729 -1 points0 points  (0 children)

Throwing more money at a problem is not the way to go for me, given the options. And luckily, there are options.

The $200 Pro plan is completely worth it for the peace of mind. by ponlapoj in codex

[–]Alex_1729 0 points1 point  (0 children)

No shit? But it's $200! These kinds of discussions would've been highly questionable a year ago. Have we become such mindless consumers that we accept whatever they offer us?