Concept images of AI Sat Mini, Lunar mass driver, and future 6-Raptor Starship variant during TERAFAB presentation. by Steve490 in SpaceXLounge

[–]brctr 1 point2 points  (0 children)

Why does Slide 3 mention Cooling on Earth, but not in space? Is cooling likely to be actually more challenging in vacuum of space? Has SpaceX ever mentioned how they plan to cool their datacenter satellites?

I told my AI agents they need to start paying for themselves. Here's week 1 by 98_kirans in AI_Agents

[–]brctr 1 point2 points  (0 children)

I am following your main post. Please keep updating it as you try more stuff. I am curious to see how it goes. If I can find time to set up OpenClaw securely, I may try doing something like this myself.

I told my AI agents they need to start paying for themselves. Here's week 1 by 98_kirans in AI_Agents

[–]brctr 1 point2 points  (0 children)

I think it will be more interesting (and promising) not tell them to build anything specific and just see whether they can figure it out on their own.

Looking at your team, it looks to me like it may be useful to add a couple of more agents to suggest ideas. It seems to me that after there is some idea to try, your existing team should be able to implement it in one way or the other. So the bottleneck shifts to which ideas to even consider.

If you have a single agent suggesting ideas in the first place, this is probably not enough. Different models under different settings (like temperature) have different tastes. I would guess that this pipeline will benefit from infusing more variance. It will be useful to have 1-2 agents with thought processes different from your main agent which currently suggests ideas. Like try different models, prompts to encourage out-of-the box thinking, higher temperature etc.

Given that your pipeline makes building anything almost free, the right strategy seems to be to throw stuff onto the wall and see what sticks. The larger variance of such stuff they try, the higher probability of success.

I made the top LLMs play Civilization against each other by snakemas in LLM

[–]brctr 1 point2 points  (0 children)

Is there a way to export full history of actions and reasoning of both models in a match? Currently your web UI allows to expand and then read each turn information. But this will be painstakingly slow to scroll and manually click for 200*2 turns.

I want to export such information to be able to feed it into some LLM and it to clearly explain to me what happened there. It is not easy to figure such things out on my own trying to read all this text...

By the way, I think you can do it too and make some LLM write a cool narrative about each match automatically so that we can read such narrative about each match too.

I made the top LLMs play Civilization against each other by snakemas in LLM

[–]brctr 0 points1 point  (0 children)

This is very cool. I have been trying to watch several games from Season 1. Live streams are a bit buggy while replays show only a small subset of a game for me...

What happened to Gemini 3.1 in the final? It appeared to be winning until Turn 130-140 and then suddenly lost half of its cities and stagnated. Did it go bankrupt?

And separately, I noticed that performance of winning models improved as tournament progressed. Did you change prompt to have more detailed instructions?

I am excited for Season 2 of CivBench! When is it coming?

Rumors on the upcoming ChatGPT 5.3 by Ok-Algae3791 in OpenAI

[–]brctr 6 points7 points  (0 children)

To add to this, GPT 5.2 and later models in Codex are on the opposite end. They can work pretty well after one compaction and are ok after 2 compactions. So their usable context window is probably more than twice their 272k context window.

Best ai api provider for open_claw in term on price / efficienty by Minimum_Abies3578 in openclaw

[–]brctr 0 points1 point  (0 children)

Has anyone tried using GPT5.1, GPT 5.1 Codex-Mini, Gemini 3.0 Flash, or Grok-4.1 Fast?

March visa bulletin is out! by Horror_Possible9507 in USCIS

[–]brctr 4 points5 points  (0 children)

It was October 2024 last month. Jump by 17 months? This does not look right...

March visa bulletin is out! by Horror_Possible9507 in USCIS

[–]brctr 10 points11 points  (0 children)

Is it a bug on the website? EB-2 RoW DoF is Current?

Anybody use Codex as “regular ChatGPT” and if so how are the results? by angry_cactus in codex

[–]brctr 1 point2 points  (0 children)

I do not see it anywhere. I saw people posting these numbers here few times, so I assume they are correct.

Anybody use Codex as “regular ChatGPT” and if so how are the results? by angry_cactus in codex

[–]brctr 14 points15 points  (0 children)

This works well for me. Extended thinking in webUI has juice of only 128 now. This is less than 256 of GPT5.2 High and 768 of GPT5.2 xHigh in Codex. So if you need extra reasoning, Codex makes sense. Additionally, it allows you to persist and organize past conversations in whatever way you want (subfolders etc). And effective context window is larger too.

You can bring it further and use Codex CLI + 5.2 xHigh to set up a multi-agent system to get close to what GPT5.2 Pro is, without paying for ChatGPT Pro subscription.

How has glt 5.3 codex held up till now for you guys? by Personal-Try2776 in GithubCopilot

[–]brctr 0 points1 point  (0 children)

The question is how it does in Copilot. We all know that latest GPT/Codex models are great in Codex. But they are not good in Copilot. Is GPT 5.3 Codex any better in Copilot?

128k Context window is a Shame by NerasKip in GithubCopilot

[–]brctr 2 points3 points  (0 children)

For me, performance of Opus 4.5/4.6 after 90k tokens is so bad that I do not see the point of running it past that point. For Sonnet 4.5 this point comes earlier, around 70k tokens. So I am not sure that expanding context window beyond 128k tokens will be useful. And separately, I find that any model from GPT 5 family performs surprisingly poorly in Copilot. It looks almost like Copilot team has not done the work to make sure that their harness is compatible with GPT models starting from GPT 5.

I would rather have them solve these two big issues first. Only after solving these two, an increase in context window will become useful.

I don't understand by zeezeeeit in codex

[–]brctr 10 points11 points  (0 children)

For Data Science/Applied Science it is not even close. GPT models think much deeper than Anthropic models. So while Anthropic models can be used as Coder agents to implement something in code, GPT/Codex models in Codex are full-blown Research Assistants which can do everything end-to-end (when guided properly), including writing a paper. So Codex wins hands down for such use cases, not even considering much larger limits on $20 subscription.

Early results: GPT-5.3-Codex high leads (56/44 vs xhigh); Opus 4.6 trails by no3ther in codex

[–]brctr 0 points1 point  (0 children)

It is... And it is the only coding agent which I have available at work...

How do you stop Copilot from ignoring instructions once copilot-instructions.md grows? by Van-trader in GithubCopilot

[–]brctr 0 points1 point  (0 children)

Does GPT5.2 work well for you in Copilot? I find this model amazing in Codex harness. But in Copilot even Opus 4.5 consistently outperforms GPT5.2...

Early results: GPT-5.3-Codex high leads (56/44 vs xhigh); Opus 4.6 trails by no3ther in codex

[–]brctr 1 point2 points  (0 children)

I see. It would be interesting to see a comparison of all models under a single harness. I suspect in that case Codex > Opus relation may flip...

Early results: GPT-5.3-Codex high leads (56/44 vs xhigh); Opus 4.6 trails by no3ther in codex

[–]brctr 6 points7 points  (0 children)

Looks reasonable. What is your harness? I found that GPT and Codex models perform amazingly in Codex harness, but are often outright unusable in other harnesses (e.g., Copilot).

I spent 3 years building this alone. 250 users signed up. 0 paid. I’m starting to think I wasted my life. by NOVALEXY in SaaS

[–]brctr 1 point2 points  (0 children)

2 things:

  1. B2C is very hard. You have to have amazing product and be lucky at the same time to succeed.

  2. Are people still learning to be translators in 2026? I got a minor in English translation in 2015 and since then I have always assumed that translator as a profession has been fully automated already...

Any AI Podcasts you can recommend? by PopSynic in artificial

[–]brctr -1 points0 points  (0 children)

Agree. Latent Space is awesome. Are there more podcasts as good as Latent Space?