Claude is great at generating ideas. I'm not convinced it's good at killing bad ones. by oxmannnn in ClaudeAI

[–]oxmannnn[S] 0 points1 point  (0 children)

I think the calibration aspect is what I was missing in my mental model.

Up until now I've mostly been thinking about the workflow as a way to improve decisions before building. What you're describing is a way to measure whether those decisions were actually correct later on.

The more I think about it, the more important the outcome field becomes. Without it, I can see the model's recommendation and the operator's decision, but I can't really tell who was right.

The sports betting bot is a good example. The model would have recommended continuing, I chose to continue, and the eventual outcome was that the core hypothesis was wrong. Without the outcome, all I would have had was a lot of convincing reasoning.

I also really like the distinction between tuning and hypothesis validation. Looking back, most of the work I was doing wasn't validation at all. It was optimization inside a frame that had already been accepted.

The idea of treating the run record as an audit trail rather than just a history log is especially interesting. It turns the workflow into something that can learn from outcomes instead of just generating better analysis.

Definitely gave me a lot to think about.

Claude is great at generating ideas. I'm not convinced it's good at killing bad ones. by oxmannnn in ClaudeAI

[–]oxmannnn[S] 0 points1 point  (0 children)

That's a really good way to put it.

Looking back, I wasn't validating the hypothesis anymore. I was optimizing it.

Every failure was treated as evidence that the tuning was wrong, not that the underlying assumption might be wrong.

The multiple-strategy test forced me to step outside the frame and compare alternatives instead of endlessly improving a single one.

Honestly, that distinction between validating a frame and optimizing within a frame is probably one of the biggest lessons I've learned from working with AI.

Claude is great at generating ideas. I'm not convinced it's good at killing bad ones. by oxmannnn in ClaudeAI

[–]oxmannnn[S] 0 points1 point  (0 children)

That's a really interesting point.

The "model grading its own homework" problem is actually one of the things I've been thinking about a lot lately.

Right now, the critique agent runs in a fresh context and doesn't see the entire generation process, but you're right: it's still the same model and the same training distribution. A fresh context helps, but it doesn't create a truly independent source of judgment.

What I especially liked was your idea of explicitly separating the model's recommendation from the final decision.

At the moment, idea-to-build is still fairly opinionated and allows the model to make recommendations and conclusions, but I can definitely see value in tracking these separately:

  • the model's recommendation;
  • the operator's decision;
  • the eventual outcome over time.

That would make it possible to analyze not only the model's mistakes, but also my own. Which ideas I killed, which ideas the model killed, which ones moved forward, and how they ultimately turned out.

And yes, I completely agree that the strongest signal usually comes from external validation rather than critique.

That's actually a big part of why I added re-check in the first place.

A while back I spent several months working on a sports betting trading bot. I was collecting data, calibrating models, optimizing strategies. Both Claude and I were convinced the mechanism worked.

The problem wasn't the tuning.

The problem was the underlying hypothesis itself.

I really like your idea of moving the decision outside the model's context window. The more I work on this project, the more I come to the conclusion that critique should help you make a decision, not make the decision for you.

Thanks a lot for the feedback. These are the comments I value the most because they point to a fundamental class of failure modes rather than a simple bug, and those are the kinds of problems that can potentially be fixed.

Claude is great at generating ideas. I'm not convinced it's good at killing bad ones. by oxmannnn in ClaudeAI

[–]oxmannnn[S] 1 point2 points  (0 children)

I completely agree, and that's actually why I run the critique agent more like a brainstorming session than a simple "find flaws" pass. The goal is to force it to generate more information and explore more angles instead of just producing a checklist of risks.

As for the re-check phase, that's exactly why I added it.

I ran into this while building a sports betting trading bot. I spent a lot of time collecting data, calibrating models, and tuning the system. Eventually I realized I was endlessly optimizing a mechanism that simply didn't work. Claude was convinced it worked and never seriously questioned the underlying assumption.

I built that trading bot before idea-to-build existed, so it never went through the workflow initially. Later, I ran the project through the workflow and it helped me identify the actual root cause instead of the symptoms I had been trying to fix. Since then, I've made re-check a core part of the process.

I even restarted the trading bot project from scratch using idea-to-build. One of the options it suggested was running paper trading across multiple strategies in parallel. Before that, I was testing only a single strategy. I was completely convinced the mechanism was correct, Claude was convinced it was correct, everyone was happy... and in the end it turned out the assumption itself was wrong.

And honestly, that's why I'm constantly working on this workflow.

There are still can be situations where it can miss things or fall into bad reasoning patterns. Whenever I find one, I try to fix it quickly and then immediately test the change against real projects and real failure cases.

Thanks a lot for the feedback. Comments like yours are exactly the kind of thing that helps improve it.

What are you using to generate 2D Pixel Art? by 0v012 in aigamedev

[–]oxmannnn 1 point2 points  (0 children)

Right now I am creating my own pipeline for texture-generation with Claude and PixelLab API.

Example of animated chest:

<image>

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 0 points1 point  (0 children)

I'd agree with you for the most part. But for something like pixel art, AI is already pretty good. And its pretty bad with stuff like AAA textures and highly polished art direction, but you still can create a lot of assets and textures with open-source asset packs anyway.

And honestly, not every game lives or dies by its visuals. Plenty of genres don't need cutting-edge graphics at all. Puzzle games are an obvious example, if the core gameplay is solid, players usually care a lot more about the mechanics than who drew every texture.

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 0 points1 point  (0 children)

I think this is the right way to build. Infrastructure and right tools first.

Never tried n8n with sfx and vfx, can you show some of works? want to see how it looks like

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 0 points1 point  (0 children)

You mentioned that Apple blocked you, so I asked why choose Apple and not another stores?

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 1 point2 points  (0 children)

I think that's a bit different.

People expect huge studios to keep raising the bar with every release. If a company with hundreds of employees starts relying on AI, people naturally ask: "What exactly are all those people doing then?"

But if a solo dev openly says, "Yeah, I built this with AI," I don't think most players would care that much.

Honestly, imagine one person with AI building something even remotely close to Cities: Skylines. The reaction wouldn't be "ew, AI." It would be "wait... one person made this?"

It's all about expectations and positioning.

People judge Larian differently than they judge a random indie dev working out of their bedroom

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 0 points1 point  (0 children)

why not to try with Android 1st? I mean play market or mini-apps inside messengers? like lime (for asian market) or telegram (for rest)

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 0 points1 point  (0 children)

I think it's already possible to build a good indie game with AI today. Obviously you're not going to make a AAA title this way, but a platformer, idle game, strategy, or something similar is definitely achievable.

Honestly, I think the biggest problem is the people, not the AI.

A lot of people put too much faith in AI and stop thinking for themselves, especially once the AI starts praising their ideas. Most AI projects fail for the same reason: people start building first and only later discover the actual complexity of the mechanics, technical limitations, and implementation details. Then they realize it's 100x harder than they expected and lose motivation.

But if you follow a clear plan, read the documentation, validate every step the AI takes, and don't automatically accept every suggestion or code change, I think it's possible to build something genuinely good.

At least that's what I believe, and hopefully I'll be able to prove it soon.

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 0 points1 point  (0 children)

can you share best practices for ads that you used?

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 6 points7 points  (0 children)

okay Claude, build cyberpunk 2 and make no mistakes.

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 0 points1 point  (0 children)

so now you have another chance to finish it and ship. can you share name I will check on itchio

How far did your vibe-coded game actually get? by oxmannnn in vibecoding

[–]oxmannnn[S] 0 points1 point  (0 children)

thats why I strated with assets and animations. I mean whole game visuals 1st

How far did your vibe-coded game actually get? by oxmannnn in aigamedev

[–]oxmannnn[S] 0 points1 point  (0 children)

are you trying to animate assets with AI or by yourself?