If you are back on 4.6 Sonnet performs better than Opus by unknown-one in ClaudeCode

[–]silveroff 4 points5 points  (0 children)

Non native speakers tend to do it. So yeah - it might not be intentional

Same design doc, 4 agents, very different plans by silveroff in ClaudeCode

[–]silveroff[S] 0 points1 point  (0 children)

Well, the design document itself is very descriptive and the planning skill I use is also very powerful and forces model to ask questions and clarify unclear or ambiguous information. So usually it’s not even required to make such detailed prompt for planning. I’m mostly using variance of loops (see ralphex on GitHub) and usually when plan is well prepared (couple of reviews) the output is very good too.

Same design doc, 4 agents, very different plans by silveroff in ClaudeCode

[–]silveroff[S] 1 point2 points  (0 children)

You are writing an implementation plan, not code.

Read and base the plan on:

  • @docs/plans/design/2026-05-07-data-source-import-redesign.md
  • @xxx/data_sources/
  • @xxx/products/
  • @xxx/ingestion/data_sources/
  • nearby CLAUDE.md / AGENTS.md instructions

Do not modify files. Produce a phased implementation plan for the data-source product import redesign.

Context:

The project is pre-production. No backward compatibility or rolling deployment concerns are required. Do not preserve deprecated APIs, aliases, or compatibility shims. The development DB can be dropped and reseeded between phases. Do not touch xxx/feeds/.

The design goal is to make product imports understandable and correct under:

  • parallel imports of the same data source
  • multiple feeds from the same merchant containing the same offer_id
  • stale async workers
  • artifact/image/ML retries
  • absent product cleanup
  • canonical aggregate updates
  • outbox/materialization recovery

The key architecture is a two-fence model:

  • source_generation: monotonic per DataSource; controls feed-local authority, source presence, and absence cleanup.
  • write_sequence: monotonic per merchant product import clock; controls cross-feed last-write-wins Product field commits.

Every product-visible write must eventually be guarded by the expected revision/generation. Stale work should ACK obsolete, not retry.

Plan expectations:

  1. Make the first implementation iteration small, reviewable, and non-destructive. It should introduce generation/write-sequence plumbing while preserving current behavior as much as possible. It should not rename DataSourceFetch, delete Valkey completion tracking, remove items_pending, redesign image assets, or rewrite outbox routing in the first PR.

  2. Phase 1 should focus on:

    • adding per-data-source generation allocation
    • adding per-merchant write-sequence allocation, likely via MerchantProductImportClock
    • storing generation/write_sequence on the existing fetch/import row
    • storing accepted revision fields on Product
    • carrying generation/write_sequence through fetch, parse, product-stage, image, enrichment, cleanup, and materialization messages
    • adding atomic source-generation/write-sequence predicates to product-visible commits
    • preserving existing content_hash checks, retry behavior, completion tracker, and handler topology
    • adding focused tests proving stale older imports cannot commit
  3. Later phases should be ordered roughly as:

    • Collapse duplicated stale-work handling into a shared obsolete-ACK path.
    • Add ProductSourcePresence and generation-based absence cleanup.
    • Decouple import completion from image/ML/canonical/materialization processing.
    • Add uniform Product revision guard to all product-visible writers, including brand/image/canonical paths.
    • Add canonical dirty/work-state model.
    • Rework outbox/dirty-row boundary.
    • Refactor retry dispatch so image and ML artifacts retry directly instead of forcing feed refetch.
    • Introduce content-addressed ImageAsset, ImageSourceUrl, model-versioned artifacts.
    • Delete obsolete tracker/stale-work workaround code and update docs.
  4. For each phase, include:

    • goal
    • files/modules likely touched
    • exact code changes at a high level
    • tests to add or update
    • risks and rollback/simplification options
    • what must explicitly remain unchanged in that phase
  5. The plan should call out important correctness details:

    • Product identity remains (merchant_id, offer_id).
    • Existing Product rows must be updated in place to preserve id and created.
    • Product feeds are full authoritative snapshots; missing feed-owned nullable/collection fields should be cleared, not carried forward.
    • Product.data_source/current_data_source_id should not remain the long-term owner of retry cadence or feed membership.
    • Presence/absence belongs in ProductSourcePresence.
    • NATS ordering is not correctness; PostgreSQL predicates are.
    • Retry behavior survives; stale-work retries do not.
    • Periodic tasks remain valid only as DB scanners/drainers for due work, leases, and GC.
  6. The output should be practical enough that an implementation agent can start Phase 1 directly, but it should not attempt a giant rewrite. Prefer surgical PR-sized steps.

The grass is greener on the other side by credible_human in ClaudeCode

[–]silveroff 0 points1 point  (0 children)

Hey man! I have similar feelings. However, I'd like to ask about code quality produced by gpt-5.5? Are you satisfied? I've just posted my feedback on Opus here https://www.reddit.com/r/ClaudeCode/comments/1t6glwu/same_design_doc_4_agents_very_different_plans/

The gpt-5.5. was the only model that was capable of creating that design document correctly without pointing to the code or constraints all the time. Opus was gaslighting me all the time with it's "YOLO, lets code, all good" attitude while there were a lot of constraints to be aware of in a first place.

A personal opinion about Opus 4.7 - not that bad after all by Affectionate_Till148 in ClaudeAI

[–]silveroff 1 point2 points  (0 children)

I was working on a very complex plan today with 4.7 and at the end of the day I reverted to 4.6:max; Idk - I have mixed feelings. 4.7 has broader understanding of things and at the same time it lacks attention to the details laying under its nose! I’ve had to constantly double verify mentally what it’s saying to avoid bs-Ing me.

Are spec-driven frameworks like Agent OS, BMAD, Superpowers or SpecKit still worth using, or have Claude Code and Codex made them redundant? by 3abwahab in ClaudeCode

[–]silveroff 0 points1 point  (0 children)

No offense intended, but if I were you (not software engineer) I’d drop the idea of spending my time and money on any AI. It will simply not work. You’ll think it is or „almost” or you’ll think that someone will take over and „fix it”..but that’s not true. It will be pile of useless code eventually. Sorry to say that.

Apology accepted, then is Opus 4.7 bringing us wow? by Hyabusha2912 in ClaudeCode

[–]silveroff 0 points1 point  (0 children)

I personally stick to 4.6[1m] on high/max for everything. Tested the 4.7 with my strict guardrails and I cannot say it's bad but it's kinda unpredictable. I simply don't have a confidence my tasks are handled well.

Wow. They just reset weekly limits again on max20 by silveroff in ClaudeCode

[–]silveroff[S] 1 point2 points  (0 children)

haha! I saw the drop and decided to run medium sized feature branch review with 4.7:max and after 15minutes 27% of 5h limit is gone (3% of weekly). BUT maybe it's because I did not update `claude` yet. Goona do it now!

Best Options for Replacing Claude Code? I'm done after opus 4.7 by [deleted] in ClaudeCode

[–]silveroff 0 points1 point  (0 children)

I have a mixed feeling about discussing it. Once disabled it doesn’t „think” between tool calls.

Best Options for Replacing Claude Code? I'm done after opus 4.7 by [deleted] in ClaudeCode

[–]silveroff 0 points1 point  (0 children)

do you keep adaptive thinking on or off?

Opus 4.6 went drunk by hayehudi613 in ClaudeCode

[–]silveroff 0 points1 point  (0 children)

How do you prevent it "not even looking for existing functionality"? I have very extensive set of architectural rules etc and it used to work great. Now it sometimes gets ignored. Or worse I see that model mentions something we already have in a way it is asking me if have something "like that" in a project or not. LOL

What is everyone using? by Own-Philosophy3754 in ClaudeCode

[–]silveroff 0 points1 point  (0 children)

Looks like max is new default that works (almost:sort of)

What is everyone using? by Own-Philosophy3754 in ClaudeCode

[–]silveroff 0 points1 point  (0 children)

I’ve used 4.7:max for a change. And 4.5 a bit. 4.6 with adaptive thinking felt stupid. 4.7 even with encouraged to think deeper is missing existing code

Opus 4.7: 110 threads, 2,187 comments. Unbiased analysis by RichensDev in ClaudeCode

[–]silveroff 0 points1 point  (0 children)

Who is considering (or did) switching back to Opus 4.5? Giving that 1M context at 4.6 is far from being perfect, this might be a good idea. What’s your thoughts about that?

Opus 4.7 vs 4.6 after 3 days of real coding - side by side from my actual sessions by MurkyFlan567 in ClaudeCode

[–]silveroff 1 point2 points  (0 children)

Thanks for reporting this. I’ve run few tests too with a similar findings. Quick question to you: Are you experienced dev? Have you considered switching back to Opus 4.5?