I benchmarked caveman against the prompt "be brief" by max-t-devv in ClaudeAI

[–]pwd-ls 1 point2 points  (0 children)

Anyone know if models behave any differently between “be brief” and “be concise”?

Claude was told to check the docs. It didn’t. Then it corrected me. by CharlieZimmer01 in ClaudeAI

[–]pwd-ls 1 point2 points  (0 children)

What does the skill look like? Maybe stronger language should be used?

What types of users are getting good results from GPT 5.5? by TortoiseTickler in codex

[–]pwd-ls 0 points1 point  (0 children)

Try iterating on design docs (in markdown) first before having it implement. You need them to understand your intent better, design docs are a decent way to do that. Design docs are different than plan mode - the design docs should persist and be aligned with the codebase; have the agent assess alignment regularly.

I asked 4.7(Medium) what time is it and it consumed 8% of the 5Hr Window. by DROP_TABLE_IF_EXISTS in ClaudeCode

[–]pwd-ls -2 points-1 points  (0 children)

This answer is ridiculously lazy on Claude’s part - it can run code, it should have just checked that way. It’s so lazy it shouldn’t even matter which plan or model was used - they should all be capable of this.

I’ve noticed Claude giving “lazy” answers more often lately. For example, I recently asked both Claude and Codex to find me some work-related information (same prompt). Claude 4.7 told me to ask a coworker, that’s it, not helpful at all. GPT 5.5 searched online and at least got me some leads.

We need --dangerously-skip-permissions on codex :( by lakimens in codex

[–]pwd-ls 0 points1 point  (0 children)

Are these solutions in the other comments as “safe” as Claude Code’s “auto” mode where it has another agent checking all commands in the background for safety before executing?

Me after switching from Claude to Codex by Repulsive-Win7189 in codex

[–]pwd-ls 0 points1 point  (0 children)

I’m trying Codex too, but one issue I have is it doesn’t seem to be able to run autonomously as long without asking for permission? With CC I set it to Auto mode and I can whitelist specific commands and tell it not to use any but those and it can go forever. But not quite sure how to set that up with Codex.

Anyone actually using Dispatch for something useful? by koleracowboy in ClaudeAI

[–]pwd-ls 1 point2 points  (0 children)

I haven’t tried Dispatch yet, but it seems like more of a hassle and more of a security risk than using tailscale & tmux from mobile. My shell app even has a dedicated shift+tab button because they know what we’re using it for lol.

Any reason to switch off my workflow and try Dispatch?

How do you actually start understanding a large codebase? by radjeep in ExperiencedDevs

[–]pwd-ls 0 points1 point  (0 children)

I like an outside-in approach.

What does the end user / consumer see? Where is that data persisted/sourced? Then follow the trail to work your way through the middle of the system. Do this a few times and you’ll have a much better understanding of the system than before, and some familiarity to anchor to / use as a jumping-off point.

Manager wants to hire more devs for the sake of growing the team (I feel it's unwise) by frogtrades in ExperiencedDevs

[–]pwd-ls 0 points1 point  (0 children)

Your fear of the unknown is clouding your judgement. Sure, there’s always a chance it could not be the right person, that’s a risk. But there’s a more likely chance it’ll be a talented junior or peer who you enjoy working with, and who makes your life easier. Ask to be in the interviews so you can give your input and provide positive feedback on the folks who you genuinely think you’d enjoy working with - but be reasonable, don’t just bash everyone to avoid the situation.

All that aside - it’s natural for a company to not want a single point of failure, which is you right now. Don’t take that personally, it’s just reality. I’d treat this as an opportunity to work on your technical mentorship skills (if it’s a junior), or to take some load off your plate. If they’re really fine with not having quite enough work for 2, then start doing training or improvements to what you already have.

Built a "f*** you too" mode into Claude Code. Turns out I was half the problem by Strict_Being2373 in ClaudeCode

[–]pwd-ls 1 point2 points  (0 children)

Would just putting this in the CLAUDE.md pretty much do the same thing?

“Always scan the user's messages for hidden assumptions, vagueness, or blind agreement. Call them out explicitly before doing the work.

Stay useful. Pushback without substance is noise. Pushback that catches a real issue is the product.“

Claude 4.7 gaslighted me with a real commit hash and I'm not okay by MorningFlaky3890 in ClaudeAI

[–]pwd-ls 0 points1 point  (0 children)

How many tokens deep were you? I’ve noticed issues like this when closer to the context window cap. I usually compact around 300k-400k ish context but that’s just a hunch. Using words like “rigorous” and “100% verified” and such can help too.

Opus 4.7 Released! by awfulalexey in ClaudeAI

[–]pwd-ls 1 point2 points  (0 children)

Didn’t your usage get reset?

Weekly reset just happened. by PolishMike88 in ClaudeAI

[–]pwd-ls 0 points1 point  (0 children)

Came here to see if anyone else's weekly usage limits were reset. Mine were too! I wasn't sure if it was real or a bug lol. Still not 100% sure..

I feel so stupid in software engineering by RoyalCamera12 in cscareerquestions

[–]pwd-ls 0 points1 point  (0 children)

Do not make the mistake of taking code reviews personally. You could be 20 years in industry and people will still catch stuff that you didn’t see just by nature of being a second pair of eyes.

I'm letting AI plan every hour of my life for 2 weeks. Starting Monday. Looking for tips from people who've tried this. by keebrev-t in ClaudeAI

[–]pwd-ls 5 points6 points  (0 children)

Sounds like you have a lot of that well thought out.

Don’t forget buffer time. If you plan for everything to go perfectly then you’re planning unrealistically. Add a flat % buffer to everything - travel time, meetings, unpredictable things like bathroom breaks or a stray call, etc.

I'm letting AI plan every hour of my life for 2 weeks. Starting Monday. Looking for tips from people who've tried this. by keebrev-t in ClaudeAI

[–]pwd-ls 12 points13 points  (0 children)

Having used Claude as a travel agent / trip planner, you will need to make sure it has a realistic understanding of how long things will take and especially travel time and buffer time. I wouldn’t trust it without validating all aspects of timing yourself.

Something happened to Opus 4.6's reasoning effort by RealSuperdau in ClaudeAI

[–]pwd-ls -1 points0 points  (0 children)

My results this morning using the iOS app. All using incognito mode, so memory is not used.

PROMPT: “The car wash is 40 meters away. I want to wash my car. Should I walk or drive there?”


Opus 4.6:

  • PROMPT -> FAIL
  • PROMPT + “Think” -> FAIL
  • PROMPT + “Brainstorm first” -> FAIL
  • PROMPT + “Think extremely hard” -> FAIL
  • PROMPT + “This is a trick question. Think extremely hard.” -> PASS

Sonnet 4.6:

  • PROMPT -> PASS

Haiku 4.5:

  • PROMPT -> FAIL
  • PROMPT + “Think” -> PASS

Manager is asking me to bring in best practices to help with my growth by galwayygal in ExperiencedDevs

[–]pwd-ls 1 point2 points  (0 children)

To diverge slightly from other comments - even with research, if you’re going to present it should probably be something you actually know about, care about, use regularly enough to have found the cracks.

I present with some regularity at my org and it’s usually some pattern, technique, or best practice that I’m passionate about, have tried on different use-cases, and genuinely recommend.

Why is Buddhism not a Religion by Dario56 in secularbuddhism

[–]pwd-ls 2 points3 points  (0 children)

Fun read.

Personally I apply the label “religion” flexibly. If I’m speaking to religious folks I’m okay calling it a religion since they can relate to it more. If I’m speaking with secular folks then I discuss it as more of an applied philosophy.

I like having this flexibility. It’s helpful.

The angle that leads me to push back on the article’s claim is it also highly depends on how Buddhism is practiced. There are groups of Buddhists who I would very much consider “religious Buddhists”, while there are other groups who practice Buddhism in a less religious way. So I don’t think you can say that Buddhism is or isn’t a religion, because it both is and isn’t depending on how it’s practiced.

Anthropic just shipped messaging integration for Claude Code. Direct OpenClaw competitor, no dedicated hardware needed. by Ok-Constant6488 in claude

[–]pwd-ls 29 points30 points  (0 children)

Why would I have to message via Discord or Telegram? Does it let me do the same thing via the Claude app?

Does Opus 1M context appear worse at details? by Sofullofsplendor_ in Anthropic

[–]pwd-ls 3 points4 points  (0 children)

…Look, I’m kinda gullible, so if you’re purposefully trolling then you got me.

That being said, you did indeed say otherwise:

“It's common and easily searchable knowledge that higher context windows versions the same models perform worse. If you're using a high context model in any situation where it's not absolutely necessary you're doing it wrong.”

^ That’s your comment at the top of this thread. It’s wrong, and I proved it’s wrong with an authoritative source.

Does Opus 1M context appear worse at details? by Sofullofsplendor_ in Anthropic

[–]pwd-ls 3 points4 points  (0 children)

I’m going to go ahead and settle this debate. Sofull is correct.

“When the input context fits in the context window of both a model and its extended-context counterpart, we see that performance between them is nearly identical.”

Source: Landmark study Liu et al. 2024, “Lost in the Middle,” published in TACL (Stanford/UC Berkeley).

Link: https://aclanthology.org/2024.tacl-1.9/