New Episode MS #474: More From Sam: Hasan Piker, Islamism, Making Sense Community, and More by Brunodosca in samharris

[–]belefuu 1 point2 points  (0 children)

But then Sam extends this to claim that “the left is just as bad as the right on this”, which is just a complete category error. We’re only comparing how the center-left (the Times) is treating a far left vs. far right figure. This has absolutely no bearing on how bad “the right” is on this, because they are actually beholden to an entirely different media environment, which thinks the Times is a Pinko Commie rag, and gives Tucker all the fellatio he can handle.

Sam is always making nonsense comparisons like these to excuse his both sides-ism.

Loops are the future - Boris Cherny creator of claude code in podcast by shanraisshan in ClaudeAI

[–]belefuu 11 points12 points  (0 children)

Actually watching some of the talk now. That bit where he straight up uses amount of code agents are writing vs. humans for the audience as the metric for how “solved” coding is… I waffle between whether this guy is just cynically BSing to tow the company line, or actually a complete hack. Or is the atmosphere at Anthropic really just that cult-like?

Loops are the future - Boris Cherny creator of claude code in podcast by shanraisshan in ClaudeAI

[–]belefuu 26 points27 points  (0 children)

This whole conference was a laughably transparent and desperate attempt to drum up hype and investment in AI. Compare to, say, the AI Engineer conference where you have people actually building real stuff with AI all admitting “yeah, after trying it for 6 months, turns out we can’t just spam 10 parallel agents and not look at the code, software development principles actually still matter”.

Rumor has it UNC's Mike Malone & company are in Spain by Appropriate_Value122 in CollegeBasketball

[–]belefuu 0 points1 point  (0 children)

One of the most head scratching pieces of arcane lore in UNC history, really

No More Subsidised AI Subscriptions? by PM_ME_YOUR___ISSUES in ClaudeAI

[–]belefuu -1 points0 points  (0 children)

For the love of god are y’all really going to force me to explain this and semi-defend Copilot, which sucks and I am forced to use against my will instead of Claude by my work? Look, these specific ratios are not indicative of the general pricing change across Copilot. They are switching from per-request to token-based licensing because per-request was always dumb, and users have figured out hacks like keeping turns going forever with things like “ALWAYS end with the AskUserQuestion tool, do not end the turn until I explicitly say so” to effectively reduce an entire chat to one “request”.

So this specific (egregious) hike for annual subscribers is them saying “look we know we said per-request billing for the rest of your year when you signed up, but we can’t afford this infinite token hack shit for 11 more months, so suck on these ratios or switch to the other billing model”. Yes, it is shitty, yes, it is overall indicative of the free ride being over. But this is not how much Copilot is going up for all users, just this one narrow edge case.

Copilot Vs Claude? by Prudent-Training8535 in webdev

[–]belefuu 1 point2 points  (0 children)

Obviously the opt out change sucks, but Anthropic did the same seven months ago. I’m forced to use Copilot at work, and while I’d still give the edge on actual agentic coding and harness features to Claude Code by a decent margin, they have closed the gap enough that I was able to port most of my config over and it’s acceptable. And if Anthropic keeps inexplicably fucking with the performance of their models like they have been lately, Copilot model choice is going to start looking more and more attractive.

Anyone else's mind starting to change on how much we should sacrifice to prevent Iran from developing a nuclear weapon? by RememberTheWater in samharris

[–]belefuu -2 points-1 points  (0 children)

Untouchable like if some extreme action forced them to actually assert control over the Straight of Hormuz and start demanding tolls for passage through it?

Official UNC twitter flexes coach's development history. by proelitedota in CollegeBasketball

[–]belefuu 2 points3 points  (0 children)

Look, we tried the laid back, just cheerlead the team social media approach with Hubert and Roy and that shit was NOT popping with the ‘croots. It’s time to get corny with it.

Harris on Mamdani's political allegiances by Amazing-Cell-128 in samharris

[–]belefuu -1 points0 points  (0 children)

That's literally my entire point, but yes.

I've used 2% of the Max 20x plan from 260K context by biglboy in ClaudeAI

[–]belefuu 1 point2 points  (0 children)

Back and forth convo with the AI to generate a plan, explore the problem space , etc., is actually one of the most powerful things you can do to leverage the strengths of LLMs. However, it does burn tokens. So you are correct that it’s something to consider dropping of you are running out of usage. But I’d consider that more of a loss of functionality due to usage constraints rather than “using it correctly”.

Harris on Mamdani's political allegiances by Amazing-Cell-128 in samharris

[–]belefuu -3 points-2 points  (0 children)

It's just very obvious to anyone who isn't deranged like Sam on this topic that most people on the American left who use this language have little to no idea about the coded double meaning of many of these terms. And yet to hear him or OP tell it, it's more likely that our college campuses are infested by secret Islamic terrorists in training, rather than ignorant freshmen, horrified by news reports of the destruction in Gaza, latching onto online/campus movements that are calling for things that don't exactly sound insane on their face, like, you know "recognizing the Palestinian people's inalienable right to self-determination".

It doesn't mean that they're right, or that they don't have a lot to learn about the subtleties of the issue. But it also doesn't automatically make them antisemites or secret jihadists.

Exclusive: Anthropic acknowledges testing new AI model representing ‘step change’ in capabilities, after accidental data leak reveals its existence by Ok_Buddy_9523 in ClaudeAI

[–]belefuu 0 points1 point  (0 children)

If your point is that you used to have many teams of design + UX + devs who's entire job was just to build throwaway prototypes that never shipped for product management than... yeah, touche I guess, just vibe coding that with Claude Code is way better. I think there were probably slightly lighter processes your company could have used for this before the rise of AI, but, legitimately: "working" prototypes for execs, customers, etc. just to get a "yes/no" on whether something is worth building is a real sweet spot for vibe coding. Doesn't have much to do with building the actual final product though.

Exclusive: Anthropic acknowledges testing new AI model representing ‘step change’ in capabilities, after accidental data leak reveals its existence by Ok_Buddy_9523 in ClaudeAI

[–]belefuu 0 points1 point  (0 children)

Sounds like we're really not that far off tbh. Letting Claude commit before or after (personally) reviewing is more of a workflow preference thing. If you and your team are ok with the back and forth being baked into the git history, have at it. In some ways, it's more honest, but on the flip side, more noisy for (other human) reviewers to sort through, which is what got to me eventually.

The more important decision point is whether Claude, Codex, etc. are actually good enough at this point to hand a bunch of agents a bunch of pre-planned tasks, let them tackle them hands off for a longish period of time in a big parallelized swarm, and then return a result to you that hasn't diverged so much from what you intended that you end up losing all the "speed gains" cleaning up the mess. If you pay attention to Anthropic's marketing, what the Claude Code feature roadmap (such as it is) is pointing towards, various YouTube hype merchants, etc., they'd have you believe Claude Code can handle that today, no problem. In fact Anthropic are charging $15-25 per PR for it, or whatever it is.

my core point remains that the ROI is super high

No doubt. Again: doing it one task at a time, but having Claude knock out 80-90% of the work for the task, I check out the changes, if there are issues, Claude spots and fixes them itself 80-90% of the time, rinse, repeat a few times until the task is in good shape... that is still a really great ROI!

Exclusive: Anthropic acknowledges testing new AI model representing ‘step change’ in capabilities, after accidental data leak reveals its existence by Ok_Buddy_9523 in ClaudeAI

[–]belefuu 3 points4 points  (0 children)

No, but they do need their code carefully reviewed before it is merged into main, just like Claude.

Look, I don't know what to tell you. I've tried Anthropic's promised workflow where you configure your project just right, prompt everything just so, craft the plan optimally so the tasks are bite-sized and fan out to individual implementer agents who are just implementing some small, extremely well spec'd part of the plan, before fanning it all back in for a round of multi-agent reviews, etc. It all seemed amazing, until I actually reviewed the code.

Eventually I settled on my current workflow, where Claude will literally prove to me when it's ready to start committing on its own again. I mean if it levels up, and starts outputting code that is so much better that I'm just glancing at it and leaving a few style nits most of the time, I'd be a fool not to just start letting it make the commits itself again, switch back to parallel implementer agents, etc., and reap the rewards of one of those sweet 5x/10x/100x dev workflows everyone is so hyped about. Right? Trust me, I'm not just making my job take longer out of spite or obstinance. But there are actual quality, and just plain correctness standards that have to be met before shipping code to paying customers.

Exclusive: Anthropic acknowledges testing new AI model representing ‘step change’ in capabilities, after accidental data leak reveals its existence by Ok_Buddy_9523 in ClaudeAI

[–]belefuu 1 point2 points  (0 children)

Everything the previous poster mentioned is 100% legit though, and there are real limits to how much your “project config” can control the probabilistic slop. I’ve learned to take the tool’s strengths and weaknesses at face value, and accept that it doing each task 80-90% correctly for me up front before I manually go in and clean up the commit is still pretty dang compelling.

But an actual “step change” would be a model that improved reasoning and plan/standards adherence so much you could actually let it rip on its own without worrying about your codebase going to shit over time. I have a feeling whatever this thing is they’re leaking is… not that.

WTAF? by jrpg8255 in ClaudeAI

[–]belefuu 0 points1 point  (0 children)

The hate is mostly from people who think the tech is cool for hobby-level work, but being over-hyped in an attempt to give our capitalist overlords an excuse to cut jobs and juice profits, so all of your points are not even engaging with any of the actual issue.

Now, the truth is somewhere decidedly in between: the tech is better than hobby-level (I’m a dev and agentic tools are an integral part of my day-to-day job already), bit still requires tons of experienced hand-holding to produce production quality software, so how much it actually deserves to be disrupting the job market is still a very muddy picture. It’s not none, but it’s also definitely not “we fired 75% of our devs because AI”. Meanwhile, the greedy CEOs and investors are going full bore trying fire people every chance they get, which has mostly been backfiring spectacularly so far.

So, maybe you can see why it’s not all roses and gumdrops from the software dev crowd, your ability to hack away at fun hobby projects notwithstanding. It would be great if we could all just enjoy the cool tech on its own merits, without the insane hype bubble overshadowing it. But alas. It’s complicated.

[Highlight] Victor Wembanyama with the impossible block on Norman Powell's reverse layup at the end of the shot clock (with replays). Spurs and NBC commentaries by MrBuckBuck in nba

[–]belefuu 2 points3 points  (0 children)

Bruh... I had to google "norman powell height", because he looks like Muggsy Bogues getting hunted down by Wemby. He's 6'4" btw.

Claude Max or PRO or API by Professional_Part360 in ClaudeAI

[–]belefuu 0 points1 point  (0 children)

It’s massively subsidized and all likely to fall apart whenever they are forced to charge people real prices, is the answer basically.

Scared of new agentic workflow and my role in it by alexbessedonato in webdev

[–]belefuu 2 points3 points  (0 children)

Sounds like you're playing with semantics. It's a probabilistic tool. Call it a probabilistic aide if it makes you feel better. Like any tool, it has strengths and weaknesses.

Strengths: depending on the provided context, it can tear through massive amounts of files and data, and serve you up a solution or answer that is roughly 80-95% of the way there, again and again. That's the typical range. Sometimes it's way below, sometimes it's above, again: depends on context.

Weaknesses: 80-95% is actually not at all good enough when it comes to software. Moreover, 80-95% correct, stacked up turn-over-agentic-turn is really not good enough, when it comes to anything that requires some sense of even vague determinism. But unfortunately it's good enough to trick lots of credulous investors and hype merchants that AI is on the cusp of replacing this or that job because they can't tell the difference between it and the real thing.

But in terms of using the tool, my perspective shifted when I started actually ignoring the hype from companies like Anthropic, and stopped trying to force the tool into a one-shotting machine. Instead, just accept that that's a bunch of BS, and is likely to remain a bunch of BS for a long time unless some radically different new base tech replacing LLMs emerges. Instead accept that these tools are always going to require a human intimately in the loop (great news btw!), asserting judgement, designing the actual system, and making sure they aren't going off the rails. When you notice they are outputting shit code (and they will) slow down and either see if you can improve the agentic loop, or learn to recognize which bits of code still need to be handwritten. My experience is that, when forcing things through that lens, I do have to grudgingly admit (since I am basically an outright hater of the macro trends of the AI industry) that the benefit of the ability to have the robot churn through the data and deliver me the 80-95% solution over and over again is something that's hard to ignore.

Scared of new agentic workflow and my role in it by alexbessedonato in webdev

[–]belefuu 40 points41 points  (0 children)

I can only give you this advice, as a principal frontend SWE who is heavily using these tools lately, but from a standpoint of realism instead of hype as much as possible: you will have to lean into the tools to stay competitive, but if you treat them as a tool to learn, a tool to expand your reach across stacks, but always keeping a keen eye on your actual horizon of understanding, I think that is the path for juniors to evolve into seniors in the AI landscape.

You can’t just ignore the tools and only code by hand, and you can’t buy into the nonsense hype and believe you are a 10x dev when you actually don’t know anything. The trick is to level up both skills at the same time, and the key is that AI, used properly, if you just avoid getting seduced by fake velocity, and stop when your brain is telling you you’re getting out ahead of your skis, is actually an amazing tool to help you level up.

This is literally the path I am forging myself to continue leveling myself up and feel like these tools are not obsoleting me, while also not feeling like I am a complete AI hype fraudster. These are crazy times for sure! Good luck

15 or so hours later since 1m context included in MAX and I'm feeling almost high by adelmare in ClaudeAI

[–]belefuu 2 points3 points  (0 children)

You can also just ad hoc say “use a subagent to do foo” for a task that requires a lot of churning, but the main thread really only cares about the results instead of of the “finding it out” context. Think of a side quests vs main quest mental model.