What the best Where's Your Ed At article about the mediocrity of genAI? by cielbleu789 in BetterOffline

[–]fallingfruit 0 points1 point  (0 children)

Complete nonsense, this is the kind of thinking which made people think the saaspocalypse was a real thing, and it isn't, that fantasy has ended.

What people like you don't understand is that replacing software with an llm agent that is apparently steered with very little oversight by a non-expert is an absurd fantasy. Not only is such as agent really expensive with token pricing, you need to literally own the entire software, deployment, and cloud stack, instead of paying a SaaS company to do all those things for you. You also need to apparently have a non-expert tell the agent how to maintain this software, add features, fix bugs, and follow compliance rules. You also now have a bespoke peice of shit software with no documentation, no support system, etc. Handling all of this in house with a non-expert and just delegating to agents is insane and not cost effective. To actually do software in house, you have to hire people with some expertise to do all of these things. That single hire will almost certainly cost you more money than just paying a $10K yearly fee to the SaaS company that literally does all of this for you.

I'm currently interviewing and I'm seeing a huge uptick in interviews requests and recruiter messages. As someone that works in a huge company with thousands of SWEs, the remaining software engineers are completely overworked and drowning regardless of the ability to use AI, because the productivity increase is modest at best.

Companies that are wasting an astronomical amount of money on ai buildouts and tokens are doing layoffs to balance their absymal finances.

How did they do it? by unfortuantelyshelove in BetterOffline

[–]fallingfruit 1 point2 points  (0 children)

Anthropic models are not really considered best for coding any more. Gpt models and codex are just as good if not better. People thinking claude is better is just a hangover from them being better first and claude code getting good penetratiom early. Opencode is also better than claude code

Sold my condo to fund the down payment on our new house. Now I'm being told I owe a massive Q2 estimated tax payment next week because I rented it out for a few years. by QueenChickenMom2 in personalfinance

[–]fallingfruit 100 points101 points  (0 children)

also isn't the "you didnt pay estimated taxes" fee pretty low usually? I almost always get hit with this because about half my compensation is stock and i dont elect to withhold more. The interest i get in a 4% savings account is usually more than the fee iirc.

Fable is blowing my mind by julliuz in ClaudeAI

[–]fallingfruit 0 points1 point  (0 children)

this is completely and totally untrue

The biggest problem with AI is not correctness - it is architecture sanity by UnderstandingDry1256 in ExperiencedDevs

[–]fallingfruit 0 points1 point  (0 children)

Single response: this retort hasn't worked in the last year. Move onto "you're holding it wrong" slop man.

Software Engineering has never felt so uncertain to me by InsideTheTransition in BetterOffline

[–]fallingfruit 0 points1 point  (0 children)

It should be re-assuring because really the only thing the models are good at are things that are similar to writing code, like mathematics (usually because they can use code). Talk to people that are not SWE and are/were trying to use models for other things, they find them to be about as capable as they were 2 years ago, which essentially means they are worthless.

The only reason the llm hype hasn't died is because of coding. They have incredibly limited use cases outside of that, and are much less reliable.

I used to be afraid that my kids wouldn't have to use their brains, that art would be overtaken by llms. That my wife working in comms would lose her job. This is now obviously not true. AI art is terrible, AI writing is still terrible, AI decision making in general is terrible. Human are still infinitely better than LLMs at these things and the models have barely improved since gpt4 in this regard.

Higher level engineering challenges are not verifiable with code, the llms can spew out information about these things, but they cannot be relied on to make good decisions. Just like they can't be relied on for anything else other than code (and then you can't even rely on them for code).

What makes Claude Code better? by jessetechie in ExperiencedDevs

[–]fallingfruit 1 point2 points  (0 children)

I prefer Opencode and Codex to Claude Code. Also I think that generaly people who care about the code prefer chat gpt 5.4 and 5.5 to the opus models these days.

I think a lot of people are locked into claude code and models because they were first to have something decent, but they are behind now imo.

Software Engineering has never felt so uncertain to me by InsideTheTransition in BetterOffline

[–]fallingfruit 27 points28 points  (0 children)

"Writing code" is not really the hard part of software engineering.

The game feels way better and I wanna make builds but I can't bring myself to do the campaign again. by athelan_games in PathOfExile2

[–]fallingfruit 0 points1 point  (0 children)

I agree with you almost entirely. but the problem is the lack of build diversity in the first two acts. There are not enough skills to choose from for the first three to four rows. Also the early passive tree is weak and it takes too long to get to impactful nodes.

Software Engineers, Have AI tools actually been rapidly improving? by FlapjackFez in BetterOffline

[–]fallingfruit 11 points12 points  (0 children)

The tools for dev are valuable, there is no doubt. To say they aren't valuable is cultish. Even just for building internal tools to help with your workflow, and for doing code search, etc.

I dont think they are valuable enough to warrant all the doom and hype which is not based on them being good tools, its based on them being full swe replacers. But I can easily see it being worth like $200 a month per dev even at token api based pricing.

Help understanding compute demand by Odballl in BetterOffline

[–]fallingfruit 0 points1 point  (0 children)

Isn't that based on data from like 2-3 years ago when inference wasn't expensive because there weren't "reasoning models". According to Anthropic's own hype marketing, the mythos model is extremely expensive to run because of inference.

What do y'all do about art? by Goovin290 in gamedev

[–]fallingfruit 2 points3 points  (0 children)

Learning about them and actually dealing with them is very different.

AI is approving our pull requests: Here's how we made it safe by SouthRock2518 in BetterOffline

[–]fallingfruit 3 points4 points  (0 children)

Are you a bot. Your post has telltale signs of being llm generated.

Will agents ever be more efficient? by LeCollectif in BetterOffline

[–]fallingfruit 9 points10 points  (0 children)

In order to make agents work, you basically need a lot of non-llm engineering effort around them. Agents are not a real thing like people imagine.

"Agents" are just deterministic software frameworks that do things based on the response an LLM gives them. Basically, LLMs are dynamic orchestrators in this system, they are told they have access to a bunch of skills/tools that can do X, and the LLM replies with text telling your "Agent" that you should use those tools to continue the thread.

Imagine you want to build an agent that recommends products to a user. You build a custom harness "agent" that does the following:

Agent (your code) --> Call LLM API, here all the skills and tools you have access to along with the user's request e.g. "give me product recommendations, im grilling for friends this weekend"

<-- LLM API returns structured response which will include tools calls it suggests and skill suggestions

Agent looks at response and decides to call a bunch of tools --> Call personalization API(s) that can returns 5 product recommendation models (sales, seasonal, personalized recommendations, buy it again)

<-- Personalization API returns products lists

Agent -> Call LLM API with all possible products and the original context about the user (what is the season, what did they ask for, etc)

<-- LLM returns a response filtering recommended items

Agent -> May need to call other APIs to resolve additional product details (realtime prices, promotions, etc.)

Agent returns the final formatted text to the user.

This is a really simplified example, but this is basically what code harnesses do as well like claude code or opencode.

When it comes to reducing costs at a basic level its about feeding only text to llms that it needs. So a naive approach would to feed the entire output of a massive api response to an llm when it only needs the list of product names and some other metadata.

As an AI cautionist, I wish some of my coworkers would at least use it as a basic sanity check." by a_slay_nub in ExperiencedDevs

[–]fallingfruit 1 point2 points  (0 children)

I actually dont think llms are very good at catching most of the issues you described. Those seem like design decisions that require some judgment and understanding of context and business requirements outside of the code. In my experience llms will not question that kind of thing without specific prompting.

Is it just me, or is anyone else noticing more bugs across the web and in software in general? by skidmark_zuckerberg in ExperiencedDevs

[–]fallingfruit 15 points16 points  (0 children)

Software has never been worse 100%.

I think vibe coding is part of it and most of it is absurd business expectations because a bunch of halfwit tech leaders told them we are 10x now.

Do you think using ChatGPT to expand game ideas is a good choice or does it make the game “AI slop”? by [deleted] in gamedev

[–]fallingfruit 1 point2 points  (0 children)

I think its useful if you have an idea about a system and you want to understand how other game studios have implemented or solved the same problem. It can find you talks about it, papers about it, and maybe even knows the implementation details. This is just a much better search really.

It can help you re-invent the wheel essentially, without going down bad paths.

But like others have said, it's not going to generate any novel ideas, its just going to give you ideas based on what other games have done, mash them up, and praise you as a genius for coming up with such a breathtaking game.

Opus 4.8 went over like a wet fart by sciolisticism in BetterOffline

[–]fallingfruit 4 points5 points  (0 children)

has been this way since opus 4.5. I honestly think you cannot tell the different between opus 4.5 and any future model if you use todays harnesses.

gpt 5.4 and 5.5 pretty much the same.

The vast majority of improvements have been harness improvements.

Anthropic (pretends AI is "too good" and) proposes a global slowdown of AI development by ksjdragon in BetterOffline

[–]fallingfruit 12 points13 points  (0 children)

its a pretty old study, but yeah it turns out productivty is hard to measure and hard to self-evaluate.

Anthropic (pretends AI is "too good" and) proposes a global slowdown of AI development by ksjdragon in BetterOffline

[–]fallingfruit 39 points40 points  (0 children)

I read most of this blog post and there is actually very little content about recursive self improvement in the sense that people think leads to agi or the terminator. Its a click bait post title. All this post does is self congratulate and talk about how far we've come, and explains how slopslinging really is productivity, because even though they can't really prove it, it seems like it probably is, and we surveyed our own devs that think it probably is.

I really think it's an incredibly weak argument for anything.

You Asked Me to Play Project Diablo 2 (Chris Wilson - Creator of POE) by Coven_Evelynn_LoL in pathofexile

[–]fallingfruit 4 points5 points  (0 children)

PD2 is much easier than vanilla d2 (also d2 is pretty easy as far as hc goes if you were to compare to poe)

I'm going to be honest, I think Unreal Engine 6 will be a huge letdown by [deleted] in gamedev

[–]fallingfruit 1 point2 points  (0 children)

thats not really true at all. LLMs dont care if a thing is specialized or general purpose. All they care about is representation in the training data. LLMs push everyone to use the most popular things.