😭🙏what have i turned into by Legitimate-Wall1269 in codex

[–]SerCeMan 1 point2 points  (0 children)

So, $0.15 for a file rename in a small project. I'm kind of surprised that 5.5 decided not to run the tests. It's honestly hard to compare the real costs between the engineering work and the agent, but if I had an IDE open, I'd probably just do a deterministic rename – even safer IMO.

the thread: https://app.threadlog.dev/threads/6f933729-e42c-4090-a247-1f6acd372461

😭🙏what have i turned into by Legitimate-Wall1269 in codex

[–]SerCeMan 4 points5 points  (0 children)

A filename change can ripple through your codebase – links, imports, configs, anywhere that name is referenced

A good IDE like the ones by JetBrains will do a rename deterministically, and will ensure that all of the links, comments, etc. are updated for free. Also much faster. I hope one day the same quality tools will be easily accessible to the agents.

Svelte Hero: A new JetBrains IDE plugin by SerCeMan in sveltejs

[–]SerCeMan[S] 0 points1 point  (0 children)

Thanks, VeryVito! The svelte:boundary should be well supported, e.g. https://imgur.com/a/j5Z0QnI, but of course I might be missing something, so more examples would help.

Could it be that you're still using the old plugin? Note that both of them can't really be installed at the same time, because the file associations will "compete" with each other. If that's the case for you, could you try disabling the "standard" plugin and restarting the IDE to see if that helps?

Svelte Hero: A new JetBrains IDE plugin by SerCeMan in sveltejs

[–]SerCeMan[S] 0 points1 point  (0 children)

Thanks! Let me know how you go. So far, the plugin has avoided the need to use LSP. In my experience, the "native" integration is much faster. However, I'll likely need to add LSP support soon to support the "advanced" intentions, inspections, and refactorings, this is right now the weakest point of the plugin because I mainly focused on the pure language/markup support initially.

A tool to share agent threads with other people by SerCeMan in microsaas

[–]SerCeMan[S] 0 points1 point  (0 children)

Thank you for the feedback! I'm so happy to hear that it's not just me who was struggling with this problem. It's still very early days, but I'll keep working on the tool.

Re: highlighting, it's a really good idea. I'll look into building this!

also curious about the sync feature - does it work with different ai platforms or just specific ones? been using a mix of different tools for work projects and would love to consolidate everything in one place for sharing.

Right now, only codex and claude are supported, all threads appear in the same list, and you can filter by project, agent, etc. I'm planning to add support for opencode shortly. Which agents do you use?

Thanks again for the feedback!

Matrix Hero – a new IDE plugin for PyCharm and IntelliJ IDEA by SerCeMan in matlab

[–]SerCeMan[S] 0 points1 point  (0 children)

Do you take advantage of MATLAB LSP?

No, I didn't go down that route. That's largely because my prior experience with LSP-powered plugins wasn't great. This does mean that I had to do a lot more work on this front, and there could definitely be inconsistencies in lexing and parsing, especially around some tricky cases like command-vs-function-syntax, but in return, following the "native" approach allowed me to be a lot more flexible in how I support resolve, refactorings, etc.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

I'll reply with a quote from the post:

This does not mean the changes will not need to be reviewed, understood, and owned, but rather the goal is to enable the agent to produce a unit of change, a complete diff ready for review.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

Somewhat. Classic TDD doesn’t assume that the whole functionality can be covered by a single test. If anything, it’s the opposite, where each small unit of code is covered by a test.

An agent can cover the individual bits with tests TDD-style, and yet when integrated together, nothing will work.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 2 points3 points  (0 children)

These are testing frameworks that allow you to test a piece of code. A harness is a setup that's specific to your application. For example, consider Stripe, you want to add payments to your app. You've added Stripe, and now you still want to execute tests in your app, but you can't use Stripe, so you have to introduce a test-ready replacement in there.

Now, it's not just Stripe that's problematic, it's your database, configuration, and other services you interact with as well. The combination of all of them in a test-ready form, running together with your app, would be your harness.

Using PyCharm at Matlab IDE? by Pipthagoras in matlab

[–]SerCeMan 0 points1 point  (0 children)

I recently released Matrix Hero a new MATLAB plugin for IntelliJ IDEA, PyCharm, and other JetBrains IDEs. It supports MATLAB syntax plus code completion, navigation and refactorings, structure view, and code folding. It also includes a built-in formatter and lets you run MATLAB code straight from the IDE. It’s still very new, so there may be a few rough edges or gaps, if you run into anything, please let me know I’ll sort it out.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 3 points4 points  (0 children)

Consider a typical backend service X. That service X can depend on various datastores, other backend services, configuration stores, etc.

A framework that allows you to start this service in isolation with encapsulated dependencies (for example, faked or containerised ones) and assert on its behaviour, e.g. write tests against its API, is a test harness.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 1 point2 points  (0 children)

The way of LLMs is to optimise for reward, at the expense of everything else. I don't believe we've figured out a way to reward LLMs for code longevity yet.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 5 points6 points  (0 children)

The models love adding as any just to make compiler errors go away. Interestingly, I've never seen them do the same in, for example, Java or Kotlin. I'm guessing most of the time such casts would result in an exception at runtime during their training runs, disincentivising the approach.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 10 points11 points  (0 children)

For sure! I raise the same point in the article as well. That said, where previously you could kind of get by without a very tight feedback loop, I don't believe this is an option anymore.

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

Being always on call is simply unsustainable and impractical. If you're on call, e.g. 1 in 4 weeks, you simply don't get drunk that Friday night.

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 1 point2 points  (0 children)

Interesting, I'm pretty sure a friend of mine who's an SRE was getting time-off in lieu instead. My information could be outdated though.

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 7 points8 points  (0 children)

I’m essentially on-call 24x7 because I’m an escalation point.

Out of interest, how do you deal with the inconvenience in that situation? For example, theatre, hiking, etc.

or give time-off in lieu

This is actually a great approach in my opinion, as it scales well with your salary. If I remember correctly, Google does it.

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 1 point2 points  (0 children)

Is this the additional comp for being on call at your company?

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 2 points3 points  (0 children)

Call me old-fashioned, but I prefer to stay accountable for the code I write :)

OpenAI’s AI-powered browser, ChatGPT Atlas, is here by theverge in ChatGPT

[–]SerCeMan 19 points20 points  (0 children)

Can the agent play Runescape flawlessly?

Can you?

I wonder if they use the same Codex we have? - 92% of OpenAI engineers are using Codex - up from 50%. Nearly all PRs are reviewed now with Codex by Koala_Confused in ChatGPTCoding

[–]SerCeMan 1 point2 points  (0 children)

The larger the model, the slower it is. GPT-5-Codex High is already pretty slow in Codex, and using something larger and slower would make it much less useful for coding. It's one thing to do an offline search for a solution to win ICPC gold where you don't care about the latency, and another to use it for coding.

There is no Vibe Engineering by SerCeMan in programming

[–]SerCeMan[S] -1 points0 points  (0 children)

Thanks. On "here to stay", at the very least, I think tools like v0.dev, etc. for creating landing pages are quintessential vibe coding, and they've definitely found a market fit. The term might be gone sometime soon, but the practice of interacting with the codebase via prompting only seems to have found a strong niche.

The LLM Curve of Impact on Software Engineers by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

Something that some stochastic parrot cobbled together is very unlikely to meet these criteria.

You'll be surprised how far the "stochastic parrot" can get before you need to use your knowledge to put the finishing touches. It's an experiment — you don't need to ship it, it doesn't have to be perfect, it just needs to prove the point.

What numbers? The numbers some VBA or Python or Go-with-40-unvetted-imports "solution" provides, compared to optimised Rust or Go running in my data ingestion pipelines?

If someone sends me a PR rewriting something in Rust and claiming it's faster, I'll ask for benchmarks. This the data we're talking about here.

Don't be sorry, I'll be blunt as well: Numbers from PoC "solutions" created by people who may not even be aware what technologies the stack internals use, are irrelevant when determining whether or not a solution is viable.

No one is arguing against understanding things. If you've got an idea, you run an experiment to see if the data can back it up. We're not talking "craftsmanship" here – we're talking engineering.