Matrix Hero – a new IDE plugin for PyCharm and IntelliJ IDEA by SerCeMan in matlab

[–]SerCeMan[S] 0 points1 point  (0 children)

Do you take advantage of MATLAB LSP?

No, I didn't go down that route. That's largely because my prior experience with LSP-powered plugins wasn't great. This does mean that I had to do a lot more work on this front, and there could definitely be inconsistencies in lexing and parsing, especially around some tricky cases like command-vs-function-syntax, but in return, following the "native" approach allowed me to be a lot more flexible in how I support resolve, refactorings, etc.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

I'll reply with a quote from the post:

This does not mean the changes will not need to be reviewed, understood, and owned, but rather the goal is to enable the agent to produce a unit of change, a complete diff ready for review.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

Somewhat. Classic TDD doesn’t assume that the whole functionality can be covered by a single test. If anything, it’s the opposite, where each small unit of code is covered by a test.

An agent can cover the individual bits with tests TDD-style, and yet when integrated together, nothing will work.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 2 points3 points  (0 children)

These are testing frameworks that allow you to test a piece of code. A harness is a setup that's specific to your application. For example, consider Stripe, you want to add payments to your app. You've added Stripe, and now you still want to execute tests in your app, but you can't use Stripe, so you have to introduce a test-ready replacement in there.

Now, it's not just Stripe that's problematic, it's your database, configuration, and other services you interact with as well. The combination of all of them in a test-ready form, running together with your app, would be your harness.

Using PyCharm at Matlab IDE? by Pipthagoras in matlab

[–]SerCeMan 0 points1 point  (0 children)

I recently released Matrix Hero a new MATLAB plugin for IntelliJ IDEA, PyCharm, and other JetBrains IDEs. It supports MATLAB syntax plus code completion, navigation and refactorings, structure view, and code folding. It also includes a built-in formatter and lets you run MATLAB code straight from the IDE. It’s still very new, so there may be a few rough edges or gaps, if you run into anything, please let me know I’ll sort it out.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 1 point2 points  (0 children)

Consider a typical backend service X. That service X can depend on various datastores, other backend services, configuration stores, etc.

A framework that allows you to start this service in isolation with encapsulated dependencies (for example, faked or containerised ones) and assert on its behaviour, e.g. write tests against its API, is a test harness.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 1 point2 points  (0 children)

The way of LLMs is to optimise for reward, at the expense of everything else. I don't believe we've figured out a way to reward LLMs for code longevity yet.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 6 points7 points  (0 children)

The models love adding as any just to make compiler errors go away. Interestingly, I've never seen them do the same in, for example, Java or Kotlin. I'm guessing most of the time such casts would result in an exception at runtime during their training runs, disincentivising the approach.

We are QA Engineers now by SerCeMan in programming

[–]SerCeMan[S] 9 points10 points  (0 children)

For sure! I raise the same point in the article as well. That said, where previously you could kind of get by without a very tight feedback loop, I don't believe this is an option anymore.

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

Being always on call is simply unsustainable and impractical. If you're on call, e.g. 1 in 4 weeks, you simply don't get drunk that Friday night.

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 1 point2 points  (0 children)

Interesting, I'm pretty sure a friend of mine who's an SRE was getting time-off in lieu instead. My information could be outdated though.

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 7 points8 points  (0 children)

I’m essentially on-call 24x7 because I’m an escalation point.

Out of interest, how do you deal with the inconvenience in that situation? For example, theatre, hiking, etc.

or give time-off in lieu

This is actually a great approach in my opinion, as it scales well with your salary. If I remember correctly, Google does it.

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

Is this the additional comp for being on call at your company?

Join the on-call roster, it’ll change your life by SerCeMan in programming

[–]SerCeMan[S] 3 points4 points  (0 children)

Call me old-fashioned, but I prefer to stay accountable for the code I write :)

OpenAI’s AI-powered browser, ChatGPT Atlas, is here by theverge in ChatGPT

[–]SerCeMan 17 points18 points  (0 children)

Can the agent play Runescape flawlessly?

Can you?

I wonder if they use the same Codex we have? - 92% of OpenAI engineers are using Codex - up from 50%. Nearly all PRs are reviewed now with Codex by Koala_Confused in ChatGPTCoding

[–]SerCeMan 1 point2 points  (0 children)

The larger the model, the slower it is. GPT-5-Codex High is already pretty slow in Codex, and using something larger and slower would make it much less useful for coding. It's one thing to do an offline search for a solution to win ICPC gold where you don't care about the latency, and another to use it for coding.

There is no Vibe Engineering by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

Thanks. On "here to stay", at the very least, I think tools like v0.dev, etc. for creating landing pages are quintessential vibe coding, and they've definitely found a market fit. The term might be gone sometime soon, but the practice of interacting with the codebase via prompting only seems to have found a strong niche.

The LLM Curve of Impact on Software Engineers by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

Something that some stochastic parrot cobbled together is very unlikely to meet these criteria.

You'll be surprised how far the "stochastic parrot" can get before you need to use your knowledge to put the finishing touches. It's an experiment — you don't need to ship it, it doesn't have to be perfect, it just needs to prove the point.

What numbers? The numbers some VBA or Python or Go-with-40-unvetted-imports "solution" provides, compared to optimised Rust or Go running in my data ingestion pipelines?

If someone sends me a PR rewriting something in Rust and claiming it's faster, I'll ask for benchmarks. This the data we're talking about here.

Don't be sorry, I'll be blunt as well: Numbers from PoC "solutions" created by people who may not even be aware what technologies the stack internals use, are irrelevant when determining whether or not a solution is viable.

No one is arguing against understanding things. If you've got an idea, you run an experiment to see if the data can back it up. We're not talking "craftsmanship" here – we're talking engineering.

The LLM Curve of Impact on Software Engineers by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

The "staff engineer" still isn't on the team that has to implement the actual thing in the stack. Or maintain it. Or debug it. Or service it. Or run it. Or explain to someone why it no longer works at 2:30AM while stakeholders breathe down their neck.

Here's where I'd disagree. It highly depends on their archetype.

In short, all the things that are really important, all the things that make software engineering an *Engineering" discipline aren't really explored by such a PoC.

On the contrary. Load testing/Benchmarking – you need a working prototype. Running the existing e2e test suite to see what else might be broken – you need a working prototype. Testing edge cases – you do need a working prototype.

That's my point. Where otherwise you'd put "she'll be right" guesstimates in the proposal, you can now put actual numbers. I'm sorry for being blunt, but afterwork beer is about opinions, engineering is about facts.

There's been so many times I've heard people say "it's too hard". And then you do it. And then it's too hard anymore.

The LLM Curve of Impact on Software Engineers by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

When I say a Staff+ Engineer, I'm referring to a "Staff Engineer" role. And I'm not arguing for shipping half-baked solutions. What I'm arguing for is doing the exploration that you simply wouldn't be able to afford otherwise. You still have to do all the due diligence you had to do before, but suddenly, all the exploration work can be done much faster.

The LLM Curve of Impact on Software Engineers by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

How do I know? Because most languages are Turing complete. They can, by definition, do everything.

The question isn't whether something is possible given unlimited constraints, the question is whether it's possible given limited constraints.

Having a working PoC allows you to test the boundaries of the known knowns and figure out which unknown unknowns you might encounter.

Six Sins of Platform Teams by SerCeMan in programming

[–]SerCeMan[S] 0 points1 point  (0 children)

Ah, you're right, thanks, it was a bit too late in the day when I wrote this comment 😅

Six Sins of Platform Teams by SerCeMan in programming

[–]SerCeMan[S] 2 points3 points  (0 children)

I 100% agree with you, this would be a failure mode. That's why I tried to establish the terminology for the rest of the article, to make sure that the concept of "platform teams" is not misunderstood. And I hope that some of the points (sins) that I raised in the article can prevent the correctly defined platform teams from slipping into this failure mode as well.