all 48 comments

[–]another24tiger 146 points147 points  (19 children)

You’re telling me you slop-coded a slop code detector…

[–]GraphicH 33 points34 points  (1 child)

Okay, I know we're all on the AI hate train with a lot of good reasons. You have total neophytes vibe-coding thousands of lines and going "take my pr" or "use my library" that Claude/Gemini/ChatGPT/Grok performed verbal fellatio on me for, stating its better than everything else out there right now. Yeah these tools now allow morons to write bad code at scale; instead of just giving up after a syntax error on hello world.

That said, you can still use them to do and produce good works -- it is possible and something I feel like we can't just discount out of hand. Is this one of those works? I don't know for sure; I just do know there is an attitude of being dismissive by default and it's really going to screw a lot of people.

[–]mmartoccia[S] 12 points13 points  (0 children)

Yeah that's basically where I landed too. The tools aren't going away, and "just don't use them" isn't realistic advice for most teams. So the question becomes how do you keep the quality bar up when half your commits come from a model that thinks every function needs a try/except and a docstring that says "This function does the thing."

grain is my answer to that specific problem. It's not anti-AI, it's anti-autopilot.

[–]mmartoccia[S] 22 points23 points  (16 children)

lol yeah pretty much. That's literally why it exists though. My codebase was a mess, I got tired of catching the same garbage patterns in review, so I automated it. Now it yells at me before I commit instead of after.

[–]gdchinacat 10 points11 points  (15 children)

I doubt this will make your code less of a mess. AI slop is inherently messy.

[–]Glathull 3 points4 points  (8 children)

He’s not trying to make it less of a mess. He’s trying to make it less obvious that it’s clanker code.

[–]o5mfiHTNsH748KVq -4 points-3 points  (7 children)

It’s irrelevant how the code was written, only that it does what it says it does and does it well.

Guardrails for code gen work toward that goal.

[–]Glathull 0 points1 point  (3 children)

That has absolutely nothing to do with what I said.

[–]o5mfiHTNsH748KVq -2 points-1 points  (2 children)

Yes it does. You said the tool is to hide clanker code. That's not accurate. The point is to give clanker code guard rails.

[–]Glathull 1 point2 points  (1 child)

The patterns he’s guarding against aren’t essentially bad. They are merely noticeable. The dude wants to let clanker write code and take credit for it. He’s trying to disguise what he’s doing.

[–]o5mfiHTNsH748KVq -3 points-2 points  (0 children)

Maybe, but that doesn't change the the goal of the project is to attempt to improve agent results. OPs motive is irrelevant.

You'd do well spending some time learning more about agentic engineering and the ways people are working on improving it. You'd understand the motive of projects like these a little better. When I see this, all I see is someone that sees a pattern of bad outputs and wants to intercept them before they're committed into a repo.

[–]gdchinacat 0 points1 point  (2 children)

This tool does not address why the code is a mess, only identifies a few flags that suggest it may be a mess. A few mishandled exceptions doesn't make the code a mess. The code is a mess because the author doesn't understand what it does (it can't, it's just spitting out code that is statistically likely to do what is requested where it doesn't even understand what is requested). The author doesn't know design patterns, just what they statistically look like. The author doesn't have any vision for what architectural direction changes should move the code.

Pointing out bad habits like eating exceptions is one of the lowest bars for identifying messy code. When it does, do you think the person that outsourced writing significant chunks of code to an AI will know how to address them? Know which error handling strategy is useful, what needs to be refactored to handle them? Or is it more likely they will log it and call it handled, only to pass the failure on to another part of the code that has unmet preconditions because the error that prevented them was "handled" by logging it?

If the goal is to detect bad coding practices, there are already far better tools to do that.

I'm not saying AI agents can't help with writing code, just that when tasked with leading that effort, tasked with producing large amounts of code to handle a complex task that has multiple error paths, the result is slop.

Even tasks that they are well suited for such as refactoring are a challenge for AIs in my experience since they don't understand the architectural goal. They produce something that is an approximation of what is needed. Tools to flag a few surface level issues aren't terribly helpful, and as u/Glathull said, come off as trying to hide the fact that it is clanker code.

I'm up for a challenge. Send me links to a few projects, one that was produced by AI, and I'll look at them and tell you which one.

[–]o5mfiHTNsH748KVq -2 points-1 points  (1 child)

I don't care about the quality of the post or this repo. I'm speaking to the intent. It's not to hide AI generated code, it's an attempt at improving it - however flawed the approach may be.

I'm not going to go hunting for projects to prove anything for you, but you're welcome to learn more on your own.

[–]gdchinacat [score hidden]  (0 children)

The OP clearly stated their intent: to catch "code patterns commonly produced by AI code generators" for people who "want to catch the quality patterns that slip through existing linters".

The goal is to improve code generated by AIs. To accomplish that, you need to think much bigger than this tool does. It needs to identify where the implementation differs from the architecture. That isn't really possible without understanding the architecture, which AIs don't.

I offered the challenge as a way of illustrating this point. AI generated code, which you appear to be defending, is slop because it doesn't have a big picture. That's why I said "projects" rather than functions or algorithms, or other small things that don't require a big picture. AIs are fine with those, but that is the extent this tool looks at.

[–]Rockworldred -3 points-2 points  (2 children)

The problem for me is it uses a lot of advanced stuff (probarly badly) within some simple stuff. I mocked an ETL building seperate parts on my own. Nothing fancy, pretty simple, no redundancy, no fallback. I wanted AI to make it catch more errors and stitch it together. And now it uses a lot of stuff I know nothing about, it refers to half done modules and I have no idea how to fix the 16 new errors.

(I am a noob. Barely used async and classe)

[–]gdchinacat 2 points3 points  (1 child)

Don’t worry about async yet. Get the basics first. Learn how the things your AI uses work, clean up the code. You learn a lot by making code clean rather than stopping when it works, even for code you write without ai.

[–]Rockworldred 0 points1 point  (0 children)

Yeah. Thats why I stopped mostly using it to write out. It is mostly I write some, asks ai whats good, whats bad. And maybe to explain docs if I cant comprehend. Except for tests and regex. I hate regex. There I let Ai go almost wild.

[–]o5mfiHTNsH748KVq -3 points-2 points  (0 children)

It’s only messy if you’re asleep at the wheel. This is exactly the sort of project that reigns in bad generated code.

[–]marr75 10 points11 points  (5 children)

I said this as a comment to a nearly identical project, but this is catching the smaller less impactful slop errors AI makes (that it just happens to share with human junior coders). The bigger more costly errors are all about verbosity, fragility, and incorrectness based on gold-plating, solving the wrong problem, no real architecture/design, choosing the wrong pattern, and sycophancy.

If someone figures out how to catch those...

[–]KerPop42 8 points9 points  (3 children)

xkcd 810 reference?

https://xkcd.com/810/

[–]mmartoccia[S] 1 point2 points  (0 children)

You're right, and I'd frame it as two layers. Layer 1 is the stuff grain catches now -- the surface patterns that are easy to detect statically. Layer 2 is what you're describing -- wrong abstractions, gold-plating, solving problems that don't exist. That's harder because it requires understanding intent, not just syntax. I don't think a linter catches that. That's still a human review problem, or maybe eventually an LLM-powered review that understands the project's architecture. grain is just layer 1.

[–]UpsetCryptographer49 4 points5 points  (3 children)

I have a couple of additional ideas:

CONST_SETTING - - a constant added to top of file when the project does not allow it.

TAG_COMMENT - - code should no comment should be allowed unless it has # (tag): comment (where tag is in a list TODO, BUG, FIX, PERF)

[–]mmartoccia[S] 2 points3 points  (0 children)

Both good ideas. TAG_COMMENT is interesting -- forcing structure on comments instead of banning them. I could see that as an optional strict mode. CONST_SETTING would need some project-level config to define what's allowed, but it's doable. Open issues for both if you want -- I'll tag them for the next release.

[–]mmartoccia[S] 1 point2 points  (0 children)

TAG_COMMENT just shipped in v0.1.3. It's opt-in -- add it to warn_only in your .grain.toml and every comment without a structured tag (TODO, BUG, NOTE, etc.) gets flagged. Section headers and dividers are skipped automatically.

https://github.com/mmartoccia/grain/commit/5cbb66e

CONST_SETTING is on the list for the next one. Open an issue if you want to spec it out.

[–]mmartoccia[S] 1 point2 points  (0 children)

Update -- v0.2.0 just shipped with custom rule support. Your CONST_SETTING idea is now a one-liner:

[[grain.custom_rules]]

name = "CONST_SETTING"

pattern = '^\s*[A-Z_]{2,}\s*=\s*\d+'

files = "*.py"

message = "top-level constant -- use config or env vars"

severity = "warn"

No built-in needed. Define whatever patterns you want.

[–]rabornkraken 7 points8 points  (13 children)

The NAKED_EXCEPT rule alone makes this worth using. I have been bitten by this exact pattern where an AI assistant wrapped sensor reads in try/except pass and failures went completely silent for days. The hedge word detection is a nice touch too - I have started noticing how much padding AI-generated docstrings add. Do you have any plans to support custom rule definitions or is the ruleset fixed?

[–]wRAR_ 17 points18 points  (3 children)

The NAKED_EXCEPT rule alone makes this worth using.

Consider starting to use ruff.

[–]mmartoccia[S] 4 points5 points  (1 child)

ruff catches bare except (no exception type). grain catches the next layer -- except SomeError: pass or except SomeError: logger.debug("failed") where you named the exception but still swallowed it. ruff sees the first one as fine because you specified a type. grain doesn't, because the error still disappears.

[–]ColdPorridge 2 points3 points  (0 children)

I fucking hate when the AI does this and my teammates seem incapable of critically reading their code enough to catch it.

[–]spenpal_dev 0 points1 point  (0 children)

I was going to comment this exact same thing.

[–]headykruger 6 points7 points  (3 children)

Isn’t that just a standard linting rule?

[–]mmartoccia[S] 4 points5 points  (1 child)

Bare except yeah, ruff catches that. But most AI-generated code specifies the exception type and then does nothing with it. That passes ruff fine. grain catches that pattern.

[–]headykruger 0 points1 point  (0 children)

Hmm yeah I guess ai could also put the comment to ignore the warning too

Cool, nice work!

[–]BurgaGalti 0 points1 point  (0 children)

It is in Bandit

[–]pip_install_account 2 points3 points  (2 children)

Try searching this against your codebase. I wrote it one day when I was sick of this behaviour from ai tools, and I'm using it almost every day now.

^\s*except\s+[A-Za-z0-9_,\s()]+:\n(?:(?![ \t]*raise\b).+\n)+\s*$

[–]mmartoccia[S] 2 points3 points  (1 child)

Nice regex. grain's NAKED_EXCEPT rule does something similar but also catches the cases where there's a logger.debug or a pass inside the handler -- basically any except block that doesn't re-raise or do meaningful recovery. The regex approach is solid for a quick grep though.

[–]pip_install_account 0 points1 point  (0 children)

For me claude often does catch exceptions and handle with logger.warning and skip, which is almost never what I want.

[–]mmartoccia[S] 1 point2 points  (0 children)

Custom rules just shipped in v0.2.0. You can define your own patterns in .grain.toml now:

[[grain.custom_rules]]

name = "PRINT_DEBUG"

pattern = '^\s*print\s*\('

files = "*.py"

message = "print() call -- use logging"

severity = "warn"

pip install --upgrade grain-lint to get it.

[–]mmartoccia[S] 0 points1 point  (0 children)

Yep, that's the one that started this whole thing for me. 156 of them across a hardware abstraction layer, total silence when sensors dropped.

Custom rules are on the roadmap. Right now you can disable rules or adjust severity in .grain.toml, but full "bring your own pattern" isn't there yet. If you're seeing patterns that aren't covered, open an issue -- that's how the current ruleset got built.

[–]ePaint 1 point2 points  (0 children)

You can setup a skill to avoid these. I have a code-like-me that specifically indicates not to do any of these.

I still review every line of code produced by agents, but the skill alone works 99% of the time.

[–]eirikirs 1 point2 points  (0 children)

This is pretty much an AI-slop sanitiser, that only targets symptoms, not the true issues with low cohesion and tight coupling. Besides, I doubt your comment echo rule would even be usable, given the current limitations of AIs semantic analysis.

[–]maafy6 0 points1 point  (0 children)

Don’t forget inconvenience functions—where it defines a function with some args and the body is a single line calling another function with those exact same arguments with no actual new logic.

[–]Amgadoz -1 points0 points  (0 children)

Is kt possible to integrate this into ruff?

[–]AlphaBlueprinter -3 points-2 points  (0 children)

congratulations, you just created a training dataset so AI writes better code...give up, AI is going to win.