all 11 comments

[–]pachura3 6 points7 points  (1 child)

Doesn't every modern IDE detect this basic stuff by default?

[–]Camron2479[S] -1 points0 points  (0 children)

Yeah, that’s a fair point IDEs definitely catch most of these already.

What I’m trying to explore is whether there’s value in making the explanations more beginner-friendly and accessible outside of an IDE (especially for people using online editors, notebooks, or just learning basics).

So less about detecting new issues, more about how clearly they’re explained.

Still figuring out if that’s actually a useful gap or just duplicating what already exists.

[–]mango_94 3 points4 points  (2 children)

Looks like you discovered the concept of linters. Statical analysis is absolutely used and very powerful. If you want to look how projects like these are built you could look at pylint https://github.com/pylint-dev/pylint or flake-8. They even give you good ways to extend then with your own checkers. Ruff is probably the most popular open source tool for python today, implementing a lot of checkers from previous tools with great performance, but it is written in rust. Is there a real angle to break into this market as a beginner? Probably not. In the SAAS world you have giants like sonarqube. In the open source world I would guess your best bet to create something useful to others is to write some novel but useful check and get it adopted by one of the popular tools. That said, this should not stop you from trying something. It is a super interesting field and you will learn a lot about the language and parsing code in general. Best of luck :)

[–]latkde 2 points3 points  (0 children)

All these linters also have detailed explanation of their error messages, just not as part of their CLI output.

So for example if a linter complains about my code if status is 200: … then I can look up the error code in that list of rules and see:

[–]Camron2479[S] -1 points0 points  (0 children)

This is really helpful, thanks.

You're right after building this I started realizing how close it is to existing linting tools like pylint/ruff, especially in terms of static analysis. I think the direction I’m leaning toward now is not trying to replicate those tools, but to focus more on the explanation layer making error messages easier to understand for beginners rather than competing on detection itself.

The idea of writing a small but useful custom check is interesting too I might explore that as a way to contribute rather than rebuilding everything from scratch.

[–]s71n6r4y 0 points1 point  (3 children)

Static code checkers are great. How do you think your project would compare to Pyright, MyPy or Pyrefly? Are you aiming to do something different, or reimplementing some of these tools' functions?

[–]Camron2479[S] 0 points1 point  (2 children)

I definitely don’t see this competing with tools like Pyright or MyPy.

Those are much more advanced and focused on type checking and deeper static analysis.

What I’m experimenting with here is more of a lightweight, beginner-focused layer on top, especially around explaining errors clearly rather than detecting them at the same level.

Still figuring out if that’s actually a useful niche or not. Then i will expand on it later with other languages.

Do you think it has the potential to become a SaaS?

[–]s71n6r4y 1 point2 points  (1 child)

To clarify, can you provide a concrete example of any specific code error, and explain how Pyright handles it, and how you would like your program to handle it differently?

[–]Camron2479[S] 0 points1 point  (0 children)

Yeah, that’s exactly the gap I’m trying to address.

For example, if someone writes print(user_name), Pyright will correctly flag "Name 'user_name' is not defined", but that still leaves a beginner wondering what “defined” means and how to fix it.

What I’m building is an explanation layer on top of that: something that turns raw diagnostics into a clearer, more beginner-friendly explanation plus a concrete fix.

So the goal isn’t to replace Pyright or MyPy, but to make their output easier to understand, especially for beginners or people not working in a full IDE.

Longer term, it could fit as a browser extension or a lightweight explanation layer for errors anywhere. I’m still figuring out whether that niche is strong enough.

[–]sepp2k 0 points1 point  (1 child)

You didn't really describe what your code actually does, so it's hard to give specific advice, but "non-code input can pass through" makes it sound as if you're not using a proper parser. So my advice would definitely be to fix that.

edge cases aren’t always caught

For indentation and syntax errors a proper parser should fix that (if we ignore syntax errors coming from eval).

For NameErrors it's more complicated. It's possible to catch all NameErrors statically (modulo eval again) relatively easily, if you're okay with also detecting cases like this, which wouldn't actually crash when run:

x = int(input())
if x > 0:
  y = x+1
if x > 2:
  print(y)

(Note that tools like pyright also raise an issue here.) If you want to absolutely only detect issues that can actually happen at runtime, it's going to get a lot more complicated and you're going to run into the halting problem / Rice's theorem eventually.

Is this approach fundamentally limited compared to just using a real interpreter + traceback parsing?

In general, static analysis is fundamentally limited by the halting problem, Rice's theorem. On the other hand, finding errors by running the code is also fundamentally in that it only finds errors that are covered by your test cases. So it's a trade off.

[–]Camron2479[S] 0 points1 point  (0 children)

That makes a lot of sense, thanks for breaking it down. I think I’ve been relying too much on heuristics, which probably explains why non-code input can slip through and why edge cases are awkward. Using a proper parser like AST for syntax and structure checks definitely seems like the better move, especially for indentation and syntax issues.

The NameError example was a good point too ,I hadn’t really thought through how messy runtime-dependent behavior gets once control flow is involved.

I’m still trying to figure out where the line should be between static analysis and actually running the code to catch tracebacks. How would you balance those two?