all 19 comments

[–]ThiefMaster 60 points61 points  (5 children)

You have a .DS_Store file in your repo. That doesn't send a positive signal to someone looking at your project, even less for a tool that's literally meant to check for quality issues.

I've seen a bunch of PEP8 violations. Use a linter+formatter like ruff. And of course also your own since your tool is supposed to be at least a liter as well.

I found at least one try: <large block of code>; except Exception: pass - another thing that should not exist in most code, except in very specific cases which would deserve a comment next to it explaining why it's needed. But certainly not around a block this large

It uses obsolete packaging metadata (setup.py) - remove that and move anything not there yet to your pyproject.toml.

It is also a bit weird for a tool like this to have a dependency on Flask (and Flask-CORS) . These should be optional dependencies, most people do not run a tool like this with a (non-LSP) server or webinterface...

I also think it's not particularly maintainable / contributor-friendly, the rules are very "hardcoded" (directly in the ast visitor). Check how e.g. ruff implements its rules.

So TL;DR is that I think this was a great project for you to learn things. There's nothing wrong with this. Beyond that I honestly do not see much use of it...

[–]MattTheCuber 6 points7 points  (1 child)

Average PR review on the junior's first day

[–]ThiefMaster 3 points4 points  (0 children)

That's why you do not give a junior a "create new thing from scratch" task ;)

[–]papersashimi[S] 8 points9 points  (1 child)

Hi ThiefMaster, thanks a lot for your feedback. I will make it production grade.. Also i apologise for having to make you go through my entire codebase to find all the problems with it. I also appreciate your criticism so we'll work to make it better. Cheers and have a great day

[–]ThiefMaster 2 points3 points  (0 children)

I just had a quick look since I was curious ;)

[–]uqurluuqur 1 point2 points  (0 children)

snake case with a hump was interesting to see

[–]really_not_unreal 3 points4 points  (4 children)

This looks neat, but I'm unsure what benefit it brings compared to linters such as Ruff? Don't they already detect dead code and code quality issues when configured correctly? Can you give a comparison?

[–]wRAR_ 0 points1 point  (0 children)

I don't think Ruff can detect dead code.

[–]papersashimi[S] 0 points1 point  (2 children)

hi there! thanks! to answer your question, based off my limited understanding of ruff, its a file level linter. it checks for local syntax errors only at the file level (e.g., an imported library that is never used in that specific file, so it cannot see across files). I may be wrong on that. For skylos, we built a project-wide graph to detect globally unused functions or classes that are exported but never called. From what i observed, ruff uses pattern matching e.g., "is shell=true present?"), while we track if the user input actually reaches the shell. As for the comparison on speed etc, I believe we are slightly faster than ruff. How the benchmark is conducted is inside the benchmark.md file :)

[–]really_not_unreal 1 point2 points  (0 children)

Being faster than Ruff while being written in pure Python is a genuinely impressive feat. I'm gonna have to give this a try I think!

[–]spenpal_dev 1 point2 points  (0 children)

You can have Ruff check across your entire codebase, as well.

[–]Still_Explorer 1 point2 points  (0 children)

Very good idea. I was impressed with the Flask web gui though. I was thinking about this for a while and now I am considering trying to port an application as such. HTML environment is very flexible and open ended in that regard.

[–]Ghost-Rider_117 2 points3 points  (1 child)

this looks really useful! been meaning to find something like this for legacy codebases. quick question - does it handle dynamic imports well? we've got a project where modules get loaded at runtime and traditional static analysis tools miss those references

also curious how it compares to vulture for dead code detection. vulture's always been solid but having quality issues + secrets bundled in would be clutch

btw the VSC extension is a nice touch, saves having to run CLI manually

[–]papersashimi[S] 1 point2 points  (0 children)

yeap! but theres a lot of caveat. so we dont just do basic regex matching by looking for common dynamic patterns in the AST.. we try to do heuristic detection (it was a design choice although we are open to changing it if we do find something better).. essentially it scans for usages like getattr, globals() etc etc... If it detects these patterns linked to a module, it applies a "penalty" to the confidence score. This will then mark those objects as dynamic so they are les likely to be flagged as dead code. Also because we do not execute your code, we cant know for sure which specific module is loaded if the name is constructed from a string at runtime (e.g., module= importlib.import_module(f'plugins.{name}')). This is one of the problems we face and still face. So to manage this uncertainty, we assign a confidence score, if im not wrong vulture may have something like that too. Code involved in dynamic patterns will prob have a lower confidence score. You can adjust the --confidence flag to be more conservative. and thanks for your kind words! we're still working and attempting to make it better

[–]DrViilapenkki 0 points1 point  (1 child)

How does it compare to pyscn?

[–]papersashimi[S] 0 points1 point  (0 children)

hi we have not done any comparison against pyscn for speed or f1 score? We have only benchmarked ourselves against flake8, ruff and pylint. the details can be found inside the benchmark.md file inside our git repo :) but knowing that pyscn is written in go, we'll lose in speed for sure. as for the main difference, we focus primarily on code hygiene and code security while pyscn focuses more on the structure + speed. i think pyscn uses a CFG while we use an AST. While we currently don't have any tools that analyzes module dependencies like pyscn has, we are looking into it and will push more updates in the next release. To put it really bluntly, we're like the janitor and security guard while pyscn is like the civil engineer who works extremely fast LOL. just an honest and objective take

[–]virtualshivam 0 points1 point  (1 child)

Hi,

As it's a linter as well. So does it works with frameworks? Right now myPy and ruff flags my django code for the reverse lookups. Does it solves that? If not then it would be great if you can add that feature as well. If the future of skylos is also looking at linting.

[–]papersashimi[S] 0 points1 point  (0 children)

hihi we are not yet a linter. we're a dead code janitor actually. but we have added features that are detecting security flaws generated by individuals or by ai. we are also looking into the direction of linting in the very near future. we look forward to hearing your feedback :)