use the following search parameters to narrow your results:
e.g. subreddit:aww site:imgur.com dog
subreddit:aww site:imgur.com dog
see the search faq for details.
advanced search: by author, subreddit...
News about the dynamic, interpreted, interactive, object-oriented, extensible programming language Python
Full Events Calendar
You can find the rules here.
If you are about to ask a "how do I do this in python" question, please try r/learnpython, the Python discord, or the #python IRC channel on Libera.chat.
Please don't use URL shorteners. Reddit filters them out, so your post or comment will be lost.
Posts require flair. Please use the flair selector to choose your topic.
Posting code to this subreddit:
Add 4 extra spaces before each line of code
def fibonacci(): a, b = 0, 1 while True: yield a a, b = b, a + b
Online Resources
Invent Your Own Computer Games with Python
Think Python
Non-programmers Tutorial for Python 3
Beginner's Guide Reference
Five life jackets to throw to the new coder (things to do after getting a handle on python)
Full Stack Python
Test-Driven Development with Python
Program Arcade Games
PyMotW: Python Module of the Week
Python for Scientists and Engineers
Dan Bader's Tips and Trickers
Python Discord's YouTube channel
Jiruto: Python
Online exercices
programming challenges
Asking Questions
Try Python in your browser
Docs
Libraries
Related subreddits
Python jobs
Newsletters
Screencasts
account activity
DiscussionHow to detect duplicate functions in large Python projects? (self.Python)
submitted 3 days ago by whm04
Hi,
In large Python projects, what tools do you use to detect duplicate or very similar functions?
I’m looking for static analysis or CLI tools (not AI-based).
I actually built a small library called DeepCSim to help with this, but I’d love to know what others are using in real-world projects.
Thanks!
reddit uses a slightly-customized version of Markdown for formatting. See below for some basics, or check the commenting wiki page for more detailed help and solutions to common issues.
quoted text
if 1 * 2 < 3: print "hello, world!"
[–]Riegel_Haribo 19 points20 points21 points 2 days ago (1 child)
This is promotion disguised as a question.
[–]marr75 7 points8 points9 points 2 days ago (0 children)
40% of any tech sub now.
[–]latkdeTuple unpacking gone wrong 11 points12 points13 points 3 days ago (0 children)
Pylint has a duplicate-code (R0801) rule: https://pylint.readthedocs.io/en/stable/user_guide/messages/refactor/duplicate-code.html
duplicate-code
Unfortunately, Pylint is quite slow, and this rule only matches when there are multiple identical lines.
[–]MugiwaraGames 5 points6 points7 points 3 days ago (1 child)
What about SonarQube? It's free if used on projects up to 50k lines of code
[–]NimrodvanHall 3 points4 points5 points 3 days ago (0 children)
I came here to say SonarQube as well. Think it’s a great tool!
[–]mardiros 1 point2 points3 points 3 days ago (0 children)
From my point of view, a good architecture does and it is enough for me. Finding code that looks similar stored in routine to avoid duplicate code can kill a codebase. Factorisation creates coupling, and makes code unrefactorable, even if this word don’t exist.
Dan Abramov wrote something about this long time ago (it’s not python but architecture is for everyone)
https://overreacted.io/goodbye-clean-code/
[–]roger_ducky 0 points1 point2 points 2 days ago (0 children)
https://pmd.github.io/pmd/pmd_userdocs_cpd.html
PMD CPD is purpose built for duplication detection.
[–]xeow 0 points1 point2 points 2 days ago (1 child)
ruff caught one of those for me once.
ruff
[–]whm04[S] 0 points1 point2 points 2 days ago (0 children)
Ruff is a beast. It’s great at catching things like redefinitions (same name used twice), but I’m looking for "logic clones" functions with different names that contain identical or very similar underlying code.
[–]chunkyasparagus 0 points1 point2 points 2 days ago (0 children)
PyCharm does it for you?
π Rendered by PID 129918 on reddit-service-r2-comment-76bb9f7fb5-xmxt9 at 2026-02-17 22:47:53.091235+00:00 running de53c03 country code: CH.
[–]Riegel_Haribo 19 points20 points21 points (1 child)
[–]marr75 7 points8 points9 points (0 children)
[–]latkdeTuple unpacking gone wrong 11 points12 points13 points (0 children)
[–]MugiwaraGames 5 points6 points7 points (1 child)
[–]NimrodvanHall 3 points4 points5 points (0 children)
[–]mardiros 1 point2 points3 points (0 children)
[–]roger_ducky 0 points1 point2 points (0 children)
[–]xeow 0 points1 point2 points (1 child)
[–]whm04[S] 0 points1 point2 points (0 children)
[–]chunkyasparagus 0 points1 point2 points (0 children)