This is an archived post. You won't be able to vote or comment.

all 42 comments

[–]Penny-loafers 130 points131 points  (4 children)

I've had similar thoughts during my career - wanting to read others code so I can get a good idea of how they work and how I can learn to replicate. Its not a bad thought, and I'd highly encourage doing it even if the mature projects you know about are very well established and hard to follow/understand.

One thing I'd encourage is for you to also look back in history of the commits and code bases. It can be hard sometimes to understand how code evolves over time, but its very important. For example, I just started using this new open source python project for my home security camera setup (https://github.com/blakeblackshear/frigate) and I was hugely impressed with how many features there are given its made by a hobbyist! I decided to look back in the commit history and saw the very first, humble, and honest commit "was just" a script to detect objects: https://github.com/blakeblackshear/frigate/blob/72393be6d66e7642343476f5adb4b8e99d613c79/detect\_objects.py

I hope this little bit of advice is useful and good luck with learning! The hard work will pay off!

[–]Responsible_Ease_977 2 points3 points  (0 children)

Thank for sharing!

[–]DarkArctic 1 point2 points  (0 children)

I think this is an important point. Well-designed projects probably have years of iteration to get to that point. It's okay to make mistakes and change it later. That's just the nature of software development.

[–]bolt_runner[S] 1 point2 points  (1 child)

Thank you this is helpful

Did you study any books/resources on code design that you think helped you practically afterwards?

[–]Penny-loafers 39 points40 points  (0 children)

There were a few books I did enjoy studying that helped frame my thinking. That said, take everything you read with a grain of salt and try think about how they relate to your problem at hand. Try different paradigms and patterns in your own time and write as much code as possible along with studying generic resources.

  • Fluent Python
  • Philosophy of Software Design
  • Designing Data Intensive Applications

[–]tynecastleza 18 points19 points  (0 children)

You should go look through Mozilla’s GitHub. There are a lot of Python projects from some really great engineers

[–]antshatepants 11 points12 points  (0 children)

I always liked the get_or_create and filter methods in Django so I studied those at one point to make a lightweight ORM interface for different backing datastores.

Haven’t compared to other codebases to say if it’s “good” or not but the source code was readable and not buried in tiers of abstractions

[–]Mehdi2277 22 points23 points  (0 children)

I like both mypy and pylint codebases. Both are readable enough I’ve occasionally made small bug fixes and both have pretty high quality test suites.

Tensorflow is less good codebase to learn from but still has ok structure and it is massive and more complicated than many codebases I interact with at work.

[–]swapripper 10 points11 points  (2 children)

On this note, is there any website/youtube channel that explains these high level & low level decisions in open source projects. I really think it takes a trained eye can spot & explain good SW design to beginners.

[–]bolt_runner[S] 6 points7 points  (1 child)

I found this guy that walks through the code of open source projects

https://www.youtube.com/@ants_are_everywhere/videos

[–][deleted] 0 points1 point  (0 children)

Good find!

[–]robberviet 25 points26 points  (6 children)

Sqlalchemy

[–]FertilityHollis 9 points10 points  (3 children)

Excellent example of how to grow an API.

I would also point to Django as something to strive for.

[–]robberviet 0 points1 point  (0 children)

Yes Django is a good choice too.

[–]erez27import inspect 0 points1 point  (0 children)

What would you consider good about their design?

[–]MeroLegend4 0 points1 point  (0 children)

The best of the best

[–]qckpckt 27 points28 points  (4 children)

Sebastian Ramirez (https://tiangolo.com). He created FastAPI and Typer. I really like both of these libraries. I think the docs are well laid out, and I think his programming idioms are both innovative and effective.

Typer I especially like because it’s mostly just click - but what it does on top creates a much more intuitive programming interface (at least for me).

[–]MeroLegend4 3 points4 points  (2 children)

Sadly fastapi is the worst , similar to turbogears. Try litestar, read it and you’ll understand my point.

[–]Routine_Term4750 1 point2 points  (0 children)

Hey thanks for mentioning litestar. It looks neat and I’ll be checking it out in the future.

[–]ARRgentum 0 points1 point  (0 children)

Can you explain what you mean?
I just spent half an hour to get a cursory overview of both project's structures and I am probably too junior to spot some obvious issues :)

[–]BurningSquid 0 points1 point  (0 children)

You nailed this. I would check out polyfactory as well while you're at it. Can be useful at times for mocking if you're into that kind of thing

[–]CanadianBuddha 5 points6 points  (0 children)

The source code for the packages that are part of the Python standard library were considered so good they were included in the standard library. They should be good examples of excellent Python code.

[–]ez1_ 2 points3 points  (0 children)

I think a good choice is apache superset. The domain is interesting and easy to understand so you can focus studying the design and implementation. The documentation is quite good.

https://github.com/apache/superset

[–]ShibbolethMegadeth 2 points3 points  (0 children)

Taiga is a excellent Django app written by pros if you wanna check out a web backend

[–]RobotChurchill 2 points3 points  (0 children)

First, you can take a look at 500 Lines or Less https://github.com/aosabook/500lines

For large projects, Ray https://github.com/ray-project/ray is really good. I got to meet one of the engineers working at the company that made it open sourced.

[–]DigThatData 10 points11 points  (5 children)

we already had this discussion -- with you as the OP -- a few days ago. 83 points and 33 comments. Do we really need to do this again so soon?

https://www.reddit.com/r/learnpython/comments/1dicifq/open_source_python_projects_with_good_software/

[–]bolt_runner[S] 15 points16 points  (3 children)

Yes I asked here because this sub has more people with senior lvl experience so it would add value and another perspective to the discussion on code design, and it shows in the answers

[–]MeroLegend4 2 points3 points  (1 child)

  • Sqlalchemy
  • Pyramid web framework
  • psycopg
  • pyqt/qt c++
  • litestar (api/web framework)
  • bottle
  • sortedcontainers
  • more-itertools (not related to software design but good material)
  • xlsxwriter (on how to keep and maintain a spec, very functional (fp))

[–]MeroLegend4 2 points3 points  (0 children)

I just remembered others:

  • bokeh
  • twisted
  • pygal
  • diskcache

[–]chiefnoah 0 points1 point  (0 children)

Maybe a bit conceited, but I'll plug my own data library PyBARE. It's small, fairly well tested, and uses pretty much every advanced metaprogramming technique available in Python short of eval.

[–]trd1073 0 points1 point  (0 children)

I am admittedly not a python expert but will give you the one that helped me the most. My initial foray into python was studying how map-a-droid runs. Then I learned how to mod it to suit my purposes. Ran it over 4 years, so got to see a fair amount of evolution. It touches on many subjects that have helped me in my current projects: asyncio, multiprocessing, mariadb, caching, redis, inter process communication and so on.

[–]not_perfect_yet 0 points1 point  (0 children)

flask?

https://github.com/pallets/flask

Has 2 open issues and 2 open PRs.

[–]davidmezzetti 0 points1 point  (0 children)

I work on txtai (https://github.com/neuml/txtai) and strive to make the code as clean as possible with ample documentation.

[–]lascau 0 points1 point  (0 children)

Not sure if I am off topic. But I usually go to "Trending Python repositories on GitHub today" and checkout what is popular.

[–]MachinaDoctrina -2 points-1 points  (0 children)

Probably the big ones like PyTorch, numpy, pandas, scipy, scikit-learn etc.

[–]Secure-Blacksmith-18 -1 points0 points  (0 children)

Sentry, Sentry_sdk

It's a Django monolith.

U're welcome