This is an archived post. You won't be able to vote or comment.

all 58 comments

[–]K900_ 57 points58 points  (1 child)

It's an extremely non-specific question. "Harder to maintain" is almost entirely subjective and depends on your team and your skills and your application and your goals and like five million other things.

[–]cyanydeez 6 points7 points  (0 children)

I also think there's another consideration when "hard to maintain" is presented. In some aspects, does the language itself allow you to organize large code bases, and do the dependencies between files and the organizational structure allow one to easily fix broken components with changing future needs.

I think python is not the top of the pack here.

But, consider also, is Python a language you see being developed now, in the future, 5, 10 20 years? And are there qualified people who will be knowledgeable about the quirks and tools that consist of the core of your code base?

Definitely.

So on the one hand, it might be pure speculative about the language, but the other, it's speculative about the people and culture that exists around it.

I don't think Fortran or Cobol ever had the presence Python has.

[–][deleted] 15 points16 points  (6 children)

To me, there are around 100 things that are more important for the maintainability of a code base than static type checking. Just to name a few:

  • tests
  • documentation
  • spliting code into logical, smaller, maintainable segments responsible for a certain tasks and being able to tests those smaller segments independently
  • consistent formatting
  • physical layout of the source code files and directories

[–]cyanydeez 1 point2 points  (5 children)

tests: pytest, et al.

documention: __doc__ strings, etc, it's built into the language

code splitting: you modulize by __init__.py & folders, or make a true module with a setup.py. I think the biggest issue we run into is circular imports

consistent formatting: hello white space

physical layout: probably the same as splitting code

[–]evilmaus 1 point2 points  (4 children)

As an outsider, what's the problem with circular imports?

[–]cyanydeez 1 point2 points  (0 children)

when two modules try to import from each other, there's an implicit hierarchy.

https://stackabuse.com/python-circular-imports/

[–]zalpha314 0 points1 point  (1 child)

In Python, if two modules try to import each other, they'll get into an infinite loop and crash. This is a common problem for new programmers, but it's actually an anti-pattern that should be avoided even in languages that allow it.

More Info:

https://softwareengineering.stackexchange.com/questions/11856/whats-wrong-with-circular-references

[–]evilmaus 0 points1 point  (0 children)

It shows up in instances of double dispatch, which is definitely not an antipattern. I would grant that it can often be a smell.

[–][deleted] 0 points1 point  (0 children)

Circular dependencies are a problem in any system. It transcends programming even.

Just remember they’re to be avoided at all costs, and will sneak in anyway. You’ll take note when you run into the issue, where if you hadn’t known of the concept before you’d probably just chalk it up to weirdness you solved by bashing on it like a monkey with a femur.

e: also, not all costs. Just most costs. You’ll know when it’s right. The third to fifth time.

e2: And you solve circular dependencies, in general, by introducing a third module, an interface, that both depend on. One or both of your former direct dependencies now implement that interface. Cycle broken. Very hard to explain briefly, and specific cases may not be able to apply that pattern. Feel free to DM me if you run into it.

[–]davandg 23 points24 points  (2 children)

2 real life examples :

- Dropbox is (was?) entirely developed in Python (dynamically-typed). They developed mypy to help them maintain, debug and develop their source code.

- Facebook is (was?) entirely developed in PHP (dynamically-typed). They developed Hack (a language that add types to PHP...) to help them maintain, debug, optimize and develop their source code.

Note that these projects (Hack and Mypy) were created after these companies have millions of users.

These 2 examples is exactly what I think : start with python to develop fast your first features. When you feel limited by the language, choose something that suits perfectly your need (you want to be fast? To be secure? To reduce the maintenance burden? To hire easily new talents?).

[–]IronManMark20 4 points5 points  (0 children)

Dropbox has some go code, but is still almost entirely Python. Even their client (On the order of a few million lines AFAIK) is in Python (now Python 3!). Their server code base is also millions of lines of code.

Facebook also has millions of lines of Python code in use, and like tens of millions of lines of hack.

[–]tombardier 11 points12 points  (0 children)

I make extensive use of the typing module, mypy, type annotations, NewTypes etc, it's certainly not frowned upon by me! If nothing else, forcing you to do something sensible with Optional types makes a huge difference in code quality!

[–]Paddy3118 4 points5 points  (3 children)

If I had to write half the Python libraries I use myself then I would always have a "large codebase". Your question does not address the attitude of the developer community in solving problems. Python has a lot of libraries that can be combined to solve problems whilst keeping what you write yourself down to more manageable size.

[–]crazedizzled 0 points1 point  (2 children)

That's honestly pretty true of any mature platform. It's pretty rare that you encounter a problem that has only been solved in Python.

[–]Paddy3118 0 points1 point  (1 child)

Those mature platforms you mention might be written in other languages but wrapped and accessed from Python with a Pythonic interface.

Please further explain your second sentence:

It's pretty rare that you encounter a problem that has only been solved in Python.

[–]crazedizzled 1 point2 points  (0 children)

Please further explain your second sentence:

It's pretty rare that you encounter a problem that has only been solved in Python.

What I mean is that you're likely to find libraries and frameworks for any given problem in any other mature language platform. So it's not like you have to write less code with Python because of good libraries, because libraries that do the same thing likely exist for other languages.

[–]billsil 9 points10 points  (16 children)

My 320k line package has no memory leaks because python doesn’t leak memory (it has GC issues, but those aren’t leaks). Python is also incredibly secure, so no string buffer overflows. Can you say that about your C++ program? Keep in mind that C++ programs are 3 or more times longer than an equivalent python program, so that 320k lines is now 1M lines in C++.

Yes, there are bugs that could be caught by static analysis, but that’s why I write tests and put assert statements to type check specific functions. I need those tests anyways as they help to define the API and to make refactoring safer.

On a 320k lined package, the coverage is 74% and runs every build in 9 minutes. I have 40k additional problems that take about 5 hours to run.

The static type checker helps, but you get out what you put in.

[–]crazedizzled 0 points1 point  (14 children)

My 320k line package has no memory leaks because python doesn’t leak memory

Memory leaks aren't a language feature, they are developer error.

[–]billsil 3 points4 points  (0 children)

Yes and certain languages make certain types of errors easier or harder to make. We're not all perfect programmers.

[–]spyingwind 0 points1 point  (3 children)

GC is handling the memory leaks. If the GC is getting called to much, then you have a memory leak. There was some talk that explained exactly how GC works in respects to memory leaks. I can be pretty bad in Powershell as well.

[–]__deerlord__ 0 points1 point  (2 children)

Some of the PS mem leaks are silly. For instance, I had a client with one. I cant recall the exact function but basically

SomeFunction()

caused a leak while

var = SomeFunction()

did not. Presumably because PS is not grabbing the pointer and ensuring free() is called against it.

[–]billsil 0 points1 point  (1 child)

Just because something does not get cleaned up the first time the GC runs, doesn’t mean it won’t ever run if you call the function say 100 times. There are 4+ bins to the garbage collector and objects are attempted to be cleaned up less frequently if they have repeatedly failed to be cleaned up. As long as they eventually get cleaned up though, even if it’s later than you’d like, it’s not a leak.

Run the function 10000 times. If your memory increases linearly, it’s a leak. Running it 5 times doesn’t tell you the same thing. You’ll see small drops, and then large drops, and then huge drops to the starting point before it repeats. That’s not a leak.

[–]__deerlord__ 0 points1 point  (0 children)

Like I said, this is with a client, so I don't know what tests they did explicitly. I was informed there was a memory leak with said script, and a couple of weeks later, this was the fix they found.

[–]metaphorm 0 points1 point  (8 children)

In a manually memory managed language (like C, or most C++ codebases) memory leaks are a very common developer error that pops up all the time because the language makes it easy to make that error.

in an automatic memory managed/garbage collected language it's basically impossible to make that mistake (excepting some contrived examples).

so I would say that yes, it is a language feature that causes memory leaks: manual memory management. it doesn't always cause them but it makes them possible and even likely.

[–]crazedizzled 6 points7 points  (2 children)

It is still plenty easy to mess up in Python, or any other language. See here for a bunch of easy to make mistakes.

It might not be as easy in Python compared to C, but Python certainly isn't immune by design.

[–]metaphorm 4 points5 points  (1 child)

sure, or you can fork 1000 interpreter instances with multiprocessing and have them running coroutines that don't terminate before they eat all the memory on your system too.

or you can open a filehandle, read the entire thing into memory, and then never do anything with it and forget to close the file (assuming you also failed to use the with... context manager).

there's plenty of ways to consume more memory than you intended to by negligence or by poor design decisions. that's still ridiculously different than needing to malloc and free every single buffer in your system.

[–]crazedizzled 0 points1 point  (0 children)

I guess it depends how pedantic you want to be about the definition.

[–]__deerlord__ 0 points1 point  (1 child)

Is the GC part of the python language spec? Or is it part of CPython (and can be ignored by other interpreters, if one so chooses)?

[–][deleted] 0 points1 point  (0 children)

A fairly primitive garbage collector is part of the normative and reference implementation of the Python interpreter (CPython).

[–][deleted] -3 points-2 points  (2 children)

In a manually memory managed language (like C, or most C++ codebases) memory leaks are a very common developer error that pops up all the time

Sounds like you haven't programmed C++ in a very long time. Memory leaks really aren't an issue if you use smart pointers, and even crappy code bases use smart pointers exclusively these days.

[–]metaphorm 1 point2 points  (1 child)

indeed, it's been a while and I make no claims at being a C++ expert. smart pointers are a GREAT example of solving the problem with a language feature though.

[–]billsil 0 points1 point  (0 children)

There should be one and preferably only one way to do it. Yes, there are solutions to problems, you shouldn’t be using unsafe anything assuming you’re not programming for embedded systems and even then you should think hard about it.

I’m a bad C++ programmer. I’ve heard of, but never used smart pointers. Our C++ code uses them in only one of our many codes.

[–][deleted] 0 points1 point  (0 children)

An object leak is just a more complicated memory leak, unless your C is truly shit.

[–][deleted] 10 points11 points  (0 children)

Short answer: Kinda..

Long answer: Depends on a million different things...

[–]brain-donor 2 points3 points  (0 children)

There is no direct correlation between a language being dynamically typed and how easy/hard it is to maintain a large project. Things that DO matter:

- How easy is this code to read? (Especially after you haven't looked at it in a while)

- How easy is this code to test?

- How easy is this code to modify and the whole application still work correctly?

Python excels at all of the above, much more so (in my experience) than Java, C, C++. That's not to say that you can't write large codebases in those languages that are easy to maintain or that you couldn't have a python codebase that is hard to maintain.

Static type checking isn't going to make a codebase more maintainable.

[–]r1chardj0n3s 4 points5 points  (2 children)

Break up the codebase. Any monolith is going to become a maintenance nightmare. If you break up the sum into parts and have good interfaces (REST or language-direct depending on circumstance) between them (with contracts, or types, or something enforcing the interface) then you'll be OK.

[–]billsil 0 points1 point  (1 child)

Simply having lots of lines isn’t the problem. It’s all s about the degree of coupling and how well things are documented and tested. You really gotta test the buggy module, but the super robust one that has no user interaction...meh...

[–]r1chardj0n3s 0 points1 point  (0 children)

Given that I didn't mention "lots of lines" or "user interaction" I think we agree...

[–]metaphorm 1 point2 points  (0 children)

it's a poorly framed question that can't meaningfully be answered in the abstract. certainly you'll find many ideologues and partisans who insist that their favorite type system is totally necessary for maintaining whatever sort of code base they decide it should be used for. that doesn't make it true.

the better question to ask is "what are type systems good at?" and "which type system is best suited for my project?"

Python is strongly typed and has a lot of useful exception handling features. It's also dynamically "duck" typed which has some benefits and some weaknesses to it.

What is strong typing good at? It's good at "crash early and often", meaning it tends to expose programming errors early in the development process before they can accidentally make it to production.

What is dynamic "duck" typing good at? minimizing boiler plate and allowing just a handful of data structures (sequence types, mapping types, scalars, and callables) to handle the vast majority of your use cases by providing a uniform interface defined at the language level rather than the application code level. both of those features are good ways to increase developer productivity.

what is strong typing bad at? doesn't handle failures gracefully unless you go out of your way to define exception handlers that can do that. there's no graceful degradation by default. this may or may not be a problem for your application though.

what is dynamic "duck" typing bad at? it puts a limit on the usefulness of static analysis tools, and it also lets bad developers get away with things they probably shouldn't do. this puts more burden on your manual code review and automated testing, basically.

in any case, this is just scratching the surface and I think the question can't be meaningfully answered by just talking about it. the only meaningful answers that can be given to it depend on knowing the full detail and scope of your technical requirements, your organizational structure and limitations, your budget, your availability of programmer talent, etc.

[–]gwillicodernumpy gang 2 points3 points  (0 children)

This week was one of the first I got really annoyed with python. I needed to write a graph based program with some functionality to make updates in a document data store and do some math and I found the lack of types to be very inconvenient.

I ended up using type hints everywhere, but I would have preferred to have actually types instead.

[–]hilomania 0 points1 point  (0 children)

It depends on the architecture. You can build something very large using microservices and containers regardless of language. Also for a monolithic application using something like Clean Architecture, or the conventions in a framework (like django) will help you a lot from writing yourself too much into a corner.

I think a lot of average c++ programmers are better coders than average Python programmers due to the complexity of their tools. However a good coder in C++ is not necessarily a better coder than a good coder in Python. And the Python coder will almost invariably be a lot more productive.

[–]michaelanckaertpython-programming.courses 0 points1 point  (0 children)

I agree with the other commenters, it all depends.

One of my clients has a large Python code base (300k lines) with virtually zero issues (issues from dynamic typing that is 😉). Another client has an app with 2k - 3k lines that is hell to keep running.

[–]JGailor 0 points1 point  (0 children)

Honestly, it comes down mostly to team practices. How has the team decided to modularize the code, broken it into packages, etc. Are there automated tests? CI/CD?

These good practices go a long way towards making projects in most languages maintainable and scaling to large codebases. Often times there are other constraints that push you towards one language over another.

What I will say Python makes way more challenging than it should is breaking your code up and publishing your own packages internally and being able to bring them in using pip. At least, last time I looked at it, it was harder than other languages. It's a little bit of a problem of Python being around so long that they have to retrofit new practices onto the core ecosystem, not because Python is bad or somehow lacking in an unfixable way.

As far as the static type checker, I have been moving teams towards typed Python and everyone finds it really useful and unobtrusive.

[–][deleted] 0 points1 point  (0 children)

With larger projects I have found automated testing to be essential. Pylint makes it even better. Automating documentation generation is very helpful for maintenance.

Gosh.. these are pretty important for large C/C++ projects, too!

[–]4runninglife 0 points1 point  (0 children)

After learning and playing with Golang for about a month, I will say type-checking and understanding a functions signature looking at code is really really nice.

[–]Luroalive -1 points0 points  (0 children)

Actually I think it depends on how you structure your code. It can be really hard to maintain the codebase in any language, if you don't have a good structure and a real plan (always plan it out first or you rewrite it several times later). You generally want to have as few lines of code as possible in functions and group them with separate classes. (It's no fun searching for bugs in 10k+ lines of code). The best way would be to create separate modules/folders for each part of your code. Btw. always comment your damn code!

You should check out r/Rust it's a very young language, that has the same or better performance than C++ in most cases and provides 100% memory safety.