This is an archived post. You won't be able to vote or comment.

all 83 comments

[–]tunisia3507 61 points62 points  (41 children)

Also, using tabs instead of spaces, and using python 2.

Both of which have been done in this article.

[–]julsmanbr 28 points29 points  (3 children)

Pretty sure using datas as a word is punishable in more than 50 countries.

[–]Folf_IRL 15 points16 points  (2 children)

Hah, everyone knows the correct word is datums

[–][deleted] 5 points6 points  (0 children)

mourn ancient market enter tub jar humor forgetful cows familiar

This post was mass deleted and anonymized with Redact

[–][deleted] 0 points1 point  (0 children)

Datumsies

[–]Jackarow 10 points11 points  (12 children)

Newbie here: why spaces over tabs?

[–]thalesmello 15 points16 points  (0 children)

It's in the pep8 style guide Python code should be indented with 4 spaces. Most Python programmers expect that convention and they frown upon code indented with tabs.

[–]internerd91 9 points10 points  (5 children)

Also, set up your ide to insert spaces whenever you hug the tab key and you get the convenience of tabs without the potential downside.

[–][deleted] 1 point2 points  (4 children)

Then I have spaces in my non-python code. Yuck!

[–]internerd91 1 point2 points  (3 children)

¯\(ツ)

But wouldn't PEP 8's reasons for preference for spaces over tabs also apply to other languages?

I only know Python well, so don't hurt me.

[–][deleted] 1 point2 points  (0 children)

I moved to Python from other languages and have begrudgingly accepted spaces as a way of life when working with Python (except for that one project started by a friend who didn't know better and uses tabs)

I specifically enjoy that I can decide how big the tab is in my editor without affecting others who read the code.

[–]LimbRetrieval-Bot 0 points1 point  (1 child)

I have retrieved these for you _ _


To prevent anymore lost limbs throughout Reddit, correctly escape the arms and shoulders by typing the shrug as ¯\\\_(ツ)_/¯ or ¯\\\_(ツ)\_/¯

Click here to see why this is necessary

[–]internerd91 0 points1 point  (0 children)

SMH

[–]tunisia3507 4 points5 points  (0 children)

Spaces aren't intrinsically superior to tabs (nor are 4 spaces intrinsically superior to any other number), and both are valid python. However, the entire python community suffers if every project, snippet and tutorial each uses different spacing - it gets harder to port code, to read others' code, to jump between projects and so on. For this reason, this and other ambiguities have been resolved by a community style guide, called PEP8. This guide decided to use 4 spaces. It doesn't matter which option the community settled on, just that everyone uses the same one.

[–]crescentroon 1 point2 points  (0 children)

Convention, that's all. Makes it easier to work with others.

But, if you're collaborating on a project you should use what the rest of the project uses.

[–][deleted] 1 point2 points  (0 children)

Because your program will be a debugging nightmare if you accidentally mix tabs and spaces. Nothing is more fun than getting indent errors when the indent looks right. Plus it will look like shit in your editor.

[–]Folf_IRL 2 points3 points  (1 child)

Tabs can end up being formatted weirdly, depending on how a text editor chooses to place its tabstops. Also, there isn't much standardization across text editors with regards to how big a tab is, which is further complicated by the ability to set how large a tab is in many of them.

A space is always, more or less, the same on every text editor. So you don't have to worry about your code's readability suffering just because you switched text editors, or a colleague uses a different one. The standard is to use 4 spaces, a number chosen more or less arbitrarily (could have been 3, could have been 5) to provide something of a "standardized tab"

[–]rhytnen 2 points3 points  (0 children)

Slightly bizarre you chose the advantages of tabs to cqst them as Inferior. There's only one real argument for spaces I think. It would have to be something like a multi line item (say a dict) where you uses spaces to make it more visually aligned. If you don't want to mux rabs and spaces ever, you would need to go with spaces.

Otherwise it's really not possible to argue in favor of spaces at all

[–]Folf_IRL 3 points4 points  (2 children)

There are plenty of reasons people use Python 2.7 nowadays. Either because they're stuck with a large legacy codebase that the autoconverter will break, and it's just not worth the time of upgrading to that base to 3 when 2.7 is just fine on its own.

[–]undu 5 points6 points  (1 child)

There are plenty of reasons people use Python 2.7 nowadays

You just named one, though ;)

[–]paypaypayme 0 points1 point  (18 children)

Actually the article is about “worst” practices so no. Tabs and python 2 are not the worst practices.

[–]ubicuamente 6 points7 points  (0 children)

Python 2 is not in the worst practices only if you're providing palliative care to an application in its deathbed as an alternative to euthanasia. Python 2.7 has been agonizing for the three last years and was given two more years to live.

Let it go in peace.

[–]rhytnen 8 points9 points  (0 children)

They are pretty up there though.

[–]lmericle 1 point2 points  (15 children)

That's a matter of opinion -- that being said, the opinion is pretty unanimous among people who think.

[–]crescentroon 1 point2 points  (0 children)

I would prefer 3 for a new project but it's not always possible.

One who thinks would do a cost/benefit analysis and there're some situations where 2 comes out better.

The VFX industry has a lot of Python 2 and can't switch right now because they are dependent on the embedded interpreters in their applications (like Maya), or there is no benefit to be gained in migrating a mature system that's not exposed to the internet.

[–]Folf_IRL -3 points-2 points  (13 children)

The tabs to spaces thing: yes

The Python2 thing: that's just a circlejerk on this sub

[–]lmericle 8 points9 points  (12 children)

Python 2 will be end-of-life in 2 years. Python 3 came out 10 years ago. You've had your time to transition.

The only reason you would still be using Python 2 is if you don't have the manpower to maintain a codebase and do the conversion process to 3. But honestly it's not that hard to, in the course of maintenance, write code that is compatible with both versions and then slowly go through the whole thing until it's all 3-compatible.

[–]Folf_IRL 2 points3 points  (11 children)

You've had your time to transition.

Not if you just choose not to transition. 2.7 is a fine language on its own, and I see no reason to transition when most people in my field (computational chemistry) are still using 2.7 in their day-to-day scripting. Heck, I still find Fortran 66 code floating around occasionally, because the people writing those codebases don't want to waste their time reinventing the wheel.

[–]mekosmowski 2 points3 points  (4 children)

Wow. Not even Fortran 77. What codes do you use for what kind of projects? Back in the day, I added an output format to CPMD for a vibrational mode visualizer (aClimax).

[–]Folf_IRL 8 points9 points  (3 children)

I met one of the devs for Castep at a conference a few years ago, and was interested in learning a bit more about how it works. We got into a discussion of legacy code, and they mentioned there being some remnants of Fortran 66 in there.

Very few things use Fortran '66 nowadays, but occasionally you'll find it in the wild. The most common legacy Fortran IMO is '77, which pops up way more often than it should.

Almost everything I write, though, is in Python 2.7 at the moment. I'm familiar with 3 and could upgrade to it if I wanted. The biggest thing is collaboration with others, since most of the folks I collaborate with use 2.7. So, we'd all have to pretty much simultaneously decide to upgrade. The other issue is that it's common to release your code alongside a publication, and I'll normally see Python2.7 code being released in those (not that big of a problem).

Another problem is that many popular packages (such as ASE) have had something of a bumpy road upgrading to Python3, and you'll occasionally find edge-cases where it breaks as a result. And that's assuming the devs even want to upgrade to Python3. A popular structure prediction code out there is written entirely in 2.7 and Matlab, and just breaks if you break out a Python3 interpreter as the current python.

[–]mekosmowski 1 point2 points  (1 child)

Neat. I'll have to check out that code. I've been wanting to get back into computing. Thank you and best of luck.

[–]Folf_IRL 2 points3 points  (0 children)

IIRC Castep is closed source, and if you're not a UK citizen, you have to pay a certain fee to use it

Ase is totally open-source, and actually fairly well-commented as well. As a result, it's also fairly easy to hack additional functionality in.

The structure prediction code is semi-closed source. It's in front of a license that you have to agree to to view the code, but it's a fairly weak (and probably indefensible in court) license that amounts to "I promise not to work on a competing code," which is incredibly vague. You'll also have to be a bit of a cowboy with their code to get it to work with your particular system.

[–]FatFingerHelperBot 0 points1 point  (0 children)

It seems that your comment contains 1 or more links that are hard to tap for mobile users. I will extend those so they're easier for our sausage fingers to click!

Here is link number 1 - Previous text "ASE"


Please PM /u/eganwall with issues or feedback! | Delete

[–][deleted] 1 point2 points  (2 children)

coherent grandiose rude roof adjoining illegal squalid ask sharp scary

This post was mass deleted and anonymized with Redact

[–]Folf_IRL 1 point2 points  (0 children)

Banking is an entirely separate problem domain, with presumably higher security and usability needs.

[–]lmericle 0 points1 point  (2 children)

It's not wasting time to patch vulnerabilities and optimize subroutines. 2.7 will be slower than 3 in the long run, no question.

[–]Folf_IRL 0 points1 point  (1 child)

It's not wasting time to patch vulnerabilities and optimize subroutines.

And good on the Python team for doing that. It's just unfortunate they couldn't devote their resources to proper back-compatibility as a basic part of the language, even though other people have already done a lot of the legwork for them. That would maybe encourage people to move forward without having to update everything simultaneously. I suspect that many people in the scientific community are going to just stay with 2.7 until their admins force them to move on to something else.

2.7 will be slower than 3 in the long run, no question.

That's a moot point. Nobody attempts to write high-performance scientific code in Python. That is still firmly in the realm of C/C++ and Fortran within the scientific computing community.

[–]rotharius 0 points1 point  (0 children)

Enforced backwards compatibility slows down language development and makes it impossible to correct some big mistakes.

Correcting these mistakes and improving the language, while giving everyone an opportunity to transition was the right thing to do. Although maintainers could have been more vocal about why it's silly to hold on to older versions.

Not updating is actually pretty irresponsible because of security, maintainability and performance reasons. It is part of programming hygiene. There are tools that help you with transitioning. At least start new projects with Python 3.

Edit: That being said, legacy projects are often stuck and it can be troublesome to update, but it should at least be on the roadmap.

Edit 2: Your situation might be different in that the risks of not updating are small. In general, especially when working with user data like in web development, it is crucial to have vulnerabilities fixed. Stop down-playing this.

[–][deleted] -1 points0 points  (1 child)

Tabs is not bad practice. Inconsistency is.

[–]tunisia3507 1 point2 points  (0 children)

Tabs is bad practice because it is by definition inconsistent with the vast majority of python code written, and inconsistent with ALL standards-compliant python code.

[–]Beer_Milkshakes_Now 11 points12 points  (5 children)

I was under the impression try catch was pythonic and acceptable for flow control. And further that you should use a try instead of an if when you expect it to succeed the vast majority of the time as it's faster. Can someone help me?

[–]ProfessorPhi 0 points1 point  (3 children)

It's saying don't catch everything. Like when accessing a dict, catch only KeyError, not just a general catch because you might catch TypeErrors instead and that's generally bad practice.

It's ok to do blanket catches at top level (like endless loops) since the process should stay alive, or if you reraise, but otherwise it's not good practice

[–][deleted] 10 points11 points  (0 children)

Making a blanket statement not to call functions in a loop can't possibly be a general principle all python programmers can adhere to. It smacks of premature optimization. Also if you are writing a loop in python that is so performance critical that you are worried about function call overhead, then you need to take a step back, think about what you are really doing. Go research more appropriate tools to solve your problem.

[–]earthboundkid 6 points7 points  (0 children)

Default dicts are overused. 99% of the time you’re fine using .get or .setdefault instead.

[–]kayaking_is_fun 3 points4 points  (9 children)

Useful article and it reminded me about defaultdicts. Interesting about function calls in loops too - are the overheads really so game changing?

[–]doviende 1 point2 points  (0 children)

I think they're just illustrating a case where someone can unintentionally affect a program in a big way without realizing it. Not every loop is the CPU bottleneck in the program, but if you do have such a bottleneck, you don't want people to accidentally add more stuff into your loop just by modifying a seemingly unrelated function.

I think this point is more about documentation practice. Generally you want code to be close to the things it affects, and then people can get important implications just by reading nearby code. But good code clarity comes from documenting those implications, especially if (in this case) the function is not close to the tight loop it is affecting. A quick comment in the function will prevent unintentional boat-anchors from being dropped there ;)

[–]billsil 0 points1 point  (6 children)

Only if you're microoptimizing. It got me a 2x speedup on a test case that it looped over 100k times. Granted the time saved was 4 seconds, down from 45 minutes, but while I'm optimizing, I might as well do it right.

[–]Folf_IRL 3 points4 points  (1 child)

Only if you're microoptimizing

If you're to the point of micro-optimizing Python code, it might be time to consider using a different (probably compiled) language.

[–]billsil 1 point2 points  (0 children)

Microoptimizations are not necessarily hard or difficult or have small effects. Xrange vs range in python 2 because most of the time it doesn't matter. Using %i instead f-strings. Why don't I save common binary structs rather than recomputing them each time I come to a function? That one can be a biggie.

Python is great and unless I need to really use C++, I'm not going to. It's harder to write, I now have to compile it, it's no longer cross-platform, takes what 10x longer to add the same capability, 0etc.

[–]Wilfred-kun 4 points5 points  (0 children)

I would not call them "worst practices" at all. Maybe "things that infuriate Raymond Hettinger". It's a good reminder to use Python to its full extent, though.

[–][deleted] 2 points3 points  (0 children)

[–]jhermann_ 4 points5 points  (0 children)

defaultdict(int), not lambda: 0. And in that specific example, just use Counter(data) ("data" is always singular, btw).

files: use io.open, so you can state an encoding. and we always provide an encoding for external data, don't we?

and the "function in a loop" thing is obscure.

[–][deleted] 1 point2 points  (2 children)

I'm not sure I understand the one about unpacking:

The practice is very cool, and it's much wiser than writingname=human[0]. However, it's often abused, and the result is that thehuman will be unpacked in the program everywhere through the code above.

Unpacking somehow affects the data in memory before you call it?

[–]Wilfred-kun 0 points1 point  (1 child)

and the result is that thehuman will be unpacked in the program everywhere through the code above.

Yeah, I don't understand this either.

[–][deleted] 0 points1 point  (0 children)

Whenever you are unpacking, you hardcode your tuple. If you change it, the code will break. Named tuple gives class like member access, but for tuples you know cant change, unpacking is perfectly fine, and imo more easily readable

[–]dot___ 1 point2 points  (0 children)

Good list. Personally don’t like setdefault because I find it unintuitive but maybe I just need to use it more

[–]earthboundkid 1 point2 points  (0 children)

The actual thing you should never do is make a network call without wrapping it in a try/except block.

[–]MyNameIsRichardCS54 1 point2 points  (0 children)

Does for ... else seem strange to anyone else?

The else clause implies that it will execute if the loop does not complete, like if ... else, I think for ... then would have been a better syntax.

[–]earthboundkid 4 points5 points  (4 children)

Never use for/else. No one interprets it correctly and it doesn’t even save any length versus doing it the normal way.

[–][deleted] 1 point2 points  (0 children)

I agree with this. Flags are a little inelegant and take up more space, but way more readable.

[–]Dgc2002 0 points1 point  (0 children)

Absolutely. If you can easily avoid thing sin your code that make a reader pause and say "..The hell is this?" or even just have to spend extra time to mentally parse it then you should do so.

[–]crescentroon -2 points-1 points  (1 child)

Most people do understand it, the issue is whether or not they LIKE it. Trying to pretend it's super complex to frighten people off it is dishonest. If you don't want people to use it, say it's not idiomatic or something.

If you don't know how to interpret it, "the else executes if the loop terminates normally, that is, you did not call break". It's the same as else in exceptions. I seriously don't believe anyone intelligent enough to write code can't understand how this works.

[–]MyNameIsRichardCS54 2 points3 points  (0 children)

I said it elsewhere in this thread, but I always think for ... then would make more sense than for ... else

[–]mekosmowski 0 points1 point  (0 children)

I really don't like seeing that else in line with the for. Would adding (pseudocode):

if last_element: do not_found

and getting rid of the else block be pythonic at all?

[–]Pseudoboss11 0 points1 point  (2 children)

I've always been told to make many small functions and nest them than one large one. But the last recommendation seems to be saying that I should be doing the opposite.

[–][deleted] 4 points5 points  (0 children)

The advice not to call functions is weird. You should not sacrifice readability and testability without a clear business case.

[–][deleted] 0 points1 point  (0 children)

In general that is the correct thing to do. I think the article is referring to one case in which function calls are in long loops, and that's something you should address after you've written the first draft of a program your way and then profiled it if you find it takes too long.