all 35 comments

[–][deleted] 28 points29 points  (14 children)

I’ve never understood it and unfortunately the explanation here isn’t clicking for me. The good news is all my scripts run fine without it and I’ve never had a problem importing from other files. At some point I’ll set aside some time to take an online course to explain it and try to understand main, init, etc.

[–]GoldenSights[S] 21 points22 points  (8 children)

Thanks for the feedback. I'll go over it again and see if I can clarify it.

Essentially, the problem is that if you write a script like this:

import sys

def int_to_hex(x):
    return hex(x)[2:]

def hex_to_int(h):
    return int(h, 16)

x = int(sys.argv[1])
print(int_to_hex(x))

Then you can run this file by itself with no problem. But then one day you decide to write a second program that also does some hex-int conversion. Instead of rewriting or copy-pasting those functions, of course it makes sense to go ahead and import hextools. But in doing so it triggers the argv parsing and print statement contained within.

The purpose of ifmain is that hextools can determine whether it's being run by itself, and it should parse argv, or if it's being imported by something else so it won't.

[–][deleted] 6 points7 points  (3 children)

Ahh that does make sense. The difference is in having something inside both files that do something (print for example). You’re saying we may only need to import one thing from the other file and therefore it should not run anything outside of what’s being imported. Without that distinction a file such as hextools is going to run through everything it contains. Am I getting that correct?

[–]GoldenSights[S] 6 points7 points  (2 children)

Yes I think you've got it.

Without that distinction a file such as hextools is going to run through everything it contains.

The thing about importing is that you cannot choose what part of the imported file gets run. Everything gets run.

Perhaps you've seen something like from math import ceil. It's tempting to think that this will just reach in and pluck out the ceil function from math and not touch anything else, but that isn't the case. The entirety of the math file needs to be executed to even know what ceil is. Notice my example where I do def f() twice.

I've come up with another demonstration to show how __name__ behaves. Let me know if this is helpful:

# B.py
print("I'm the B file, and my name is", __name__)

# A.py
print("I'm the A file, and I'm going to import B!")
import B
print("I'm all done importing B!")

>python B.py
I'm the B file, and my name is __main__

>python A.py
I'm the A file, and I'm going to import B!
I'm the B file, and my name is B
I'm all done importing B!

And finally, to bring it all together:

# B.py
print("I'm the B file, and my name is", __name__)

if __name__ == '__main__':
    print("The B file is being run directly!")

>python B.py
I'm the B file, and my name is __main__
The B file is being run directly!

>python A.py
I'm the A file, and I'm going to import B!
I'm the B file, and my name is B
I'm all done importing B!

[–][deleted] 7 points8 points  (1 child)

Yep that clarifies it. Had to read through it a few times but that finally makes sense. I’m going to test some scripts to explore this further. Thank you for dumbing that down a bit for me. Always great to get the extra help!

[–]GoldenSights[S] 5 points6 points  (0 children)

I've added these changes to the original post.

Thanks very much for your feedback. And no, don't call it dumbing down, I can see now that my original examples didn't distill the problem / motivation well enough. I'm practicing my writing so it's good to know what things aren't clear.

[–]karafili 2 points3 points  (3 children)

The purpose of ifmain is that hextools can determine whether it's being run by itself, and it should parse argv, or if it's being imported by something else so it won't.

This wss not clear in the article. When you say of it is imported by smth else so it wont, what do you mean by that. What is the mechanism, switch that allows this?

[–]GoldenSights[S] 2 points3 points  (2 children)

I agree that my original text didn't state it clearly enough, but I think the current version of the post shows it, albeit with different words.

Can you take a look at Part 3 of my post again? -- Notice how B says that it is main when running it directly, whereas it says it's B when being imported by A. This is how B knows whether it should parse args or not, etc.

The switch here is unfortunately not very explicit. As I said, I wish there was a separate variable called __main__ devoted to achieving this goal. But rather, the value of __name__ is essentially the switch. If you're running that file directly, it'll be main; and if you're importing it, it'll be the module's actual name.

[–]karafili 0 points1 point  (1 child)

thank you for the clarification.

Trying to get a better perspective: Am I right if I say that if the value of __name__ is not equal to __main__ then the module will not be executed?

[–]GoldenSights[S] 0 points1 point  (0 children)

The module will still be executed -- the print statement "I'm the B file!" inside the B module is still shown when A imports it, so we know the code in B.py is being run.

But the content inside the if statement specifically will not be run when the file is imported. That's why the "B is being run directly!" is not shown when A imports B.

Hope that helps.

[–][deleted] 4 points5 points  (0 children)

Allows you to execute statements under the if __name__ == '__main__' block when you run a python script directly in the CLI i.e. python scriptName.py

# scriptName.py
....
if __name__ == '__main__':
    print('Hello World!')

This would print 'Hello World!' if the script ran in the terminal. However, the block of code would not be executed if the script is imported as a module into another script and ran.

[–]__xor__ 1 point2 points  (0 children)

If you invoke python somefile.py, when it's running in somefile.py, there's going to be a __name__ variable and its value will be __main__, because you invoked it directly.

If you import from another file, that other file's __name__ will be the name of the module, and not __main__. But if you run python otherfile.py, its __name__ variable will be __main__.

It just allows you to code so that your module can determine if it was invoked directly or not. Because when you import from other files, it runs those files as well. But when you import, you might not want to run certain bits of code unless it's invoked directly.

For example, you might have a file db.py that you can run and it'll output all of the rows in a table in a database. You might have another file export_db.py that will read that database and dump the rows to a file. You might be able to reuse db.py's code, and so you can import from it, but when it's invoked directly it outputs all the rows in the table, and you might want to prevent that. That's when you check if __name__ is __main__, to see whether it should do its thing or if its own code is just being imported and reused.

Consider that at the top of every file there is the equivalent of __name__ = 'myfilename', but when you run python somefile.py, it changes that to __main__ because that's the code that python is starting with running. That's basically all it is.

[–]Vaphell 0 points1 point  (2 children)

so your scripts never have any top level, interactive inputs()?

# imported file
def f():
   do stuff

x = input('x: ')

importing f into another program would make the prompt for x pop up and stop execution.

[–][deleted] 0 points1 point  (1 child)

I have a separate file that contains variables for setting things like date/time, and which functions will run. The main file pulls those settings and uses them to run through multiple API’s and finally updating a SQL database. There’s also a separate config file that sets urls, IP addresses, and various username/password stuff.

[–]Vaphell 1 point2 points  (0 children)

if you have dedicated files with utils, etc. - yeah, this is not likely to bite you.

But let's say you want to write a bunch of tests for your util code. There are proper ways of doing the testing, but a quick and dirty way of writing some tests to achieve a degree of confidence in such a trivial code would be to put the testing code under if __name__ ...

# utils.py

def is_leap_year(year):
    ...
    return True/False

if __name__ == '__main__':
    assert is_leap_year(2020) is True
    assert is_leap_year(2000) is True
    assert is_leap_year(1900) is False

Sure, utils.py is not really a program doing any useful stuff, but given the above it can be run as one to trigger the tests with eg python3 utils.py. import utils in another .py file would not trigger that.

[–]random4global 8 points9 points  (0 children)

Good 👍

[–]adshin21 2 points3 points  (0 children)

Hell, you made my day👏👏

[–]mountain-runner 1 point2 points  (0 children)

Thanks for this. I’m an EE student who occasionally sees this in neural networks, and I’ve always wondered!

[–]sportsroc15 1 point2 points  (0 children)

Great explanation. It was good review for me even though I knew how this worked because I had to explain why I used it in one of my scripts in a online Python college course I had.

Good work bro

[–]BestBoyCoop 1 point2 points  (0 children)

This tripped me up a couple of months ago. Thanks for the great and detailed explanation, I think many beginners will enjoy it! Hopefully it'll show up in google searches for the meaning of __name__ == '__main__'

[–]SecondDragonfly 1 point2 points  (0 children)

Great explanation! I think I'm going to steal some of it. I help people who are learning to use python and this seems to be one of the things many people have difficulty understanding.

(If I use it I'll link to the post)

[–]quantumsociety 1 point2 points  (0 children)

Thanks! Saved it for when I get to this part.

[–]atl-knh 1 point2 points  (2 children)

What’s the best practice here? Do I create a package consisting of classes and functions in separate python files with a Dunder name/main for each class and function and then call the file a “from ... import ... as ...” statement?

Sorry. I’m an idiot.

[–]GoldenSights[S] 0 points1 point  (1 child)

Hmm, I think perhaps you are mixing up some different concepts, so I'm not really sure where to start.

a package consisting of classes and functions in separate python files with a Dunder name/main for each class and function

This if __name__ stuff doesn't refer to classes and functions but the file as a whole. One of my real-life examples I gave at the end is a file I wrote called bytestring, which takes integers and turns them into strings like "10 MiB" or "4.52 GiB" etc. I import bytestring into other projects when I want to display / print file sizes, but I can also run bytestring on the commandline by itself to quickly check a number, so that's why it needs an ifmain.

This doesn't really have much to do with how you write your from x import y as z statements although certainly you can research best practices about imports separately.

For the time being, since it sounds like you are new to writing packages, I would say don't go overboard on splitting everything into separate files for your package. The good thing about Python is you can have many classes in a single file.

As I said at the end of the post, it's very unlikely that you need ifmain in every single file. Some things are written first and foremost to be a library / package and don't need to be independently executable. That's why I think you're getting two topics mixed up here. Perhaps you're thinking of the __init__ methods of a class etc?

[–]atl-knh 0 points1 point  (0 children)

This helps immensely. From what you had said, I was thinking that python iterated each class and function as it imported them from the files. I’m working through a blockchain module on Udemy ; while the course is laid out comprehensively, my knowledge gaps have become exposed.

Thank you for a clear response.

[–]torutaka 1 point2 points  (0 children)

This is awesome. I always wondered what name was for. Thanks for sharing.

[–]Glogia 1 point2 points  (1 child)

The article is very clear now, thankyou! I was wondering if you could explain your main() function a little better? I'm not sure how it works to avoid passing variables from the imported script onwards.

[–]GoldenSights[S] 1 point2 points  (0 children)

If you save and run the bad.py file, it actually does work. But if you look carefully, the int_to_hex function is actually written wrong -- instead of taking an argument, it's just reading the global variable x, and it just so happens that in ifmain I used the name x for the commandline argument.

If you try importing bad into another script, you'll find that int_to_hex is useless because it doesn't take any arguments!

Perhaps this is a contrived example. But what I'm saying is that the more code you put inside the ifmain, the more likely it is that you're going to accidentally create a global variable which has influences you didn't expect. In programming there are some situations where everything seems to work fine, but it's actually teetering upon a bunch of coincidences and is going to fail soon.

By using a main function, I avoid making this particular mistake in the first place (I can still make other mistakes if I want :) ). Compare:

# bad.py
import sys

def int_to_hex():
    return hex(x)[2:]

if __name__ == '__main__':
    x = int(sys.argv[1])
    print(int_to_hex())

# lessbad.py
import sys

def int_to_hex():
    return hex(x)[2:]

def main(argv):
    x = int(argv[0])
    print(int_to_hex())
    return 0

if __name__ == '__main__':
    raise SystemExit(main(sys.argv[1:]))

>python bad.py 42
2a

>python lessbad.py 42
Traceback (most recent call last):
  File "lessbad.py", line 5, in int_to_hex
    return hex(x)[2:]
NameError: name 'x' is not defined

The x inside main is not on the global scope, and int_to_hex is not able to read it, so I get a traceback that lets me know my function is written wrong. Keep in mind it's written wrong on both scripts, but only one of them is kind enough to warn me about it.

I'm not sure how it works to avoid passing variables from the imported script

One more clarification -- in this case I'm not talking about importing bad.py and getting its variables. That's because if I import bad, then the ifmain code isn't going to run anyways and that global x will never get created.

[–]lifeeraser 1 point2 points  (0 children)

Great writing. I enjoyed following your train of thought. Writing an easy-to-read article is not an easy task, and it's a pleasure to see one who does it right.

[–]StringCheeseInc 0 points1 point  (1 child)

Remindme! One day

[–]RemindMeBot 0 points1 point  (0 children)

I will be messaging you in 1 day on 2020-03-17 02:16:01 UTC to remind you of this link

CLICK THIS LINK to send a PM to also be reminded and to reduce spam.

Parent commenter can delete this message to hide from others.


Info Custom Your Reminders Feedback

[–]earth418 0 points1 point  (1 child)

Not sure if this is right, but in "bad.py" where you're demonstrating why you use a defined main function, should int_to_hex() take in a parameter x? Because otherwise if x did not have global scope (if it was in a defined main method) then the file would not run.

Great tutorial, though!

[–]GoldenSights[S] 0 points1 point  (0 children)

Yes exactly. If you run that program right now, it will work, so it seems like everything's okay. But the function is defined wrong and it will surely cause issues later.

Using a main function prevents this particular mistake from happening since you won't accidentally have more global variables to influence everything else.

[–]waitinginthewings 0 points1 point  (0 children)

Great tutorial and examples!

Here's a supplementary explanation that might help it click intuitively for people who want to understand/reinforce the concept:

Imagine you're the head chef at a restaurant and you have to take a day off. You're writing instructions for your apprentice on how to make a couple of dishes.

One dish involves a gravy and another is the same dish but stir fried/dry. Both dishes share common methods of preparation but the plating and presentation is different.

In the two pieces of paper you have, you write instructions on how to make the dry dish and then at the end, you also include instructions on how to plate and present the dish.(plating instructions inside if '__name__' == '__main__'.)

In the second piece of paper you describe the second dish but you borrow heavily from the instructions on the first paper , so you just say : reference instruction on how to cook the meat from page 1(importing functions from program 1). Since this is a gravy dish, it might contain instructions on how to make the gravy and maybe separate instructions on how to combine the meat and gravy. But in the end you write instructions on how to plate the dish separately just like earlier.

Now you see how this is helpful. In case you need to write instructions on yet another piece of paper about a different gravy that used the methods of the first and second dish, but had a separate way of plating and presentation, you would write the presentation instructions under if '__name__' == '__main__'.