Running Modules

Mon 19 March 2018 by Moshe Zadka

(Thanks to Paul Ganssle for his suggestions and improvements. All mistakes that remain are mine.)

When exposing a Python program as a command-line application, there are several ways to get the Python code to run. The oldest way, and the one people usually learn in tutorials, is to run python some_file.py.

If the file is intended to be usable as both a module and as a command-line parameter, it will often have

if __name__ == '__main__':
    actually_run_main()

or similar code.

This sometimes has surprising corner-case behavior, but even worse -- some_file.py is not looked for in either $PATH or sys.path, it must be explicitly handed. It also changes the default Python path from including the current directory, to including the location of some_file.py.

The new recommended modern way, of course, is to set entry_points in the setup.py file. When the distribution is installed, a console script is auto-generated and added to the same place the Python interpreter is found. This means that we need to think carefully about the other things that might have the same name on our $PATH to avoid collisions.

There is a third way, which is subtle. When Python sees the -m <some name> option, it will look for a module or a package by that name. If it finds a module, it will run it with __name__ being "__main__" in order to trigger the path that actually does something -- again leading to some, if not all, issues discussed earlier.

However if it finds a package it will run its __main__.py module (still setting __name__ to "__main__") -- not its __init__.py.

This means that at the top of __main__.py we can invert the usual logic:

if __name__ != '__main__':
    raise ImportError("Module intended to be main, not imported",
                      __name__)

from . import main_function
main_function()

This allows running python -m <some package>, but anyone who tried to accidentally import <some package>.__main__ will get an error -- as well they should!

Among other things, this means we only care about our sys.path, not our $PATH. For example, this will work the same whether the package is installed to the global Python or --user installed.

Finally, if an entrypoint is desired, one can easily be made to run __main__:

entrypoint = toolz.compose(lambda _dummy: None,
    functools.partial(runpy.run_module,
                      "<some package>",
                      run_name='__main__'))

Using the builtin module runpy, the builtin module functools and the third party module toolz.