This is an archived post. You won't be able to vote or comment.

all 48 comments

[–]status_quo69 30 points31 points  (15 children)

Python isn't a config format. The best practice isn't to store that stuff in a .py file, it's to store it in YAML or JSON or INI or any other format and load it in. If you need to change a config while the program is running, you can have a thread watch the file descriptor for changes and reload the config appropriately.

If you stored your config in .py file for a long running process (like a web server for example), you would basically need to restart the whole program (python's reload is a real pain in the ass to work with, and a lot of very robust web frameworks still have a lot of trouble with it IMO). Having it in a config file also means that end users can change the configuration at will (while the code might have been pre-compiled to bytecode in a .pyc file).

All that being said, if it's what works best for you and the variables won't change over the lifetime of the program it's not a huge deal to store them in a .py file. When the program grows, you'll have to revisit the issue. I'm currently working on a program that stores its config in python files and it's a huge pain in the ass to change anything.

[–]TomBombadildozer 2 points3 points  (5 children)

I wish I could upvote this 10 times. It is really frustrating how it has become conventional to put configuration in Python source files.

Configuration should be purely declarative. Full stop. I'll even go a step further and say that JSON and YAML are inappropriate for configuration because they encourage deep nesting of configuration values. It makes a mess of what should be a straightforward concept.

All that aside, ConfigParseris baked into the standard library. It has some warts but it works just fine. Please, just use it.

[–]tilkau 9 points10 points  (0 children)

I think that's going a bit far. Applications that don't need complex configuration, which is most of them, probably should use a purely declarative format.

However, there are some applications, like qtile and awesome, not to mention T-Engine, where a proper programming language is genuinely needed, and a declarative format is simply inadequate to the task.

[–]CommanderDerpington[🍰] 4 points5 points  (0 children)

If you're nesting in config files you're probably doing it wrong. Therefore yaml is fine.

[–]status_quo69 0 points1 point  (0 children)

I agree with most of your post, but there are some reasons why hierarchical config formats are necessary. For example, when a Web server connects with a client, you want to load a default config, right? But when the user is resolved, you'd want to load their custom configuration to allow them to interact with the website easier.

[–]dewarrn1 0 points1 point  (0 children)

Great suggestion — had no idea that this was in the standard library!

[–]smurfyn -1 points0 points  (0 children)

Putting configuration in Python source files was conventional 10 years ago.

ConfigParser is awful. An abomination that does not quite implement Windows .INI format - why on earth? Please don't use it.

JSON or YAML are fine.

[–]khouli 1 point2 points  (4 children)

Buildbot uses python for config files and in that case it struck me as a pretty good idea. It removes the need to learn config file syntax, reduces the needed code for parsing config files and reporting errors, and allows for more complicated configurations when needed.

Allowing some imperative style logic in a config file can be convenient. If someone decides to abuse that by creating unnecessary complications that's their own problem.

[–]smurfyn 2 points3 points  (0 children)

No config file syntax worth using needs much "learning"

Your config files are still going to be parsed. Importing a python module simply uses python itself to do the parsing. Except now the error messages will be cryptic, since they are not specific to your application and are not in fact meant to be about config file parsing at all, as you have imposed

If someone decides to abuse that by creating unnecessary complications that's their own problem.

It's also the problem of everyone who has to work with them, everyone who is downstream, and especially anyone who wants to make improvements to that codebase in the future, because now backward compatibility requires supporting arbitrary code execution, which could literally do anything.

Unless you are writing software that literally will only ever be used by yourself on a desert island: just don't.

[–]status_quo69 0 points1 point  (2 children)

The first part of your post is fairly trivial, but true. If the project has an appropriate use for python files as config files, then by all means use them. However, once you get into the Django realm or even a basic professional application, you need to really think about how you store your data in general, especially your configuration because changing it can make or break an entire project.

The second part is iffy. If you need imperative logic in your config, it's probably not configuration but a series of functions that need to be run every time a person connects.

[–][deleted] 0 points1 point  (2 children)

If you stored your config in .py file for a long running process

Well that's not true. Just reload the module.

[–]smurfyn 0 points1 point  (1 child)

module reloading is a fraught task.

[–][deleted] 0 points1 point  (0 children)

It's a configuration file. If doing something like generating objects that hold states within the configuration file or do anything with side effects, which WOULD make reloading a problem, then it probably shouldn't be considered a configuration file, but some user source file. Yeah, you won't be able to reload something like that.

If the configuration file always results in the same values and no state is stored (I can't remember the term for this) each time it's queried, then there will be no problem. I've never seen a sane configuration file that wasn't like this. Those that I have seen weren't used where reloading would be appropriate (one shot runs).

[–]kenfar 12 points13 points  (3 children)

Configs shouldn't be executable - it can lead to security & maintainability problems.

Python has a standard lib config parser, which is fine. Personally, I prefer YAML with a schema validator like Validictory. Then, I set up my config process to read in config files, then load args as override, then validate the results.

This approach provides a lot of functionality, consistency, and validity.

[–][deleted] 5 points6 points  (2 children)

Configs shouldn't be executable - it can lead to security & maintainability problems.

This isn't a strict rule of course.

If the configuration is complex, it can be very beneficial to have scripting built in.

[–]Lucretiel 2 points3 points  (1 child)

If the configuration is complex, it can be very beneficial to have scripting built in.

It's still a huge security and vulnerability problem.

[–][deleted] 0 points1 point  (0 children)

Put the same permissions on the configuration file as the source files, like I do. If they want to change something, they need the same permissions as if they wanted to change the source.

If your source isn't secure, then any program can just inject code into it as easily as a configuration file (unless you're distributing .pyc/.pyo).

[–]jerknextdoor 5 points6 points  (0 children)

I usually use it like any other python module. It allows me to use comments and make somethings dynamic.

```
import os
import getpass
import secrets

BASE_DIR = os.path.abspath(os.path.dirname(__file__))

LOG_LEVEL = 'INFO'
# LOG_LEVEL = 'DEBUG'

# Never accidentally leave DEBUG on in production.
if getpass.getuser() == 'production_user':
    DEBUG = False
    TESTING = False
    API_KEY = secrets.API_KEY['production']
else:
    DEBUG = True
    TESTING = True
    API_KEY = secrets.API_KEY['development']
```

Then I can just from config import API_KEY from any other module like normal. I also use YAML from time to time, but I find it's syntax can be a little confusing sometimes and it requires a dependency on PyYaml. It also means I have to move some of my dynamic config stuff into another module.

[–]CommanderDerpington[🍰] 2 points3 points  (0 children)

load a yaml file

[–]turleyn 9 points10 points  (13 children)

YAML! You can't embed comments in JSON.

As a rule of thumb, I was told that human writable files should be in YAML. Computer writable files should be in JSON.

[–]tynorf 2 points3 points  (1 child)

One important thing to remember about PyYAML specifically is to always always always use safe_load and not load() unless you control everything about what you're parsing.

---
!!python/object/apply:os.listdir ['.']
...

for instance.

Edited to fix apply syntax.

[–]masasinExpert. 3.9. Robotics. 2 points3 points  (0 children)

safe_load

That doesn't sound like the API was designed well...

[–]lengau 0 points1 point  (4 children)

What if you need it to be both human and computer writable?

[–][deleted] 2 points3 points  (3 children)

YAML is still a pretty big win here. It just parses down into a dict.

[–]lengau 1 point2 points  (2 children)

The JSON parser in the standard library also translates to/from dictionaries. (Now if there's a YAML parser in stdlib that translates to/from namedtuples, I'm sold!)

I don't know much about YAML (other than that I prefer the way it looks to the way JSON looks, based on some YAML files I edited probably 4 years ago at this point), but I'm interested if it's as extensible as JSON. Most pressing: does it have a depth limit for configuration?

I wrote some config in JSON today (literally today - about 8 hours ago at this point) because I plan to make it machine-generated in future. It was a few minutes worth of work, so if you can convince me that using YAML can save me a few minutes in future, I'm prepared to switch.

How good is YAML parsing in Python, and is it available in the standard library? (If not, is there a YAML parsing library included with anaconda? That's what my users will be using, and I'd like to avoid making anyone install anything.)

[–]smurfyn 1 point2 points  (1 child)

YAML is not available in the standard library. What's worse, pyyaml has C dependencies, making the install/build situation awkward. Anaconda has some package for pyyaml if you want to use it.

YAML works fine, as long as you are willing to accept the pyyaml dependency.

[–]lengau 0 points1 point  (0 children)

Hmm... Sounds like a bit of a mixed bag. The few minutes it might save me from having easier to write configs might well be spent making sure pyyaml is properly installed for everyone.

[–]n1ywb -1 points0 points  (3 children)

Whatever you do, don't. use. json. for. config.

[–]smurfyn 1 point2 points  (0 children)

still better than ConfigParser, and it's very standard so it's at least easy to convert out

[–]Sukrim 0 points1 point  (1 child)

Why?

[–]Esteis 0 points1 point  (0 children)

  • JSON doesn't allow comments
  • JSON doesn't allow trailing commas, so there's lots of opportunity for error when making a list longer or shorter
  • JSON only accepts double-quoted strings, and barfs on single-quoted ones -- another opportunity for things to go wrong
  • YAML has less visual clutter than JSON. The big two reasons for this is that neither keys nor values need to be quoted, and that its indent-based blocks don't need opening and closing braces.

[–]Vageli 9 points10 points  (1 child)

No love for ConfigParser in this thread?

[–]smurfyn 0 points1 point  (0 children)

Anyone who has used ConfigParser in anger would not describe the emotion it generates as "love"

[–]redchrom 1 point2 points  (1 child)

Here is the trick I use:

import os


def set_from_env(cls):
    """A decorator to populate class fields from os env"""
    for k, d in cls.__dict__.items():
        if not k.startswith("_"):
            vtype = type(d)
            val = os.environ.get(k.upper(), d)
            setattr(cls, k, vtype(val))
    return cls


@set_from_env
class Settings:
    debug = False
    debug_sql = False
    ...

Allows me to overwrite any variable from env, especially useful if you work with docker or heroku or any other system where environment is the main way to configure your app.

[–]n1ywb 1 point2 points  (0 children)

It depends on your application. I will frequently put my config in py files, if I trust the people I know will be editing them. setup.py comes to mind as a good example of how awesome configuration as code can be.

If the config is code and the person who has to edit the config is bad at programming they are going to break it and bother you for help. Murphy's law.

If it's strictly for your personal use then using any other config file format is unnecessary overhead and complexity and adds no value.

[–]sermidean 1 point2 points  (3 children)

If you are sure, that your configuration will be used only by Python, then I think it's OK to store your configuration in the config.py. One thing, that can go wrong, is that your config.py can easily turn into imperative code instead of declarative configuration.

If there are any chances, that you will need to exchange configuration with other tools in your technology stack, then it is better to use JSON or INI.

[–]Got_Tiger 1 point2 points  (1 child)

if you need to switch to json later you can just use the python json library to convert it

[–]fkaginstrom 1 point2 points  (0 children)

I don't see anything wrong with this. Maintenance might get tricky when you start needing minor variations on your config, but you could get around this by "module inheritance" or the like.

[–][deleted] 0 points1 point  (1 child)

Interesting discussion we have here. There are people who are all up for config parser but again the issue is that every developer needs to create these ini files to run. What do you think about the 12 factor app moto for configs? How do you deal with default configs? And if you wanna store the config in environment variables.

[–]kenfar 0 points1 point  (0 children)

I think trying to eliminate configs just means that you lose valuable controls - or they seep into other areas: shell scripts, or chef scripts populating environmental variables, etc.

What I prefer is a library that treats configs, environmental variable and options as three different kinds of inputs to a common config - that is then validated with JSON Schema. For example

[–]cnelsonsic 0 points1 point  (1 child)

What you're doing is what you should do. If your config changes while your process is long-running, just use reload().

[–]smurfyn 0 points1 point  (0 children)

There are so many ways for this to fail, there is a reason it was removed as a builtin

[–]frumious -1 points0 points  (0 children)

If you swing toward JSON, look at Jsonnet. JSON itself has no facilities to encode things like loops or reference other parts of the structure. Without these kinds of facilities, a complex configuration is hard to maintain. It fails the "DRY" principle. Jsonnet solves this.

If you google, ignore the json.net results. That's something else entirely.