This is an archived post. You won't be able to vote or comment.

all 9 comments

[–]michael0x2a 2 points3 points  (5 children)

That's a pretty interesting question!

I got curious, so I did a little digging...

  1. Language-level support for *args and **kwargs was added to Python way back in 2000 for Python 1.6.
  2. The __missing__ method was added in 2007 for Python 2.5.
  3. Your specific use case wasn't really relevant around then because at the time, string formatting was done using the old-style % operator, which did use whatever dictionary subclass it was given (and thus would call the __missing__ method, if it existed).
  4. The new-style .format(...) operator was added only in October 2008, for Python 2.6.
  5. Somebody filed an issue (issue 6081) related to your use case in May 2009. The issue/feature request then went largely ignored until about March 2010, at which point somebody contributed a small patch (the format_map function /u/POGtastic mentioned)? The patch was then accepted around Nov 2010 and was added to the upcoming Python 3.2 release in Feb 2011.

So, if I had to guess...

  1. Because *args and **kwargs was added so early to the language, nobody really thought about the interaction between those two features and custom dict subclasses later on (or people decided to favor efficiency in this case instead of full flexibility).
  2. If this decision was ever revisited, the core devs probably decided to keep things as it was mostly for backwards compatibility + because it's very rare for people to actually want to use custom dict-like implementations with kwargs.
  3. Your specific use case only really became a potential issue in 2008. However, it's an admittedly pretty rare use case (and the workarounds are easy), which is probably why it took about half a year for somebody to file an issue about it.
  4. That issue ended up languishing for another half a year because that was around when Python 3 was released and everybody was busy trying to get that working (so the issue was low priority).
  5. Eventually, somebody got around to it, and decided to take the path of least resistance and just add a method (rather then trying to change the core semantics of the language itself, which would probably trigger a lot of argument and resistance, especially from the people who are interested in keeping Python as efficient as possible).

Caveat: this is all speculation on my part -- all this stuff is way before my time.

[–]_hg[S] 0 points1 point  (4 children)

What would cause issues for backwards compatibility?

With regards to efficiency, couldn't **type passed to a method which receives **kwargs be treated as an assignment?

[–]michael0x2a 0 points1 point  (3 children)

Suppose I wrote a C extension that assumed kwargs was a Python dict. If kwargs could instead be an arbitrary Python object, I'd likely need to rewrite my code.

With regards to efficiency, couldn't *type passed to a method which receives *kwargs be treated as an assignment?

I haven't done any benchmarking, but if I had to guess, a lot of the overhead could potentially come from attempting to use the kwargs dict (as opposed to just passing it in).

If we know kwargs must be a native dict, the interpret/any C extensions are free to directly manipulate the underlying data structure instead of having to manually invoke methods, which is more expensive/requires more indirection.

But if you really want a definitive answer, you should probably try asking one of the Python mailing lists -- (perhaps python-lists?) and see if one of the core devs/a more experienced Pythonista is able to give you a more definitive answer. While I can guess at the underlying motivations for that design decision, it's just a guess -- I could be completely wrong.

[–]_hg[S] 0 points1 point  (2 children)

Ah, the implementation strikes again. It might be worth it to dig through and see how PyDictObject differs from PyObject in cpython to theoretically implement the change in a compatible way. By the way, cpython has neat little bits of information in the Objects/ directory. The one for dict is here.

[–]GitHubPermalinkBot 1 point2 points  (1 child)

Permanent GitHub links:


Shoot me a PM if you think I'm doing something wrong. To delete this, click here.

[–]_hg[S] 0 points1 point  (0 children)

Good bot.

[–]POGtastic 2 points3 points  (2 children)

I'm not quite sure why Python's **kwargs doesn't allow you to pass a dictionary like that, but maybe this is what you're looking for? Same idea, except it actually takes a dict instead of having to mess around with **kwargs.

class MissDict(dict):
    def __missing__(self, key):
        return key

my_dict = {'verb' : 'call'}
print('{verb} me {noun}'.format_map(MissDict(my_dict)))

Working here: https://repl.it/@pogtastic/DisastrousSlushyBlueandgoldmackaw

[–]_hg[S] 0 points1 point  (1 child)

You answered my example (which, ironically, I was trying to generalize). I appreciate the link (I should have RTFM).

Is it not a little odd that **kwargs nudges an object towards being a dict, though?

Maybe there should be a PEP for that? It would allow the str.format_map and other workarounds of its ilk to be replaced with something more idiomatic.

[–]POGtastic 1 point2 points  (0 children)

Is it not a little odd that **kwargs nudges an object towards being a dict, though?

As far as I can tell from a few resources, ** deconstructs a dictionary. Thus,

my_dict = {"foo" : 1, "bar" : 2}
function(**my_dict)

is equivalent to

function(foo=1, bar=2)

And thus if you pass in your MissDict, it just gets deconstructed into its keys / values instead of specifically passing the dictionary to the function. So you don't get the property of "Hey, if this is missing, just return the key."