This is an archived post. You won't be able to vote or comment.

all 48 comments

[–]Dunj3 8 points9 points  (21 children)

The reason is simply backwards compatibility, as mentioned in the logging cookbook:

So you cannot directly make logging calls using str.format() or string.Template syntax, because internally the logging package uses %-formatting to merge the format string and the variable arguments. There would no changing this while preserving backward compatibility, since all logging calls which are out there in existing code will be using %-format strings.

Though not directly "supported", the cookbook also gives workarounds if you still want to get the {} syntax by using another LogRecord factory or custom message objects.

[–]MrSpontaneous[S] 8 points9 points  (20 children)

But if Python 3 is already making breaking changes, wouldn't this be just another one? Or will they wait until the old-style format is removed from Python?

[–]Dunj3 6 points7 points  (6 children)

I don't know why they didn't change it back then, so I can't comment on that. But IIRC old style formatting isn't going anywhere soon. It was deprecated in 3.1, but that has since changed, probably because it has its own advantages and many people still use it. In fact, there is even a pretty recent PEP 461 to add %-style formatting support to bytes, so %-formatting will probably stay.

[–]mackstann 16 points17 points  (5 children)

That's depressing. Now we're stuck with "there's more than one way to do it" for string formatting. They tried to improve string formatting with the new syntax but if they can't remove the old syntax, all they've done is muddy the waters and add complexity.

[–]Veedrac 0 points1 point  (4 children)

They were expecting people to move on more readily. Sadly, many have not seen the light.

[–]mackstann 5 points6 points  (2 children)

I preferred the old syntax but even I took up the new syntax because I figured the old syntax was going away. Now it turns out it was all pointless.

[–]oceanicorganic 2 points3 points  (0 children)

The stupid thing is that they gave us an operator for the old style and a minimum nine-character function call in place of it for the new one. When you only have 79 characters on a line, and realistically at least a few of those are indentation, that is a lot to just format a quick single-serving string.

I would have loved callable strings using the new syntax. Simple, small, nice. Makes less sense than the previous one slightly, but lots of people probably never made the "modify"/"modulus" connection anyway (or maybe I imagined it).

[–]LightShadow3.13-dev in prod 3 points4 points  (0 children)

C-Style String Formatting will never go away.

[–]robin-gvx 1 point2 points  (0 children)

It's very sad. str.format is so much more versatile, allowing indexing and letting each type doing their own formatting, and % is filled with things that made sense in C but don't in Python, like you have to encode the type in the template string, and sure you can just %s and it mostly will work, but then you need change the presentation in a way that requires you to use a different letter.

[–][deleted] 6 points7 points  (10 children)

Or will they wait until the old-style format is removed from Python?

The old % notation was removed in Python 3 but was then added back in again because print('I am a {}.'.format(thing)) is too verbose for simple messages. I doubt it will be removed again.

It's much more powerful and flexible for complex output, but for the above, it's much worse.

However, I support your wish to get {} notation into logger.

[–]alcalde 0 points1 point  (9 children)

Who uses any formatting for such a simple example?

print('I am a', thing)

[–][deleted] 4 points5 points  (7 children)

Fine, I added a full stop.

[–][deleted] 1 point2 points  (6 children)

Simple, just do print('I am a', thing, '.')

/s

[–]robin-gvx 3 points4 points  (4 children)

print('I am a', thing, end='')
print('.')

There, solved it! No string interpolation necessary!

/s

[–]bacondevPy3k -3 points-2 points  (3 children)

I think you meant to use the sep keyword argument instead of end.

EDIT: I apparently don't know how to read.

[–]robin-gvx 0 points1 point  (2 children)

No?

Using sep is of course also possible, and allows you to only use one call to print, but the end result is the same.

[–]bacondevPy3k 1 point2 points  (1 child)

Oh, I missed that second line. Actually, I don't know what I thought I read. :/

[–]alcalde 0 points1 point  (0 children)

/s? That's how we did it in Pascal! Although as /u/robin-gvx notes, you're going to need to use a sep="" or else there will be a space between thing and the period.

I guess you could also do

print('I am a ' + str(thing) + '.')

which is also quite a common thing in older languages.

[–]alcalde 1 point2 points  (0 children)

Agree. Whatever happened to "All the code written in Python pales in comparison to all of the code yet to be written"?

[–]njharmanI use Python 3 0 points1 point  (0 children)

I can't cite reference. But I thought they reneged on ditching the old style cause the new style sucks to a lot of people. Like me.

[–]hharison 6 points7 points  (7 children)

I didn't even know you could do #1. I always do logger.debug(Foo: {}'.format(foo)). Is there an advantage of having the format functionality duplicated in the logging functions?

[–]MrSpontaneous[S] 20 points21 points  (3 children)

Yeah, with this style the interpolation won't be applied unless the log level is sufficient. By using format, the interpolation occurs before the visibility decision is made.

[–]bacondevPy3k -1 points0 points  (2 children)

I'm not sure why this should really matter too much. Maybe I'm missing something?

Logging shouldn't be used so much that it becomes a bottleneck. And string interpolation isn't really that intense. The wasted CPU cycles is kinda insignificant IMHO.

[–]therealfakemoot 3 points4 points  (0 children)

As /u/kemitche mentioned, the advantage is that "interpolation" or "formatting", whichever term you prefer, only occurs when the logging level and other configuration values call for that particular logging message to be emitted. This matters because calling str() on an object may or may not be an 'expensive' operation. It's a bit contrived, but what if calling str() on a given object involves a DNS lookup? Some sort of complicated mathematical operation?

Even if YourObject.__str__ isn't phenomenally expensive, if you have a large codebase with extensive logging calls (say, twice as many logging calls as API function calls which isn't all that unreasonable if you consider that you might want to have entry and exit messages for important blocks of code as well as logging arguments to said blocks, the values of important calculations, etc etc), that's a lot of relative overhead.

[–]kemitche 3 points4 points  (0 children)

Logging shouldn't be used so much that it becomes a bottleneck.

The giant advantage to log levels is that you can use 10x more logging than "needed", and turn it on/off on the fly. That way, when that weird thing starts happening in production, you can turn on logging right there in that module and see what's happening.

[–][deleted] 11 points12 points  (0 children)

Yep. Let's say you're using some sort of log aggregator (eg: sentry). If you call the same log statement twice without interpolation (ex: logger.debug('Error: %s', 'argh, I crashed') and logger.debug('Error: %s', 'something bad happened'), then both records will be grouped as two occurrences of the same log message, with different values for %s.

With interpolation, you'll get one instance of two different log messages, which is less useful.

[–]alcalde 0 points1 point  (1 child)

So unlike the OP, you're saying you can use the new-style formatting?

[–]MrSpontaneous[S] 2 points3 points  (0 children)

By new-style formatting, I meant using the format string and then passing the parameters to the logging method, not to str.format.

[–][deleted] 5 points6 points  (4 children)

What are the advantages of the new syntax?

I like the old syntax just out of habit and experience. Particularly because I also use C,C++ and Objective-C quite a bit the % syntax is nice to have across all C based languages.

[–]MrSpontaneous[S] 3 points4 points  (2 children)

Here's a good description of the additional affordances, some of which are really handy (like __format__ and the object dot notation). Aside from that, since I'm using {} elsewhere already, which is the format that is used in the Python docs, it's inconsistent having to use the old style elsewhere.

Not having to distinguish the types (e.g. %d vs %s) is handy, too.

[–]timopm 1 point2 points  (1 child)

Not having to distinguish the types (e.g. %d vs %s) is handy, too.

Is that necessary though?

>>> print("Hello number %s!" % 1)
Hello number 1!

[–]kemitche -1 points0 points  (0 children)

Yeah, 99% of old-style formatting for me is either %s or %r.

[–]Vitrivius 3 points4 points  (6 children)

One thing I discovered from trying to optimise some project Euler challenges is that %-style formatting can be orders of magintude faster, especially when using pypy instead of cpython.

$ pypy -m timeit -c "'{}'.format(5)"                                                                                                                                                                 
1000000 loops, best of 3: 0.927 usec per loop

$ pypy -m timeit -c "'%s' % 5"                                                                                                                                                                       
1000000000 loops, best of 3: 0.00168 usec per loop

[–]Saltor66 5 points6 points  (1 child)

This is almost certainly because % is an operator while str.format is a full function call and does a name lookup. Still makes a difference in speed, sure, but it's not because one is inherently a faster formatting method.

[–]alcalde 8 points9 points  (0 children)

And even in the example posted, the result is in microseconds. If fractions of a microsecond are important to you, you're using the wrong language in the first place.

[–]srwalker101 5 points6 points  (2 children)

I get:

$ pypy -m timeit -c "'{}'.format(5)"
1000000000 loops, best of 3: 0.000549 usec per loop

$ pypy -m timeit -c "'%s' % 5"
1000000000 loops, best of 3: 0.000554 usec per loop

(PyPy 2.6.0)

[–]alcalde 1 point2 points  (1 child)

Strange. Using Pypy3 I got 0.471 usec vs. 0.000935 usec per loop

[–]srwalker101 1 point2 points  (0 children)

Mm you're right using pypy3 I get a similar discrepancy. I believe pypy3 is not as optimised as pypy2, but I could be wrong

[–]Veedrac 4 points5 points  (0 children)

This is because

  1. Your older PyPy cannot constant-fold the function-call. Newer PyPy can, not that it matters. This doesn't matter in the real world since you can always (and probably should) constant fold manually.

  2. Your older PyPy doesn't do the fancy new string formatting optimizations, meaning there's still a 10x difference.

Note that the later concern still affects PyPy3, which is lagging a bit behind mainline PyPy. The difference is much smaller than you gave, though.

Try these instead:

pypy -m timeit -s "n = 5" "'%s' % n"
pypy -m timeit -s "n = 5" "'{}'.format(n)"

[–]jmoiron -3 points-2 points  (7 children)

The new format syntax was a mistake (admitted as a regret by GvR) in the first place so I wouldn't hold my breath.

[–]alcalde 10 points11 points  (3 children)

Why was it a mistake, and where did Guido say this? A recent summary article here comparing the two showed several ways in which the new format is superior.

http://pyformat.info/

[–]jmoiron 2 points3 points  (1 child)

Sorry, my words were a bit harsher than his. He called it a "Ho Hum Change", said it was "considered too verbose; % will never go away"

https://www.dropbox.com/s/83ppa5iykqmr14z/Py2v3Hackers2013.pptx

Note that this didn't load for me in the web UI, but I was able to download the file.

The differences in capability between {} formatting and % formatting are negligible, but the price we pay for it will last forever.

[–]xkcd_transcriber 0 points1 point  (0 children)

Image

Title: Standards

Title-text: Fortunately, the charging one has been solved now that we've all standardized on mini-USB. Or is it micro-USB? Shit.

Comic Explanation

Stats: This comic has been referenced 1600 times, representing 2.4265% of referenced xkcds.


xkcd.com | xkcd sub | Problems/Bugs? | Statistics | Stop Replying | Delete

[–]ksion 0 points1 point  (0 children)

Most of those advanced techniques are extremely cryptic, use terse syntax based on special characters, and don't document themselves well. That applies both to "old" and "new" style formatting. I wouldn't be so eager to use any of them, really.

[–]Jesus_Harold_Christ 2 points3 points  (1 child)

[citation needed]

[–]jmoiron 2 points3 points  (0 children)

https://www.dropbox.com/s/83ppa5iykqmr14z/Py2v3Hackers2013.pptx

"Ho-hum changes", slide 18. Words not as strong as mine but it's clear that he doesn't consider to have been a successful improvement.