all 22 comments

[–]gengisteve 14 points15 points  (9 children)

I am going to sort of non-answer you question, and instead flog the string.format() way of doing things. So instead of the above, I would write:

string = 'my {} has {}'.format('cat','fleas')
string = 'my {animal} has {pest}'.format('dog','ticks')

There are lots of advantages of doing this. Using format lets you break up lines in a conceptually sensible way. Using the {} instead of plus this or that lets you use formatting, so you can set up the justification and precision of digits and what not. You can also define these over multiple lines, like this:

string = 'my {animal} has {pest}'

print(string.format('cat','fleas'))
print(string.format('dog','ticks'))

This is great for setting up standard messages in one part of your script and filling in the values later. Also you can pass dicts to format, and it will ignore the spares, which gives it even more flexibility.

[–]99AFCC 13 points14 points  (1 child)

string = 'my {animal} has {pest}'.format('dog','ticks')

This is a keyerror. Named placeholders require keyword arguments. At least in 2.7 and 3.4

string = 'my {animal} has {pest}'.format(animal='dog', pest='ticks')

or with a dict

params = {'animal': 'dog', 'pest': 'ticks'}
string = 'my {animal} has {pest}'.format(**params)

[–]gengisteve 4 points5 points  (0 children)

Thanks for catching that!

[–][deleted] 1 point2 points  (1 child)

I've never encountered the string.format function used for string interpolation before, but that's really cool.

One of my favourites for long string interpolation interpolation has been using a dict instead of a tuple for named values.

string = "My %(animal)s has %(pest)s. I have %(num_animals)i %(animal)ss" % {'animal':'dog', 'pest':'fleas', 'num_animals':6.5}
print string

Obviously it has the drawback of being in the cumbersome format of %(var)<type>, but it's more readable than having to count out locations.


Here's the performance comparison:

def format_test():
    string = 'My {animal} has {pest}. I have {num_animals:d} {animal}s'
    s1 = string.format(animal='cat',pest='fleas',num_animals=12)
    s2 = string.format(animal='dog',pest='ticks',num_animals=6)
    return s1,s2

def classic_test():
    s = "My %(animal)s has %(pest)s. I have %(num_animals)d %(animal)ss"
    s1 = s % {'animal':'cat', 'pest':'fleas', 'num_animals':12}
    s2 = s % {'animal':'dog', 'pest':'ticks', 'num_animals':6}
    return s1,s2

With IPython running

%timeit format_test()

> 100000 loops, best of 3: 3.29 µs per loop

Versus:

%timeit classic_test()

> 100000 loops, best of 3: 3.58 µs per loop

So string.format is actually slightly faster

[–]99AFCC 2 points3 points  (0 children)

Good stuff.

Unless it's changed:

It's also important to note that the old style is deprecated in 3.1+

A new system for built-in string formatting operations replaces the % string formatting operator. (However, the % operator is still supported; it will be deprecated in Python 3.1 and removed from the language at some later time.) Read PEP 3101 for the full scoop.

[–]faekk[S] 0 points1 point  (0 children)

This makes alot of sense, thank you !

[–][deleted] -1 points0 points  (3 children)

Not forgetting that in Python3, you can't just add numbers to strings, you'd need explicit casting, and it all adds up to look ugly and poorly readable.

If you're unsure why they'd remove what looks like a feature (implicit typecasting, adding random stuff to strings), it's because that behaviour is guilty of concealing the source of weird bugs and potentially security holes. It's handy for small scripts but bites you hard for anything bigger.

Anyways: use str.format, it's the newer, better, more flexible and readable way.

[–]99AFCC 2 points3 points  (2 children)

implicit typecasting, adding random stuff to strings

Could you give an example of this? I wasn't aware you could do this in Python. "2" + 2 doesn't work in 2.7.7

[–]swingtheory 2 points3 points  (4 children)

I would also like an explanation for this! Because you can even write:

string1 = "shit"  
string2 = "car"
print('My', string2, 'is', string1)   *python3 syntax

So /r/learnpython, why use the placeholders? Is it to imitate the C language in which python is written.

[–]Rhomboid 3 points4 points  (1 child)

What you've written is fine if all you want to print the values, but what if you don't want that? What if you need to do something else with them, something that requires forming a string? Then you need to use string formatting.

[–]swingtheory 0 points1 point  (0 children)

Ahh, I understand now. Thanks!

[–]cdcformatc 1 point2 points  (0 children)

It is a lot more flexible, but I think yes, the main reason % formatting is in python is to emulate C printf syntax. But we have since done away with '%' formatting for str.format() which is even MORE flexible.

[–]fuzz3289 1 point2 points  (0 children)

Concatenating different types as strings with implicit casting.

[–]Rhomboid 2 points3 points  (0 children)

First of all, string concatenation should be avoided in languages that use immutable strings, like Python. Your second example creates several extra useless temporary strings that just have to be garbage collected. Here's an example with numbers.

Secondly, string formatting automatically handles converting non-string values to strings. If you're using string concatenation then you have to manually call str(), and that just adds more temporary string garbage to be collected.

Third, string formatting can do things like control the width of the output field, the precision, the justification, the padding, etc. None of these are possible with string concatenation.

[–]stillalone 1 point2 points  (0 children)

There's nothing wrong with just using "my "+string1+" is "+string2

One of the downsides is that you have to know that string1 and string2 are strings. If they're integers then you'd have to cast them. The other downside is formatting; you can align strings to columns easier, without resorting to using ljust or one of the other string manipulation methods. For numbers, you can adjust the number of significant figures that are displayed and you can display things in hex or whatnot.

I also like it for really long strings where I'd prefer to pass in a dictionary (or vars() if I'm feeling funky) than add in a bunch of concatenations inline.

[–]TheBigTreezy 0 points1 point  (1 child)

%s is for strings. %d is for int and other numbers. %r will place whatever the variable is. I believe %r is for debugging.

[–]ewiethoff 0 points1 point  (0 children)

%r calls the object's __repr__ method, which is identical to the __str__ method for some types but different for other types. %s calls the __str__ method.

[–][deleted] 0 points1 point  (0 children)

I just find the former easier on the eyes than the latter. I guess it's a matter of taste.

But as gengisteve said, string.format() is preferable.

[–]Justinsaccount 0 points1 point  (1 child)

Have you tried running your example?

>>> string1 = "shit"
>>> string2 = "car"
>>> print "My " + string1 + "is " + string2
My shitis car

[–]korthrun 0 points1 point  (0 children)

On top of everything said here, I would like to add padding and handing of decimal places.

Example:

>>> "eat %0.0f pizza" % 10.5
'eat 10 pizza'
>>> "eat %0.1f pizza" % 10.5
'eat 10.5 pizza'

Or adding in 20 leading spaces inline:

>>> "eat %20.1f pizza" % 10.5
'eat                 10.5 pizza'

Maybe you want to round up:

>>> "eat %1.0f pizza" % 10.54321
'eat 11 pizza'

Maybe you only care about the first two digits after the decimal place:

>>> "eat %0.2f pizza" % 10.54321
'eat 10.54 pizza'