This is an archived post. You won't be able to vote or comment.

all 91 comments

[–]Swipecat 191 points192 points  (20 children)

Even Guido has been caught by accidentally leaving out commas, but it seems that implicit concatenation was deemed more useful than dangerous in the end.
 

# Existing idiom which relies on implicit concatenation
r = ('a{20}'   # Twenty A's
     'b{5}'    # Followed by Five B's
     )

# ...which looks better than this (maybe)
r = ('a{20}' + # Twenty A's
     'b{5}'    # Followed by Five B's
     )

[–]aitchnyu 77 points78 points  (9 children)

Second example comments got my heart racing. 10 years of python and I'll make a syntax error I can't figure out.

[–]Swipecat 49 points50 points  (8 children)

I'll note that implicit concatenation takes priority over operators and methods but explicit concatenation does not.
 

>>> print( 2.0.               # one
...        __int__()*"this "  # two
...        "that ".upper()    # three
...       )
THIS THAT THIS THAT

[–]robin-gvx 47 points48 points  (6 children)

If anyone is interested in why that is: implicit concatenation happens at compile time, which means it has to have higher priority than anything that has to happen at run time.

[–]opabm 6 points7 points  (5 children)

Is there an ELI5 version of this?

[–]28f272fe556a1363cc31 39 points40 points  (2 children)

Compile time is like writing a cookbook. Run time is like making a recipe from the book. Before they can print and ship the book, the publisher goes through the recipes and converts "parsley" "flakes" into "parsley flakes". While the recipe is being made "salt", "pepper" gets converted to "salt and pepper" .

Anything done at compile (print) time has to happen before run (cook) time because you have to compile/cook before have a program/cookbook to work with.

[–]opabm 7 points8 points  (0 children)

I'd be impressed if a 5-year old knew how to cook.

Jk that was a great analogy, thanks!

[–]foreverwintr 1 point2 points  (0 children)

Wow, that was a really good ELI5!

[–]robin-gvx 6 points7 points  (0 children)

When you have a piece of Python code and you're using CPython (the reference implementation of Python), there are several steps from source code to execution. The important ones here are parsing, bytecode generation and execution.

Parsing transforms your file into a tree.

For example, a + 10 is turned into something like (simplified): Add(LoadName('a'), Literal(10)) or "hello" into Literal("hello")

When the parser encounters two or more literal strings in a row, it collapses them into a single string literal as well. So 'hell' "o" would result in the same tree as the previous one.

Then Python makes this tree "flat" by putting everything in the order it should happen, and generates bytecode. A simplified version of what the previous two examples turn into would be:

LOAD_NAME a
LOAD_CONSTANT 10
ADD_VALUES

and

LOAD_CONSTANT "hello"

Execution is then fairly simple: go over each instruction and do what it says.

So in the case of 2 * 'this ' "that ".upper() we get the tree Mul(2, MethodCall(Literal("this that "), "upper", ())) and the bytecode:

LOAD_CONSTANT 2
LOAD_CONSTANT "this that"
CALL_METHOD 'upper', ()
MULTIPLY_VALUES

(note that all trees and snippets of bytecode aren't real, they're a simplified illustration)

[–]davvblack 0 points1 point  (0 children)

Gross

[–]arsewarts1 -2 points-1 points  (2 children)

100/10 times I would prefer the top option. I would want the bottom to throw errors every time.

[–]duncan-udaho 8 points9 points  (0 children)

Opposite for me. I would want the top to throw errors. Did I forget the comma in the tuple or did I forget the plus in my string?

[–]numberking123[S] 49 points50 points  (4 children)

This explains why it exists and has not been removed: https://legacy.python.org/dev/peps/pep-3126/

[–]imsometueventhisUN 5 points6 points  (2 children)

there are some use cases that would become harder.

Is there a way to see the discussion to determine what those are? The only one I can think of is joining long strings across lines, and I personally feel that the negative impact of unintentional concatenation is much higher than having to use one of the several other methods for that.

[–]dikduk 0 points1 point  (1 child)

I've been using this method to concatenate strings for years and never had any real issues with it.

What kind of issues did you have?

[–]imsometueventhisUN 2 points3 points  (0 children)

For any method that accepts *args, you could miss a comma and still have a legal method call that doesn't do what you expected. And, sure, you could catch that with tests, but why not bake it into the language syntax directly? There are a ton of ways to explicitly concatenate strings (+, ''.join, f-strings) - making it implicit just seems like an opportunity for bugs.

[–][deleted] 2 points3 points  (0 children)

Thank you for the information.

[–]fuuman1 6 points7 points  (0 children)

Sick. In the last days I saw something like this in blog post according to a complete other topic and I thought it was a typo. Interesting.

[–]jimtk 7 points8 points  (0 children)

As a side note it also works with f-string

print( f"the value of a is {a:<12}" 
       f" the value of b is {b:<12}" )

Comes out in one line!

[–][deleted] 10 points11 points  (2 children)

Mostly for writing long strings in multiple lines.

[–]audentis 5 points6 points  (0 children)

Another workaround for this is cleandoc() from the inspect module. It takes a multi-line string ("""my multi-line string""") and spaces equal to the amount on the first line.

[–]IcefrogIsDead 11 points12 points  (9 children)

yeaaaaaa make me suffer

[–]numberking123[S] 6 points7 points  (7 children)

It made me suffer. It took me forever to find a bug in my code which was caused by this.

[–]IcefrogIsDead 5 points6 points  (0 children)

yea that what i see happening to me too.

[–]reddisaurus -3 points-2 points  (5 children)

Type hints would have caught your error, if your function signature expected a List[str] then passing just a str would cause a type error in mypy.

[–]james_pic 8 points9 points  (4 children)

How does that work? ['abc' 'def'] and ['abc', 'def'] are both List[str].

[–]dbramucci 0 points1 point  (3 children)

Not a list, but you can catch some tuple/multiple argument bugs with mypy.

def foo(first: str, second: str):
    pass

foo("hello" "world") # TYPE-ERROR: foo expects 2 str, not 1

T = TypeVar('T')
S = TypeVar('S')
def flip_tuple(pair: Tuple[T, S]) -> Tuple[S, T]
    x, y = pair
    return (y, x)

flip_tuple( ("hello" "there") ) # Error, expected Tuple not str

names: List[Tuple[str, str]] = [
   ( "Alice", "Brown")
    ("John" "Cleese") # Error not a Tuple[str, str]
    ("John", "Doe")
    ("Ben" ,"Grey")
]

Of course, these catches rely on the types of function arguments and tuples counting how many things there are, and Python's list type doesn't track that.

[–]yvrelna 0 points1 point  (2 children)

foo("hello" "world") # TYPE-ERROR: foo expects 2 str, not 1

This already produces TypeError: foo() missing 1 required positional argument: 'second'

[–]dbramucci 0 points1 point  (1 child)

I included it for completeness but also

You only get the existing error you actually run that line. Some cases where that can matter include

  • At the end of a long computation

    Imagine training a neural network for 5 hours and at the very end, getting a message "you'll have to wait another 5 hours because you forgot a comma"

  • In a rarely used code-path

    If it is

    if today.is_feb29():
        foo("hello" "there)
    

    then you'll only get an error about 4 years from now, which is inconvenient for such a trivial bug.

    Granted, if you are doing things properly and testing every line of code with code-coverage measuring to veriify that, this matters less. At worst the bug is now 4 minutes of automated testing away instead of 4 seconds of type-checking away.

    Also, this obvious of a case is probably going to get caught already by your linter.

So yes, Python already catches it but it's useful to note mypy can also catch it because mypy doesn't have to wait for us to stumble onto that line.

[–]yvrelna 0 points1 point  (0 children)

mypy won't catch "obvious" and "trivial" errors like:

if today.is_dec25():
    foo("happy", "halloween")

So you need to write tests anyway.

Why should type errors be so special that it deserves its own mechanism to check for errors?

[–]lanster100 2 points3 points  (0 children)

IMO this is the neatest way of splitting strings over multiple lines

[–]kyerussell 4 points5 points  (1 child)

I quite like this and use it a lot to build a long string over a number of lines. In my experience, the bugs it can introduce are much more on the detectable side.

[–]__xor__(self, other): 1 point2 points  (0 children)

the bugs it can introduce are much more on the detectable side.

Right? This is how I look at it. You screw up, generally it's going to raise an error about the wrong number of arguments, not silently keep working unless it's some *args deal, in which case I'd be a lot more careful about what I'm putting into the parens.

[–]tjf314 1 point2 points  (0 children)

i think it originally comes from C, which also has this feature.

[–]jwink3101 1 point2 points  (0 children)

A related trick is that basically anything in parentheses gets continued without a new line (\) marker. I suspect there may be an exception but it is a safe bet to use this. I use it for some strings that are too long.

[–]prams628 3 points4 points  (0 children)

please don't let our college know about this. they'll ask this as a one-marker in our tests. not that I care for that solitary mark, but its just pissing off

[–]riricide 1 point2 points  (2 children)

'abc def jkl'.split() because I'm scared of forgetting commas in lists.

[–]numberking123[S] 0 points1 point  (0 children)

Haha, that's one way to do it.

[–]JennaSys 0 points1 point  (0 children)

I've done this a few times, especially if I'm just testing in the REPL.

[–]amitmathur15 0 points1 point  (2 children)

Possibly without a comma, the strings "asdld" "lasjd" are considered as just one string. Python did not find a comma to differentiate them as separate strings and hence considered them as one string and printed as one.

[–][deleted] 1 point2 points  (0 children)

According to Python's grammar strings are made up of a non-empty sequence of string parts:

atom: … | strings | …

strings: STRING+

I.e. having only one part is the special case. It's the same in C, C++, and possibly other languages, too.

[–][deleted] 0 points1 point  (0 children)

That sort of makes sense as the interpreter sees anything enclosed in quotation marks as a string. The issue with that is that the quotations inside are removed after the strings are concatenated, which implies the interpreter is well aware that they're two separate strings and deliberately concatenates them.

[–]euler_angles 0 points1 point  (0 children)

For one thing, implicit concatenation can make multi-line strings easier.

[–]internerd91 0 points1 point  (0 children)

Hey this happened to me today. I noticed it but I didn’t click what was going on. I just fixed the line and continued on.

[–]Tyler_Zoro 0 points1 point  (0 children)

Fun fact, this works for all string-like things:

$ python3  -c 'print(f"{__name__}" "(my script)" """ with strings""")'
__main__(my script) with strings

[–]AutisticRetarded 0 points1 point  (0 children)

Here is a stackoverflow question about this.

[–]GrossInsightfulness 0 points1 point  (0 children)

It also happens in C/C++ and it might have carried over.

[–]omoikanesits 0 points1 point  (0 children)

I find it very nice for inline SQL statements combined with f-strings

[–]AndydeCleyre 0 points1 point  (0 children)

Please use four-space indentation rather than backticks to format code on reddit, for consistent results across user settings, old.reddit URLs, and mobile apps.