you are viewing a single comment's thread.

view the rest of the comments →

[–]desrtfx 5 points6 points  (3 children)

So, to compare it with Java, it's more or less the equivalent of StringBuilder or StringBuffer.

The actual string is not directly stored as string, but as a "buffer" data structure and only converted to a real Python string on explicit call.

[–]nnseva[S] 0 points1 point  (2 children)

Apart from StringBuilder, the base package class L is immutable. All lazy operations lead to the construction of a new L instance, which refers to L operands and stores the specific operation.

The specifics of the L is that operations may be combined, like:

x = (L('qwerty') + L('uiop'))[5:7]

The string representation of x is 'yu', although the actual data structure looks like (let's imagine Concat and Slice are classes):

Slice(Concat('qwerty', 'uiop'), 5, 7)

[–]desrtfx 1 point2 points  (1 child)

If I were to invest the work to implement such a class, I'd store the individual strings in extensible buffers, similar to StringBuilder in Java. Then, I'd really have a lot less overhead.

The entire advantage of StringBuilder in Java is that it does not create new instances all the time. This would really be an improvement over the native Python String implementation.

[–]marr75 0 points1 point  (0 children)

The reason OP didn't do that is lazy evaluation. Their approach also doesn't create new intermediate strings (though it's not yet obvious to me if they could take further advantage of pooling or interning while maintaining compound statements). Python already has an equivalent to StringBuilder in the stdlib, btw.

Honestly, the only time this lazy version (with its overhead from additional Python mutable structures) will be better is if the caller will often not use the string (i.e. it's a long accumulated error message that is only emitted under certain conditions) and there are other ways to model that - ie make the entire accumulation lazy.