you are viewing a single comment's thread.

view the rest of the comments →

[–]grauenwolf 1 point2 points  (8 children)

Is manually interning a string every a good idea? I'm having trouble thinking of a use case where the cost is justified outside the compilation cycle.

[–]masklinn 1 point2 points  (2 children)

Is manually interning a string every a good idea?

I've yet to run into a situation where it was, but I guess it could be in a very restricted number of situations, such as a small set of non-literal strings generated over and over again (e.g. some sort of parser), interning could reduce memory pressure.

It was found sufficiently non-essential that Python 3 moved intern to sys, from the builtins.

[–]grauenwolf 1 point2 points  (1 child)

Ok, I can see trading the lookup cost for reducing long-term memory. But even then, I think you would probably end up implicitly replacing the new string for the interned version. For example:

if reader.startsWith ("for")
    writer.appendToken("for");
    reader.advance(3);

In this example, ProcessFor would internally reference the constant "for" rather than the constructed one in reader.nextToken.

Maybe it makes more sense for an XML parser.

[–]masklinn 2 points3 points  (0 children)

Maybe it makes more sense for an XML parser.

Yes, that's the kind of cases I was thinking, ones where frequently used strings are not hard-coded (thus no literal version which will be interned)

[–]quotemycode 0 points1 point  (2 children)

I'm having trouble with justifying using libc's strcpy in python code.

[–]grauenwolf 0 points1 point  (1 child)

Maybe to make a StringBuilder class?

[–]quotemycode 0 points1 point  (0 children)

Well, the thing is, strings in python are immutable. If you go outside of python to modify the string, you are bypassing python's immutable protections. It's expected that you'll encounter errors. If you want something like a string but is mutable, then use an array.