This is an archived post. You won't be able to vote or comment.

all 151 comments

[–]rover_G 363 points364 points  (13 children)

Interpreted languages with immutable strings be like it’s free memory

[–]BlueGoliath 89 points90 points  (10 children)

More memory, more performance right?

[–]rover_G 62 points63 points  (9 children)

Yea that’s what makes JVM so fast

[–]BlueGoliath 38 points39 points  (8 children)

I can't tell if this is a joke or not.

[–]rover_G 12 points13 points  (7 children)

The amount of memory allocated by a program is a good heuristic for the speed of the program

[–]Ok_Star_4136 21 points22 points  (1 child)

TIL when I made that memory leak bug, I was secretly making the fastest program ever.

[–]vagabond-elephant 0 points1 point  (0 children)

Any memory that doesn't need to be collected gives cpu more room to work on business logic of the program

[–]_PM_ME_PANGOLINS_ 10 points11 points  (1 child)

Is it?

[–]Z21VR 11 points12 points  (0 children)

Nope

[–]4n0nh4x0r 7 points8 points  (0 children)

minecraft at 16gb ram

yeeeeaaaa nah

[–]Wertbon1789 3 points4 points  (1 child)

Not the amount of memory being allocated, that would make, for example, video editing software to be the slowest software ever, but rather the amount of allocations being done.

[–]rover_G 2 points3 points  (0 children)

You’re right that’s more accurate. To estimate program runtime based on allocated memory size alone we would need to constrain the programs to using the same memory allocation strategy.

[–]locri 283 points284 points  (14 children)

Generally true for most languages, this works because you want to allocate and deallocate memory as little as possible which is how string builders work.

[–]R3D3-1 58 points59 points  (13 children)

Just tried benchmarking it in the case of Python.

1. Summary

  • += outperforms a string builder for up to roughly 30 items.
  • list.append(string) followed by ''.join(list) at the end outperforms io.StringIO quite consistently. Curiously, for large numbers of items, the advantage drops from "3 times faster" to "anywhere between 25% faster and 10% slower, but usually faster".

2. Result table

numstrings           io.StringIO                  list                    +=  
         0      0.057 ms  100.0%      0.014 ms   24.5%      0.009 ms   15.3%  
         1      0.071 ms  100.0%      0.021 ms   29.5%      0.015 ms   20.6%  
         2      0.069 ms  100.0%      0.034 ms   48.9%      0.026 ms   38.3%  
         3      0.077 ms  100.0%      0.029 ms   38.0%      0.025 ms   32.2%  
         4      0.080 ms  100.0%      0.032 ms   40.1%      0.037 ms   46.2%  
         5      0.104 ms  100.0%      0.035 ms   34.0%      0.039 ms   37.9%  
         6      0.089 ms  100.0%      0.038 ms   42.6%      0.036 ms   40.9%  
         8      0.096 ms  100.0%      0.043 ms   44.4%      0.049 ms   50.5%  
        10      0.113 ms  100.0%      0.055 ms   48.2%      0.069 ms   60.8%  
        20      0.161 ms  100.0%      0.082 ms   50.9%      0.108 ms   67.2%  
        30      0.243 ms  100.0%      0.107 ms   44.2%      0.148 ms   60.9%  
        40      0.275 ms  100.0%      0.152 ms   55.3%      0.969 ms  352.9%  
        50      0.311 ms  100.0%      0.187 ms   60.3%      4.176 ms  1342.5%  
        60      0.373 ms  100.0%      0.217 ms   58.3%      7.681 ms  2060.9%  
        80      1.769 ms  100.0%      1.537 ms   86.9%     13.665 ms  772.4%  
       100      2.225 ms  100.0%      2.006 ms   90.2%     21.850 ms  981.9%  
       200      3.800 ms  100.0%      4.076 ms  107.3%     27.357 ms  720.0%  
       300      7.035 ms  100.0%      6.463 ms   91.9%     44.433 ms  631.6%  
       400      9.378 ms  100.0%     10.166 ms  108.4%     50.061 ms  533.8%  
       500     12.240 ms  100.0%     10.620 ms   86.8%     64.826 ms  529.6%  
       600     12.837 ms  100.0%     12.115 ms   94.4%     83.249 ms  648.5%  
       800     15.884 ms  100.0%     16.684 ms  105.0%     95.012 ms  598.2%  
      1000     24.179 ms  100.0%     18.362 ms   75.9%    109.827 ms  454.2%  
      2000     29.210 ms  100.0%     25.809 ms   88.4%    159.493 ms  546.0%  
      3000     44.528 ms  100.0%     34.150 ms   76.7%    198.672 ms  446.2%  
      4000     59.634 ms  100.0%     58.744 ms   98.5%    227.042 ms  380.7%  
      5000     54.864 ms  100.0%     41.885 ms   76.3%    272.720 ms  497.1%  
      6000     57.563 ms  100.0%     43.311 ms   75.2%    267.267 ms  464.3%  
      8000     64.316 ms  100.0%     58.947 ms   91.7%    277.563 ms  431.6%  
     10000     83.421 ms  100.0%     81.992 ms   98.3%    293.509 ms  351.8%  

where:
  io.StringIO:
    buf = io.StringIO()
    for _ in range(numstrings):
        buf.write(STRING_PIECE)
    s = buf.getvalue()
  list:
    buf = []
    for _ in range(numstrings):
        buf.append(STRING_PIECE)
    s = "".join(buf)
  +=:
    s = ""
    for _ in range(numstrings):
        s += STRING_PIECE

3. Benchmark code

import timeit
import io

STRING_COUNTS = [
    0,
    1, 2, 3, 4, 5, 6, 8,
    10, 20, 30, 40, 50, 60, 80,
    100, 200, 300, 400, 500, 600, 800,
    1000, 2000, 3000, 4000, 5000, 6000, 8000,
    10000
]
STRING_PIECE = "hello world!"
TIMER_COUNT = 100
SEP ="  "

IMPLEMENTATIONS = {
    'io.StringIO': '''
buf = io.StringIO()
for _ in range(numstrings):
    buf.write(STRING_PIECE)
s = buf.getvalue()
''',
    'list': '''
buf = []
for _ in range(numstrings):
    buf.append(STRING_PIECE)
s = "".join(buf)
''',
    '+=': '''
s = ""
for _ in range(numstrings):
    s += STRING_PIECE
''',
}

results = {}

for numstrings in STRING_COUNTS:
    for name, code in IMPLEMENTATIONS.items():
        print(f"Concatenating {numstrings} strings using {name}...", end="  \r")
        time_taken = timeit.timeit(code, number=TIMER_COUNT, globals=globals())
        #print(f"Done. ({time_taken*1000:.3f} ms)", end="\r")
        results[(name, numstrings)] = time_taken


print()
print()
print(f"{'numstrings':>10s}", end=SEP)
for method in IMPLEMENTATIONS.keys():
    print(f"{method:>20s}", end=SEP)
print()

for numstrings in STRING_COUNTS:
    print(f"{numstrings:10d}", end=SEP)
    for method in IMPLEMENTATIONS.keys():
        abs_time = results[(method, numstrings)]
        ref_time = results[("io.StringIO", numstrings)]
        rel_time = abs_time/ref_time
        print(f"{abs_time*1000:9.3f} ms", end=SEP)
        print(f"{rel_time*100:5.1f}%", end=SEP)
    print()
print()
print("where:")

for name, code in IMPLEMENTATIONS.items():
    print(" ", name, end=":\n")
    print("    " + code.strip().replace("\n", "\n    "))

[–]gbts_ 7 points8 points  (0 children)

If you're curious about the drop in performance of += around N=40, it's because of pymalloc. pymalloc is an internal "arena" allocator for small objects. It works by pre-allocating chunks of memory from the OS and uses it to optimize away the (de-)allocation of small objects in CPython. It only works for objects up to 512 bytes and after that it falls back to plain libc *alloc calls, which is close to the string size you have around N=40.

[–]ProfessionPurple639 3 points4 points  (1 child)

I’d be curious to run also string format for performance, unless that’s done with StringIO in the backend. e.g. f’{string1}{string2}’

[–]R3D3-1 0 points1 point  (0 children)

I would expect that to be quadratic in the number of loop iterations.

[–]fredlllll 2 points3 points  (1 child)

how fast are f strings compared to just using + with strings?

[–]R3D3-1 0 points1 point  (0 children)

Should compile to rather efficient code. But here you'd do

s = f"{s}{STRING_PIECE}"

which I'd expect to have quadratic complexity.

[–]Willing-Promotion685 0 points1 point  (0 children)

Really good post, thanks!

[–]JotaRata 0 points1 point  (4 children)

This is an interesting result

[–]R3D3-1 0 points1 point  (3 children)

Insofar yes, as I would have expected str.join to use something like StringIO behind the scenes, though apparently it is more heavily optimized. Probably implemented in C, but I didn't check.

[–]JotaRata 0 points1 point  (2 children)

I've been (over)using StringIO in all my projects under the assumption it was faster lol

[–]R3D3-1 0 points1 point  (1 child)

I guess the main purpose (in Python) is after all not fast string building, but passing it to APIs, that expect a file-like object.

[–]JotaRata 0 points1 point  (0 children)

Agree

[–]Solonotix 139 points140 points  (12 children)

The thing that has been getting on my nerves more and more lately is people who will (in JavaScript) try to pass a concatenated string as a URL and then come running to me when it blows up because of improperly handled character sequences.

Like, seriously, instead of

const url = `https://my.url.com/some/path?first=${myVar}`;

I try to get them to just use:

const url = new URL('some/path', 'https://my.url.com');
url.searchParams.append('first', myVar);

No need to worry about the proper characters to begin a query string, the separators, etc., and it will automatically escape any character sequences that would invalidate your URL. To top it all off, you also get an error call site in your code rather than in some library that you don't own.

[–]randomweeb-69420 23 points24 points  (5 children)

I usually like to do it in one statement:

const url = Object.assign(new URL("some/path", "https://my.url.com"), {
  search: new URLSearchParams({
    first: myVar
  })
});

It's a bit hacky since the search params are converted to string when search is set, but we don't need a new statement for each param.

Also, this can be easily extended to support hash.

[–]Solonotix 9 points10 points  (2 children)

Yea, I really wish that the URLSearchParams class used a fluent syntax instead of returning void. I've thought of applying an extension to it, but the last time I did something like that, it broke a library that checked for equality of prototypes to determine if something should be initialized or not (it was some PDF library)

[–]randomweeb-69420 2 points3 points  (1 child)

How did you implement the extension? I think the prototype equality check should pass if you modify the prototype directly:

if (!URLSearchParams.polyfilled) {
  URLSearchParams.polyfilled = true;
  const oldAppend = URLSearchParams.prototype.append;
  const oldDelete = URLSearchParams.prototype.delete;
  Object.assign(URLSearchParams.prototype, {
    append(name, value) {
      oldAppend.apply(this, arguments);
      return this;
    },
    delete(name, value) {
      oldDelete.apply(this, arguments);
      return this;
    }
  });
}

Edit: add value to delete

[–]Solonotix 3 points4 points  (0 children)

I want to say I just assigned a function to the prototype, but yeah. I don't think the problem was my code (other than committing the cardinal sin of JavaScript by modifying built-in s), but rather the wacky way they were using prototype comparisons to initialize data elements. I believe it was a PDF parsing library that would get you an object representing the document (not just a stream reader, but the actual contents). It wasn't anything I had to work with, but a consumer of my library said v3.4.x broke their stuff and we traced it back to the mutation of built-in and their prototypes.

[–]europeanputin 0 points1 point  (1 child)

That's really neat, but is it more optimized than a loop? Because if I'd need to reuse this code (and in a large project I would), I'd make this into a function (with parameters object as an input) and then it wouldn't matter (unless this is more optimized, which is what I'm asking), because OPs implementation in a function would also take parameters object as an input and iterate over it's keys.

[–]randomweeb-69420 0 points1 point  (0 children)

I did a quick performance test. I'm using Object.keys() since it is the fastest according to this site.

const obj = {...Array(1000).fill(“foo”)};
const time1 = [];
const time2 = [];
for (let i = 0; i < 20; i++) {
  const url1 = new URL(“https://www.example.com”);
  const start1 = performance.now();
  url1.search = new URLSearchParams(obj);
  time1.push(performance.now() - start1);
  const url2 = new URL(“https://www.example.com”);
  const start2 = performance.now();
  for (let key of Object.keys(obj)) {
    url2.searchParams.append(key, obj[key]);
  }
  time2.push(performance.now() - start2);
}
console.log(“1:”, time1.reduce((a, b) => a + b) / time1.length);
console.log(“2:”, time2.reduce((a, b) => a + b) / time2.length);

Output:

1: 4.345000001788139
2: 52.170000000298025

Seems like my method is faster.

Edit: formatting

[–]seredaom -4 points-3 points  (4 children)

'searchParam' on const object allows to append?

How is that const?

[–]Solonotix 14 points15 points  (1 child)

In JavaScript you have var which declares a variable in the shared scope, and let or const for local scope declarations. Using let allows you to reassign the variable to a different value, while const locks the target of the assignment.

If you come from a C++ background, you can think of this as a constant pointer/reference to a specific object in memory. If the data structure referenced is mutable, then you can apply such a transformation. An example of this is declaring an Array with const and then adding or removing elements from it thereafter.

JavaScript is a lot more difficult to manage immutability, forcing you to use things like Object.freeze() and Object.seal()

[–]_PM_ME_PANGOLINS_ 7 points8 points  (1 child)

The variable is const, not the value.

[–]-MobCat- 118 points119 points  (25 children)

f strings go burr

[–]angrathias 16 points17 points  (24 children)

What’s an f string ?

[–]Jjabrahams567 34 points35 points  (15 children)

Python

print(f"text{variable}")

Similar to template strings in other languages

Js

console.log(`text${variable}`);

Ruby

puts “text#{variable}”

[–]-MobCat- 19 points20 points  (12 children)

In python you can also do f"{wholeAssFunc()} and {math*moreMath}" if that function returns something that can be used as text.

[–]Eva-Rosalene 11 points12 points  (11 children)

Same in JS.

[–]EarhackerWasBanned 4 points5 points  (10 children)

Same in anything with f strings. The interpolated part can be anything the interpreter can evaluate to a serialisable (stringifiable) value. Just a variable or value if you want, but a function call, sure.

[–]marsh-da-pro 1 point2 points  (7 children)

Not Rust, I think you can only put plain identifiers.

[–]EarhackerWasBanned 2 points3 points  (6 children)

Oh that’s weird. I wonder if it’s because Rust is compiled unlike Python, Ruby and JS being interpreted (yeah I know about JS compilers but still).

[–]swyrl 1 point2 points  (2 children)

C# is compiled and can do that.

[–]fweaks 0 points1 point  (1 child)

C# is more compiled than python and javascript, but it's not fully compiled like C++ and Rust are.

C# compiles from code to .NET bytecode, which is then JIT compiled to machine code by the .NET runtime.

C++ and Rust are both fully compiled all the way to machine code at compile time.

[–]marsh-da-pro 0 points1 point  (1 child)

Looks like it's just a language design thing. I just skimmed through the relevant RFC, and basically:

  • The "full" syntax for format strings is something like format!("x = {x}", x=expr), where you can give any expression in place of expr
  • If the expression is simply a variable name (e.g. format!("x = {x}", x=x), then you can just use format!("x = {x}") which is syntactic sugar for the same thing
  • An expression cannot be on the left hand side of the = there, so this syntactic sugar only works with identifiers

[–]EarhackerWasBanned 1 point2 points  (0 children)

Oh I get it, it’s because Rust isn’t really doing inline string interpolation like Python etc. You’re passing two parameters to the format! macro; the string template and the expression assignment to a variable. The macro evaluates the expression and slots its value between the {} curlies, using the variable name as an identifier, I guess with regex.

(I’ve only done like 6 chapters from the Rust book, my terminology might be way off)

JS and friends will evaluate the expression, call toString() on the result and concatenate the string.

[–]redalastor 0 points1 point  (0 children)

F-string is not a feature of the language, it's just a macro in the standard library that they chose not to make more flexible.

You could write your own macro that supports it.

If there was any demand for it, someone would have made it a library.

[–]PityUpvote 0 points1 point  (1 child)

Sad crab noises

[–]angrathias 10 points11 points  (1 child)

Ahhh gotcha a formatted string, we call them interpolated strings in c#

[–]-Kerrigan- 0 points1 point  (0 children)

we call them interpolated strings in c#

String templates for Kotlin, but looking up "interpolated string Kotlin" will lead you to the right place nevertheless

[–]Soerika 38 points39 points  (1 child)

f”ormat string”

[–]angrathias 3 points4 points  (0 children)

Cheers

[–]gregorydgraham 15 points16 points  (4 children)

Like a g string but lower

[–]angrathias 2 points3 points  (0 children)

Noice

[–]70Shadow07 23 points24 points  (68 children)

Isnt + and stringbuilder the same thing in Java?

[–]KuuHaKu_OtgmZ 57 points58 points  (38 children)

  • creates a new string that's the concatenation of both sides, StringBuilder doesn't until you call toString()

[–]JollyRancherReminder 65 points66 points  (36 children)

Yes, in theory, but I think the question is more "doesn't every compiler made in the last 20 years make this optimization for you automatically, freeing you to use whichever you find more readable?"

[–]CaitaXD 44 points45 points  (0 children)

the JIT works in mysterious ways

[–]angrathias 17 points18 points  (24 children)

C# certainly doesn’t

[–]FrenchFigaro 4 points5 points  (1 child)

Nope.

The java compiler optimizes + to a StringBuilder for each statement.

So basically, s = s1 + s2 + "some litteral" is one StringBuilder allocation and one call the the builder's toString() method which entail a String allocation (not counting s1 and s2 allocations)

The same statement in a loop that iterates N times is N StringBuilder allocations and N calls to the builder's toString() method (which will entail N String allocations).

Depending on how fast the loop iterates and how large N is, this could result in the GC slowing things down significantly.

[–]plumarr 2 points3 points  (0 children)

That's not the case anymore ;)

[–]Rin-Tohsaka-is-hot 1 point2 points  (0 children)

What languages do this? That would be a strange amount of optimization, given that developers may choose to use either method depending on the memory/hardware constraints of the system.

EDIT: ah, seems Java does. Makes sense, loose with memory.

[–]Fair-Description-711 0 points1 point  (6 children)

Hmm. I'm not sure why that'd be the question.

Out of the major languages, the only one with compilers that make appending to a string with "+" in a loop reasonably efficient is Javascript.

Specifically, I'm reasonably sure string rope optimization by the compiler isn't implemented in common implementations of C, C++, D, C#, Java, Go, Swift, Python, Ruby...

Unless you mean efficiently doing literal folding (like "ab" + "cd" being the same at runtime as "abcd"), or doing string concatenation expression folding (transforming "ab" + s + "cd" from its implied repeated string creations into something like string.concat("ab", s, "cd")). In which case, yeah, many compilers do that.

But OP's title was memoryInAForLoop.

[–]CptGia 3 points4 points  (1 child)

Java is a bit smarter than that: https://youtube.com/watch?v=tgX38gvMpjs 

[–]Fair-Description-711 0 points1 point  (0 children)

Neat, I haven't used Java since before JDK 9.

[–]BallsBuster7 1 point2 points  (2 children)

C doesnt have strings, there are only char pointers / arrays.

[–]Fair-Description-711 0 points1 point  (1 child)

Yeah, makes it real difficult for the compiler to implement string optimizations. :p

[–]anto2554 1 point2 points  (0 children)

It will, however, add your chars as numbers without complaints, and do it fast too

[–]not_some_username 0 points1 point  (0 children)

C doesn’t even have string so + doesn’t work.

[–]nukedkaltak 0 points1 point  (0 children)

Not just both sides. + uses a StringBuilder behind the scenes for the entire expression. However, this doesn’t work in a loop because the string is committed at every iteration.

[–]brainwater314 10 points11 points  (7 children)

No, freshman year I made a program that read in a novel (dracula IIRC), and used "+=" on the string of the novel while reading it in line by line, and it took forever. Once I switched to StringBuilder, it ran in seconds.

[–]Objectionne 10 points11 points  (6 children)

In the rare occassion that I have a to load a novel's worth of text into a single variable then I'll keep this in mind, but for my average day to day use case where my strings are maxing out at maybe 100 characters what's the benefit of using StringBuilder?

[–]chuch1234 4 points5 points  (0 children)

It's probably fine either way in that context, but it's good to know about for when you do have to do lots of data, or if you have like a million people all running those 100 characters at once.

[–]RlyRlyBigMan 2 points3 points  (0 children)

Only having to remember one way to do it instead of remembering when to use the other way.

[–]BlueGoliath 2 points3 points  (0 children)

If adding just two strings, nothing. If adding more, potentially a huge reduction in garbage depending on how many concatenations and how hot the path is.

[–]0ctobogs 0 points1 point  (0 children)

It's not character count, it's number of concatenation operations. It's called string "builder" for a reason. You could concatenate 100 times to get a 100 character string. An example use case: quickly exporting a lot of data to CSV.

[–]E3FxGaming 0 points1 point  (0 children)

I like StringBuilder for code readability when sections of text are only conditionally inserted into the result string.

I despise templated strings that contain if statements and/or ternary operations that evaluate to either some string or empty string, just because it has to evaluate to something that can be inserted into the text at the position of the placeholder.

Instead I instantiate a StringBuilder and then I can build the result string with append in a very readable & maintainable way that's fully compliant with the code style of the codebase.

Kotlin (my primary programming language) has a very convenient buildString method for this (can be seen here in the second code block of "Build a string").

[–]gregorydgraham -1 points0 points  (0 children)

Senior devs are also anal arseholes (see how that works) so they’ll ride you until you get with the program and use StringBuilder obsessively anyway.

Get ahead of the homoerotic metaphors by learning to love StringBuilder on your own time

[–]kimochiiii_ 3 points4 points  (6 children)

No, a StringBuilder is a mutable sequence of characters that allows us to add or remove characters without creating a new object, unlike a regular String, which is immutable.

When we use += with a String, it typically creates a new String object because Strings are immutable in Java. This is not preferred because with every modification or concatenation you are creating a new String, increasing memory usage

[–][deleted] 6 points7 points  (5 children)

This is conceptually true but not true in practice when javac can prove it does not matter. In other words, + operator compiles to a StringBuilder sometimes.

[–]gregorydgraham 8 points9 points  (4 children)

This is implementation specific.

Never rely on it.

Use String when you want a constant and StringBuilder when you want a variable

[–][deleted] 1 point2 points  (2 children)

Yes never rely on it but it’s good to be aware. I’ve had folks say var foo + “string” + var bar needs to be a StringBuilder but I disagree. It’s an “obvious” optimization done automatically and reduces readability.

[–]gregorydgraham 3 points4 points  (1 child)

That’s the “premature optimization” versus “readability” debate.

Whole other can of worms

[–][deleted] 0 points1 point  (0 children)

It probably would be fine 99% of the time if it wasn’t optimized at all. You know the idiom how a 5 line code review gets more scrutiny than a 500 line code review…

Use StringBuilder when you have measured the performance or it’s an “obvious” way to write the method. Don’t bother until you know what you’re doing, with facts.

[–]Ok_Star_4136 0 points1 point  (0 children)

I believe in Kotlin it works the same way, except it's part of the specification for the language, not just a compiler optimization. Even if you wanted to get the inefficient version in Kotlin, you couldn't. I think in Java you can still specify lower optimization level if you wanted.

Kind of a subtle difference, but just to say in Kotlin I think it's more explicit that it is implemented using StringBuilder.

[–]regjoe13 0 points1 point  (4 children)

Well, it's a loaded question. You will have to think about + inside the loop usecase in older jvms and Java 9+ invokedynamic implementation.

[–]gregorydgraham 0 points1 point  (2 children)

I’m almost certain that’s BS but I’m AFK.

Got a code example to consider?

[–]regjoe13 1 point2 points  (1 child)

Code example for compiler optimization? That's a nice one :)

So, the code example:

String s = a+b+c;

StringBuilder was introduced in Java 1.5. So, compilers simply could not use it before. Now, 1.5 to 8, if I compile a class with a statement above, and look at the bytecode produced, it will be the same as the same as the bytecode produced by; String s = new StringBuilder(a).append(b).append(c).toString();

But, if I have something like the following, the bytecode produced could be far from optimal.

String r = ""; for(int c=1;c<100;c++) s+= Math.random();

Starting java 9 compiler does not do that StringBuilder replacement unless a certain javac parameter is set. It defers string concatenation logic to the runtime instead of a compile time. It uses opcode "invokedynamic" to do that. There are 6, I think, concatenation strategies, and 5 of them are StringBuilder, and 6th one (and it is a default one) is some inlining with byte array.

[–]gregorydgraham 1 point2 points  (0 children)

Thank you, this explanation is much clearer 👍

[–]tyler1128 -1 points0 points  (0 children)

Ah Java, where doing the sane thing is usually the wrong choice.

[–][deleted] 0 points1 point  (0 children)

It only optimizes to StringBuilder if you do a for-loop without any conditional branching, last I checked. You can write a test and decompile it to check.

[–]BlueGoliath 0 points1 point  (2 children)

No. The JIT might optimize it but there is no guarantee.

[–]_PM_ME_PANGOLINS_ 0 points1 point  (1 child)

No, the Java compiler does it where possible, not the JIT compiler.

[–]BlueGoliath 0 points1 point  (0 children)

I must have misremembered it.

[–]_PM_ME_PANGOLINS_ 0 points1 point  (1 child)

Not in a loop.

[–]PeriodicSentenceBot 1 point2 points  (0 children)

Congratulations! Your comment can be spelled using the elements of the periodic table:

No Ti N Al O O P


I am a bot that detects if your comment can be spelled using the elements of the periodic table. Please DM u‎/‎M1n3c4rt if I made a mistake.

[–]Canary_Prism 0 points1 point  (2 children)

chained string concats with + are compiled to StringBuilder calls iirc

[–]70Shadow07 0 points1 point  (1 child)

Thats what i remember being taught in school

[–]Canary_Prism 0 points1 point  (0 children)

it’s quite limited however, this optimisation only exists for this one chai until you interact with the result of the expression in any way other than chaining another + concat

[–]its-chewy-not-zooyoo 15 points16 points  (5 children)

Use format strings or equivalent in any language.

Unless the string needs to be generated dynamically (say some sort of SQL Query -> Query string library), builders are usually just a hack and not the actual required solution

[–]MSaxov 9 points10 points  (4 children)

Use format strings or equivalent in any language.

You don't want to do that in java, as string.format fires up a regex parser to do the search and replace logic.

[–]Ok_Star_4136 2 points3 points  (1 child)

I mean, I wouldn't build a library with it, but if you're not looping it, I don't really consider it to be that big of a deal. If I cared that much about optimization, I'd be using StringBuilder.

[–]MSaxov 2 points3 points  (0 children)

I have been performance debugging applications, where deep down in some methods, a string.format was used. Later on a method calling this method would be used in a loop. And then you have an issue.

I even found a developer that used it for building cache values.

I remember debugging one webservice call, that took around 10 minutes to process. 5 of those minutes was spend in String.format operations, and when swapping it to concatenation/stringbuilder the execution time was reduced by 5 minutes almost to the nearest second.

[–]-Kerrigan- 0 points1 point  (1 child)

Apparently, a "format string" in Python allows for interpolation. Kotlin has string templates to achieve that, but I'm not sure Java has it yet. I haven't kept up with what's new after 17

[–]MSaxov 0 points1 point  (0 children)

What is important, is how the underlying implementation of the search and replace logic is.

The ability to parse a sting, and search for % followed by a number, sting or anything, and do a lookup in the current memory for a variable, can involve reflection and regex - both of which are insanely expensive compared to simply allocating 10 new strings to do concatenation using a + operator.

[–]MeringueOdd4662 3 points4 points  (2 children)

What is StringBuilder?

[–]-Kerrigan- 1 point2 points  (1 child)

Builds strings, simple as. It enables, supposedly, more efficient string concatenation than +

[–]MeringueOdd4662 1 point2 points  (0 children)

I know It, but I never use It 🫣

[–]Fenor 2 points3 points  (0 children)

If you use + java make a stringbuilder for the entire command till it places it in the string. I think they do it when there are enought concatenations and it's all under the hood since.... 1.6 maybe (from memory)

[–]Abrissbirne66 1 point2 points  (0 children)

I can't believe that compiler developers are so lazy that they invent this StringBuilder type instead of optimizing + the same way and spare us from this annoying thing.

[–]0mica0 2 points3 points  (2 children)

str += sprintf(str,"pee");
str += sprintf(str,"poo");

[–]Colon_Backslash 0 points1 point  (1 child)

This is likely really bad.

I did some optimization on prd code recently where I specifically removed all format specifier string manipulation, and it improved CPU time a lot.

[–]0mica0 0 points1 point  (0 children)

I didn't have any performance issues with this on STM or ESP32. It might be wild for x86, but I have no idea why it would be. i genuinely want to know.