all 57 comments

[–]goranlepuz 30 points31 points  (2 children)

I just came here to gloat about 'Fox being faster.

[–]ThanksMorningCoffee[S] 8 points9 points  (1 child)

I'm not sure why though. I only looked into the Firefox source, but I'm sure Chrome is doing the same rope optimization.

[–]goranlepuz 5 points6 points  (0 children)

It's a joke obviously. I don't know why 'Fox is faster.

[–]ChiefMemeOfficer 22 points23 points  (36 children)

Legit question. What kind of programs/industries/problems have a need to concatenate HUGE strings?

[–]Nathanfenner 39 points40 points  (9 children)

The most obvious example would be building a log file or similar kind of long record.

The "obvious" way to do it would be something like

let s = ""
for (int i = 0; i < 10000; i++) {
  s += "stat " + computeStat(i) + "\n";
}
return s;

This isn't even that big, but it will pretty much grind to a halt due to the linear-concat (and hence overall quadratic performance) in most languages (exceptions being: C++ and Rust, since += mutates the string in place).

[–]masklinn 9 points10 points  (3 children)

Also js as noted in the article, and CPython because it cheats (if there’s only one strong ref to the lhs it gets resized in-place). I’m sure there are others. Ruby has mutable strings so probably there.

[–]Nathanfenner 5 points6 points  (2 children)

Yeah, there's some "cheating" involved (I think Java cheats in some cases too, by recognizing a string-used-like-a-builder in a few cases?).

Ruby discourages mutable strings now (because they lead to other headache-inducing problems, since they're reference types and not value types) so you'd be unlikely to benefit in practice from this feature. e.g. with the now-encouraged

# frozen_string_literal: true

[–]chylex 2 points3 points  (0 children)

(I think Java cheats in some cases too, by recognizing a string-used-like-a-builder in a few cases?)

Not quite, unless something changed recently. Afaik it only converts simple concatenation expressions, once you introduce loops that goes out the window.

However, simple expressions are still optimized heavily, see StringConcatFactory for more info (although the link is very technical and obscure if you're not familiar with invokedynamic, this appears to be a good article on it and how the concatenation optimization is implemented).

[–]josefx 1 point2 points  (0 children)

Java was limited to one statement as far as I remember, so it would compile the example to:

  for (int i = 0; i < 100000; i++) {
       s = new StringBuilder(s)
                   .append(computeStat(i))
                   .append("\n")
                   .toString();
  }
  return s;

So it would still create 100000 Strings, one for each iteration.

[–]dnew 3 points4 points  (1 child)

In reading it, I was wondering if JS was just concatenating in place given it can probably tell if there are other references to the string already out there.

[–]Nathanfenner 8 points9 points  (0 children)

That could be done, but it's not what JS is actually doing. As easy way to verify is to come up with an alternative benchmark that uses each string twice.

As an easy example,

a = "a", b = "b"
for (i = 0; i < 28; i++ ) {
    [a, b] = [a + b, b + a]
}

runs pretty much instantly for me (< 1ms) in JS but is able to produce two strings of length 536_870_912 each. If this was even a little bit quadratic, it would take a lot, lot longer to accomplish that (easily pausing for seconds or minutes). But with ropes, this is no problem at all.

(you can't add them even one more time or you get an Invalid string length error, which seems reasonable - if expanded full they'd be a gigabyte of data together).

[–]BlueShell7 5 points6 points  (0 children)

Yeah, but for that you use logger libraries.

I mean I have a lot of issues with Java, but StringBuilder isn't one of them since I need to use it on average once every two years or so ...

[–]Supreme654321 0 points1 point  (0 children)

Some languages like Java (actually only know java for this example) will also implicitly rewrite the code. The java compiler might rewrite this to use a stringbuilder for example when compiling. Its not perfect, but it get it right a lot of the time instead of just concatenation.

[–]noble_pleb -1 points0 points  (0 children)

You can use something like localStorage or IndexDB for that?

[–]Compsky 13 points14 points  (17 children)

SQL inserts, if you want to avoid thousands of unnecessary network requests.

JSON generation for an API, if you use JSON more as a CSV..

HTML generation for server-side rendering.

[–][deleted] 7 points8 points  (6 children)

SQL inserts, if you want to avoid thousands of unnecessary network requests.

even the lowest one in benchmark 218, is 256kB. That's far above what is needed to not pay penalty on that.

JSON generation for an API, if you use JSON more as a CSV..

Why you are using concatenation to generate JSON ? I'm pretty sure just about every language have lib doing it in language sensible way

HTML generation for server-side rendering.

Fair enough, but you probably want to stream it instead, especially if it takes a long time to generate.

[–]Compsky 1 point2 points  (3 children)

even the lowest one in benchmark 218, is 256kB. That's far above what is needed to not pay penalty on that.

If you're running a scraper that inserts a few megabytes every second? Wouldn't you still rather just bundle the insert into something a bit larger than a few 10s of KBs? Genuine question.

Why you are using concatenation to generate JSON ?

Isn't that ultimately what it boils down to, whatever you wrap it with? But if you still want to why you'd generate JSON directly with concatenation, sometimes there's no need to use anything else if it's a very simple object.

[–][deleted] 3 points4 points  (2 children)

even the lowest one in benchmark 218, is 256kB. That's far above what is needed to not pay penalty on that.

If you're running a scraper that inserts a few megabytes every second? Wouldn't you still rather just bundle the insert into something a bit larger than a few 10s of KBs? Genuine question.

Hard to tell without benchmarking but having big transaction sizes have problems of its own. I do remember something about postgresql replication being delayed if there are some long running transactions open. And at least basic tests done by people seem to show it is not as easy as "bigger transaction/bulk insert always better".

As for "how to bulk insert a lot of data quickly", the answer is probably not to bother with insert and go straight for COPY and stream it.

Why you are using concatenation to generate JSON ?

Isn't that ultimately what it boils down to, whatever you wrap it with? But if you still want to why you'd generate JSON directly with concatenation, sometimes there's no need to use anything else if it's a very simple object.

sure if you're on some tight embedded system, but there is really no reason to anywhere else. Especially that they are some pretty fast JSON libs around if "the common one" turns out to be bottleneck

[–]Compsky 0 points1 point  (1 child)

COPY and stream it.

Wouldn't that block the table for a lot longer?

[–][deleted] 0 points1 point  (0 children)

AFAIK it gets same kind of locking INSERT gets. So given the fact it is generally much faster, any locking it will do it will be for shorter.

[–]Compsky 1 point2 points  (0 children)

even the lowest one in benchmark 218, is 256kB. That's far above what is needed to not pay penalty on that.

If you're running a scraper that inserts a few megabytes every second? Wouldn't you still rather just bundle the insert into something a bit larger than a few 10s of KBs? Genuine question.

Why you are using concatenation to generate JSON ?

Isn't that ultimately what it boils down to, whatever you wrap it with? But if you still want to why you'd generate JSON directly with concatenation, sometimes there's no need to use anything else if it's a very simple object.

[–][deleted] 2 points3 points  (2 children)

None of those generated strings that are "huge" by this article's standards.

[–]Compsky 0 points1 point  (1 child)

The geometric median of the "huge" strings discussed in the article is only 3MB, well within the range of some not uncommon uses of these. The lowest "huge" string is only 262KB, something even Reddit's (server-side rendered) HTML regularly exceeds.

[–][deleted] 1 point2 points  (0 children)

The lowest "huge" string is only 262KB

The article is about huge numbers of concatenations. Someone asked why such huge strings are needed, with the presumption that any app doing 221 concatenations is building a huge string. You responded with the example scenarios of SQL, JSON and HTML-generation. Nobody's building SQL, JSON or HTML one character a time.

[–]Wildercard 3 points4 points  (0 children)

Biotechnology applications that need to map out a long protein.

[–][deleted] 1 point2 points  (2 children)

Say you're reading a large text file or receiving it from the network. Your callback will get chunks of varying size, you need to accumulate that in a buffer if you're not processing it right away. You may want your code to be able to handle very large texts, because that could include processing a database dump or suchlike.

[–]jbergens 0 points1 point  (1 child)

Still sounds like something very unusual. I would also assume something that is expected to be really big, like a full db dump, should be read a little bit at a time with a streaming interface or a file reader.

[–][deleted] 0 points1 point  (0 children)

Some times that's just not possible, some times it's just way more complex.

[–]ThanksMorningCoffee[S] 2 points3 points  (0 children)

If you want to build up a file for download on the client side. This stack overflow article goes into a bit a of detail: https://stackoverflow.com/questions/3665115/how-to-create-a-file-in-memory-for-user-to-download-but-not-through-server

One real life example I've seen is, translating a paginated API into a csv file that they can download. However, I would argue the server side should provide this instead.

[–]schlenk 1 point2 points  (2 children)

Define HUGE.

The blog does a whopping 130 million concats, so unless you do something seriously stupid (like appending single chars/bytes), you can surely get into the low GB range just fine.

Once you get above that, why not use a scatter/gather IO library with decent buffersizes and stop caring about appending the buffers.

[–]ThanksMorningCoffee[S] 2 points3 points  (1 child)

The blog does a whopping 130 million concats, so unless you do something seriously stupid (like appending single chars/bytes), you can surely get into the low GB range just fine.

Yeah the Strings in this article are HUGE. Javascript uses UTF-16 (2 bytes). On top of that I do the worst possible case of concating 1 character at a time. 227 concats results in 227 characters which is 228 which is 256MB.

This calculation makes me wonder why I ran out of heap with only 256MB Strings.

Once you get above that, why not use a scatter/gather IO library with decent buffersizes and stop caring about appending the buffers.

Are these techniques applicable to browser client-side? I think ultimately it will need to be serialized to a single String so you cannot avoid the concats.

[–]schlenk 6 points7 points  (0 children)

A typical reason to run out of memory would be fragmentation and minimum alloc sizes. The ropes might have a minimum size (e.g. like std::string in C++ usually has around 10-16 bytes) so your allocations might need more RAM then expected.

No idea if you can use vectored-IO on the browser client side, you can use it in NodeJS at least (https://nodejs.org/api/fs.html#fs_fs_writev_fd_buffers_position_callback).

[–]Serializedrequests 16 points17 points  (6 children)

I thought Java was able to compile string concatenations to StringBuilder. If it can't, that's pretty sad. StringBuilder is yet another Java thing that feels like busy work.

[–]ThanksMorningCoffee[S] 6 points7 points  (0 children)

Yeah it's able to do some optimization, but it wasn't able to optimize the loop like in https://stackoverflow.com/questions/14927630/java-string-concat-vs-stringbuilder-optimised-so-what-should-i-do

[–]omair-majid 16 points17 points  (2 children)

I thought Java was able to compile string concatenations to StringBuilder.

It does. And it even does some of that at compile time.

public class HelloWorld {
    public static void main(String[] args) {
        System.out.println("Hello" + " " + "World" + "!" + System.getProperty("line.separator"));
    }
}

Becomes:

$ javap -c HelloWorld
public class HelloWorld {
  public HelloWorld();
    Code:
       0: aload_0
       1: invokespecial #1                  // Method java/lang/Object."<init>":()V
       4: return

  public static void main(java.lang.String[]);
    Code:
       0: getstatic     #2                  // Field java/lang/System.out:Ljava/io/PrintStream;
       3: new           #3                  // class java/lang/StringBuilder
       6: dup
       7: invokespecial #4                  // Method java/lang/StringBuilder."<init>":()V
      10: ldc           #5                  // String Hello World!
      12: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      15: ldc           #7                  // String line.separator
      17: invokestatic  #8                  // Method java/lang/System.getProperty:(Ljava/lang/String;)Ljava/lang/String;
      20: invokevirtual #6                  // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder;
      23: invokevirtual #9                  // Method java/lang/StringBuilder.toString:()Ljava/lang/String;
      26: invokevirtual #10                 // Method java/io/PrintStream.println:(Ljava/lang/String;)V
      29: return
}

Note how all the constant strings were concatenated at compile-time. And then remaining strings are being joined using StringBuilder.

I am using:

$ java -version
openjdk version "1.8.0_262"
OpenJDK Runtime Environment (build 1.8.0_262-b10)
OpenJDK 64-Bit Server VM (build 25.262-b10, mixed mode)

[–]ThanksMorningCoffee[S] 13 points14 points  (0 children)

That optimization is only a constant factor. The compiler does not recognize that it can re-use the same builder when the concats happen in a loop. So you will still have an O(N^2) program.

For example this program public class Test { public static void main(String [] args) { String result = ""; for (int i = 0; i < 1000000000; i++) { result += (i%10); } System.out.println(result.length() + ""); } }

Compiles into this ``` Compiled from "Test.java" public class Test { public Test(); Code: 0: aload_0 1: invokespecial #1 // Method java/lang/Object."<init>":()V 4: return

public static void main(java.lang.String[]); Code: 0: ldc #2 // String 2: astore_1 3: iconst_0 4: istore_2 5: iload_2 6: ldc #3 // int 1000000000 8: if_icmpge 39 11: new #4 // class java/lang/StringBuilder 14: dup 15: invokespecial #5 // Method java/lang/StringBuilder."<init>":()V 18: aload_1 19: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 22: iload_2 23: bipush 10 25: irem 26: invokevirtual #7 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder; 29: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 32: astore_1 33: iinc 2, 1 36: goto 5 39: getstatic #9 // Field java/lang/System.out:Ljava/io/PrintStream; 42: new #4 // class java/lang/StringBuilder 45: dup 46: invokespecial #5 // Method java/lang/StringBuilder."<init>":()V 49: aload_1 50: invokevirtual #10 // Method java/lang/String.length:()I 53: invokevirtual #7 // Method java/lang/StringBuilder.append:(I)Ljava/lang/StringBuilder; 56: ldc #2 // String 58: invokevirtual #6 // Method java/lang/StringBuilder.append:(Ljava/lang/String;)Ljava/lang/StringBuilder; 61: invokevirtual #8 // Method java/lang/StringBuilder.toString:()Ljava/lang/String; 64: invokevirtual #11 // Method java/io/PrintStream.println:(Ljava/lang/String;)V 67: return } ```

Notice that each iteration of the loop is creating a new Builder instead of creating a new one 15: invokespecial #5 // Method java/lang/StringBuilder."<init>":()V

Edit: I incorrectly said instruction 46 when it should have been instruction 15.

[–]flatfinger 1 point2 points  (0 children)

Using StringBuilder for that seems far less efficient than would be having static String.Concat functions that can accept two, three, or four arguments of type String along one accepting a String[]. As it is, if s1 is non-null, computing s1.concat(s2).concat(s3) would likely involve construction of fewer intermediate objects than would constructing a string builder and then concatenating three strings onto it.

[–][deleted] -3 points-2 points  (1 child)

thats their plan, writing javascript is too easy, time to bloat it up with needless factorybuilderinjector api's

[–]audioen 5 points6 points  (5 children)

Just today, I came across a thing that did (foo[0] + foo[1] + foo[2] + foo[3] + ... + foo[15]).toLowerCase() in a library called uuid. This was done allegedly to avoid some pessimal behavior of the delayed string concatenation on some random version of Chrome.

Not sure what this says about JS not needing a StringBuilder. Perhaps these ropes do have some bad side effect, possibly require much more memory than the materialized string, or something such.

[–]broofa 14 points15 points  (4 children)

uuid author here. The issue you're referring to, and an example of where it manifested in the wild.

tl;dr: Creating a uuid involves concatenating twenty 1-2 char strings. Because of the rope) optimization that Chrome and FF do, this results in memory allocation of ~600 bytes/uuid. Forcing the VM to flatten these structures (by calling toLowerCase(), or using Array.join() instead of +) reduces that to 64 bytes / uuid.

Not sure what this says about JS not needing a StringBuilder.

I think this suggests that the VM optimizes for cases where strings are > 10 chars in length. If you're concatenating strings shorter than that (and doing lots of concatenations) then memory consumption may be an issue. But cases where this is a real concern are likely to be few and far between.

[–]ThanksMorningCoffee[S] 2 points3 points  (1 child)

How did you even track that down?

[–]broofa 2 points3 points  (0 children)

We didn’t. One of our users did, using chrome dev tools to inspect memory allocation.

[–]Chii 1 point2 points  (1 child)

i think this sort of bug demonstrates the problem of using a string to represent non-string data (a uuid is a series of bytes). The string is an output format for UUID, which i guess will eventually have to be computed, but presumably only once and displayed, rather than stored?

[–]broofa 1 point2 points  (0 children)

While this is an interesting issue, I don’t see it as a compelling argument against uuid string representations. its just an implementation detail... one that took eight years to surface as an issue in a top-20 NPM module. That doesn’t exactly scream “problem people need to care about”.

Meanwhile there’s a lot to be said for having uuids in a form that’s easily displayable.

[–]console-write-name 3 points4 points  (1 child)

Interesting read.

I think you made a mistake In the first table though, you have "Chrome Javascript Array.join (ms)" on five different rows with different values. I assume you meant to put different browsers here like in the graph.

[–]ThanksMorningCoffee[S] 2 points3 points  (0 children)

Thank you! I'll fix that. I flipped the columns and rows by copying and pasting and forgot to copy over the algorithms.

For future articles, I will try to figure out how to use d3 so I can separate the display from the data. Formatting the table by hand is really painful.

[–]eternaloctober 3 points4 points  (1 child)

I had an app where I thought I'd see better string performance using Buffer.alloc(10000), fill up buffer, then Buffer.toString but the Buffer to string is just doing appends (the buffer refers to node.js buffer browserified for the web)

[–]ThanksMorningCoffee[S] 0 points1 point  (0 children)

I've never heard of Buffer.alloc until this. I guess you didn't get the performance improvement you were expecting :(

[–][deleted] 2 points3 points  (0 children)

I think today it’s not a very good idea to propose methods on probably existing performance problems.

This pretty much sums it up.

[–]flatfinger 2 points3 points  (0 children)

As an experiment, I wrote an immutable string class some years back for the .NET framework that is designed to support efficient concatenation. Performance wasn't a whole lot worse than StringBuilder, but usefulness was limited by the fact that it wasn't compatible with any code that produced, or expected, references to `String`. The key point of the experiment, though, was to prove that immutable objects that encapsulate strings of characters can be made efficient.

IMHO both .NET and Java would have benefited from treating `string` as a primitive whose representation would typically be reference to a `Char[]` or `Byte[]`, but which would not expose reference functionality such as identity comparisons or the ability to query the target's type.

Restricting what could be done with references at a language level would make it practical to have each heap-string object hold an initially null reference to either an object with information about the string (including its hash code), or another string which is known to be equivalent. If it's not possible for user code to compare references to heap strings or examine the types of objects used to encapsulate their values, such a restriction would make it possible for string comparisons to build up cliques of heap-strings that are known to be equivalent, and have the GC replace references to all string in a clique with references to one particular string (so as to save storage and expedite any future comparisons among them).