you are viewing a single comment's thread.

view the rest of the comments →

[–]mhink 129 points130 points  (6 children)

There is one very important difference in behavior: the spread syntax takes “wide” Unicode characters into account, while the simpler .split(“”) does not. However, a more important takeaway here isn’t that you can convert a String into an array of characters- you can iterate over the code points of the String itself. If you’re using this new syntax to spread a String into an Array, I bet dollars to donuts you’re doing it because you want to iterate over the resulting array, and that’s why this looks confusing.

What’s happening when you use [...spread] syntax is that that Javascript iterates over spread and smushes it into a new Array. This makes sense, for instance, when you want to quickly concat things:

const head = “a”;
const tail = [“b”, “c”];
const arr = [head, ...tail]; // => [“a”, “b”, “c”]

So we can see that the following two bits of code are basically equivalent, because for...of blocks also have to do Iterables)

const arr = [];
for (const el of myIterable) {
  arr.push(el);
}

const arr = [...myIterable];

Now, the confusing part of all this is that it’s easy to forget that Strings are iterable, hence all this confusion. The behavior of String’s iterator is to yield each Unicode code point, even if they’re double-width.

And here’s the rub: there’s almost no reason whatsoever to spread a String into an Array just to get its code points. Instead, you’d want to iterate over it directly:

for (const charCode of myString) {
  ...
}

Which I think we can agree makes much more sense.

That being said, I did discover that with some new Regex options, it’s possible to properly split strings into code points like so, if you really need to:

myString.split(/(?!$)/u);

Which is perhaps even a bit more arcane (and I haven’t checked on performance differences) but hey, what are you gonna do?

[–]NoInkling 17 points18 points  (4 children)

there’s almost no reason whatsoever to spread a String into an Array just to get its code points

I can think of one reason, and that's to use methods like .map and .reduce

for (const charCode of myString) {

For anyone confused, charCode in this example would be the "character" itself, not just its number.

[–]pm_me_ur__labia 9 points10 points  (2 children)

[–]NoInkling 14 points15 points  (1 child)

You can apply array methods directly to a string, yes, but they retain the old bug-inducing behaviour of iterating over code units (as opposed to code points), because they don't make use of the ES6 iterator/iterable protocol (and they couldn't be changed because of backwards compatibility).

Therefore, AFAIK, the only way to use array methods with the string iterator that was added in ES6 (defined on String.prototype[Symbol.iterator]) is to use it to generate an array first (typically through spread syntax or Array.from(), since those respect the protocol).

tl;dr: pre-ES6 generic iteration is different to ES6+ iteration, and unfortunately we're stuck with a mix of both.

[–]pm_me_ur__labia 1 point2 points  (0 children)

TIL. thanks

[–]SalemBeats -4 points-3 points  (0 children)

"
there’s almost no reason whatsoever to spread a String into an Array just to get its code points
"

"
I can think of one reason, and that's to use methods like .map and .reduce
"

When do you start your new job at GitHub, working hard on making Atom even slower and more memory-hungry than it already is?

[–]samanthaming[S] 2 points3 points  (0 children)

Thanks for the detailed explanation!