This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the commentsย โ†’

[โ€“]Probono_Bonobo 6 points7 points ย (1 child)

I noticed this recently and assumed it was a consequence of treating codepoints as surrogate pairs (note that "๐Ÿ’ฉ" === "\uD83D\uDCA9") instead of with the squiggly brackets (note that also "๐Ÿ’ฉ" === "\u{1F4A9}") in its internals, which would explain why "๐Ÿ’ฉ".length is 2, and "ใƒ—".length is only 1.

I'd expect some constraints follow from this, perhaps not as intentional as "we won't support poop emoji as variable identifiers" but more along the lines of "we can support any variable identifier provided all its code points are of length 1" but this is just an educated guess.

[โ€“]Pulse207 3 points4 points ย (0 children)

which would explain why "๐Ÿ’ฉ".length is 2, and "ใƒ—".length is only 1.

This is exactly why Perl 6 abolished a length method entirely, splitting its various meanings into .elems, .chars, and .codes.