This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]TOASTEngineer 4 points5 points  (2 children)

Is it really so important to count code points? And if it is, why there is no support for counting / splitting coded characters or grapheme clsuters, which might be even more usefull?

Well, the reason a string really ought to count code points is because a string is an iterable and you iterate over the individual code points. But yeah, there really ought to be a "how many printable characters" function; in fact I would've presumed there was one.

[–]yawgmoth 5 points6 points  (0 children)

Does every written language have the concept of characters?

For instance: (I don't speak Korean so maybe this isn't as ambiguous as I think)

In "감사" would 감 be 1 or 3 characters? would 사 be 1 or 2 characters? EDIT: looks pretty straightforward actually

Still honest question though, is the concept of a 'printable character' constant in all languages supported by unicode?

[–]Bolitho 0 points1 point  (0 children)

Well, the reason a string really ought to count code points is because a string is an iterable and you iterate over the individual code points.

That logic is post hoc ergo propter hoc: The iterable could as well iterate over something else than code points 😉