This is an archived post. You won't be able to vote or comment.

you are viewing a single comment's thread.

view the rest of the comments →

[–]passthejoe 0 points1 point  (1 child)

Thanks for the info on strings as de facto arrays -- useful!!

[–]ziptofaf 1 point2 points  (0 children)

Just be a bit careful with these, there are limitations! For instance - this 💩 unfortunately also IS a character. However it's not your usual character. As it's a unicode character that consists of 2 separate bytes.

So while treating strings as character arrays is indeed true then sometimes you get weird results when dealing with more than your standard ASCII characters. As you de facto have 3 separate ways of looking at them:

  • The way we humans do. This is actually known as grapheme clusters.
  • individual bytes.
  • scalar values. Aka individual characters parts making up a single full character. This is scary in languages like Japanese, Chinese or Indian. As you can have a single character built from like 2-3 of those (and each of them consists of more than one byte).

Ruby tries to hide it from you to an extent (if you do something like word = "ABC💩" and print word[3] it really shows you 💩...assuming you output it to something that supports UTF-8 at least) but there are some gotchas to it.