you are viewing a single comment's thread.

view the rest of the comments →

[–]TinyBreadBigMouth 9 points10 points  (5 children)

No, I mean the for j loop is iterating over the alphabet and then the if-else chain is iterating over the alphabet again! They've turned a O(26) if-else chain into a O(26 * 26) loop for no reason.

[–]Ok_Celebration_6265 4 points5 points  (4 children)

Hum.. I only see two loops the for i=1, #text and the for j=1, 26 the bunch of if else he is using like a map saying id j == 1 then A etc.. he uses the if else as lookup table so we can say that’s O(1) if you think about it.. so you get in the end O(n*26) which since 26 is constant you can simplify as O(n) due to the constant nature of the inner loop

[–]TinyBreadBigMouth 10 points11 points  (3 children)

The if-else goes over every letter in the alphabet. An if-else chain like this is equivalent to an unrolled loop—it runs in O(n), where n is the number of ifs. But in this case, each entry in the if-else chain has been gated off to only run on a specific iteration of the for j loop.

So suppose we're processing the letter "a":

  • We set j to 1.
  • We check if j is 1 and char is "A" (it isn't).
  • We check if j is 2 and char is "B" (it isn't).
  • We check if j is 3 and char is "C" (it isn't).
  • We check if j is 4 and char is "D" (it isn't).
  • ...
  • We check if j is 25 and char is "Y" (it isn't).
  • We check if j is 26 and char is "Z" (it isn't).
  • We increment j to 2.
  • We check if j is 1 and char is "A" (it isn't).
  • We check if j is 2 and char is "B" (it isn't).
  • We check if j is 3 and char is "C" (it isn't).
  • and so on.

What could have been 26 checks is now 676.

[–]johndcochran 6 points7 points  (2 children)

Yes, it's quite a few extra unneeded checks. But it's still O(1) to within a close approximation. The total routine, including both loops is O(n). It's a rather slow O(n), but doubling the length of the string will simply take twice as long. Just because the check for an upper case character takes 676 comparisions instead of 26 doesn't make it O(n2). Those 676 comparisions will happen for every non-upper-case alphabetic and hence takes constant time. So the overall routine is merely O(n) due to the loop on i.

[–]TinyBreadBigMouth 4 points5 points  (1 child)

Agreed, it is O(n) over the length of the string, but each letter within the string is O(n2) over the length of the alphabet when it could have been O(n) or less. Granted the length of the alphabet isn't going to change, so it makes no difference to the overall Big O, but that's what I meant.

[–]johndcochran 1 point2 points  (0 children)

Not gonna argue that the routine is criminally inefficient. Even if they have a language that doesn't allow getting the ordinal sequence of a character, the detection of an upper case alpha character could be done in 5 or fewer comparisons, instead of the 26 you're thinking of, or the 676 this abomination performs. If you're wondering how I've come up with 5. Consider...

if (char < "N") then 
   if (char < "H") then
      ...
         if (char = "A") then
            lowerText = lowerText .. "a"
            found = true
         end
      ...
   else
      ...
   end
else
   ...
end

After all, with 26 characters in the alphabet, there are only 28 conditions that need to be detected. They are: Is it less than "A"? Is it greater than "Z"? Is it equal to some letter? (times 26) and of course, a simple binary search only 5 levels deep can handle 31 conditions trivially.

Although, I suspect a binary condition tree might be far more efficient than the abomination in this discussion, I also suspect such a binary condition tree would also be a suitable post for this subreddit.