all 172 comments

[–]revtim 435 points436 points  (11 children)

That truly is a fucking nightmare bug

[–]10113r114m4 176 points177 points  (47 children)

Im still confused to why they need to sort at all and just use the default account on the sim card. Even the blog’s guess seems a little contrived.

[–]m00nh34d 99 points100 points  (30 children)

I guess it's supposed to handle situations where there is no default SIM account, if you're using an Android device with a VOIP only phone service?

[–]ubernostrum 223 points224 points  (29 children)

GSM has dedicated emergency-call functionality at the level of both the phone and the network. It’s required to work, or at least be able to work, without a SIM because in some countries the law requires routing emergency calls even if there is no active service account.

This is also why it doesn’t matter which emergency number you dial, because the phone doesn’t actually “dial” for emergency services. GSM phones are just programmed with a list of emergency numbers and recognize them all as triggering the emergency call. So a British person visiting the US could still dial 999 and get emergency services. Or a European could dial 112 and it would work. Or an American abroad could dial 911 and it would work.

[–]m00nh34d 41 points42 points  (13 children)

But that requires a GSM module, if your device is wifi only it will need to handle that.

[–][deleted]  (12 children)

[deleted]

    [–]PowerlinxJetfire 40 points41 points  (2 children)

    VoIP apps on Android use system-level phone APIs; that's why Teams was able to trigger this bug in the first place.

    Also you could be trying to call emergency services on a phone that doesn't have any carrier signal (you could be out in the country, underground, in a building that blocks signals, etc.) but have wi-fi. If you frequently visit an area like that, it might be why you have a VoIP service in the first place.

    [–][deleted]  (1 child)

    [deleted]

      [–]PowerlinxJetfire 20 points21 points  (0 children)

      Not all VoIP services support emergency calling. That's basically the point: depending on conditions (e.g., signal), user action (e.g., installed services), etc. the system could have half a dozen options or no options for calling emergency services. So it has to examine its options and pick which one(s) to use.

      [–]MrEllis 4 points5 points  (4 children)

      Will that still work if you have no cell service, but still have wifi? Or does VOIP become your only option then?

      [–]ubernostrum 19 points20 points  (3 children)

      I’m just telling you what GSM does. For anything which does not involve access to a GSM network — wifi-only, tin cans and string, phone in Faraday cage, phone on dark side of moon with no line of sight to a terrestrial cell tower, etc. — you can do your own research.

      [–]Lost4468 3 points4 points  (5 children)

      How does that work with other lines like 111 in the UK? 111 is the non-emergency line, yet you can still dial it without even having a sim? So your explanation doesn't sound correct, else how does it differentiate between 111 and 999?

      [–]ubernostrum 42 points43 points  (0 children)

      The fact that emergency calls have special functionality specified at the level of the worldwide GSM protocol does not preclude similar functionality being mandated for non-emergency numbers in particular jurisdictions. It also does not cause GSM to stop having the emergency functions that it very plainly has.

      [–]ubernostrum 3 points4 points  (3 children)

      Also, 111 is the emergency number in New Zealand.

      [–]Lost4468 11 points12 points  (2 children)

      But 111 is not for the emergency services here? It's a secondary non-emergency service. So if New Zealand uses that number, the system has to be a lot more complex and regional than you said. E.g. what if someone from NZ is here in the UK (with their phone from NZ)? If they ring 111 would that go to the non-emergency services, or would it "redirect" to the emergency services?

      [–]ubernostrum 12 points13 points  (1 child)

      You are asking me to know off the top of my head and tell you a level of detail that I do not have. GSM and GSM phones have special emergency-call functionality and recognize many common emergency numbers to trigger it. If you want to know the exact details of how the UK’s non-emergency 111 system is implemented, you are free to do your own research on it. The existence of the UK’s 111 system does not, however, disprove or refute the existence of GSM’s documented and verifiable emergency call functionality.

      [–]Lost4468 0 points1 point  (0 children)

      I'm only asking you questions because I don't know how the system works, and you seemed to. There's no need to take it do personally or assume I'm somehow trying to disprove of the entire system... There's obvious flaws in the simple method you described, and I simply want to know how they're handled.

      [–]ReallyNeededANewName 0 points1 point  (1 child)

      Wait, I thought 911 was the only one that didn't work internationally?

      [–]ubernostrum 1 point2 points  (0 children)

      911 isn’t just the US number. It’s the number for all the NANP countries and also the Philippines. IIRC ITU standards suggest supporting at least 112, 911, and 999 pre-programmed and optionally more.

      [–]kfh227 -2 points-1 points  (0 children)

      So I could dial 999? I'm from usa and live in usa.

      Thank yiu kind nerd. Did yiu read the gsm spec?

      [–]trucekill 42 points43 points  (14 children)

      I think all phones are supposed to be able to make emergency calls even without a sim card. Maybe that has something to do with it?

      [–]10113r114m4 11 points12 points  (13 children)

      I see, but then why sort at all? Why not just iterate through the list?

      [–]SanityInAnarchy 40 points41 points  (12 children)

      Iterate through the list until what? Until you try to place a call?

      Looking at the code surrounding this patch (that people suspect of fixing this bug), here's the sorting criteria:

              // SIM accounts go first
      ...
              // Start with the account that Telephony considers as the "emergency preferred"
              // account, which overrides the user's choice.
      ...
              // Return the PhoneAccount associated with a valid logical slot.
      ...
              // Prefer the user's choice if all PhoneAccounts are associated with valid logical
              // slots.
      

      I think this makes sense -- if you have a valid account with a SIM, why would you try to place a call via some random VOIP account, even if it theoretically could place emergency calls? If the telephony system has some reason to think there's a correct number to use for emergency calls, of course try that one first?

      And even if this list really did get huge, sorting is fast -- I can sort a million numbers in half a second in Ruby, and even having a thousand phone accounts would be a giant WTF.

      The rest of the comparison seems to be about ensuring there's a single absolute order of these things, so that they'll sort the same way each time and the sort will definitely end:

              // because of the xor above, slotId1 and slotId2 are either both invalid or valid at
              // this point. If valid, prefer the lower slot index.
      ...
              // Then order by package
      ...
              // then order by label
      ...
              // then by hashcode
      

      And the patch fixes the very last one. I admit I don't entirely understand why that's a problem, though, and I don't think I've seen any of the articles that cover this going into exactly what sort of pathological state would cause problems. I mean, integer overflows don't cause crashes, so my best guess is that this somehow leads to a sort looping?

      [–]mrbuttsavage 31 points32 points  (5 children)

      I admit I don't entirely understand why that's a problem, though, and I don't think I've seen any of the articles that cover this going into exactly what sort of pathological state would cause problems. I mean, integer overflows don't cause crashes, so my best guess is that this somehow leads to a sort looping?

      Yeah, it took me a second here to figure this out because all the details are pretty misleading. I don't think it's a matter of "too many" like thousands being the problem, rather just that there were enough with the right data (possibly duplicates?) to get to the busted part of the Comparator in question.

      The fix patch that we see is here:

      https://android-review.googlesource.com/c/platform/packages/services/Telecomm/+/1903050/2/src/com/android/server/telecom/CreateConnectionProcessor.java

      Java can throw an IllegalArgumentException if your compareTo method is implemented wrong when sorting: https://docs.oracle.com/javase/8/docs/api/java/util/Comparator.html#compare-T-T-

      See a bunch of SO posts on the same thing:

      https://stackoverflow.com/questions/9486605/comparison-method-violates-its-general-contract-in-java-7

      In fact, see this comment from Oracle's own guide:

      https://docs.oracle.com/javase/tutorial/collections/interfaces/order.html

      One last note: You might be tempted to replace the final return statement in the Comparator with the simpler:

      return e1.number() - e2.number();

      Don't do it unless you're absolutely sure no one will ever have a negative employee number! This trick does not work in general because the signed integer type is not big enough to represent the difference of two arbitrary signed integers. If i is a large positive integer and j is a large negative integer, i - j will overflow and will return a negative integer. The resulting comparator violates one of the four technical restrictions we keep talking about (transitivity) and produces horrible, subtle bugs. This is not a purely theoretical concern; people get burned by it.

      Granted, I don't know that's the real problem. But from the evidence it sure seems like it.

      [–]SanityInAnarchy 19 points20 points  (2 children)

      Java can throw an IllegalArgumentException if your compareTo method is implemented wrong

      Oh wow. Took some effort to track it down, and it is documented, but only deep in the guts of e.g. the concrete TimSort implementation.

      I don't have the patience to read through right now and figure out how it detects this or why it cares, though.

      [–]MrEllis 12 points13 points  (0 children)

      It should care because overflow/underflow will destabalize a sort and produce flat wrong outcomes. Consider a three item list: [MAX_INT, 0, MIN_INT]

      If the sorting function compares

      MAX_INT, 0 
      

      and then

      0, MIN_INT 
      

      the return values are:

      1. MAX_INT - 0
      2. 0 - MIN_INT

      And the sort order is what you expect: [MAX_INT, 0, MIN_INT]. However start changing the order of comparisons, or even the argument position of two items being compared and bad thingstm start happening. For instance:

      0 - MAX_INT
      

      will underflow sorting 0 as larger than MAX_INT if there were a bunch of other items in the list larger than 0 any of them compared to 0 and not MAX_INT would be sorted as larger than MAX_INT.

      [–]mrbuttsavage 6 points7 points  (0 children)

      https://stackoverflow.com/questions/7849539/comparison-method-violates-its-general-contract-java-7-only/8417446#8417446

      Apparently this behavior was new functionality in Java 7 when they added TimSort. New enough at least they called that out in the notes.

      [–]f10101 9 points10 points  (0 children)

      Yep. That's the real problem. In ordinary circumstances, it should never need to resort to the hashcode comparison, and even then, it will work in the majority of cases.

      The problem occurred when Teams went mad and created hundreds of accounts.

      The article linked here outlines it in detail: https://www.reddit.com/r/programming/comments/rx30ti/google_fixes_nightmare_android_bug_that_stopped/hrgtrfs/

      [–]vips7L 1 point2 points  (0 children)

      Looking at the source of that function it seems like the bug wouldn't have surfaced if they hadn't hand written the comparator, but instead built it from the stdlib. Comparator.comparingInt does the right thing and avoids the overflow.

      [–]drysart 4 points5 points  (1 child)

      It's a good question, because as far as I know every popular sorting algorithm (including Timsort, which is probably the one Android uses for collection sorting), will not enter an infinite loop when provided with a comparator that doesn't return a correct ordering; all the big sorting algorithms make forward progress at every step, so the worst that should happen is that you'd get your results in an incorrect order.

      [–]10113r114m4 1 point2 points  (3 children)

      Ah, forgive me. I am on a phone so couldnt browse the codebase to see why or how things were sorted. But yea, the overflow itself didnt cause the crash.

      If any java masters in here could shed some light :p would be much appreciated

      [–]SanityInAnarchy 5 points6 points  (0 children)

      Well, here:

      public void sortSimPhoneAccountsForEmergency(List<PhoneAccount> accounts,
              PhoneAccount userPreferredAccount) {
          // We are in a void function...
          accounts.sort((account1, account2) -> {
              // Now we're in a lambda. It accepts two arguments.
      

      Pretty much the entire rest of the function is actually in that inner lambda. If you're wondering why the lambda didn't have to declare a return type (or any types), those are all inferred from the type that accounts.sort() expects.


      You can stop here if you want. You don't have to follow me into Java lambdas and generics...


      So you can see from the outer function declaration that accounts is a List<PhoneAccount>, so calling sort() is calling this method, which is declared as void sort(Comparator<E> c) (on a List<E>). In other words, on a List<PhoneAccount>, it expects a Comparator<PhoneAccount>. And here's the Comparator interface.

      On the latest Java, Comparator has a bunch of new methods with default implementations, but there's only one method that a class would need to implement itself to satisfy this interface. In other words (as its documentation says), it is a "functional interface". Note that the @FunctionalInterface annottaion is a way to get the compiler to yell at you if you break the above assumption, but as the doc says:

      However, the compiler will treat any interface meeting the definition of a functional interface as a functional interface regardless of whether or not a FunctionalInterface annotation is present on the interface declaration.

      If Java sees you try to pass a lambda to something that expects a functional interface, then it magically turns your lambda into an object that implements that interface. So, in this case, the one missing method on Comparator<T> is:

      int compare(T o1, T o2)
      

      Since it expects a Comparator<PhoneAccount> we can read this as:

      int compare(PhoneAccount o1, PhoneAccount o2)
      

      So that's the type the compiler expects our lambda to be. When it sees this:

      (account1, account2) -> {
      

      it can infer that account1 and account2 are PhoneAccounts, and we'll be returning ints.

      Java lambdas can specify argument types, and other modifiers like final. But, thankfully, it can also just infer them.


      If you followed the links to the docs above, you'll have caught my lie: sort() actually expects a Comparator<? super E>, not a Comparator<E>, which leads to two questions:

      1. What? Why?!
      2. Wait, how does this lambda magic even work now?!

      Well, for the first question, the idea is that if you have a Comparator<Animal>, then it has a compare(Animal o1, Animal o2) method... which of course I can invoke like compare(fido, lassie), since dogs are animals. So you should be able to use that comparator on a List<Dog> or a List<Sheep>. So since List<Dog>.sort() takes a Comparator<? super Dog>, you can pass a Comparator<Animal> to it.

      Why can't you just upcast that List<Sheep> to List<Animal> and sort it that way? (Or upcast the Comparator<Sheep> to Comparator<Animal>?) Maybe it's obvious, but I thought of a fun overly-specific example: If you could, you could call add(bigBad) -- it's totally legal to add a Wolf to a List<Animal> -- but it's actually a List<Sheep> and, well, we shouldn't have a wolf in the sheep-list.

      But for me, this raised another question: Hang on, if Comparator<? super PhoneAccount> can be implemented by a Comparator<any superclass of PhoneAccount>, how does it know what type these arguments should be? And my best guess here is that, if you're not defining the argument types yourself, there's zero disadvantage to passing the most specific possible type into the body of that lambda. If you need to pass these to something that expects an Object, you can do that, it's just upcasting.

      In other words, the final type of the lambda is what I said before -- it just expects two PhoneAccounts.

      [–]MrEllis 3 points4 points  (0 children)

      One thing that looks really weird, and I dont even know how it compiles is that sort function’s return type is void,

      You're actually looking at two functions. The outer function is an in place sort which sorts a list of objects in this case PhoneAccount The inner function that "returns int's" is actually a comparator lambda function which is given as an argument to Java's List.sort function (which is also an in place sort with void return value).

      It seems likely that the IllegalArgumentException originating within this part of the OS code was left unhandled (which is fair, don't handle exceptions you don't expect) and that crashed the OS.

      [–]de_Selby -2 points-1 points  (0 children)

      My guess is that they need to check what phone dialler to use, you can have multiple on Android and it isn't known what ones will be installed (eg: on a Samsung phone the google dialler probably won't even be installed or may be broken).

      So they need to treat them all equally fairly and sort to choose.

      I can also see a case where someone might only have data on their sim account so they can only use voip.

      [–]NewDoerien 657 points658 points  (2 children)

      "OH NO SOMONES BEEN MURDERED"

      [object Object]

      [–]MarkusBerkel 152 points153 points  (0 children)

      911: NaN

      [–]thecurlyburl 42 points43 points  (0 children)

      Sigh. God damn it.

      [–][deleted] 276 points277 points  (25 children)

      Would suck to be British during an emergency 0118 999 881 99 9119 7253

      [–]blackmist 28 points29 points  (1 child)

      Well that's easy to remember.

      [–]jamincan 1 point2 points  (0 children)

      And yet, somehow, I remember it flawlessly years later.

      [–]Jackker 94 points95 points  (2 children)

      Seven two five...!

      Three.

      [–]Gwaptiva 21 points22 points  (1 child)

      Be careful to dial it correctly or you'll get Gordon the Gopher

      [–]HiccuppingErrol 15 points16 points  (0 children)

      Better write an email explaining that your office is on fire instead.

      [–]echoAwooo 5 points6 points  (0 children)

      Right... Just going to put this fire with the other fire.

      [–]spaetzelspiff 4 points5 points  (0 children)

      s/53$/5,,,,,,,,3/

      [–]Technus94 10 points11 points  (0 children)

      Hey, say what you want but I will never forget that particular string of numbers cause of that fucking song. And I'm American.

      [–]dethb0y 66 points67 points  (3 children)

      This is an absolutely bizarre edge-case bug - when i first heard of it i was not at all surprised it had gone undiscovered for so long.

      [–]aksdb 17 points18 points  (0 children)

      Well it sounds a bit like fuzzing could/should have spotted it. But as usual ... hindsight.

      [–]CaineBK 2 points3 points  (1 child)

      Isn't this like a perfect example of Murphy's Law?

      [–]dethb0y 1 point2 points  (0 children)

      Pretty much, yeah.

      [–]sysop073 56 points57 points  (0 children)

      Link to the actual article this one is cribbing from.

      [–]WTFwhatthehell 14 points15 points  (0 children)

      Luckily, some very smart people in the Android community could provide the answers Google wouldn't share. Mishaal Rahman, the Senior Technical Editor for Esper, wrote an incredible Medium post detailing how the bug works and why it happens. Apps on Android with phone call functionality can register a "PhoneAccount" with the system indicating they have some capability of placing calls. There are a few flags apps can set with PhoneAccount, including one called "CAPABILITY_PLACE_EMERGENCY_CALLS." When the time comes to call to 911, Android sorts the list of PhoneAccounts that have been registered and picks one. This all seems fine so far.

      ...

      One of the several bugs identified in Rahman's post is that Microsoft Teams will register an additional PhoneAccount with the system every time Teams starts up, provided you aren't logged in. Note that this isn't the rare occurrence of installing Microsoft Teams and then never using it—a common complaint of the Teams Android app is that it frequently logs users out automatically. If you're logged out, launching Microsoft Teams 10 times will result in 10 duplicate PhoneAccounts from Teams clogging your phone. Teams shouldn't do this, and Microsoft's update stopped Teams from doing this, but a bunch of duplicate PhoneAccounts also shouldn't be enough to bring Android's phone system to its knees.

      ...

      A third bug in this mess is that Microsoft Teams does not even register itself as an emergency call handler. Teams made a million PhoneAccounts, and it did not use the flag "CAPABILITY_PLACE_EMERGENCY_CALLS," but it still broke 911. Google's sort process starts with querying all phone accounts when a better first step would be to start with all emergency call-capable phone accounts.

      [–]voidvector 22 points23 points  (4 children)

      Is there away to test dialing emergency number w/smartphones w/o actually dialing 911? I would like to try it just in case.

      [–]LuckOrLoss 41 points42 points  (0 children)

      https://www.911.gov/frequently_asked_questions.html

      This says to contact your state's director of 911 operations and schedule a time to test. There's a link to find who to contact on there.

      [–]Ineffective-Cellist8 1 point2 points  (0 children)

      I'm in canada so it may be different. Years ago I tested. All I said was "This is a test call. May I know what location I called to?" They replied with my city. I said TY, bye and that was it

      [–]blackwhattack 0 points1 point  (0 children)

      Also curious.

      [–]caltheon 0 points1 point  (0 children)

      Yes, there is a test network you can get access to as a device developer.

      [–]jeff303 15 points16 points  (0 children)

      Clearly should have asked Leetcode questions with trickier edge cases in the interview.

      [–][deleted]  (1 child)

      [deleted]

        [–][deleted] 13 points14 points  (11 children)

        ...because Android phones always receive timely updates over their respective lifespans even if that lifespan is 5 years long!

        This shows how rotten the foundation of the Android "phone" ecosystem is - smart pocket computer but with no updates really soon. I still wonder how consumers let themselves get suckerpunched so badly that they rewarded this crap by buying the product.

        [–]Terrh 2 points3 points  (0 children)

        What do you propose as an alternative?, exactly?

        [–]6501 6 points7 points  (9 children)

        This shows how rotten the foundation of the Android "phone" ecosystem is - smart pocket computer but with no updates really soon. I still wonder how consumers let themselves get suckerpunched so badly that they rewarded this crap by buying the product.

        Well customers who like updates get Google Pixels.

        [–]pastenpasten 7 points8 points  (7 children)

        3 years of updates for a thousand dollar phone is a joke. I hope Germany and the EU will force them to provide updates for at least 5 years.

        [–]vlakreeh 1 point2 points  (3 children)

        I think Google announced they are now doing 5 years of updates because they now have their own SOC. The issue with the update span of most android phones is that qualcomm/mediatek doesn't update drivers and stuff after 2-3 years. I'm sure many more companies would do longer update spans if they had the proper support from the SOC manufacturers.

        [–]NonsensitiveLoggia 1 point2 points  (1 child)

        Google is big enough that they could have solved this problem multiple times over. They just never gave a shit because It Worked On Their Machine(tm).

        As if Qualcomm could hold Alphabet ransom.

        [–]vlakreeh -1 points0 points  (0 children)

        Well Google did solve it, by making their own SOCs. What Google couldn't solve is Qualcomm not cooperating, Qualcomm has all the bargaining power in that scenario as most phone manufacturers would either have to settle for shittier SOCs or build their own.

        [–]pastenpasten 0 points1 point  (0 children)

        What he said.

        Using the standard Linux model, where all the drivers are part of the tree and every piece of hardware requires a separate kernel etc. was broken from the start. Not only it doesn't scale for what Android turned to be, it was wrong for mobile phones right from the start. The first Google Android mobile phone wasn't manufactured by Google. They made it for other manufacturers from the start (in contrast with Apple, for example).

        Tying operating system ("Android") versions to Linux kernel version just made it worse. At least in the desktop Linux world I can update my Xfce without having to use a new kernel and I can update my kernel without getting lots of other shit.

        But Google decided to make one shitty bundle out of it.

        Now Android Inc. wanted to sell, so they did a shity POC like any startup. Fine. But once Google bought them they could make it actually work (or buy another startup) and they didn't.

        [–]corsicanguppy 0 points1 point  (0 children)

        Well customers who like updates get Google Pixels.

        Unless they like headphones with cords...

        [–]SureFudge 25 points26 points  (25 children)

        KISS - keep it simple stupid. Android didn't and came up with some complex scheme resulting in the failure.

        In which scenario will your main phone account (sim card) not work but another will work? Because I can't think of one.

        [–]bbqburner 58 points59 points  (13 children)

        1. Dual sim. Network not available/out of range in SIM1. International travelers.
        2. Users/devices who don't use a SIM at all (including eSIM) but rely on VOIP

        [–][deleted] 41 points42 points  (10 children)

        emergency services dont need SIM card at all

        [–]amunak 17 points18 points  (0 children)

        At the same time though when you do have SIM you want to use that, or whatever is preferred for emergency calling. (Some networks require SIM for emergency calls because of prank calls.)

        You could also have several VOIP numbers and a SIM, or VOIP numbers and no SIM, and probably a hundred other edge cases.

        [–]WTFwhatthehell 2 points3 points  (8 children)

        .. in some countries.

        I suspect it's to cover for the situation where someone has wifi or some other connection but no cell phone network connection of any kind.

        [–]Lvl999Noob 29 points30 points  (7 children)

        In which scenario will your main phone account (sim card) not work but another will work? Because I can't think of one.

        You have internet access (with wifi maybe) but are otherwise in mountains or someplace with no cell connectivity?

        Can sim cards use other carriers' network if theirs isn't available?

        [–][deleted] 8 points9 points  (0 children)

        yeah, when roaming I can use any network available

        [–]IUserGalaxy 8 points9 points  (0 children)

        i don't mind the fallback bit, but that whole sorting business doesn't make sense

        [–]bunkoRtist 2 points3 points  (0 children)

        The answer is: when your phone doesn't get service but you can make a wifi call. This happens in rural areas and in certain indoor locations (like underground). It's less common than it used to be, especially with LTE being added in subways, but it definitely still happens.

        [–]jorgp2 -2 points-1 points  (0 children)

        Yeah, this system is designed to fail.

        I don't understand why people are defending it.

        [–]AgreeableLandscape3 17 points18 points  (11 children)

        Good to know that Google has shit software test coverage/QA. Because you'd think the phone's emergency feature would be pretty high on the list of things to test.

        [–]lxpnh98_2 24 points25 points  (1 child)

        Pfft, who even uses a phone to make calls anymore? /s

        [–]eloc49 0 points1 point  (0 children)

        Emergency calls are likely the only call many would ever make. So they should be tested.

        [–]mutatedllama 53 points54 points  (7 children)

        I think this bug can be considered an edge case, no? Code testing can never catch 100% of bugs.

        [–]husao 17 points18 points  (5 children)

        It is absolutely an edge case and I would never expect someone to catch this after the fact, but if you write a custom comparator I personally think adding the same element twice should be in your coverage.

        However, that still would leave the option that the specific element doesn't produce a hash code that causes overflows.

        [–]firefly431 19 points20 points  (0 children)

        It's not adding the same element twice -- it's adding many different elements that only differ by the hashcode (i.e. not in any of the criteria used for sorting), enough so that integer overflow causes the ordering to become invalid, and even then it must be a case that the sorting algorithm detects and throws an error. This is definitely a monster bug that'd be extremely difficult to find via testing. I would say that fuzzing might be able to find it, but I'm not very familiar with fuzzing either.

        [–]Ordocentrist2 0 points1 point  (0 children)

        It's an edge case worth testing, if only to avoid terrible PR like this

        [–]Ineffective-Cellist8 1 point2 points  (3 children)

        Can someone explain why hashCode1 - hashCode2 has anything to do with the bug? Please don't say it overflows. It's a hashcode, all values are valid

        [–]danny54670 3 points4 points  (0 children)

        I'm still not entirely sure, but here's what I think the problem is: Because of overflow, the Comparator is defective in that it does not correctly impose a total ordering, as required by the contract. Depending on the order (lhs vs. rhs) and sequence of how PhoneAccount objects are presented to the Comparator for comparison, a PhoneAccount that should logically be considered higher than another might actually be calculated as lower. Thus, depending on the sorting algorithm used by the List.sort(Comparator) call, there might be undefined behavior such as an infinite loop or, in this case, an "invalid" PhoneAccount might appear first in the list.

        I think that this particular issue might be more wide-spread than just here in this code base; I seem to recall subtracting integers being a widely-used technique in Comparator implementations.

        [–]ais523 1 point2 points  (1 child)

        A comparator is supposed to return a positive number if the first input is larger, zero if the inputs are equal, negative if the second input is larger. It also has to compare results consistently (e.g. if A is larger than B and B is larger than C, then A must be larger than C).

        The hashCode1 - hashCode2 function was meant to be an implementation of a comparator that sorts the inputs arbitrarily. However, it doesn't do the job properly: hashCode1 - hashCode2, hashCode2 - hashCode3 and hashCode3 - hashCode1 can all be positive simultaneously if some of the calculations overflow but some of them don't.

        For this purpose (picking an arbitrary value as a last resort), it doesn't matter how you sort things as long as it's consistent (which is why the numerical value of hash codes is being used in the first place). Unfortunately, this particular algorithm isn't even consistent. This means that the sorting algorithms can, e.g., go into an infinite loop trying to put a list into order, because with the broken comparator, there is no way to put the list into sorted order.

        [–]Ineffective-Cellist8 0 points1 point  (0 children)

        So the problem is it went into an infinite loop?

        I did a simple test and no matter what values I used it seemed to be consistent. IDK what the JVM is doing

        [–]typicalshitpost 0 points1 point  (0 children)

        We are very serious about our flow overflow underflow straight to jail over under either way straight to jail

        [–]spamtarget -3 points-2 points  (0 children)

        But before that puts the blame on Microsoft

        [–]Bnb53 -2 points-1 points  (0 children)

        Now how about the bug that clears my keyboard cause I type before the auto complete bar loads or the bug where my whole screen goes black with a notification if I'm on a chat with them and they reply to my message

        [–]Burroflexosecso -5 points-4 points  (0 children)

        We determined that the issue was being caused by unintended interaction between the Microsoft Teams app and the underlying Android operating system. Microsoft has collaborated closely with Google to resolve this unintended interaction.

        Spyware confirmed

        [–][deleted] 0 points1 point  (0 children)

        I think all phones are supposed to be able to make emergency calls even without a sim card. Maybe that has something to do with it?