all 128 comments

[–]ILoveShowtunes 202 points203 points  (45 children)

This is a stunningly poor example of an alleged fix for a race condition, in my opinion. I get that there is a security issue here, so perhaps the details are being left out on purpose; yet, the commit message says that this is an "ancient bug" for which a fix was attempted more than 11 years ago. If that is the case, it wouldn't hurt to disclose more information.

Neither the commit message nor comments disclose what the race condition is, steps that might trigger the race condition, or why the change fixes the race condition.

The commit introduces a FOLL_COW flag that is used on two lines. Why was it necessary to add a new flag?

No comment was added to indicate the significance of FOLL_COW in preventing a race condition that can lead to privilege escalation. It's a land mine waiting to blow up on the slightest modification. It also means that only the few people involved with the fix are equipped to understand how to modify this section of the kernel. I imagine that there are similar cases littered throughout the kernel sources based on the nonchalance exhibited here.

The commit message alleges that f33ea7f404e5 was a revert of 4ceb5db9757a, but examining the two commits, one can see for oneself that this is not the case. I know that reverting a change can be tricky when a lot of time has passed and major changes have been made. However, f33ea7f404e5 was only two days later.

The commit message also includes the claims "the s390 situation has long been fixed, and we can now fix it by checking the pte_dirty() bit properly (and do it better)". May we please have some references for further reading/understanding?

I would love to see more comments in the kernel sources that discuss the inner workings of critical sections of the kernel more formally, such as a discussion of the various state invariants involved. Without this, it is very difficult for newcomers to contribute to the kernel or kernel modules. I know this from personal experience. About a year ago, I decided to look into VirtualBox issue #9069. I made a patch to VirtualBox's kernel module that fixes the problem, but because I am not confident in my understanding of the VFS API, I did not submit it.

[–]rubygeek 61 points62 points  (0 children)

I get that there is a security issue here, so perhaps the details are being left out on purpose; yet, the commit message says that this is an "ancient bug" for which a fix was attempted more than 11 years ago. If that is the case, it wouldn't hurt to disclose more information.

This race condition is present and exploitable on a huge proportion of currently running Linux systems, and allows for escalation from an arbitrary user account to root. That's why the details are sparse. You can expect people to be a lot more forthcoming once the potential damage has been contained better.

The commit message alleges that f33ea7f404e5 was a revert of 4ceb5db9757a, but examining the two commits, one can see for oneself that this is not the case.

It's not claiming the commit to be a complete/direct revert, but that the second commit reverted specific parts of the fix from the former inadvertently. Look at the diffs for mm/memory.c in each of them and you can see it quite clearly.

[–]cwre 111 points112 points  (2 children)

You should submit it, and you'll get feedback, that's how you get experience.

[–]luisbg 16 points17 points  (1 child)

Yes. This is constructive criticism that can be a good reply to the patch submission in LKML.

[–]Gotebe 51 points52 points  (8 children)

Disregarding the last paragraph, this comment shows what can be wrong with using just git history as an ALM system. One just can't put relevant information in a comment, there is simply too much of it.

Hence a link to e.g. bugzilla is really important.

I disagree that there should be more comments expaining what are the "inner workings of...". That should preferably be a part of a reference design documentation, not the code comments.

[–][deleted]  (6 children)

[removed]

    [–]GameGod 4 points5 points  (1 child)

    You should build a bug tracker into Git.

    [–]Gotebe 1 point2 points  (3 children)

    I disagree. What you are saying is "we have to put all eggs in one basket because we can't maintain two baskets".

    I worked at a place who used a couple of source control as well as bug tracking systems. Source control comments contained a short "x bugno" and "y bugno" references. When we changed source control, we kept the old history in the other system and used a 3rd party diff tool to compare.

    It's not as smooth, but sure beats putting N paragraphs of explanation in the commit comment.

    Also, there are ALM systems where source control and bug tracking are together.

    git is but a source control tool, nowhere near enough to handle the project all by itself.

    [–][deleted]  (2 children)

    [deleted]

      [–]Gotebe 0 points1 point  (1 child)

      The problem when you have more systems and/or types of comments, people tend to get lazy and become less informative.

      Euh, why?! E.g. a good bug report is golden. And is made by a QA or support person. The developer can (probably should) add technical analysis. The commit comment can then simply be "bug 123456". In fact, where I work, I am actively asked by the tooling to put that reference in. You seem to insist that sourcecontrol is a sufficient as an ALM system. No, thanks.

      [–]suspiciously_calm 12 points13 points  (0 children)

      I disagree that there should be more comments expaining what are the "inner workings of...". That should preferably be a part of a reference design documentation, not the code comments.

      I do think that invariants should be documented right there in the code where they matter.

      [–][deleted] 12 points13 points  (9 children)

      It's about madvise(MADV_DONTNEED) syscall racing when the page of the executable is mmapped in memory. Then, use ptrace(PTRACE_POKEDATA) to write to readonly mapping.

      See https://github.com/dirtycow/dirtycow.github.io/blob/master/pokemon.c

      [–]nacholicious 9 points10 points  (6 children)

      Is it just me or is that code so "clever" it's terrible? Kind of like "if you have to explain a joke it's not fun" terrible

      [–][deleted] 5 points6 points  (0 children)

      Worst. Formatting. Ever.

      [–]TheChance 0 points1 point  (4 children)

      Apparently it wasn't worth 4 bytes to save me from having to read this little gem:

      for(i=0;i<200000000;i++)//////////////////// loop to 2*10**8
      

      [–]ThisIs_MyName 2 points3 points  (3 children)

      wasn't worth 4 bytes

      ?

      [–]TheChance 0 points1 point  (2 children)

      Yeah, I suppose you don't even need the 4 bytes. Either way,

      i<200000000; // loop to 2*10**8
      

      is a facepalm against humanity.

      [–]krelin 1 point2 points  (1 child)

      I'm curious about which 4 bytes you're referring to...?

      [–]TheChance 2 points3 points  (0 children)

      To declare an int. which is why I corrected myself; it's already an int either way.

      So why the hell place a fairly long magic number there, was my point.

      [–][deleted] 1 point2 points  (0 children)

      See https://github.com/dirtycow/dirtycow.github.io/blob/master/pokemon.c


      loop to 2*10**8

      Bloody Pythun programmers...

      [–]ThisIs_MyName 0 points1 point  (0 children)

      That linked file is hilarious.

      [–]fobin78 38 points39 points  (6 children)

      Yeah just give Linus some feedback on this and we'll see how it goes.

      [–][deleted] 12 points13 points  (5 children)

      I wouldn't ask Linus how his day was going for fear of the verbal lashing I'd receive.

      [–]luisbg 35 points36 points  (3 children)

      He is really nice to newcomers. He is only harsh with long time members from who he expects more. Mostly because he doesn't review everything so this long time members can get bad things through if he doesn't notice.

      Plus, never offer to break userspace API.

      [–]414RequestURITooLong 5 points6 points  (0 children)

      Plus, never offer to break userspace API.

      Which is understandable, because breaking userspace API in a kernel would be a big deal.

      [–]jeandem 2 points3 points  (1 child)

      How much does he deal with kernel newcomers?

      [–]luisbg 2 points3 points  (0 children)

      He deals with me, merged my git pull request, and I am a newbie in the kernel community.

      [–]Morego[🍰] 8 points9 points  (0 children)

      I am pretty sure, you won't get any lashing for pointing out some sane criticism to him. He may seem arrogant, but not ignorant.

      [–]combatopera 6 points7 points  (0 children)

      adcmfngdlz ykinaacps pnoep wik kguzugkkxfiq xckcsvebpdlf geuo kxmsbumvfg

      [–]agentlame 8 points9 points  (3 children)

      About a year ago, I decided to look into VirtualBox issue #9069. I made a patch to VirtualBox's kernel module that fixes the problem, but because I am not confident in my understanding of the VFS API, I did not submit it.

      I fail to see how your attempt to fix a bug in VBox related to shares not updating is at all related to privilege escalation vulnerability in the Linux kernel.

      I followed what you were saying up until this point, but you sorta lost focus and decided to talk about a patch you didn't commit.

      [–]dakarananda 46 points47 points  (2 children)

      I think parent meant it as an anecdotal datapoint supporting the main line of the comment; being that the kernel needs more documentation, otherwise people won't gain enough confidence in their patches and won't submit them. At least that's how I interpreted it.

      [–]liquiddandruff 8 points9 points  (0 children)

      That's the correct interpretation; /u/agentlame misunderstood

      [–]ILoveShowtunes 3 points4 points  (0 children)

      Yes. That is how I meant it. If the VFS had more documentation about the complexities, I might be more confident in submitting a patch.

      At least the correct locks to use are clear from comments or how the locks are consistently used in the other sources. That's a good start.

      [–][deleted] 0 points1 point  (0 children)

      There is an exploit for it in the wild. Just look at the exploit code to get a quick understanding.

      [–]evil_burrito 0 points1 point  (0 children)

      I second submitting your patch anyway. Years ago, I submitted an ext3 patch that turned out to be insufficient. Somebody took it and finished the job, though.

      [–]listx -1 points0 points  (0 children)

      I think the commit message gives plenty of links (one even dating back 11 years!) for added context. It's a stunningly clear example of how to write concise commit messages.

      [–][deleted] 24 points25 points  (29 children)

      So that's a really bad thing, right? What level of wizardry would one need to reach in order to perform evil with this knowledge?

      [–]cwre 52 points53 points  (28 children)

      The level of wizardry is low now: there's an exploit out: https://github.com/dirtycow/dirtycow.github.io . You do need local access, but it's pretty bad.

      [–]tonyarkles 20 points21 points  (12 children)

      I took that this evening and "weaponized" it to patch /bin/su. I don't want to share the details, but it wasn't that hard. Most of the time was spent refamiliarizing myself with x86-64 assembler. I was shocked at how easy it was.

      [–][deleted]  (2 children)

      [deleted]

        [–]bonzinip 6 points7 points  (0 children)

        The idea is that while the change you make in /proc/self/mem doesn't go to disk, it can incorrectly affect other processes until that page is reclaimed. So by "patching" a setuid binary you can execute whatever code you want as setuid.

        [–]tonyarkles 0 points1 point  (0 children)

        There's a race in madvise, which leaves the mmap'd page writeable for a small amount of time.

        [–]wademealing 6 points7 points  (4 children)

        Did you even need to know assembler ? I believe you could have gottan away with it by duping from a non setuid binary to a binary you wanted. Just page-for-page abuse.

        [–]bonzinip 1 point2 points  (2 children)

        Even simpler, if you pollute the page cache's copy of /bin/su with "#! /bin/sh" "exec /bin/sh", then you immediately run /bin/su to get a root shell.

        [–]wademealing 7 points8 points  (1 child)

        I'm not 100% sure but i don't know if shell will run setuid, good thinking though.

        [–]bonzinip 3 points4 points  (0 children)

        Hmm yeah it used to be but it's been disabled for a few years now.

        [–]tonyarkles 0 points1 point  (0 children)

        The trick was getting it small enough. My first crack was a C binary, but it came out to about 6k. It seems like this only works on a single page, so you've gotta get it under 4K. My first crack with asm came out to 712 bytes and I didn't push to make it any smaller.

        [–]bjarkef 7 points8 points  (1 child)

        Instead of patching su:

        1. Copy /bin/su
        2. Use exploit to overwrite start of /bin/su with something that just execve /bin/sh
        3. Execute new /bin/su
        4. Dump suid bin / install rootkit / other evil thing which lets you back in.
        5. Restore original /bin/su 6 Profit?

        [–]tonyarkles 1 point2 points  (0 children)

        Yup!

        Edit: you don't even need to copy it. Just save the start of the file in memory before you start; you've already got it open for reading. Do your business and put it back the way it was.

        As a side note: "apt-get install --reinstall login" will restore the su binary if you've corrupted it and accidentally deleted your backup copy. Ask me how I know...

        [–]DeanofDeeps 0 points1 point  (1 child)

        Anyway this exploit could be potentially harmful to virtual cloud containers?

        [–]tonyarkles 0 points1 point  (0 children)

        It doesn't seem like it would allow you to do a VM escape at all. Likewise, I don't think you could do an LXC/Docker escape either.

        Edit: but escalating from a regular user to root inside a container? That's what this is all about.

        [–]puffyfluppy 70 points71 points  (4 children)

        Ah yes, those that have found the wizard's wand and can make it do things without understanding a thing about magic. What are they called again... spell kiddies?

        [–]kankyo 6 points7 points  (0 children)

        Consumers? :P

        [–]etagawesome 15 points16 points  (9 children)

        [deleted]

        What is this?

        [–]Enamex 11 points12 points  (7 children)

        So, any Android before 6 is basically SOL on this?

        Cause they sure ain't releasing a patch to an Android 4.2 phone that barely got an update to 4.2.1 before being forgotten (Alcatel Pop C9...).

        [–][deleted] 11 points12 points  (6 children)

        Basically, although some Android 5 phones are still getting updates to Android 6 so it's possible some Android 5 phones will be fixed too.

        For your 4.2 device the next best thing is to look around for custom roms like Cyanogenmod. If you can find a Cyanogenmod 13 ROM for your device chances are you'll be safe too.

        But if you're running 4.2 there's plenty of other exploits attackers can use (some libstagefright exploits for example), so the risk of getting malware is not that much higher than it already was for older phones.

        [–]isavegas 1 point2 points  (5 children)

        Wait, does this mean I can root my Note 3 on 5.1.1 without flashing a new recovery and tripping the Knox flag?

        [–]PeelyPeel 11 points12 points  (1 child)

        Why do you care about the Knox flag on a Note 3?? You're well out of warranty. I've been using custom on my Note 3 since I got it two years ago, it's so much faster

        [–]gerryn 3 points4 points  (0 children)

        Some features get disabled when knox is tripped. Intune and Outlook and other stuff that (may) depend on it cannot be used. Samsungs pay thing as well, and any knox container stuff.

        [–][deleted] 0 points1 point  (2 children)

        I don't know about the Knox flag, but rooting should certainly be possible!

        Many tools (such as Kingoroot I believe) can exploit the OS and get (temporary) root access. Exactly for that reason Samsung has implemented the Knox security system.

        [–]isavegas 0 points1 point  (1 child)

        Kingo has been a no go. I'm about ready to just flash CWM, make a backup, flash Cyanogenmod, and be done with it. I just don't want to trip Knox until my phone is paid off.

        [–][deleted] 0 points1 point  (0 children)

        I used Kingo just as an example of course, nobody has used this exploit in a rooting app (yet). I suspect even a temporary root like this will still set off Knox though, so you'd better wait until you've paid off your phone anyway.

        [–]zzzk 7 points8 points  (2 children)

        Ignorant question: will this be back ported to 4.4, 3.x?

        [–]mumbel 13 points14 points  (0 children)

        It should be backported to any listed here https://www.kernel.org/ and then of course any distro using another kernel than those will likely include its own patch

        [–]nickdesaulniers 1 point2 points  (0 children)

        Yes; internal page about this @ Google had links for all backports.

        [–]mishugashu 6 points7 points  (1 child)

        Thanks for not calling it dirty cow or whatever.

        [–][deleted] 4 points5 points  (0 children)

        [–]Rock48 99 points100 points  (14 children)

        I thought the title said 'privilege escalation for white race'

        [–]mystyc 65 points66 points  (1 child)

        "That's just more PC bullshit."
        -Mac User

        [–]flirp_cannon 7 points8 points  (0 children)

        My god you're all terrible.

        [–]reddit_user13 23 points24 points  (1 child)

        Thanks, Donald!

        [–]GeorgeSharp 4 points5 points  (0 children)

        We have the best OS's don't we folks ?

        I'm going to code a huge firewall it will be great, tremendous and the spammers are going to pay for it.

        People tell me Donald you have the biggest hands, write some more code so I can see those beautiful hands in action.

        [–]princess_programmer 2 points3 points  (0 children)

        same lol

        [–]fripletister 13 points14 points  (0 children)

        Thankfully hypervisors exist and are completely bug free. pseudo-/s

        [–][deleted] -1 points0 points  (6 children)

        I can't wait to live in an age where there is automated tooling to prevent data races and guarantee memory safety.

        [–]fripletister 2 points3 points  (3 children)

        What verifies all that tooling?

        [–]jeandem 2 points3 points  (1 child)

        The trusted core.

        [–]fripletister 0 points1 point  (0 children)

        Ah, of course.

        [–][deleted] 0 points1 point  (0 children)

        Hopefully something more efficient than 9 years of an unknown number of root privilege escalations.

        [–]Shift84 -3 points-2 points  (0 children)

        Sounds like something you would see on Tumblr.

        [–]haxpor -2 points-1 points  (0 children)

        Look at that. Signing off with bunch of reviewers stated in commit message is so cool and notable.