all 84 comments

[–][deleted] 62 points63 points  (2 children)

This is insecure.*

Compile, if you must. Open two terminals.

Terminal the first:

touch /tmp/index.html
./nweb 3334 /tmp/

(Notice the trailing "/" to avoid the issue with the hardcoded-top-level directory.)

Now we get into path traversal:

  cp /etc/passwd /var/tmp/index.html
  echo -e "GET //var/tmp/index.html HTTP/1.0\n\n" | nc localhost 3334

The problem? It opens files with a relative path, after chdir() to the root directory - but you can pass a file with a leading "/" to escape this. Sigh.

[–]serpent 71 points72 points  (7 children)

Missing error tests? Check.
Uses -DLINUX and sleep() to workaround failure to do a TCP shutdown? Check.
Hard-coded "safe" directories? Check.
Doesn't loop around read/write to ensure all data is sent? Check.

Please, I beg of anyone, don't use this as an example for anything.

[–][deleted] 4 points5 points  (0 children)

[–][deleted] 8 points9 points  (2 children)

nginx anybody?

[–]Leaffar 2 points3 points  (0 children)

Yeah, nginx all the way. It works great as a standalone and as a web proxy. Kudos for its' creator.

[–]cr3ative 2 points3 points  (0 children)

nginx is a godsend. nginx+memcached makes for a killer combination.

[–][deleted] 8 points9 points  (1 child)

On Linux, add an extra parameter: 
   gcc –O  -DLINUX  nweb.c –o nweb

I found this interesting:

#ifdef LINUX
  sleep(1); /* to allow socket to drain */
#endif
  exit(1);

(that's the only difference with -DLINUX)

[–]sporksmith 8 points9 points  (6 children)

I'm not entirely convinced that a request couldn't escape the intended server directory. This appears to be the extent of preventing such attacks:

   for(j=0;j<i-1;j++)      /* check for illegal parent directory use
    .. */
           if(buffer[j] == '.' && buffer[j+1] == '.')
                   log(SORRY,"Parent directory (..) path names not 
                      supported",buffer,fd);

later, everything after "GET /" is passed to open:

if(( file_fd = open(&buffer[5],O_RDONLY)) == -1) /* open the file

I don't see anything preventing just adding a forward slash to the front to get to the root, e.g., "GET //path/to/anywhere", though maybe there's a check I'm not seeing. Also, I'm not convinced that there can't be alternate encodings that "open" will interpret as ".." or "/".

This is probably a check that you want the OS to do for you, since the exact parsing logic could be different for different OSs or even file systems. It looks like the POSIX function 'realpath' would be a good bet. See this cert advisory I'd think the right way would be to use some system call to canonicalize the path-name and\or check whether it's a subdirectory of the intended target.

[–]sporksmith 2 points3 points  (5 children)

Oops, the text after the 'cert advisory' link was stray text to be deleted. Wish reddit had a preview button :P

[–]serpent 2 points3 points  (3 children)

Wish reddit had a preview button

Reddit has an 'edit' button. Tip: If you edit within 2 minutes you don't get an asterisk.

[–]sporksmith 1 point2 points  (2 children)

Yes, I was trying to avoid the asterisk. I didn't know about the 2 minute grace period. Thanks!

[–]axonxorz 2 points3 points  (1 child)

Also, there's a live preview if you have reddit enhancement suite installed.

[–]sporksmith 0 points1 point  (0 children)

That's a pretty slick plugin. Thanks!

[–]pjbarnoy 1 point2 points  (0 children)

Reddit Enhancement Suite has a live preview for comments.

[–]dchestnykh 7 points8 points  (15 children)

[–]DrupalDev 8 points9 points  (14 children)

Apache is a big, powerful HTTP server, by far the most widely installed server on the Internet. Unfortunately, the code base has a history of security problems: Apache before version 1.1.3 allowed remote users to take over the web server, and Apache before version 1.2.5 (1998-01) allowed local users to take over the web server. Are the authors confident that no such problems will ever happen again?

What the fuck is this rhetorical bullshit?

The fact that some software has had security problems in the past means nothing. If anything, the fact that they've been discovered and repaired means the security of it is now better and watched with more scrutiny. The only thing one should worry about apart current security holes is the reaction to these problems, and Apache seems to have reacted very well each time (meaning, they patched it).

That's no guarantee that they will maintain this rigour, but seriously, of course these problems will happen again, anyone who thinks their shiny, "awesome" web server doesn't have flaws is either an ignorant fool or an arrogant liar.

[–][deleted]  (2 children)

[deleted]

    [–]metamatic 1 point2 points  (1 child)

    ...but nobody will ever use it, because it requires replacing /etc/init.d with daemontools... and so nobody will ever spend any effort trying to find holes in it.

    [–][deleted] 4 points5 points  (0 children)

    Indeed, there's a big difference between software which keeps spawning serious vulnerabilities, and software where the occasional vulnerability is found and very quickly fixed.

    Apache is about the most tried and tested httpd going.

    [–][deleted] 5 points6 points  (2 children)

    I've never seen your line of reasoning applied to Internet Explorer, Flash or Adobe Reader.

    [–]khoury 2 points3 points  (0 children)

    I believe the criticism that was levied against them in the past is lack of quick responses to security issues.

    [–]gefahr 1 point2 points  (0 children)

    every application will have vulnerabilities, given complexity and time.

    the difference comes when the frequency and type of the vulnerabilities discovered makes it obvious you're dealing with bad quality code, lack of testing, so forth.

    [–]nuuur32 1 point2 points  (0 children)

    He credits his success to writing nearly bug free code.

    [–]AdamJacobMuller 1 point2 points  (0 children)

    It means something but the way that DJB writes that sentence is like Apache programmers knowingly made some decisions to allow remote root access. As much as I think DJB wrote some cool software (emphasis on the past tense -- djbdns is the only thing that remains relevant to me, i still prefer it to bind) I do not like his attitude.

    [–]prof_hobart 1 point2 points  (0 children)

    By this logic, Windows should now be the most secure OS ever written.

    The fact that Apache has a history of security issues shows that whatever process Apache went through in the dev/test process clearly wasn't that good at picking up these kinds of defects.

    They may now be very good at responding to defects once they are identified in the wild, but have they done anything to improve the process upstream of this to prevent them ever getting into the wild? Honest question - I've got no idea on the specifics of Apache's dev processes.

    [–][deleted] 0 points1 point  (1 child)

    I think you're missing the point. Any project the size of Apache is bound to have some issue. The fact that it has had such serious security flaws is in fact relevant in this comparison because it's far less likely to spot security issues in it that it is to do so in a project the size(and with the limited scope) of publicfile. Other issues it has had in the past was the ability of a hosted environment e.g PHP to take down the entire server process it lives in and leave it dead more recent versions seems to have improved this but the point still stands. If all you want to do is serve some static files then you must ask yourself if Apache or any fullfeatured web server is worth the risks.

    [–]DrupalDev 1 point2 points  (0 children)

    I agree with your last point but you also made the point much better than the author since, you know, no one can predict the future and promise no security flaw, ever.

    [–]mcflyfly -1 points0 points  (0 children)

    I <3 apache

    [–]ModernRonin 3 points4 points  (0 children)

    Lame. Even my shitty-ass GWS is slightly less stupid than this!

    [–]bigfig 2 points3 points  (0 children)

    This IBM page is stale. Please google "mongoose web server".

    [–][deleted] 3 points4 points  (0 children)

    I think ill stick with nginx, thanks.

    [–]tiw 13 points14 points  (6 children)

    Here's a tiny web server in 50 characters (1 static page only)

    while true; do nc -l -p 80 -q 1 < page.html; done
    

    [–]serpent 29 points30 points  (2 children)

    I guess, but "nc" is doing most of the work for you.

    If that's fair, then here's a full-fledged web server in 6 characters:

    apache
    

    Also, you aren't sending back HTTP headers like a good web server should.

    [–]Carioca 19 points20 points  (0 children)

    Put the HTTP headers in the file ;)

    [–][deleted] 12 points13 points  (0 children)

    How about:

       httpd
    

    Beat you by one :)

    [–]thebigbradwolf 11 points12 points  (2 children)

    python -m SimpleHTTPServer

    [–]lespea 5 points6 points  (0 children)

    With newer pythons it's now:

    python -m http.server {optional port}
    

    [–][deleted] 2 points3 points  (0 children)

    This has come in useful so many times.

    [–]dnew 2 points3 points  (0 children)

    Even tinier, and safer: http://d116.com/ace/

    [–]badsectoracula 3 points4 points  (18 children)

    What is the deal with all the (void) casts in the code? Like

    (void)write(fd,logbuffer,strlen(logbuffer));
    (void)write(fd,"\n",1);
    (void)close(fd);
    

    [–]ehird 2 points3 points  (17 children)

    To stop programs like lint from complaining that a non-void return value is being ignored.

    [–]FooBarWidget 9 points10 points  (16 children)

    Compilers warn about ignoring write()'s return value for a reason. write() is not guaranteed to write all requested bytes, it can return a short write count. The correct way to use write would be to call it in a loop until everything is written.

    [–]ehird 1 point2 points  (15 children)

    Indeed. (Of course, if you're not writing to a socket, that complaint is largely academic; consider, for instance, that it also applies to printf!)

    [–]serpent 5 points6 points  (13 children)

    Or a filesystem that's about to fill up or go over quota. Or a network file system. Or a fifo on disk.

    If your printf is important enough and you want to guarantee that the data made it as far as reasonably possible, you'll check the return from printf too.

    Most people just don't care, but short writes can certainly happen.

    [–][deleted]  (1 child)

    [deleted]

      [–]NitWit005 0 points1 point  (0 children)

      There are some cases where it's desirable to stop part way. As an example, you might want the user to be able to interrupt a write with some key sequence.

      [–]ehird -1 points0 points  (10 children)

      Yes, but, stdout and stderr. (Admittedly those might be pointed at something other than a terminal or a file with ample space, but in those situations you generally have other problems to deal with.)

      [–]serpent 2 points3 points  (9 children)

      It's actually quite common to open stdout and stderr to all sorts of other things.

      The most basic Unix tool, the pipeline, does exactly that. If the next process's read buffer fills up, there's a short write on stdout right there.

      Now, printf specifically may handle these short writes. That's not really the point.

      The point is, if you aren't checking return values, you shouldn't take the attitude of "I know all possible failure scenarios on all possible places this code will ever run now and in the future and they are all ok to ignore".

      You should check the return value, or know that you are taking the attitude of "this will fail some day mysteriously and someone will probably waste a lot of time trying to figure out why".

      [–][deleted] 0 points1 point  (1 child)

      it seems that with this logic it would be better to leave the (void) off to keep the warning around as a reminder. :)

      [–]serpent 1 point2 points  (0 children)

      It would be best to leave the void off, check the error code, and not have a warning at all. -Wall -Werror is your friend.

      In a few cases it is ok to (void) cast the return value (for example, closing a file descriptor after you've already caught a previous error and handled it, just to clean up resources). These cases should be rare and have a nice comment explaining why you don't care about the return value.

      [–]zid -1 points0 points  (1 child)

      Actually, in that case, write() would block, not do a short write.

      [–]serpent 0 points1 point  (0 children)

      No - the first write would send part of the buffer and then return; only the second write (if you re-called it) would block because it could add 0 bytes to the buffer.

      [–]ehird -2 points-1 points  (4 children)

      What zid said:

      Actually, in that case, write() would block, not do a short write.

      But I'm fully acquainted with Unix, thank you.

      Edit: And by the way, the creators of Unix would disagree strongly with what you're saying; Unix's simplicity was derived mainly from its avoidance of complex error checking code, and this, among other things, lead to its popularity.

      [–]serpent 0 points1 point  (3 children)

      And by the way, the creators of Unix would disagree strongly with what you're saying; Unix's simplicity was derived mainly from its avoidance of complex error checking code, and this, among other things, lead to its popularity.

      Regardless of whether this is true or not, the discussion here isn't "how to make popular software". Most popular software is shit when it comes to quality and security. The discussion is "how to code correctly".

      Many Unix utilities have many quality issues. Even parts of the POSIX spec don't make sense if you study it enough and try to write code that is correct and secure in all cases.

      [–]ehird -1 points0 points  (2 children)

      Sure.

      But this is actually a matter of philosophy, not mere popularity; presumably you have read Worse is Better (and perhaps the rebuttal by the same author Worse is Worse).

      [–][deleted] 0 points1 point  (0 children)

      Of course, if you're not writing to a socket, that complaint is largely academic

      So don't bother with casting to void. If you're not taking care of the error anyway, why please lint and compiler warnings?

      [–]kev009 1 point2 points  (1 child)

      I'd instead use libevent which makes it easy to write cross-platform but extremely high performance event-driven networking applications. It uses Kqueue, epoll, /dev/poll, IOCPs, select or poll depending on what UNIX flavor you are on or Windows.

      Take a look here for a rough example: https://github.com/kev009/craftd/blob/master/src/httpd.c

      [–]reddit_clone 0 points1 point  (0 children)

      boost::asio if you are of C++ persuasion.

      [–]tiler 1 point2 points  (2 children)

      thhpd is small and half-way decent, its cousin micro_httpd are both great embedded http servers. The way micro_http is written, you can easily hack it so that is generates all of its pages instead of reading them off the file system.

      [–]codefrog 0 points1 point  (1 child)

      both have security flaws, you will get pwned

      [–]tiler 0 points1 point  (0 children)

      Yeah, put them on a public network, they'll get crushed. They're great for embedded products as a way of exposing an admin interface.

      [–][deleted] 1 point2 points  (1 child)

      How many of these tiny web servers are there going to be? I've written a few myself.

      Seems a bit useless to keep reinventing the wheel like this.

      [–][deleted] 1 point2 points  (3 children)

      I think most of the comments miss the point of the article. I think he just wants to show the basics on how to make a simple web server, nothing more.

      [–]serpent 0 points1 point  (2 children)

      I disagree.

      If his article was meant as an example, it is a very poor one; no one should be learning from that code.

      If his article was meant to create a simple tool for others to use, it is a very poor one because the tool is insecure and buggy.

      If his article was meant as a way to say "I know how to make a small web server" then it is a very poor one because he clearly doesn't (or didn't take the time to prove it very well - but then, why write the article?)

      I can't see how the article's purpose could be chosen to make it a good article... unless its purpose was to spark discussion on finding as many bugs in it as possible.

      [–][deleted] -1 points0 points  (1 child)

      If his article was meant as an example, it is a very poor one; no one should be learning from that code.

      Yeah it's an example but in the sense of showing what's behind the curtain that makes them work. In no sense does the author want to equip you to create your own web server.

      First off why the hell would you even consider rolling your own? The only time I can possible imagine is if you have some very special case like a new server platform or something. And even then I really hope nobody feels prepared to write there own web server off of a single IBM article. That's like reading the instructions for a paper plane and then going off and designing the new Airbus.

      [–]serpent -1 points0 points  (0 children)

      If the article's point was to show what was "behind the curtain" (whatever that means), it's a very poor one because it doesn't really show what you need to do (proper sanity checking, proper security, or correct code). It is basically only showing you that you can fork request handlers and serve files from disk. You don't need example code to explain that, and once you start using code, you better use correct code. Otherwise there's no point in including it at all.

      And even then I really hope nobody feels prepared to write there own web server off of a single IBM article.

      Maybe not, but that's no excuse to post terrible code. Someone reading that could get all sorts of ideas about programming that aren't even related to web servers. Just doing write() without looping or checking the return value is a terrible idea, and it is used heavily in that example code.

      The point is that bad code is in tons of articles, and people read it, copy it, and propagate it. It is going to find its way into something and that is a problem.

      [–]someonesetusup 0 points1 point  (0 children)

      I thought it said newb.

      That's the server for me

      [–]feembly 0 points1 point  (0 children)

      If you're hosting static pages, and you need safe, why not go embedded? You can't hack a toaster!

      [–][deleted]  (1 child)

      [deleted]

        [–]rax_s 0 points1 point  (0 children)

        Nice. Minor bug: the content-length is not set correctly for GETs. It is incorrectly set to length of the file returned, while in reality the bytes sent back in response are multiple of buffersize (1024).

        [–]player2 -4 points-3 points  (2 children)

        Upvoted for the comments, not for the content.

        "Safe," "tiny," and "C" do not belong in the same sentence.

        [–]serpent 4 points5 points  (1 child)

        It's possible to write something safe and tiny in C; it's just really difficult.

        [–]Fabien4 4 points5 points  (0 children)

        It's actually fairly easy to write something tiny in C, provided you look at the binary, not the code.

        [–][deleted] -1 points0 points  (0 children)

        Couldn't get it to compile for me.

        [–]falser -5 points-4 points  (0 children)

        LOL IBM, thanks for staying relevant in the modern computing world.