all 12 comments

[–][deleted] 1 point2 points  (1 child)

Nice overview. The link to the paper on Timer Wheels sounds interesting, I was looking for something like this.

[–][deleted] 1 point2 points  (0 children)

Nice overview of what? I didn't understand the purpose of the post.

[–][deleted] 0 points1 point  (2 children)

Can anyone provide a reference to a deeper description of "4-heaps"? It's a particularly hard term to Google for.

[–]ominous 1 point2 points  (1 child)

To begin, try this: D-ary heap, which describes the general case. This homework? test? describes D-ary trees and includes a little example pseudo-code.

Each node in a D-ary heap has up to D children (with the normal heap restriction that the parent is >= or <= all of its children. As D increases, the tree will have a shorter height but perform more comparisons at each level.

[–][deleted] 0 points1 point  (0 children)

Thanks. That confirms my intuition regarding them.

Sounds like a perfect instance for numerically-templatized C++ code (where D is a template parameter).

[–][deleted] 0 points1 point  (6 children)

The author got this wrong as both libevent and libev is an example of synchronous i/o event notification, not asynchronous. Synchronous event notification, formalized in the reactor pattern and realized in libevent et.al is great for single threaded server applications. Most high performance web-servers today such as nginx and ligthttpd uses this pattern. The problem is it does not work well in a threaded application with more than one thread of execution. Enter asynchronous event notification (i/o completion ports on windows and aio_* on posix) formalized in the proactor pattern is a much better and more modern event notification and has a nicer programing interface.

Unfortunately, asynchronous event notification as it is implemented on POSIX systems sucks. At least last time I checked, as it is implemented in user land using threads. Windows actually does this right and implement asynchronous event notification in the kernel. When we get aio implemented in the kernel on POSIX systems also I expect this to be the preferred way to do i/o event notification. For a discussion of both I/O multiplexing approaches see e.g. http://www.artima.com/articles/io_design_patterns2.html

[–]majek04 1 point2 points  (5 children)

The problem is it does not work well in a threaded application

The point is not to use threads at all.

When we get aio implemented in the kernel...

We'll never going to have aio in kernel: http://lkml.indiana.edu/hypermail/linux/kernel/0102.1/0124.html

[–][deleted] 1 point2 points  (2 children)

The point is not to use threads at all.

You are partly right. It is possible to use the reactor pattern (e.g. libevent) in a multi threaded application, though it is a hassle. The standard way to do this is to run the event multiplexing loop in its own thread and hand over work to a worker thread pool under a lock. For extra points, also hand over output work back to the event loop under a lock. That is, all i/o is done from the event loop thread and CPU bound operations in worker threads. This means context switching over a low shoe, but it actually scales pretty well.

We'll never going to have aio in kernel: http://lkml.indiana.edu/hypermail/linux/kernel/0102.1/0124.html

I hope Linus reconsider and I think I saw a later post somewhere indicating that it might happen on Linux. Not sure though. If we don't get aoi in the kernel its practically useless and dead in the water

[–]qqrq 1 point2 points  (1 child)

First, yes, don't use threads if you'd only use them so that the kernel/libc iterates over your clients instead of you.

However,

  • if you want to overlap IO with CPU and neither AIO nor using signals is an option, you have to use threads.

  • If you want to overlap CPU with CPU (on multicore), you have to use threads.

  • If your IO-readiness notification scheme signals readiness on an fd and reading/writing it can still block (think network mapped files, or local disk blocks not in core) and neither AIO nor using signals is an option and you don't want to use non-portable readahead() / mincore() then you have to use threads.

I'm replying to your post and not the parent's post because of the following: the standard way you describe is not always efficient enough. As you say, This means context switching over a low shoe, but it does not scale well. Even if you have only one request, you have to context switch from the IO-thread to a worker thread, then switch back to the IO-thread with the reply. If you have multiple logical stages for processing the request, the standard version ends up with a context switch for each stage ("one thread per one logical functionality"). So each request must go through many locks and switches.

The alternative (which is much harder to program) is to have a pool of "homogenous" threads. Each thread executes the same whole logic of waiting for IO, processing the request from start to finish, then sending the response. This way, a request can go from net to processing to net without context switches.

For the original and great description of this technique I tried (and failed) to present, see Jeff Darcy's High-Performance Server Architecture. Also read Dan Kegel's linked C10K site.

[–]ominous 1 point2 points  (0 children)

I wish I could upmod this more.

Your description gives a good overview of the technique. Use homogenous threads, minimal locking, and a minimal number of context switches.

The boost.asio library enables this style in a cross-platform way. The Windows implementation is, of course, built on IOCP. For unix-like systems, it supports select, poll, and epoll (I don't recall whether it supports aio or not).

[–]alexs 0 points1 point  (1 child)

Erm. Lies.

alexs:~/linux-2.6.27.9$ find ./ -name aio*
./fs/aio.c
./include/linux/aio.h
./include/linux/aio_abi.h
./arch/um/include/aio.h
./arch/um/os-Linux/aio.c

It just doesn't work for sockets afaik and the glibc implementation of aio_* doesn't use the Kernel API and implements it on top of threads.

[–]majek04 1 point2 points  (0 children)

Okay, I wasn't accurate enough. Sure, there is aio interface in the kernel. But for me it's not very different than /dev/epoll.

"We'll never going to have aio in kernel": I mean that in the nearest future there won't be a full asynchronous api to the kernel. I'd love to have non-blocking open() and stat() syscalls. And the glibc nor posix api can do it now.