Skip to content

new epoll I/O manager

Duncan Coutts requested to merge dcoutts/ghc:dcoutts/io-manager-epoll into master

Extending MR !9677, add a new I/O manager using the new scheme for in-RTS I/O managers. This one is intended to be a high-performance platform-specific one, based on the Linux epoll API.

It is intended to become the default I/O manager for Linux for the non-threaded RTS.

Ignore all but the final patch. The others come from !9676 (closed) and !9677.

This one uses the infrastructure introduced in the previous MRs, so there is relatively little added in this MR specifically for this I/O manager. As such it can serve as a template for other platform specific I/O managers such as ones for kqueue, iocp, io_uring etc.

I also have a benchmark (not included in this MR) that demonstrates this I/O manager getting better performance than with the threaded RTS MIO I/O manager. It's a simple benchmark where one has N O/S pipes (e.g. 1000+), and N-1 threads copying data from one pipe to the next. Then one injects a byte into the first pipe, waits for it to come out the other end, and then do that again M times.

Sample output from my machine:

$ time ./IOManagerBench +RTS --io-manager=select
I/O manager benchmark
IOManagerBench: file descriptor 2005 out of range for select (0--1024).
Recompile with -threaded to work around this.

$ time ./IOManagerBench +RTS --io-manager=poll
I/O manager benchmark

real	0m12.730s
user	0m2.457s
sys	0m10.215s

time ./IOManagerBench +RTS --io-manager=epoll
I/O manager benchmark

real	0m2.126s
user	0m1.519s
sys	0m0.554s

$ time ./IOManagerBench +RTS --io-manager=mio
I/O manager benchmark

real	0m2.815s
user	0m2.300s
sys	0m0.464s

Of course select fails because 1000 pipes is 2000 FDs. Satisfyingly, poll works but is very slow. And then epoll and mio give similar good performance, with the non-threaded RTS epoll beating the threaded RTS mio on user time, but mio beating epoll on kernel time.

The reason for the latter is that mio uses an optimisation where it can avoid doing one system call per I/O wait. We should be able to do the same for epoll later if we require clients to use a new deregister-on-close API, like is used in MIO (closeFd).

Merge request reports