A coroutine-based HTTP server modeled after archibate/co_async.
Task management:
- Create an entrypoint task.
- This task may spawn other tasks, leaving them in different schedulers.
- A task is either running, staying in schedulers to be resumed, or cancelled.
- Schedulers collaborate and the entrypoint task finally finishes.
Schedulers:
TimedScheduler
- Stores coroutines waiting for time.
- e.g.
sleep_for
andsleep_until
.
EpollScheduler
(Blocking)- Stores coroutines waiting for files to become ready.
- e.g.
AsyncFile
,wait_file_event
, etc. - NOTE: Instead of registering function pointers to
epoll_event
, every waiter registers aEpollFilePromise*
. When epoll signals an event, it provides us with a coroutine handle to resume. Unlike a standard function pointer, when a coroutine returns from theresume()
call, it does not necessarily reach its conclusion. It can be launched, but not finished.
A blocking scheduler should be put at the end of a loop, making sure that there're no other tasks that are ready to run. They normally come with a timeout so we can check for new tasks.
File operations:
AsyncFile
is to a file descriptor whatstd::unique_ptr
is to a raw pointer.AsyncFileStream
serves as a wrapper forAsyncFile
, analogous to howFILE*
operates. The operations encapsulatesgetc
andputc
around aFILE*
that is associated with non-blocking file descriptors.AsyncFileBuffer
maintains a buffer itself and interact directly withread()
/write()
so it can get more accurate feedback on errors and work potentially better.
Utilities:
when_all
andwhen_any
- They both assume the tasks passed as arguments are not in the scheduler.
- When the last task of the
when_all
group finishes, it awakes the previous suspended task (which is waiting forwhen_all
coroutine to finish). - When the first task finishes,
when_any
destroys the other tasks by returning from the coroutine body and letting the temporary tasks' destructors destroy the coroutine handles and remove them from the scheduler.
First, download googletest:
git submodule add https://github.com/google/googletest.git extern/googletest
git submodule update --init --recursive
Caution
You should run git submodule add
at the root of the project's workspace.
Second, download this dependency (this library translates http status codes to strings):
git submodule add https://github.com/j-ulrich/http-status-codes-cpp.git extern/http_status_code
git submodule update --init --recursive
Then run cmake commands to build the project. (You can run unit tests in the build directory by running ctest
.)
Development setup:
- Compiler: GCC 13.3.0
- System: Ubuntu 24.04.1 LTS (WSL2)
- CPU: Intel® Core™ Ultra 9 Processor 185H
Compile flags: -O2 -g
.
When the program is almost always I/O-ready:
wrk -t12 -c1000 -d20s http://localhost:9000/repeat\?count\=10000
- Blocking version (example/server_blocking.cpp): Requests/sec: 42020.76
- Coroutine version (example/server_epoll_coro.cpp): Requests/sec: 33691.99
In this case, coroutines and non-blocking I/O make the performance worse!
When the program needs to wait for other time-consuming operations:
wrk -t12 -c1000 -d20s http://localhost:9000/sleep\?ms\=<ms>
ms | 0 | 1e-6 | 1e-5 | 1e-4 | 1e-3 | 1e-2 | 0.1 | 1 | 10 |
---|---|---|---|---|---|---|---|---|---|
blocking | 54294.19 | 47433.17 | 45850.88 | 11849.45 | 11283.70 | 10186.02 | 5107.66 | 865.31 | 96.58 |
coro | 45168.30 | 42747.37 | 42881.30 | 47369.20 | 45861.48 | 45144.80 | 56382.97 | 61340.93 | 42636.99 |
It's interesting that when coroutines sleep for a while, they work better 😂. I suspect that when they are not scheduled immediately, epoll_wait()
+ accept()
can accept more incoming connections. It's like calling more people into a restaurant. They just end up waiting a long time for their food to arrive and having a bad experience, but the restaurant earns more.
For reference, archibate/co_async/example/server.cpp achieves 81044.05 requests/s in my test. Probably due to io_uring?
To further improve the performance, I can still:
- Utilize a thread pool.
- Currently the project only gives a demo in a single thread.
- Switch to liburing-based I/O.
- Imagine io_uring as a way of issuing async syscalls to the Linux kernel without doing it directly in your program (
epoll_wait()
+read()
/write()
). Not only the number of syscalls is greatly reduced, you don't have to wait forread()
/write()
to finish. Moreover, io_uring supports zero-copy.
- Imagine io_uring as a way of issuing async syscalls to the Linux kernel without doing it directly in your program (