- C 78.2%
- Shell 8.7%
- Python 5.6%
- C++ 4.8%
- CMake 2.2%
- Other 0.5%
| .vscode | ||
| cmake | ||
| include/ht | ||
| src | ||
| tests | ||
| .clang-format | ||
| .gitignore | ||
| AGENT.md | ||
| BENCHMARKS.md | ||
| CMakeLists.txt | ||
| justfile | ||
| LICENSE | ||
| README.md | ||
| ROADMAP.md | ||
hachettp
A small, single-thread-per-core HTTP/1.1 server library in C. One static
library (libhachettp.a), no runtime dependencies — talks to the kernel
directly (raw io_uring syscalls, not liburing).
#include <ht/ht.h>
static void on_request(const ht_config_t* cfg, const ht_request_t* req,
ht_response_t* res, ht_alloc_t* a) {
(void) cfg; (void) req;
ht_response__text(res, 200, "hello from hachettp\n", a);
}
int main(void) {
ht_config_t cfg = {0};
ht_server_t server = {.config = &cfg};
if (ht_server__listen(&server, nullptr, 8080, 64) != 0) return 1;
ht_alloc_t* a = ht_alloc__init();
ht_loop_t* loop = nullptr;
ht_loop__init(&loop, a);
ht_signal__install_default(&server);
ht_acceptor_t acc = {0};
ht_server__serve(&acc, &server, loop, a, on_request);
ht_loop__run(loop);
ht_server__serve_finalize(&acc);
ht_loop__free(loop, a);
ht_server__close(&server);
ht_alloc__finalize(a);
}
curl http://localhost:8080/ → hello from hachettp.
What it is
- HTTP/1.1 request parsing, response writing, connection lifecycle.
- Async proactor: one event loop per thread, callback-driven I/O.
- Zero hidden allocations on the hot path. Caller-allocated acceptor, per-conn freelist, pluggable allocator.
- Zero-copy file serving:
sendfile(2)on Linux/BSD,TransmitFileon Windows. - Cross-platform: epoll / io_uring (Linux), kqueue (macOS/BSD), IOCP (Windows). Backend is a compile-time choice.
- C23, no third-party deps.
What it isn't
- Not HTTP/2, not HTTP/3.
- No TLS — terminate at a reverse proxy.
- Not a general async runtime; the loop only deals in HTTP-shaped I/O.
- Pre-1.0; the public API is stabilizing but not frozen.
Build & install
just build # Debug + io_uring (Linux default)
just build --target=Release
just run --target=Release # serves CWD on :3000
just install --target=Release --prefix ~/.local
just recipes wrap CMake; raw CMake works too:
cmake -S . -B build -DCMAKE_BUILD_TYPE=Release -DHT_IO_URING=ON
cmake --build build
cmake --install build --prefix ~/.local
HT_IO_URING=OFF falls back to epoll on Linux. HT_DEBUG_LOG=ON
turns on HT_DLOG() traces (no-op at release otherwise).
Use it from CMake
After install, downstream projects pick it up via find_package:
cmake_minimum_required(VERSION 3.10)
project(my_app C)
set(CMAKE_C_STANDARD 23)
find_package(hachettp 0.1 REQUIRED)
add_executable(my_app main.c)
target_link_libraries(my_app PRIVATE hachettp::libhachettp)
If hachettp lives under a non-standard prefix:
cmake -B build -Dhachettp_DIR=$HOME/.local/lib64/cmake/hachettp
For non-CMake builds, a pkg-config file ships too:
cc main.c $(pkg-config --cflags --libs hachettp) -o my_app
Thread-per-core
The hello-world above is single-threaded: one accept loop, one event loop, one core. To scale, run N independent workers — each with its own allocator, event loop, and acceptor — sharing the listening socket.
static void worker_main(ht_workers_t* w, uint32_t id, ht_alloc_t* a, void* ud) {
const ht_server_t* server = ud;
ht_loop_t* loop = nullptr;
ht_loop__init(&loop, a);
ht_acceptor_t acc = {0};
ht_server__serve(&acc, server, loop, a, on_request);
ht_workers__ready(w, id, (ht_worker_stop_fn) ht_loop__wake, loop);
ht_loop__run(loop);
ht_server__serve_finalize(&acc);
ht_loop__free(loop, a);
}
int main(void) {
ht_config_t cfg = {0};
ht_server_t server = {.config = &cfg, .signal_handler = on_signal};
ht_server__listen(&server, nullptr, 8080, 1024);
ht_alloc_t* a = ht_alloc__init();
ht_workers_t* workers;
ht_workers__init(&workers, 4, a); // 4 worker threads
ht_workers__start(workers, worker_main, &server);
ht_signal__install_default(&server);
ht_workers__join(workers);
ht_workers__free(workers);
ht_alloc__finalize(a);
ht_server__close(&server);
}
Why this works:
- One
ht_loop_tis single-threaded. All async submissions and completion callbacks for that loop fire on its owning thread; no atomics needed inside callbacks. - The listening socket is shared across workers via
SO_REUSEPORT(POSIX) or duplicated SOCKET handles (Windows IOCP). The kernel load-balances new connections across the workers' accept queues — no user-space lock on the hot path. - Per-conn state is private to its loop. Buffers, parser state, and the response builder all live inside the per-acceptor freelist, never crossing thread boundaries.
- Threads are pinned to distinct cores by
ht_workers_tso allocations first-touch on the right NUMA node and the scheduler doesn't migrate workers around.
ht_workers_t is a convenience wrapper around platform threads — if you
already have your own thread pool, the same primitives (ht_loop_t,
ht_acceptor_t, ht_alloc_t) compose into a hand-rolled version. See
src/bin/ht.c for the canonical embedder.
Examples
Two ship in src/bin/:
ht.c— multi-worker static file server with MIME detection, gzip-sidecar serving, content-negotiation, directory-index lookup, auto-generated directory listings, and graceful shutdown. The thread-per-core pattern in production form.routing.c— single-handler routing example. Method+path dispatch, JSON/HTML/text builders, query iteration, path parameters, body echo, prefix-mounted static tree.
After just install --prefix ~/.local, the ht binary is on your PATH:
ht # 4 workers, serve . on :3000
ht -p 9090 -d ./public -w 8 # 8 workers, ./public on :9090
ht -i index.htm --no-list # custom index file, no directory listings
ht --no-index --no-list # serve files only; dirs return 404
Flags: -p PORT, -d DIR, -w WORKERS, -i FILE (index filename, default
index.html), --no-index (disable index lookup), --no-list (disable
auto-generated directory listings; on by default). ht -h for the full list.
Backend matrix
| Platform | Default backend | Switch |
|---|---|---|
| Linux | epoll | -DHT_IO_URING=ON for io_uring |
| macOS | kqueue | — |
| BSD | kqueue | — |
| Windows | IOCP | — |
Backends are compile-time. Each lives in its own translation unit
(src/loop/loop_{epoll,iouring,kqueue,iocp}.c); CMake picks
exactly one.
Performance
Single-machine numbers (loopback, 16-core x86_64, 4 workers each):
| Server | Small req (Req/s) | 10 MiB stream (GB/s) | Peak mem |
|---|---|---|---|
| hachettp (io_uring) | 163,890 | 12.19 | 1.3 MiB |
| hachettp (epoll) | 129,992 | 11.90 | 1.2 MiB |
| nginx | 157,341 | 11.13 | 10.7 MiB |
| lighttpd | 229,307 | 10.66 | 2.0 MiB |
Roughly nginx-class throughput at ~10× lower memory footprint. Reproduce
with tests/bench.sh. Full numbers + methodology in
BENCHMARKS.md.
Public API tour
Include <ht/ht.h> for the umbrella, or cherry-pick <ht/*.h>:
| Header | What |
|---|---|
ht/ht.h |
re-exports everything below |
ht/server.h |
ht_server_t, ht_acceptor_t, listen / serve |
ht/request.h |
ht_request_t, method / header / body / query accessors |
ht/response.h |
ht_response_t, body / file builders, serve_file |
ht/config.h |
ht_config_t, MIME table, tuning knobs |
ht/loop.h |
ht_loop_t proactor lifecycle |
ht/workers.h |
ht_workers_t thread-per-core pool |
ht/signal.h |
shutdown signal wiring |
ht/alloc.h |
pluggable allocator vtable |
ht/slice.h |
ht_slice_t + comparators |
ht/bytes.h |
ht_bytes_t (dynamic buffer) + push / format / parse |
ht/uri.h |
URI parse + query iteration |
ht/types.h |
ht_io_t, ht_result_t, platform detection macros |
ht/fs.h, ht/net.h |
platform-typed ht_file_t / ht_socket_t |
ht/thread.h |
CPU pinning + cpu count |
Naming: ht_<module>__<verb> (double underscore). Stdint types
(uint16_t, size_t, …) on the public surface.
Status
Pre-1.0. ABI is not frozen and the API may shift. See ROADMAP.md for current priorities — HTTP/1.1 conformance gaps, performance, infrastructure.