This project implements a production-style HTTP/1.1 web server from scratch in C++98, drawing design inspiration from nginx. It features a non-blocking, event-driven I/O model built on the Reactor pattern, nginx-style hierarchical configuration, full CGI/1.1 support, chunked transfer encoding, and dynamic template rendering. The architecture is deliberately minimal — no threads, no third-party libraries — to demonstrate mastery of POSIX sockets, epoll, process management, and the HTTP/1.1 specification.
- Supported Features
- Quick Start
- Configuration System
- Architecture
- HTTP Request Parsing
- HTTP Handlers
- CGI Implementation
- Error Handling
- Build System
| Category | Feature | Description | Status |
|---|---|---|---|
| Server | Non-blocking I/O | All sockets are non-blocking; event-driven via epoll |
✅ |
| Keep-Alive connections | Configurable idle timeout per server block | ✅ | |
| Virtual hosts | Multiple server blocks on different listen addresses |
✅ | |
| Signal handling | Graceful shutdown on SIGINT |
✅ | |
| Config | nginx-style config file | Global → HTTP → server → location hierarchy | ✅ |
| Location routing | Longest-prefix matching across route directives |
✅ | |
limit_except |
Per-location HTTP method allowlist | ✅ | |
client_max_body_size |
Enforces request body size limit (returns 413) | ✅ | |
root / alias |
File system path mapping per location | ✅ | |
redirect |
HTTP redirect with configurable target URL | ✅ | |
autoindex |
Directory listing with HTML template | ✅ | |
error_page |
Custom error page HTML template | ✅ | |
| HTTP | GET | Static files with MIME detection | ✅ |
| POST | File upload to configurable directory | ✅ | |
| DELETE | Remove files within upload-enabled directories | ✅ | |
| Chunked Transfer-Encoding | Decode chunked request bodies; chunk response for large files | ✅ | |
| Cookie parsing | Parses Cookie request header into key-value map |
✅ | |
| Percent-encoding | RFC 3986 URI decoding | ✅ | |
| CGI | Script execution via fork/execve |
Isolated child process per request | ✅ |
| POST body piping | Passes request body to CGI stdin via temp file | ✅ | |
| Response header parsing | Strips CGI headers from output before forwarding to client | ✅ | |
| Timeout | Kills CGI process after 3 seconds via monitor process | ✅ | |
| Templates | Error page rendering | Substitutes {{status_code}} and {{message}} in HTML template |
✅ |
| Autoindex rendering | Renders directory listing with clickable file links | ✅ | |
| Not Implemented | HTTPS / TLS | No SSL support | ✗ |
| HTTP/2 | HTTP/1.1 only | ✗ | |
| Multi-worker processes | Single reactor process only | ✗ |
# Build
make
# Run with a config file
./webserv configs/default.conf
# Fetch a page
curl http://127.0.0.1:8080/
# Upload a file
curl -X POST http://127.0.0.1:8080/upload \
--data-binary @myfile.txt \
-H "Content-Type: text/plain"
# Delete a file
curl -X DELETE http://127.0.0.1:8080/upload/myfile.txtThe configuration file follows an nginx-style hierarchical structure: directives cascade from outer blocks into inner ones, with inner blocks taking precedence.
worker_processes 1;
worker_connections 1024;
http {
server {
location { ... }
}
}
| Directive | Example | Description |
|---|---|---|
worker_processes |
worker_processes 1; |
Number of worker processes (only 1 active) |
worker_connections |
worker_connections 1024; |
Max simultaneous connections |
| Directive | Example | Description |
|---|---|---|
client_max_body_size |
client_max_body_size 1M; |
Maximum allowed request body size |
default_type |
default_type text/html; |
Fallback MIME type |
error_page |
error_page ./www/html/error_page.html; |
HTML template for error responses |
autoindex_page |
autoindex_page ./www/html/autoindex.html; |
HTML template for directory listing |
| Directive | Example | Description |
|---|---|---|
listen |
listen 127.0.0.1:8080; |
IP address and port to bind |
server_name |
server_name localhost; |
Hostname for virtual host matching |
keep_alive_timeout |
keep_alive_timeout 65; |
Idle connection timeout in seconds |
error_log |
error_log log/error.log error; |
Log file path and log level (debug/info/warning/error) |
| Directive | Example | Description |
|---|---|---|
route |
route /upload; |
URL prefix this block matches (longest prefix wins) |
limit_except |
limit_except GET POST; |
Allowed HTTP methods; others get 405 |
root |
root ./www/html; |
Base filesystem path for request URI |
alias |
alias ./www/images; |
Replace full route with this path (vs. append for root) |
index |
index index.html; |
Default file served for directory requests |
redirect |
redirect /new-page; |
HTTP redirect target |
autoindex |
autoindex on; |
Enable directory listing |
enable_upload |
enable_upload on; |
Allow POST file uploads to this location |
cgi_path |
cgi_path ./cgi-bin; |
Directory containing CGI scripts |
cgi_extension |
cgi_extension .sh; |
File extension triggering CGI execution |
worker_processes 1;
worker_connections 1024;
http {
client_max_body_size 1M;
default_type text/html;
error_page ./www/html/error_page.html;
autoindex_page ./www/html/autoindex.html;
server {
listen 127.0.0.1:8080;
server_name localhost;
keep_alive_timeout 65;
error_log log/error.log error;
location {
route /;
limit_except GET;
root ./www/html;
index index.html;
}
location {
route /upload;
limit_except GET POST DELETE;
autoindex on;
enable_upload on;
}
location {
route /cgi;
limit_except GET POST;
cgi_path ./cgi-bin;
cgi_extension .sh;
}
}
}Before deciding on the design, we analysed three common concurrency models and their trade-offs (see doc/non-blocking IO.md):
| Model | Mechanism | Problem |
|---|---|---|
| Fork per connection | fork() after accept() |
High cost: page-table copy, scheduler overhead, memory per process |
| Thread pool | Queue + mutex + thread pool | Lock contention on the shared queue; bounded by pool size (~10k connections impractical) |
| I/O Multiplexing | select / poll / epoll |
Single process handles N sockets; cost paid only when data is ready |
Among the multiplexing syscalls, epoll was chosen over select and poll:
selectuses a fixed bitmap — hard limit of 1024 fds; O(n) kernel scan + O(n) userspace scan + two copies offd_setper call.polllifts the 1024 limit but retains the O(n) scan and double copy.epollstores fds in a kernel-side Red-Black Tree (O(log n) registration); only ready fds are returned via a linked list — no scan, no copy.
This project uses epoll in level-triggered (LT) mode — the default and the only mode permitted by the 42 subject constraints.
That said, we researched edge-triggered (ET) mode and document the trade-off here, because understanding why ET is more performant motivates much of the overall non-blocking design:
| Level-Triggered (LT) | Edge-Triggered (ET) | |
|---|---|---|
| When notified | Every epoll_wait call, as long as data remains unread |
Once, at the moment data arrives |
| Missed reads | Cannot miss data — kernel keeps re-notifying | Must drain socket fully on each wakeup (loop until EAGAIN), or data is silently lost |
| Syscall overhead | Higher — each buffered byte generates a wakeup | Lower — one wakeup per burst of data, regardless of size |
| Typical use | Easier to implement correctly | Nginx, high-performance servers |
With LT, if a handler reads only part of the available data, epoll_wait wakes up again on the next call — correct, but at the cost of redundant wakeups under high load. ET removes those redundant wakeups entirely: the kernel fires once per state change (new data arriving), so a single wakeup covers an entire burst. The handler must read until EAGAIN to avoid stalling the connection — a discipline this server's ConnectionHandler already enforces via its drain loop, meaning the code would require minimal changes to switch to ET if the constraint were lifted.
The Reactor pattern comes in four combinations, each with different scalability and complexity trade-offs (see doc/web-kernal-design.md):
| Variant | How it works | Bottleneck / Trade-off | Real-world example |
|---|---|---|---|
| Single Reactor Single Worker (SRSW) | One event loop handles accept + read/write + business logic in the same process | Handler stalls block the acceptor; single CPU only | Redis ≤ 6.0 |
| Single Reactor Multiple Workers (SRMW) | One event loop dispatches I/O events; worker threads handle business logic | The single reactor becomes the bottleneck under massive connection rates; shared queue needs locking | — |
| Multiple Reactor Multiple Workers (MRMW, processes) | Main reactor accepts and hands fds to sub-reactors; each sub-reactor owns its own epoll loop and worker |
No shared state between workers → no locks; main reactor stays lightweight | Nginx |
| Multiple Reactor Multiple Workers (MRMW, threads) | Same topology but with threads instead of processes | Shared memory simplifies fd handoff; needs careful synchronisation | Netty, Memcached |
Nginx's MRMW (multi-process) design is the reference: the main process only calls accept() and distributes connections to worker processes via a shared listening socket. Each worker runs its own independent epoll loop with no shared mutable state, eliminating lock contention entirely.
This project implements the simpler Single Reactor Single Worker variant — the same three-role structural separation (Reactor / Acceptor / Handler) as nginx, but collapsed into a single process. This is sufficient for the project scope and keeps the implementation auditable without multi-process synchronisation complexity.
This server (SRSW):
Reactor ──new connection──▶ Acceptor ──▶ creates ConnectionHandler
──read/write──────▶ ConnectionHandler ──▶ RequestProcessor ──▶ HTTP Handler
──CGI pipe ready──▶ CgiHandler
All three handler types implement the same IHandler interface, so the Reactor dispatches events without knowing what kind of fd it is serving.
┌─────────────────────────────────────────────┐
│ Reactor │
│ while (!stop_flag) { │
│ epoll_wait(events, MAX_EVENTS, timeout) │
│ for each event → fd_map[fd]->handle_event │
│ } │
└────────────┬────────────────────────────────┘
│ dispatches by fd
┌──────────┴──────────────────────────────┐
│ IHandler (interface) │
├─────────────────────────────────────────┤
│ Acceptor — new TCP connections │
│ ConnectionHandler — client read / write │
│ CgiHandler — CGI stdout pipe │
└─────────────────────────────────────────┘
Key components:
| Class | File | Responsibility |
|---|---|---|
Reactor |
include/kernel/Reactor.hpp |
epoll event loop, fd → IHandler* map, SIGINT shutdown |
Acceptor |
include/kernel/Acceptor.hpp |
Accepts TCP connections, registers new ConnectionHandler with epoll |
ConnectionHandler |
include/kernel/ConnectionHandler.hpp |
4096-byte read buffer, write buffer flushing, keep-alive timer |
RequestProcessor |
include/kernel/RequestProcessor.hpp |
Per-connection parse state, routes completed requests to HTTP handlers |
CgiHandler |
include/kernel/CgiHandler.hpp |
Reads CGI process stdout from pipe, relays data to client write buffer |
Client connects
→ Acceptor::handle_event()
accept() → new fd
Create ConnectionHandler, register EPOLLIN with epoll (LT mode)
→ EPOLLIN fires (LT: re-fires until buffer is empty)
ConnectionHandler reads up to 4096 bytes per iteration until EAGAIN
RequestProcessor::feed() → RequestAnalyzer state machine
On COMPLETE → dispatch to GetHandler / PostHandler / DeleteHandler / CgiExecutor
→ Handler builds response into ConnectionHandler write buffer
Register EPOLLOUT
→ EPOLLOUT fires
ConnectionHandler flushes write buffer to socket until EAGAIN
→ Keep-alive: reset RequestAnalyzer state, re-arm EPOLLIN
→ Idle timeout: close fd, remove from epoll, destroy handler
RequestProcessor tracks per-connection state using a bitmask so that multiple orthogonal conditions (e.g. chunked body + CGI in progress) can be represented simultaneously:
| State | Bitmask | Meaning |
|---|---|---|
INITIAL |
0 | Fresh request, nothing parsed yet |
WAITING_SESSION |
1 | Waiting for virtual host resolution |
PROCESSING |
2 | Handler is building the response |
WAITING_CGI |
4 | Blocked on CGI child process output |
HANDLE_OTHERS_CHUNKED |
8 | Subsequent chunk of a chunked response body; send body only (no status/headers) |
HANDLE_FIRST_CHUNKED |
16 | First chunk of a chunked response body; include status line and headers |
HANDLE_CHUNKED |
32 | Chunked response is in progress (stream file in CHUNKED_SIZE pieces) |
COMPLETED |
64 | Response fully written; evaluate keep-alive |
CONSUME_BODY |
128 | Draining leftover request body before responding |
ERROR |
256 | Unrecoverable error; tear down connection |
UNKNOWN |
512 | Unknown state |
Implementation references: GetHandler chunked flow and ResponseBuilder body-only mode.
Request parsing is handled by a chain of state machines that process the byte stream incrementally. This design handles partial TCP segments and pipelined requests naturally.
shell/RequestLineAnalyzer.hpp parses METHOD SP URI SP HTTP/1.1 CRLF.
States: METHOD → SPACE_BEFORE_URI → URI → SPACE_BEFORE_VERSION → VERSION → CRLF
Recognized request-line methods: GET, POST, DELETE, OPTIONS, CONNECT.
Currently accepted/implemented methods are GET, POST, and DELETE; OPTIONS and CONNECT are parsed but not implemented and may be rejected later with 501.
shell/UriAnalyzer.hpp implements RFC 3986 URI parsing.
Supported URI forms:
| Form | Example | Usage |
|---|---|---|
| origin-form | /path?query |
Normal HTTP requests |
| absolute-form | http://host/path |
Proxy requests |
| authority-form | host:port |
CONNECT method |
| asterisk-form | * |
OPTIONS |
Validation: IPv6 address syntax, percent-decoded characters, path normalization (resolves .. and . segments to prevent directory traversal).
shell/HeaderAnalyzer.hpp parses HTTP header fields per RFC 7230.
- Validates
field-name: field-value CRLFformat - Handles obsolete line folding (multi-line headers)
- Extracts cookies from the
Cookieheader into a key-value map - Signals completion on the empty
CRLFline
ChunkedCodec (kernel/ChunkedCodec.hpp) implements RFC 7230 §4.1 chunked encoding.
State machine:
CHUNKSIZE → (hex digits) → SIZE_CRLF → CHUNKBODY → BODY_CRLF → (loop / COMPLETE)
- Decodes request bodies for uploads
- Encodes large response files to avoid loading them fully into memory
GetHandlertracks per-fd file offsets in_chunked_file_recordsfor streaming large files
All handlers extend ARequestHandler (kernel/ARequestHandler.hpp), which provides:
HttpException-based error propagation with automatic error response generationResponseBuilderfactory for 2xx/3xx/4xx/5xx responses- Template rendering via
TemplateEngine
File resolution pipeline:
URI path
→ longest-prefix match on location routes
→ apply root / alias mapping
→ normalize path (resolve .., .)
→ check existence and read permission
→ detect MIME type from extension
→ serve file (chunked if large) or render directory listing
Directory listing: When autoindex on and the request targets a directory, GetHandler loads the template from autoindex_page, enumerates directory entries, and substitutes file links. When index is configured, it appends the index filename before attempting directory listing.
MIME detection: File extension → Content-Type map covers HTML, CSS, JavaScript, images (JPEG, PNG, GIF, SVG, ICO), fonts, JSON, XML, PDF, and binary fallback (application/octet-stream).
Large file streaming: Files above the chunked threshold are streamed using Transfer-Encoding: chunked. The HANDLE_RES_CHUNKED state persists a file offset across multiple EPOLLOUT events until the file is fully sent.
Upload flow:
- Validate
enable_upload onfor the matched location; otherwise 403. - Enforce
client_max_body_size; exceed → 413. - Require
Content-LengthorTransfer-Encoding: chunked; missing → 411. - Stream body into a temp file in
/tmp/(random name). - Determine final filename from
Content-Dispositionheader or request URI. - Normalize final path to prevent directory traversal.
- Rename temp file to upload directory → 201 Created (or 200 OK if overwrite).
- Temp file is cleaned up on any error.
- Resolves target path using the same root/alias mapping as GET.
- Only files inside upload-enabled directories are deletable (403 otherwise).
- Directory deletion is rejected (403).
- Returns 204 No Content on success.
CGI scripts are triggered when the request URI matches a location with cgi_path and cgi_extension configured.
Request matched as CGI
→ CgiExecutor::cgi_exec(...)
1. Extract script name from URI
2. Build CGI environment variables
3. pipe(stdout_pipe)
4. fork()
Child:
dup2(stdout_pipe[write], STDOUT_FILENO)
Read POST body from temp file → pass as stdin via pipe
execve(script_path, argv, envp)
Parent:
Close write end of pipe
Register CgiHandler(stdout_pipe[read]) with epoll
5. Fork monitor process → kills CGI after 3 s timeout
→ CgiHandler::handle_event() reads output, writes to client buffer
Standard CGI/1.1 environment passed to the script:
| Variable | Source |
|---|---|
REQUEST_METHOD |
Parsed request method |
SCRIPT_NAME |
URI path of the script |
PATH_INFO |
URI path after script name |
QUERY_STRING |
URI query component |
SERVER_NAME |
server_name directive |
SERVER_PORT |
Bound port |
CONTENT_TYPE |
Content-Type request header |
CONTENT_LENGTH |
Content-Length request header |
HTTP_* |
All other request headers (- → _, uppercased) |
A second fork() creates a monitor process that sleep(3) then sends SIGKILL to the CGI PID. The monitor exits immediately if the CGI finishes first (detected via waitpid with WNOHANG). This prevents hung CGI scripts from blocking the event loop.
HttpException (utils/HttpException.hpp) carries an HTTP status code and detail message. Handlers throw HttpException(STATUS_CODE, "detail") and ARequestHandler::_handle_exception() catches it, builds an error response, and renders the configured error_page template.
| Range | Codes |
|---|---|
| 2xx | 200 OK, 201 Created, 202 Accepted, 204 No Content |
| 3xx | 301 Moved Permanently, 302 Found, 303 See Other, 304 Not Modified |
| 4xx | 400 Bad Request, 401 Unauthorized, 403 Forbidden, 404 Not Found, 405 Method Not Allowed, 408 Request Timeout, 411 Length Required, 413 Payload Too Large, 417 Expectation Failed |
| 5xx | 500 Internal Server Error, 501 Not Implemented, 502 Bad Gateway, 503 Service Unavailable |
The error template at www/html/error_page.html uses two placeholders:
<h1>{{status_code}}</h1>
<p>{{message}}</p>TemplateEngine (kernel/TemplateEngine.hpp) performs simple string substitution before serializing the response.
The project uses CMake wrapped by a convenience Makefile.
make # configure + build → ./webserv
make clean # remove build/ directory
make fclean # clean + remove webserv binary, tmp/, www/upload/
make re # fclean then rebuildCompiler requirements: c++ with -std=c++98 -Wall -Wextra -Werror
The Makefile fingerprints source files (CMakeLists.txt, source/, include/) against the binary timestamp — if nothing changed, the build is skipped without invoking CMake.
Runtime directories (tmp/ and www/upload/) are created automatically on first make.
Hope you liked this project, don't forget to give it a star ⭐.