High-throughput HTML parser + CSS selector engine for Zig.
Performance numbers are not conformance claims. The parser is intentionally permissive and currently does not fully match browser-grade tree-construction behavior.
- Conformance details: Documentation#conformance-status
- Benchmark methodology: Documentation#performance-and-benchmarks
- Raw outputs:
bench/results/latest.md,bench/results/latest.json
See the latest benchmark snapshot for more details
Source: bench/results/latest.json (stable profile).
ours │████████████████████│ 1647.32 MB/s (100.00%)
lol-html │█████████████░░░░░░░│ 1062.59 MB/s (64.50%)
lexbor │███░░░░░░░░░░░░░░░░░│ 234.51 MB/s (14.24%)
| Profile | nwmatcher | qwery_contextual | html5lib subset | WHATWG HTML parsing |
|---|---|---|---|---|
strictest/fastest |
20/20 (0 failed) | 54/54 (0 failed) | 524/600 (76 failed) | 440/500 (60 failed) |
Source: bench/results/external_suite_report.json
- 🔎 CSS selector queries: comptime, runtime, and cached runtime selectors.
- 🧭 DOM navigation: parent, siblings, first/last child, and children iteration.
- 💤 Lazy decode/normalize path: attribute/entity decode and text normalization happen on query-time APIs.
- 🧪 Debug tooling: selector mismatch diagnostics and instrumentation wrappers.
- 🧰 Parse profiles:
strictestandfastestoption bundles for benchmarks/workloads. - 🧵 Destructive parsing by default for throughput, with an opt-in non-destructive read-only mode.
const std = @import("std");
const html = @import("html");
const options: html.ParseOptions = .{};
test "basic parse + query" {
var input = "<div id='app'><a class='nav' href='/docs'>Docs</a></div>".*;
var doc = try options.parse(std.testing.allocator, &input);
defer doc.deinit();
const a = doc.queryOne("div#app > a.nav") orelse return error.TestUnexpectedResult;
try std.testing.expectEqualStrings("/docs", a.getAttributeValue("href").?);
}Parsing goes through options.parse(...). Use const options: html.ParseOptions = .{ .non_destructive = true }; when the caller bytes must remain unchanged, including file-backed memory maps. This mode reads the original source directly and does not make a full-source copy.
-Dintlen=u16|u32|u64|usizeselects the integer width used for document spans and node indexes.- Smaller widths reduce memory use but also reduce the maximum parseable input size.
u32is the default. Useu64for multi-gigabyte inputs.
- Full manual: Documentation
- API details: Documentation#core-api
- Selector grammar: Documentation#selector-support
- Parse mode guidance: Documentation#mode-guidance
- Non-destructive parsing: Documentation#non-destructive-parsing
- Conformance: Documentation#conformance-status
- Architecture: Documentation#architecture
- Troubleshooting: Documentation#troubleshooting
zig build test
zig build docs-check
zig build examples-check
zig build ship-checkexamples/basic_parse_query.zigexamples/runtime_selector.zigexamples/cached_selector.zigexamples/query_time_decode.zigexamples/inner_text_options.zigexamples/non_destructive_parse.zig
MIT. See LICENSE.