Mathilda is an AI Agent-generated computer algebra system (CAS) heavily inspired by the core architecture and evaluation semantics of Mathematica (the Wolfram Language). Written entirely in C99, it implements a recursive expression model, structural pattern matching with backtracking, rewriting rules, and an extensive library of built-in mathematical functions.
Today Mathilda spans roughly 159,000 lines of C99 across 176 source modules, exposing ~365 built-in functions organized into 19 functional categories — from arbitrary-precision arithmetic and symbolic calculus to polynomial factorization, dense linear algebra, and integer factorization.
Evaluation engine
- Infinite evaluation semantics: expressions are repeatedly evaluated top-down until a fixed point is reached.
- Attribute-driven evaluator: a small generic core consults per-symbol bitflags (
HoldFirst/HoldAll,Flat,Orderless,Listable,OneIdentity,Protected, …) to decide how to process each call.
Pattern matching & rules
- First-class pattern matching:
Blank(_),BlankSequence(__),BlankNullSequence(___), named bindings (x_,x_h),Condition(/;),PatternTest(?),Optional, andRepeated— with full sequence backtracking. - Rule engine: transformation rules (
->,:>) and replacement operators (/.,//.,Replace).
Numbers
- Arbitrary-precision integers via the GNU Multiple Precision Arithmetic Library (GMP), with automatic promotion/demotion from machine integers.
- Exact rationals and complex numbers, plus MPFR-backed arbitrary-precision reals with precision/accuracy tracking (
N[expr, prec], precision literals such as3.98`50).
Symbolic mathematics
- Calculus: symbolic differentiation (
D,Dt,Derivative); multi-method integration (Integrate) cascading rational-function, Risch–Norman, radical-substitution, derivative-divides, and CRC integral-table methods;Series,Limit, and symbolic summation (Sumvia polynomial, geometric, and Gosper algorithms). - Polynomials: univariate and multivariate arithmetic, factorization (square-free decomposition, Hensel lifting, irreducibility testing), algebraic-number factoring over ℚ(α), Gröbner bases, GCD/LCM, and partial fractions.
- Linear algebra:
Det,Inverse,Dot,Cross; LU / QR / Cholesky / SVD / Schur decompositions; eigenvalues and eigenvectors via multiple algorithm kernels; norms, rank, and condition numbers — with optional LAPACK acceleration for machine-precision work. - Simplification:
Simplifywith a complexity-driven search, trigonometric identities, radical denesting, and an assumptions framework ($Assumptions,Element).
Number theory & factorization
-
Number theory:
GCD,LCM,ExtendedGCD,PowerMod,Divisors,EulerPhi,MoebiusMu,PrimitiveRoot, continued fractions, and more. -
Integer factorization: a unified, automatic pipeline alongside explicit algorithms — Pollard's Rho, Pollard's
$P-1$ , Williams'$P+1$ , Fermat, CFRAC, Dixon's Method, and the Elliptic Curve Method (ECM).
Programming & utilities
- Functional programming:
Map,Apply,Fold,Nest,Through,Composition, and pure functions (&/#). - Scoping & control flow:
Module,Block,With;If,Which,Switch,Do,For,While,Piecewise. - Standard library: lists and iteration, strings, statistics, random numbers, date/time, and file I/O.
The complete reference (~365 functions) lives in Mathilda_spec.md, which indexes the per-category pages under docs/spec/builtins/:
- Arithmetic and Algebra
- Calculus
- Elementary Functions
- Linear Algebra
- Structural Manipulation
- Expression Information
- Functional Programming
- Lists and Iteration
- Pattern Matching
- Control Flow
- Assignment and Rules
- Scoping Constructs
- Simplification
- Power Series
- String Operations
- File I/O
- Statistics
- Random Number Generation
- Time and Date
Weekly change summaries are recorded under docs/spec/changelog/.
To build and run Mathilda you need:
- A C99-compliant compiler (
gccorclang) - GMP (
libgmp/gmp-dev) — arbitrary-precision integers (required) - GNU Readline (
libreadline/readline-dev) — interactive line editing (required) - MPFR (
libmpfr/mpfr-dev) — arbitrary-precision reals (enabled by default) - LAPACK / BLAS — fast machine-precision linear algebra (optional, auto-detected)
- CMake — only required to build the test suite
The optional backends are controlled by build-time flags and degrade gracefully when disabled or absent:
| Flag | Default | Effect when on |
|---|---|---|
USE_MPFR |
1 |
Arbitrary-precision reals: N[expr, prec], Precision/Accuracy, precision literals. Build without it via make USE_MPFR=0. |
USE_LAPACK |
1 |
Fast machine-precision linear algebra. Auto-detected: Apple Accelerate on macOS, lapacke/lapack/blas on Linux. Falls back to the pure-C path (USE_LAPACK=0) if none is found. |
USE_ECM |
1 |
Elliptic Curve Method factorization via the vendored GMP-ECM (src/external/ecm/), built automatically. |
The makefile auto-discovers src/*.c, configures and compiles internal dependencies, then links the main executable (-std=c99 -O3).
- Clone the repository (including the bundled GMP-ECM submodule):
If you already cloned without
git clone --recurse-submodules https://github.com/stblake/Mathilda.git cd Mathilda--recurse-submodules, rungit submodule update --init --recursivefrom inside the repo before building. (The build script will also attempt this automatically if it detects an uninitialized submodule.) - Build the project:
To build a leaner binary, disable optional backends, e.g.
make -j$(nproc)make -j$(nproc) USE_LAPACK=0 USE_MPFR=0. - Start the interactive REPL:
./Mathilda
Mathilda ships a comprehensive C-based unit suite — 216 test_*.c files
covering evaluation, parsing, pattern matching, arithmetic, polynomials,
calculus, linear algebra, and more. CMake auto-detects the same optional
backends (MPFR, LAPACK, ECM) during configuration.
cd tests
mkdir -p build && cd build
cmake ..
make -j$(nproc)
# Run all test binaries
for t in *_tests; do ./$t; doneEverything in Mathilda is represented by an immutable-by-convention Expr AST
node — a tagged union over Integer, Real, BigInt, Symbol, String,
Function, and (when built with MPFR) arbitrary-precision MPFR reals. Compound
values are Function nodes: Rational[n, d], Complex[a, b], and lists
(List[…]) are all expressions, so the same generic tools (Part, Map,
ReplaceAll, …) operate uniformly on everything.
The system is modularized into several independent subsystems:
- Parser (
parse.c) — a Pratt parser mirroring Mathematica's operator precedences (inline lexing, no separate tokenizer). - Evaluator (
eval.c) — the fixed-point evaluation loop; appliesHold*,Flat,Orderless, andListablebefore recursively evaluating arguments. - Symbol table (
symtab.c) —OwnValues,DownValues(user rules), attributes, docstrings, and native C built-in function pointers. - Pattern matcher & rule engine (
match.c,replace.c) — structural tree unification with sequence segmenting and backtracking.
Larger mathematical domains live in dedicated subdirectories of src/:
| Subsystem | Responsibility |
|---|---|
poly/ |
Univariate/multivariate polynomial arithmetic, factorization, algebraic-number fields, Gröbner bases |
linalg/ |
Dense linear algebra; eigen kernels split by algorithm |
calculus/ |
D/Dt/Derivative, Series, Limit, Integrate (incl. Risch–Norman) |
simp/ |
Simplify, trigonometric simplification, radical denesting, assumptions |
sum/ |
Symbolic summation (polynomial, geometric, Gosper) |
internal/ |
Mathematica-syntax bootstrap .m files (init, integral tables) loaded at startup |
external/ecm/ |
Vendored GMP-ECM (do not modify) |
A recurring design pattern is C for performance, rules for mathematics: hot
paths (parser, evaluator, matcher, arithmetic) are C, while higher-level
identities (integral tables, etc.) are expressed as DownValues in Mathilda's
own language. The full architecture is documented in SPEC.md, and
extension recipes in docs/extending.md.
Adding new functionality to Mathilda is straightforward:
-
Write the C implementation. Create your evaluation logic in the appropriate
.cmodule (e.g.,core.c). Your function signature must beExpr* builtin_myfunc(Expr* res).- Memory rule (ownership contract): the builtin takes ownership of
res. On success, return a newExpr*— the evaluator freesresfor you, so you must not callexpr_free(res)yourself (doing so causes a double-free). If you cannot evaluate the input (e.g. symbolic arguments to a purely numeric function), returnNULLwithout freeingres, and the evaluator retains ownership, leaving the expression unevaluated. When you reuse parts ofresin your result, NULL them out first so the evaluator's cleanup doesn't free them twice.
Expr* builtin_myfunc(Expr* res) { if (res->data.function.arg_count != 1) return NULL; /* leave unevaluated */ // ... mathematical logic ... return expr_new_integer(42); /* evaluator frees res — do NOT free it here */ }
- Memory rule (ownership contract): the builtin takes ownership of
-
Register the function. In the module's initialization routine (e.g.,
core_init()), register the function and assign a documentation string so it is available via?MyFunc:symtab_add_builtin("MyFunc", builtin_myfunc); symtab_set_docstring("MyFunc", "MyFunc[x]\n\tComputes the ultimate answer.");
-
Assign attributes. If your function threads over lists, operates symmetrically, or holds its arguments unevaluated, set the corresponding attributes during initialization:
symtab_get_def("MyFunc")->attributes |= ATTR_LISTABLE | ATTR_PROTECTED;
-
Test and document.
- Add test cases to the appropriate suite in
tests/using theTEST(...)macro. - Document the function in the matching file under
docs/spec/builtins/and add a note to the current week'sdocs/spec/changelog/entry.
- Add test cases to the appropriate suite in
See docs/extending.md for the full recipes (modules,
patterns, internal .m rules, operators) and CLAUDE.md for the
contributor workflow.
Mathilda is open-source software licensed under the GNU General Public License v3.0 (GPLv3).
You are heavily encouraged to explore the codebase, submit pull requests, report issues, and expand the CAS with new mathematical algorithms! Please see the LICENSE file for more details.