Specifics to implement testing of individual grammars
It seems to me the Makefile here ought to have a test target, on which the compile target depends. That might call something like:
pegex parsetest filename.pgx
which would load and parse the .pgx file, then iterate over a set of input files, located in an input directory (eg test-input), producing for each one an output file in an output directory (eg test-output) and comparing it byte-for-byte with the contents of expected files in an expected directory (eg test-expected). It would return success if the outputs all matched the relevant expected files.
The contents of out/expected files would be effectively ASTs: a data structure, in JSON format, resembling this:
{
"data": { "parse_tree": "contents" },
"errors": [ "array of parse error hashes maybe non existent" ]
}
There would be a separate target of testdiff which would go through the output and expected directories, showing in some human-readable way the data-structure differences, possibly using the "canonical" (pretty) JSON outputs put through diff -u.
Implications for Pegex in furthering Acmeist ideas
This implies the idea of grammars and input files producing a language-independent AST, which seems like a good idea to me. That separately implies the idea of language-independent grammar-to-AST mappers, one per language. This would imply being able to automatically generate eg a C library that parsed eg GraphQL, or JSON.
Specifics to implement testing of individual grammars
It seems to me the Makefile here ought to have a
testtarget, on which thecompiletarget depends. That might call something like:which would load and parse the
.pgxfile, then iterate over a set of input files, located in an input directory (egtest-input), producing for each one an output file in an output directory (egtest-output) and comparing it byte-for-byte with the contents of expected files in an expected directory (egtest-expected). It would return success if the outputs all matched the relevant expected files.The contents of out/expected files would be effectively ASTs: a data structure, in JSON format, resembling this:
{ "data": { "parse_tree": "contents" }, "errors": [ "array of parse error hashes maybe non existent" ] }There would be a separate target of
testdiffwhich would go through the output and expected directories, showing in some human-readable way the data-structure differences, possibly using the "canonical" (pretty) JSON outputs put throughdiff -u.Implications for Pegex in furthering Acmeist ideas
This implies the idea of grammars and input files producing a language-independent AST, which seems like a good idea to me. That separately implies the idea of language-independent grammar-to-AST mappers, one per language. This would imply being able to automatically generate eg a C library that parsed eg GraphQL, or JSON.