feat: On SortVerbose by jodavies · Pull Request #805 · form-dev/form

jodavies · 2026-03-03T12:20:16Z

The first commit just cleans up WriteStats by using snprintf to sort out the field alignment, rather than the deeply nested if statements and MesPrint.

The second adds an On SortVerbose; mode for the sort statistics printing, which looks like:

Time =       0.06 sec    Generated terms =   33554432
           test1         Terms in output =          0
                         Bytes used      =          4
                         Comparisons     =  195165224
                         Small Buffer    =    0,   29
                         Large Buffer    =    0,    0

The numbers are, across all threads,

the total number of comparisons made
the number of times the small buffer was sorted due to TermsInSmall
the number of times the small buffer was sorted due to SmallSize
the number of times the large buffer was sorted due to LargePatches
the number of times the large buffer was sorted due to LargeSize

The motivation is to make it easier for the user to configure sorting buffer sizes, since they are able to see what is the bottleneck. For example, increasing SmallSize is not useful if TermsInSmall was the reason for your small buffer sorts, etc. Printing the comparison count is more useful from a development perspective, if one is working on the sorting systems.

I don't see any performance degradation due to collecting the extra information.

Any comments? Any extra useful information we could add here?

coveralls · 2026-03-03T12:42:02Z

coverage: 58.52% (+0.05%) from 58.466% — jodavies:sortverbose into form-dev:master

tueda · 2026-03-27T05:11:52Z

Here are the benchmark results with tform -w8 (Intel Core i9-12900, Ubuntu 20.04, x86_64).
I don't see any significant performance regressions.

Benchmark	Speedup	95% bootstrap CI
chromatic	1.00	[1.00, 1.00]
color	1.00	[0.99, 1.00]
fmft	1.00	[1.00, 1.00]
forcer	1.00	[1.00, 1.01]
forcer-exp	0.99	[0.99, 1.00]
mbox1l	1.00	[0.99, 1.01]
minceex	1.00	[0.99, 1.00]
mincer	1.01	[1.00, 1.01]
sort-disk	1.00	[0.99, 1.00]
sort-large	0.99	[0.97, 1.00]
sort-small	1.00	[0.99, 1.01]
trace	0.99	[0.98, 1.00]

Details

Speedup of B over A (mean) = (mean time of A) / (mean time of B)

A:

TFORM 5.0.0 (Jan 27 2026, v5.0.0)
-backtrace  +flint=3.4.0  +gmp=6.3.0   -mpi    +pthreads  +zlib=1.2.11
-debugging  +float        +mpfr=4.2.2  +posix  -windows   +zstd=1.4.4
Compiler: GCC 11.5.0
Architecture: x86_64

B:

TFORM 5.0.0 (Mar  3 2026, v5.0.0-2-g69dad23)
-backtrace  +flint=3.4.0  +gmp=6.3.0   -mpi    +pthreads  +zlib=1.2.11
-debugging  +float        +mpfr=4.2.2  +posix  -windows   +zstd=1.4.4
Compiler: GCC 11.5.0
Architecture: x86_64

Paired runs with n = 30 per benchmark with /tmp instead of /dev/shm. Used the scripts from this snapshot. The binaries were built for the x86-64-v1 baseline.

Environment:


OS	Ubuntu 20.04.6 LTS
Kernel	Linux 5.15.0-84-generic
Architecture	x86_64
CPU	Intel Core i9-12900
CPU configuration	16 cores / 24 threads (8 P-cores + 8 E-cores)
Memory	62.6 GiB
Storage	WD_BLACK SN770 1TB NVMe SSD

tueda · 2026-03-27T05:14:13Z

Just in case, I also tried the first commit (refactoring only), but I got a compile error:

/home/tueda/work/form/sources/sort.c: In function 'WriteStats':
/home/tueda/work/form/sources/sort.c:178:28: warning: unused variable 'oldLogHandle' [-Wunused-variable]
  178 |                 const WORD oldLogHandle = AC.LogHandle;
      |                            ^~~~~~~~~~~~
/home/tueda/work/form/sources/sort.c:327:32: error: 'oldLogHandle' undeclared (first use in this function)
  327 |                 AC.LogHandle = oldLogHandle;
      |                                ^~~~~~~~~~~~
/home/tueda/work/form/sources/sort.c:327:32: note: each undeclared identifier is reported only once for each function it appears in
/home/tueda/work/form/sources/sort.c: At top level:
/home/tueda/work/form/sources/sort.c:331:1: error: expected identifier or '(' before '}' token
  331 | }
      | ^

The deeply-nested conditions on the length of the sizes is due to MesPrint truncating numbers which are too long for positioned fields, rather than loosening the positioning. snprintf doesn't truncate, so is much easier to align.

jodavies · 2026-03-28T22:54:45Z

I fixed the non-compilation of the refactor commit.

I also added a report of the total size of the generated terms, before sorting and compression. Pre-compression means that this number might be larger than Bytes used, even in a module where no terms merged during a sort:

Time =       0.00 sec    Generated terms =         64
           test1         Terms in output =         64
                         Bytes used      =       4888
                         Unsorted bytes  =       6144
                         Small Buffer    =    0,    0
                         Large Buffer    =    0,    0
                         Comparisons     =         63

Add information to the final sort summaries (per thread and master) including the number of comparisons made, the number of times the small and large buffers were sorted due to their capacities, and the total size of the unsorted, uncompressed generated terms.

jodavies · 2026-05-05T09:53:07Z

I added information on the additional stats to the manual.

refactor: use snprintf to clean up WriteStats

7e0ddbf

The deeply-nested conditions on the length of the sizes is due to MesPrint truncating numbers which are too long for positioned fields, rather than loosening the positioning. snprintf doesn't truncate, so is much easier to align.

jodavies force-pushed the sortverbose branch from 69dad23 to 80f7139 Compare March 28, 2026 22:52

jodavies force-pushed the sortverbose branch from 80f7139 to 7ddf04d Compare March 28, 2026 22:57

jodavies force-pushed the sortverbose branch from 7ddf04d to c04fadf Compare May 5, 2026 09:38

jodavies changed the title ~~WIP On SortVerbose~~ feat: On SortVerbose May 5, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: On SortVerbose#805

feat: On SortVerbose#805
jodavies wants to merge 2 commits intoform-dev:masterfrom
jodavies:sortverbose

jodavies commented Mar 3, 2026

Uh oh!

coveralls commented Mar 3, 2026 •

edited

Loading

Uh oh!

tueda commented Mar 27, 2026

Uh oh!

tueda commented Mar 27, 2026

Uh oh!

jodavies commented Mar 28, 2026

Uh oh!

jodavies commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jodavies commented Mar 3, 2026

Uh oh!

coveralls commented Mar 3, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

tueda commented Mar 27, 2026

Uh oh!

tueda commented Mar 27, 2026

Uh oh!

jodavies commented Mar 28, 2026

Uh oh!

jodavies commented May 5, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

coveralls commented Mar 3, 2026 •

edited

Loading