Skip to content

Remove groupBy' for specialized internal splitBothBoth helper#26

Open
ninioArtillero wants to merge 1 commit into
seereason:masterfrom
tweag:ninioArtillero/specialize-groupBy
Open

Remove groupBy' for specialized internal splitBothBoth helper#26
ninioArtillero wants to merge 1 commit into
seereason:masterfrom
tweag:ninioArtillero/specialize-groupBy

Conversation

@ninioArtillero
Copy link
Copy Markdown
Contributor

@ninioArtillero ninioArtillero commented May 20, 2026

The groupBy' helper is defined within the Data.Algorithm.DiffContext module with the purpose of creating separate hunks within the getContextDiffNumbered function.

In this PR, this helper is specialized to Hunk c instead of arbitrary lists with one variation: the equality predicate argument is removed by (essentially) inlining the predicate used at the call site: \a b -> not (isBoth a && isBoth b). As a consequence, the required split is done throughout the recursive accumulation and consing, and the isBoth local helper is removed from getContextDiffNumbered.

This change merges two helpers and reduces the amount of indirection, making the operation in question easier to reason about and the body of getContextDiffNumbered more transparent and concise:

    splitBothBoth $ doPrefix $ getGroupedDiff a0 b0

NOTE: This change passes all cabal tests and results in no perceptible performance variation on cabal bench (from #25).

NOTE: The groupBy' export is now removed and not replaced by the new function (reason being that the new helper is an implementation detail), thus probably requiring a version bump. The script at the bottom of this description reveals that no module directly importing DiffContext calls groupBy'. It might still be the case for a module re-export to hide a call from this script, but this seems very unlikely. A Github search for groupBy' language:Haskell shows bespoke definitions of groupBy' are very common. Besides, this is far from being a loss given that focused alternatives to Prelude.groupBy with equivalent behavior to DiffContext.groupBy' (but perhaps, better performance) exist in the ecosystem:

#!/usr/bin/env bash
# Check whether any Diff library direct dependency calls Data.Algorithm.DiffContext.groupBy'
# which would be broken by removing its export.
# Requires: curl, cabal, grep.

set -euo pipefail

WORKDIR=$(mktemp -d)
trap 'rm -rf "$WORKDIR"' EXIT

echo "Fetching reverse dependencies of Diff from Hackage..."
curl -sf 'https://hackage.haskell.org/package/Diff/reverse' \
 | grep -oP 'href="/package/\K[^/"]+' \
 | sed 's/-[0-9][0-9.]*$//' \
 | sort -u \
 > "$WORKDIR/revdeps.txt"

echo "Updating Hackage index..."
cabal update -v0

echo "Downloading sources..."
mkdir -p "$WORKDIR/sources"
while read -r pkg; do
 cabal get "$pkg" --destdir="$WORKDIR/sources" -v0 2>/dev/null \
   || echo "  Warning: could not fetch $pkg"
done < "$WORKDIR/revdeps.txt"

echo "Searching for groupBy' references..."

# Pass 1: find all files that import Data.Algorithm.DiffContext in any form.
IMPORTING_FILES=$(grep -rl "import.*Data\.Algorithm\.DiffContext" \
 "$WORKDIR/sources" --include="*.hs")

if [[ -z "$IMPORTING_FILES" ]]; then
 echo "No files import Data.Algorithm.DiffContext — safe to remove the export."
 exit 0
fi

echo "Files importing Data.Algorithm.DiffContext:"
echo "$IMPORTING_FILES" | sed "s|$WORKDIR/sources/||"
echo

# Pass 2: within those files, search for any use of groupBy'.
echo "$IMPORTING_FILES" | xargs grep -n "groupBy'" \
 && echo "Matches found above." \
 || echo "No matches — safe to remove the export."

@ninioArtillero ninioArtillero force-pushed the ninioArtillero/specialize-groupBy branch from a725ea7 to e3f1241 Compare May 20, 2026 15:19
@ninioArtillero ninioArtillero marked this pull request as ready for review May 20, 2026 15:30
@ddssff
Copy link
Copy Markdown
Member

ddssff commented May 20, 2026

This looks good to me.

@ninioArtillero ninioArtillero force-pushed the ninioArtillero/specialize-groupBy branch from e3f1241 to 12fc5af Compare May 20, 2026 17:20
@ninioArtillero
Copy link
Copy Markdown
Contributor Author

@ddssff Just gave it a last sanity check and just changed two type aliases to better reflect where we expect the Hunk invariants to hold on splithBothBoth. I consider this ready; I'll be on the spot if any further change is required. Thanks for looking at this!

@ninioArtillero ninioArtillero force-pushed the ninioArtillero/specialize-groupBy branch from 12fc5af to c45b66d Compare May 21, 2026 16:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants