Skip to content

[improve][broker] Part 2: Pack PendingAcksMap values into long#26030

Open
void-ptr974 wants to merge 6 commits into
apache:masterfrom
void-ptr974:pending-acks-pack-value-part2
Open

[improve][broker] Part 2: Pack PendingAcksMap values into long#26030
void-ptr974 wants to merge 6 commits into
apache:masterfrom
void-ptr974:pending-acks-pack-value-part2

Conversation

@void-ptr974

@void-ptr974 void-ptr974 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor

Motivation

Part of #26027.

PendingAcksMap tracks entries that have been dispatched to Shared and Key_Shared consumers but are not fully acknowledged yet. Before this PR, each pending entry stored its value as an IntIntPair:

  • remaining unacked message count
  • sticky key hash

The consumer hot paths usually only need the remaining unacked count after a lookup or removal. Keeping a value object per pending entry increases retained heap, and returning IntIntPair from those paths can create short-lived objects that are avoidable.

Modifications

  • Pack the pending-ack value into one long while keeping the same outer/inner sorted-map layout and ledger/entry ordering.
    • high 32 bits: remaining unacked count
    • low 32 bits: sticky key hash
  • Add primitive accessors for the consumer paths that only need the remaining unacked count:
    • getRemainingUnacked
    • removeAndGetRemainingUnacked
  • Update Consumer to use the primitive accessors for ack owner lookup, full ack completion, and redelivery removal.
  • Keep the existing get and removeAndGet methods that return IntIntPair for compatibility, and mark the legacy pair-returning path as deprecated.
  • Add PendingAcksMapPackingBenchmark under microbench so this value-layout change can be measured independently from later inner-map changes.

This PR only changes the value representation and the hot-path accessors. It does not replace the inner ordered map and does not include the later primitive hash-map or BitSet cleanup-index work.

Correctness coverage

PendingAcksMapTest covers the updated behavior:

  • packed value round trip through public accessors
  • zero remaining count
  • signed sticky key hash values, including negative hashes
  • integer boundary values
  • value-based removal
  • updateRemainingUnacked
  • removeAllUpTo callback value preservation
  • missing-entry lookups

Benchmark

This PR adds PendingAcksMapPackingBenchmark, which compares the value layout before this PR with the value layout introduced by this PR, using the same ledger/entry ordering behavior in both cases. The table below reports the before/after results relevant to this PR.

Run command:

./gradlew :microbench:shadowJar

java -jar microbench/build/libs/microbench-5.0.0-M1-SNAPSHOT-benchmarks.jar \
  'PendingAcksMapPackingBenchmark\..*' \
  -p entries=50000 \
  -p ledgers=1 \
  -wi 1 -i 3 -w 1s -r 1s -f 1 \
  -bm avgt -tu ns -prof gc

Local OpenJDK 25.0.2 results:

Operation Before this PR This PR Change
addOrReplace 183.9 ns/op, 24.7 B/op 177.9 ns/op, 0.5 B/op -3.3% time, -98.0% alloc
containsHit 79.3 ns/op, 0.3 B/op 70.6 ns/op, 0.2 B/op -11.0% time
getRemainingUnackedHit 81.1 ns/op, 0.3 B/op 70.4 ns/op, 0.2 B/op -13.3% time
updateRemainingUnacked 181.4 ns/op, 24.7 B/op 173.0 ns/op, 0.5 B/op -4.6% time, -98.0% alloc
removeAndGetRemainingAndAdd 338.3 ns/op, 65.3 B/op 294.9 ns/op, 40.8 B/op -12.8% time, -37.5% alloc
removeWithValueAndAdd 327.3 ns/op, 89.3 B/op 294.5 ns/op, 40.8 B/op -10.0% time, -54.3% alloc
forEachScan 370.8 us/op, 1477.0 B/op 277.6 us/op, 815.4 B/op -25.1% time, -44.8% alloc
removeAllUpToSmallPrefixAndRefill 19.7 us/op, 6622.3 B/op 18.5 us/op, 4235.4 B/op -6.2% time, -36.0% alloc

The main effect is lower retained heap from removing one value object per pending entry, plus lower allocation on update/remove paths that no longer need to create replacement pair values. The legacy pair-returning methods remain available for compatibility.

JOL GraphLayout.totalSize() on the value-layout change alone:

Entries / ledgers Before this PR This PR Change
1,000 / 1 64,872 B / 2,007 objects 40,880 B / 1,007 objects -37.0% size, -1,000 objects
1,000 / 20 73,536 B / 2,083 objects 49,696 B / 1,083 objects -32.4% size, -1,000 objects
50,000 / 1 3,200,872 B / 100,007 objects 2,000,880 B / 50,007 objects -37.5% size, -50,000 objects
50,000 / 20 3,209,536 B / 100,083 objects 2,009,696 B / 50,083 objects -37.4% size, -50,000 objects

Verifying this change

  • ./gradlew :pulsar-broker:test --tests org.apache.pulsar.broker.service.PendingAcksMapTest --max-workers=1 -PtestRetryCount=0
  • ./gradlew :pulsar-broker:checkstyleMain :pulsar-broker:checkstyleTest --max-workers=1
  • ./gradlew :microbench:compileJava --max-workers=1
  • ./gradlew :microbench:checkstyleMain --max-workers=1
  • ./gradlew :microbench:shadowJar --max-workers=1
  • JMH command listed above

@void-ptr974

void-ptr974 commented Jun 14, 2026

Copy link
Copy Markdown
Contributor Author

After this PR is merged, the next PR will switch the per-ledger entry map in PendingAcksMap to the primitive Long2LongOpenHashMap added in #26028.

That follow-up will keep the packed-value shape from this PR and focus only on removing the boxed inner-map key/value overhead. The BitSet prefix index for same-ledger removeAllUpTo cleanup will be a separate PR after that.

…alue-part2

# Conflicts:
#	pulsar-broker/src/main/java/org/apache/pulsar/broker/service/PendingAcksMap.java
#	pulsar-broker/src/test/java/org/apache/pulsar/broker/service/PendingAcksMapTest.java

@lhotari lhotari left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to #26028 (review). I have the fastutil minification working now and this change can be implemented on top of fastutil once the restoration of fastutil has been merged.

@void-ptr974 void-ptr974 requested a review from lhotari June 18, 2026 02:59
@void-ptr974

Copy link
Copy Markdown
Contributor Author

Thanks @lhotari. I updated this on top of #26032.

The latest version uses fastutil and keeps the scope to the value-layout change. I also updated the benchmark/JOL numbers in the PR description.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants