Skip to content

Avoid per-write virtual dispatch in DictionaryValuesWriter.shouldFallBack() by caching the size-exceeded check #3501

@arouel

Description

@arouel

Describe the enhancement requested

DictionaryValuesWriter.shouldFallBack() is called by FallbackValuesWriter.checkFallback() after every single value write. The current implementation dispatches a virtual call to getDictionarySize() on every invocation:

public boolean shouldFallBack() {
  return dictionaryByteSize > maxDictionaryByteSize || getDictionarySize() > MAX_DICTIONARY_ENTRIES;
}

getDictionarySize() is an abstract method overridden in each typed subclass (Binary, Long, Double, Integer, Float) to return the backing map's .size(). Since shouldFallBack() is polled after every write, including writes of duplicate values that do not grow the dictionary, the virtual dispatch and map-size query are redundant work for the common case where most values are already in the dictionary.
Both dictionaryByteSize and the dictionary entry count can only increase when a new entry is added (inside the if (id == -1) branch of each subclass's write method). Therefore the size-exceeded condition can only transition from false to true at that exact point.

Proposal

Replace the per-write check with a cached boolean dictionarySizeExceeded flag. Introduce a checkDictionarySizeLimit(int newDictionarySize) method that subclass write methods call only when a new dictionary entry is actually added. shouldFallBack() then returns the cached flag directly, a simple field read with no virtual dispatch.

Component(s)

Core

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions