Add clean_name utility and fallback_func support to lookup#95
Merged
Conversation
Add a `clean_name` helper that strips incoming text down to a bare state name, and a pluggable `fallback_func` mechanism for `lookup`. - `clean_name(text)`: removes punctuation/underscores and the filler words "the", "state", and "of", returning a lowercased, space-separated string. Standalone helper; `lookup` does not call it. - `lookup(..., fallback_func=None)`: when no match is found, calls `fallback_func(val)` with the original, unmodified value. The fallback decides what to match against. - `startswith_fallback(val)`: a ready-made fallback matching `val` against the start of each state's name, case-insensitively. - `lookup` now returns immediately on a cache hit instead of scanning the full state list afterward. - Fallback hits are cached under a fallback-specific key, so a cached fallback result never leaks into a no-fallback or different-fallback lookup of the same value. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds a
clean_nametext-normalization helper and a pluggablefallback_funcmechanism tous.states.lookup, plus a cache-hit short-circuit.Closes #59
What changed
clean_name(text)— strips an incoming string down to a bare state name: lowercases, replaces punctuation and underscores with spaces, drops the filler wordsthe,commonwealth,state, andof, and recombines the remaining tokens into a single space-separated string. For example," The state OF idaho "becomes"idaho". It is a standalone helper —lookupdoes not call it automatically, so callers opt in withlookup(clean_name(raw)).fallback_funcargument onlookup—lookupaccepts an optionalfallback_func, defaulting toNone. When the normal FIPS/abbreviation/metaphone matching finds nothing,fallback_func(val)is called with the original, unmodified lookup value and may return aStateorNone. The fallback decides for itself what to match against. Existing behavior is unchanged when no fallback is passed.startswith_fallback(val)— a ready-madefallback_functhat matchesvalagainst the start of each state's or territory's name, case-insensitively, returning the first match (orNone, including for empty input).Approach notes
lookupmutates itsvalargument in place (upper-casing abbreviations, metaphone-encoding names). The original value is captured up front and passed tofallback_func, so fallbacks receive the human-readable string rather than a metaphone code.lookupnow returns immediately on a cache hit instead of scanning the full state list afterward — a pre-existing inefficiency.Testing
Adds tests for
clean_name,startswith_fallback, fallback resolution and caching semantics, and the cache-hit short-circuit.uv run pytest .passes (30 passed, 1 skipped — a pre-existing network test);ruff checkandruff format --checkare clean.