Skip to content

Add clean_name utility and fallback_func support to lookup#95

Merged
jcarbaugh merged 2 commits into
mainfrom
APR-27-add-clean-name-and-fallback-func
May 19, 2026
Merged

Add clean_name utility and fallback_func support to lookup#95
jcarbaugh merged 2 commits into
mainfrom
APR-27-add-clean-name-and-fallback-func

Conversation

@jcarbaugh
Copy link
Copy Markdown
Member

@jcarbaugh jcarbaugh commented May 17, 2026

Summary

Adds a clean_name text-normalization helper and a pluggable fallback_func mechanism to us.states.lookup, plus a cache-hit short-circuit.

Closes #59

What changed

clean_name(text) — strips an incoming string down to a bare state name: lowercases, replaces punctuation and underscores with spaces, drops the filler words the, commonwealth, state, and of, and recombines the remaining tokens into a single space-separated string. For example, " The state OF idaho " becomes "idaho". It is a standalone helper — lookup does not call it automatically, so callers opt in with lookup(clean_name(raw)).

fallback_func argument on lookuplookup accepts an optional fallback_func, defaulting to None. When the normal FIPS/abbreviation/metaphone matching finds nothing, fallback_func(val) is called with the original, unmodified lookup value and may return a State or None. The fallback decides for itself what to match against. Existing behavior is unchanged when no fallback is passed.

startswith_fallback(val) — a ready-made fallback_func that matches val against the start of each state's or territory's name, case-insensitively, returning the first match (or None, including for empty input).

Approach notes

  • lookup mutates its val argument in place (upper-casing abbreviations, metaphone-encoding names). The original value is captured up front and passed to fallback_func, so fallbacks receive the human-readable string rather than a metaphone code.
  • lookup now returns immediately on a cache hit instead of scanning the full state list afterward — a pre-existing inefficiency.
  • Fallback hits are cached under a fallback-specific key, so a cached fallback result is never returned to a lookup that passes a different fallback or none at all.

Testing

Adds tests for clean_name, startswith_fallback, fallback resolution and caching semantics, and the cache-hit short-circuit. uv run pytest . passes (30 passed, 1 skipped — a pre-existing network test); ruff check and ruff format --check are clean.

Add a `clean_name` helper that strips incoming text down to a bare
state name, and a pluggable `fallback_func` mechanism for `lookup`.

- `clean_name(text)`: removes punctuation/underscores and the filler
  words "the", "state", and "of", returning a lowercased,
  space-separated string. Standalone helper; `lookup` does not call it.
- `lookup(..., fallback_func=None)`: when no match is found, calls
  `fallback_func(val)` with the original, unmodified value. The
  fallback decides what to match against.
- `startswith_fallback(val)`: a ready-made fallback matching `val`
  against the start of each state's name, case-insensitively.
- `lookup` now returns immediately on a cache hit instead of scanning
  the full state list afterward.
- Fallback hits are cached under a fallback-specific key, so a cached
  fallback result never leaks into a no-fallback or different-fallback
  lookup of the same value.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@jcarbaugh jcarbaugh closed this May 17, 2026
@jcarbaugh jcarbaugh deleted the APR-27-add-clean-name-and-fallback-func branch May 17, 2026 16:55
@jcarbaugh jcarbaugh restored the APR-27-add-clean-name-and-fallback-func branch May 19, 2026 21:50
@jcarbaugh jcarbaugh reopened this May 19, 2026
@jcarbaugh jcarbaugh self-assigned this May 19, 2026
@jcarbaugh jcarbaugh merged commit 9bedf0f into main May 19, 2026
9 checks passed
@jcarbaugh jcarbaugh deleted the APR-27-add-clean-name-and-fallback-func branch May 19, 2026 22:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

us.states.lookup('New York State') returns None

1 participant