diff --git a/README.md b/README.md
index 79c60fe0..66dd0810 100644
--- a/README.md
+++ b/README.md
@@ -8,7 +8,15 @@ Multiple results may be returned representing possible conceptual matches, but a
Note that the results returned by this service have been conflated using both GeneProtein and DrugChemical conflation; you can read more about this at the [Conflation documentation](https://github.com/NCATSTranslator/Babel/blob/master/docs/Conflation.md).
-* See this [Jupyter Notebook](documentation/NameResolution.ipynb) for examples of use.
-* See the [API documentation](documentation/API.md) for information about the NameRes API.
-* See [Scoring](documentation/Scoring.md) for information about the scoring algorithm used by NameRes.
-* See [Deployment](documentation/Deployment.md) for instructions on deploying NameRes.
+## Getting started
+
+The best place to start is the Jupyter Notebook, which walks through the most common use cases with live examples:
+
+* [](https://colab.research.google.com/github/NCATSTranslator/NameResolution/blob/master/documentation/NameResolution.ipynb) [Jupyter Notebook](documentation/NameResolution.ipynb) — interactive examples covering lookup, filtering, autocomplete, bulk lookup, and synonyms
+
+## Documentation
+
+* [Translator Guide](documentation/TranslatorGuide.md) — what to do when results are unexpected, when to use `/synonyms` vs. NodeNorm, and performance tips
+* [API documentation](documentation/API.md) — full reference for all NameRes endpoints
+* [Scoring](documentation/Scoring.md) — how NameRes scores and ranks results
+* [Deployment](documentation/Deployment.md) — instructions for deploying NameRes
diff --git a/api/resources/openapi.yml b/api/resources/openapi.yml
index 07313170..6a9ac0dc 100644
--- a/api/resources/openapi.yml
+++ b/api/resources/openapi.yml
@@ -13,8 +13,10 @@ info:
have been correctly normalized using the
Node Normalization service.
You can read more
about this API on the NameResolution GitHub repository.
- Note that the returned by this service have been conflated using both GeneProtein and DrugChemical conflation;
- you can read more about this at the Conflation documentation.'
+ Note that the results returned by this service have been conflated using both GeneProtein and DrugChemical
+ conflation; you can read more about this at the
+ Conflation documentation.
+ The active conflations for any deployment can be discovered via the /status endpoint.
'
license:
name: MIT
url: https://opensource.org/licenses/MIT
diff --git a/api/server.py b/api/server.py
index 2a2277b6..e6e3cb5e 100755
--- a/api/server.py
+++ b/api/server.py
@@ -71,6 +71,11 @@ async def status() -> Dict:
babel_version = os.environ.get("BABEL_VERSION", "unknown")
babel_version_url = os.environ.get("BABEL_VERSION_URL", "")
+ # Which conflations are active in this deployment? Baked in at data-loading time.
+ conflations_raw = os.environ.get("CONFLATIONS", "GeneProtein,DrugChemical")
+ conflations = [c.strip() for c in conflations_raw.split(",") if c.strip()]
+ conflation_url = "https://github.com/NCATSTranslator/Babel/blob/main/docs/Conflation.md"
+
# Look up the BIOLINK_MODEL_TAG.
# Note: this should be a tag from the Biolink Model repo, e.g. "master" or "v4.3.6".
biolink_model_tag = os.environ.get("BIOLINK_MODEL_TAG", "master")
@@ -101,6 +106,8 @@ async def status() -> Dict:
'url': biolink_model_url,
'download_url': biolink_model_download_url,
},
+ 'conflations': conflations,
+ 'conflation_url': conflation_url,
'nameres_version': nameres_version,
'startTime': core['startTime'],
'numDocs': index.get('numDocs', ''),
@@ -122,6 +129,8 @@ async def status() -> Dict:
'url': biolink_model_url,
'download_url': biolink_model_download_url,
},
+ 'conflations': conflations,
+ 'conflation_url': conflation_url,
'nameres_version': nameres_version,
}
diff --git a/documentation/API.md b/documentation/API.md
index 57bcbdea..8db89f68 100644
--- a/documentation/API.md
+++ b/documentation/API.md
@@ -91,13 +91,17 @@ The Name Resolver largely consists of two [search endpoints](#search-endpoints):
## Conflation
Unlike the Node Normalizer, the Name Resolution Service does not currently support on-the-fly conflation. Instead,
-all the [Babel conflations](https://github.com/NCATSTranslator/Babel/blob/master/docs/Conflation.md) are turned on when Solr database is built. At the moment, this includes:
-* GeneProtein conflation: protein-encoding genes are conflated with the protein(s) they encode, and the gene identifier
- is used to identify this concept. Therefore, if you search for ""
-* DrugChemical conflation: drugs are conflated with their active ingredient, and the identifier for the active ingredient
- is used to identify this concept.
-This means that -- for example -- protein-encoding genes will include the synonyms found
-for the protein they encode, and that no separate entry will be available for those proteins.
+all the [Babel conflations](https://github.com/NCATSTranslator/Babel/blob/main/docs/Conflation.md) are baked in when the Solr database is built. At the moment, this includes:
+* **GeneProtein conflation:** protein-encoding genes are conflated with the protein(s) they encode, and the gene identifier
+ is used to identify this concept. Therefore, if you search for a protein name, you will typically receive the gene
+ identifier (e.g., searching for "dystrophin" returns `NCBIGene:1756` rather than a UniProtKB identifier).
+* **DrugChemical conflation:** drugs are conflated with their active ingredient, and the identifier for the active
+ ingredient is used to identify this concept.
+
+This means that protein-encoding genes include the synonyms found for the protein they encode, and no separate
+entry is available for those proteins in NameRes.
+
+The active conflations for any NameRes deployment can be queried programmatically via the [`/status` endpoint](#status).
Once you have an identifier from Name Resolver, you can use the [Node Normalizer](https://nodenormalization-sri.renci.org/)
to look up the equivalent identifiers for that CURIE with and without conflation. Please use the Node Normalizer
@@ -325,6 +329,8 @@ Solr database.
"url": "https://github.com/biolink/biolink-model/tree/v4.2.6-rc5",
"download_url": "https://raw.githubusercontent.com/biolink/biolink-model/v4.2.6-rc5/biolink-model.yaml"
},
+ "conflations": ["GeneProtein", "DrugChemical"],
+ "conflation_url": "https://github.com/NCATSTranslator/Babel/blob/main/docs/Conflation.md",
"nameres_version": "v1.5.1",
"startTime": "2025-12-19T11:53:09.638Z",
"numDocs": 425583391,
diff --git a/documentation/TranslatorGuide.md b/documentation/TranslatorGuide.md
new file mode 100644
index 00000000..763412cf
--- /dev/null
+++ b/documentation/TranslatorGuide.md
@@ -0,0 +1,185 @@
+# NameRes Translator Guide
+
+This guide is aimed at Translator developers and users who are integrating NameRes into their workflows.
+It covers what to do when results are unexpected, how `/synonyms` (reverse-lookup) relates to NodeNorm,
+and tips for improving performance.
+
+## What to do when a name lookup returns unexpected results
+
+NameRes ranks results by a [Solr TF*IDF score](./Scoring.md) — the top result is the best *textual* match,
+not necessarily the biologically intended concept. If the results don't look right, try these steps.
+
+### 1. Use `highlighting` to understand what matched
+
+Set `highlighting=true` on a `/lookup` call to see which label or synonym drove the match:
+
+```
+GET /lookup?string=cold&highlighting=true&limit=5
+```
+
+This tells you which synonym triggered the match, which helps diagnose why an unexpected concept ranked high.
+
+### 2. Filter by Biolink type
+
+Use `biolink_type` to restrict results to the category you expect. Multiple types are combined with OR logic:
+
+```
+GET /lookup?string=cold&biolink_type=Disease&biolink_type=PhenotypicFeature
+```
+
+Common types: `Disease`, `Gene`, `ChemicalEntity`, `PhenotypicFeature`, `BiologicalProcess`, `AnatomicalEntity`.
+Types can be specified with or without the `biolink:` prefix.
+
+### 3. Restrict to trusted prefixes
+
+Use `only_prefixes` to limit results to a specific ontology, or `exclude_prefixes` to drop a noisy one.
+Prefixes are pipe-separated and case-sensitive:
+
+```
+# Only MONDO disease identifiers
+GET /lookup?string=diabetes&biolink_type=Disease&only_prefixes=MONDO
+
+# Exclude UMLS (often produces many ambiguous matches)
+GET /lookup?string=NIH&exclude_prefixes=UMLS
+```
+
+Common trusted prefixes by category:
+
+| Category | Recommended prefixes |
+|---|---|
+| Disease | `MONDO`, `OMIM`, `ORPHANET` |
+| Gene | `NCBIGene`, `HGNC` |
+| Chemical/Drug | `CHEBI`, `DRUGBANK` |
+| Phenotype | `HP`, `MP` |
+| Anatomy | `UBERON`, `CL` |
+
+### 4. Filter by taxon for gene/protein queries
+
+When searching for a gene or protein, results may include entries from multiple species. Use `only_taxa`
+to restrict to a specific organism. The value is a pipe-separated list of NCBI Taxon CURIEs:
+
+```
+# Human genes only
+GET /lookup?string=APOE&biolink_type=Gene&only_taxa=NCBITaxon:9606
+
+# Human and mouse
+GET /lookup?string=APOE&only_taxa=NCBITaxon:9606|NCBITaxon:10090
+```
+
+Common taxa: human `NCBITaxon:9606`, mouse `NCBITaxon:10090`, rat `NCBITaxon:10116`, zebrafish `NCBITaxon:7955`.
+
+### 5. Try autocomplete mode for partial strings
+
+If your search string is a fragment of a name (e.g., typed by a user mid-word), set `autocomplete=true`.
+This expands the final word with a wildcard so that `"diab"` matches `"diabetes"`, `"diabetic"`, etc.:
+
+```
+GET /lookup?string=diab&autocomplete=true&limit=5
+```
+
+Without `autocomplete`, `"diab"` will only match documents that literally contain the token `"diab"`.
+
+### 6. If the correct concept is consistently missing
+
+If your filtering is correct but the expected result never appears, the concept may be missing from the
+Babel data that NameRes is built from. Consider filing an issue on:
+- [NameRes GitHub](https://github.com/NCATSTranslator/NameResolution/issues) — for search/ranking problems
+- [Babel GitHub](https://github.com/NCATSTranslator/Babel/issues) — for missing synonyms or identifiers
+
+---
+
+## Using `/synonyms` (reverse-lookup) vs. NodeNorm
+
+These two services answer different questions.
+
+### Use `/synonyms` when you want to inspect synonyms for a known CURIE
+
+The `/synonyms` endpoint returns all names and synonyms that NameRes knows for a given concept, along with
+its Biolink types, taxa, and clique identifier count. This is useful for verifying synonym coverage or
+debugging why a particular name did or did not match.
+
+```
+GET /synonyms?preferred_curies=NCBIGene:1756
+```
+
+**Important:** `/synonyms` requires the *preferred* (normalized) CURIE. If you pass a non-preferred
+identifier (e.g. a UniProtKB accession for a gene), you will get an empty result. Before calling
+`/synonyms`, normalize your CURIE with NodeNorm (see below).
+
+You can look up multiple CURIEs in one request:
+
+```
+GET /synonyms?preferred_curies=MONDO:0005148&preferred_curies=NCBIGene:1756
+```
+
+### Use NodeNorm when you need identifier normalization or equivalent identifiers
+
+The [Node Normalization service](https://nodenormalization-sri.renci.org/) is the right tool when you need to:
+
+- Convert a non-preferred identifier to its preferred CURIE
+- Find all equivalent identifiers for a concept across ontologies
+- Check which Biolink types a CURIE maps to
+- Determine whether two CURIEs refer to the same concept
+
+To normalize a CURIE before passing it to `/synonyms`, call NodeNorm with GeneProtein and DrugChemical
+conflation enabled (to match the conflation used by NameRes):
+
+```
+GET https://nodenormalization-sri.renci.org/get_normalized_nodes?curie=UniProtKB:A0A0S2Z3B5&conflate=true&drug_chemical=true
+```
+
+The `id.identifier` field in the response is the preferred CURIE you can then pass to `/synonyms`.
+
+### Quick decision guide
+
+| Question | Tool |
+|---|---|
+| What synonyms does NameRes know for this CURIE? | `/synonyms` |
+| What is the preferred identifier for this concept? | NodeNorm |
+| Are these two CURIEs equivalent? | NodeNorm |
+| What Biolink types does this CURIE have? | NodeNorm |
+| Why didn't a particular name match in `/lookup`? | `/synonyms` + `highlighting` |
+| Which conflations are active in this NameRes deployment? | `/status` (`conflations` field) |
+
+---
+
+## Performance tips
+
+### Batch multiple queries with `/bulk-lookup`
+
+Instead of making N separate `/lookup` calls, send them all in one POST request to `/bulk-lookup`.
+It returns a dictionary keyed by input string:
+
+```json
+POST /bulk-lookup
+{
+ "strings": ["diabetes", "hypertension", "asthma"],
+ "limit": 5,
+ "biolink_types": ["Disease"]
+}
+```
+
+This is significantly more efficient than sequential individual requests.
+
+### Add filters before processing results
+
+Apply `biolink_type`, `only_prefixes`, and `only_taxa` at query time rather than filtering the response
+yourself. Server-side filtering reduces the result set before it is serialized and transmitted.
+
+### Set `limit` to what you actually need
+
+The default `limit` is 10 and the maximum is 1000. If you only need the top result, set `limit=1`.
+If you need to page through a large result set, use `offset` for server-side pagination rather than
+requesting a large `limit` and slicing client-side.
+
+### Cache results between Babel data releases
+
+NameRes results are stable between Babel data releases (which happen a few times per year). If your
+application calls NameRes repeatedly for the same input strings, cache the results locally. Check the
+`/status` endpoint to detect when the Babel version changes and invalidate your cache accordingly:
+
+```
+GET /status
+```
+
+The `babel_version` field in the response changes with each data release.