Skip to content

Implement topic resolver and optimize cache loading#1908

Open
keyurva wants to merge 7 commits into
datacommonsorg:masterfrom
keyurva:resolve-topics
Open

Implement topic resolver and optimize cache loading#1908
keyurva wants to merge 7 commits into
datacommonsorg:masterfrom
keyurva:resolve-topics

Conversation

@keyurva
Copy link
Copy Markdown
Contributor

@keyurva keyurva commented May 22, 2026

  • Adds support for the new resolver = "topic" option inside the /v2/resolve API and a parameter to request expanding topic hierarchies.
  • Integrates dynamic, incremental SV metadata caching into the topic cache manager.
  • Excludes topics and SVPGs that lack populated list properties in the database.
  • Prunes dangling parent-child tree references recursively when child targets are empty or skipped.
  • Bypasses Spanner's edge pagination truncation limit globally by parallelizing multi-node requests in chunked partitions.

Verification / Testing Reference

1. Topic Resolver (expand_topics = false)

Returns immediate sub-topic and variable candidates:

curl -X POST -H "Content-Type: application/json" -H "X-Skip-Cache: true" -d '{
  "nodes": ["dc/topic/Root"],
  "resolver": "topic",
  "expand_topics": false
}' http://localhost:8081/v2/resolve | jq

2. Topic Resolver (expand_topics = true)

Recursively expands the entire nested hierarchy cache:

curl -X POST -H "Content-Type: application/json" -H "X-Skip-Cache: true" -d '{
  "nodes": ["dc/topic/Root"],
  "resolver": "topic",
  "expand_topics": true
}' http://localhost:8081/v2/resolve | jq

3. Embeddings/Indicator Resolver (expand_topics = false)

Performs semantic embeddings search and returns immediate candidates:

curl -X POST -H "Content-Type: application/json" -H "X-Skip-Cache: true" -d '{
  "nodes": ["health"],
  "resolver": "indicator",
  "expand_topics": false
}' http://localhost:8081/v2/resolve | jq

4. Embeddings/Indicator Resolver (expand_topics = true)

Performs semantic search and recursively expands matched topic hierarchies:

curl -X POST -H "Content-Type: application/json" -H "X-Skip-Cache: true" -d '{
  "nodes": ["health"],
  "resolver": "indicator",
  "expand_topics": true
}' http://localhost:8081/v2/resolve | jq

@keyurva keyurva changed the title Implement topic resolver and optimize Spanner cache loading Implement topic resolver and optimize cache loading May 22, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces a new topic resolver for explicit topic tree navigation and enhances the indicator resolver to support recursive topic expansion. Key changes include the implementation of a TopicExpander interface, a read-through cache for Statistical Variable metadata in TopicCacheManager, and parallelized node fetching in the datasource layer to improve performance and avoid database limits. Feedback focuses on optimizing the cache implementation to prevent stampedes, improving recursive slice allocations, and ensuring robust error handling and logging across the new topic expansion logic.

Comment thread internal/server/topic/topic_cache.go
Comment thread internal/server/topic/topic_cache.go Outdated
Comment thread internal/server/topic/expansion.go Outdated
Comment thread internal/server/topic/expansion.go Outdated
Comment thread internal/server/topic/expansion.go Outdated
Comment thread internal/server/v2/resolve/embeddings.go Outdated
@keyurva keyurva requested a review from clincoln8 May 22, 2026 17:56
Copy link
Copy Markdown
Contributor

@clincoln8 clincoln8 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Keyur! just a few nits, feel free to ignore

)

// StatVarInfo stores property metadata for a Statistical Variable.
type StatVarInfo struct {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we want description in this?

// WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
// See the License for the specific language governing permissions and
// limitations under the License.

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you add a file level comment for this one to explain what this adapter is for. if this is intended to only ever be an adapter for the topic fetch, then consider renaming the file to be more specific.

@@ -46,7 +46,7 @@
) (*pbv2.ResolveResponse, error) {
// TODO: Remove this once embeddings search (resolver == "indicator") are
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe update the TODO to also mention topic resolution being supported in spanner

ExpandRoots(ctx context.Context, expandTopics bool) ([]*pbv2.ResolveResponse_Entity_Candidate, error)
ExpandTopic(ctx context.Context, topicDcid string, expandTopics bool) ([]*pbv2.ResolveResponse_Entity_Candidate, error)
GetTopicDisplayName(ctx context.Context, topicDcid string) string
GetSVPropertyInfos(ctx context.Context, svDcids []string) (map[string]SVPropertyInfo, error)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: thoughts on plural vs singular?

Suggested change
GetSVPropertyInfos(ctx context.Context, svDcids []string) (map[string]SVPropertyInfo, error)
GetSVPropertyInfo(ctx context.Context, svDcids []string) (map[string]SVPropertyInfo, error)

Node: node,
}

candidates, err := topicExpander.ExpandTopic(ctx, node, expandTopics)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to confirm, if one topic fails, then we return error for entire response and no partial response?


candidates, err := topicExpander.ExpandTopic(ctx, node, expandTopics)
if err != nil {
return nil, status.Errorf(codes.Internal, "Failed to expand topic %s: %v", node, err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we log here? or higher up in handler_v2 on error?

}

chunks := chunkNodes(nodes, fetchAllChunkSize)
responses, err := fetchChunksParallel(ctx, ds, req, chunks, pageSize)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we add some sort of debug log for how often we're chunking requests?

return n.GetName()
}
}
return ""
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: log warning for a missing topic name?

localResp, remoteResp := <-localRespChan, <-remoteRespChan

// Note: merger.MergeResolve handles nil inputs (e.g. error handling or empty) gracefully
v2Resp := merger.MergeResolve(localResp, remoteResp)
Copy link
Copy Markdown
Contributor

@clincoln8 clincoln8 May 26, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you double check that internal/merger/merger.go:MergeResolve correctly merges the resolver=topic case?

particularly recursive merge on children

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants