Reproducible benchmark for LLM authority-reference reliability — deterministic metrics, error taxonomy, CSV outputs
benchmark linked-data reproducible-research semantic-web knowledge-graph digital-libraries llm-evaluation citation-hallucination authority-data
-
Updated
May 29, 2026 - Python