Research notes and analysis on AI benchmarks, datasets, and evaluation methodologies.
- OfficeQA Benchmark Research — Deep dive into Databricks' grounded reasoning benchmark for enterprise document QA
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Research notes and analysis on AI benchmarks, datasets, and evaluation methodologies.