Public-safe AgentV result artifacts for the financial-research-agent demo
project.
Source eval definitions live in EntityProcess/financial-research-agent. This repo
stores Dashboard-ready artifacts under .agentv/results/runs/ only. Before
pushing artifacts, run the public artifact preflight from agentv-deploy:
python3 ../agentv-deploy/scripts/check-public-result-artifacts.py .Writer credentials should come from RESULT_SYNC_GITHUB_TOKEN or local git/gh
auth and should be scoped only to this result repository where possible. Reader
mode is anonymous HTTPS clone/pull.
- 50-case Codex financial-research baseline — aggregate public baseline over 50 Dexter-adapted financial research questions.
- One-test Codex web-search baseline — early live plumbing check for the native Dexter
llm-graderrubric shape.
The source/eval repository also has a public narrative report: EntityProcess/financial-research-agent BASELINE_RESULTS.md.
- 50-case Codex financial-research baseline report — generated with
agentv results report .agentv/results/runs/age-14-task-bundle-dogfood/2026-06-10T08-35-26Z-age-14-codex --out docs/index.html. Published at https://entityprocess.github.io/financial-research-evals/. - One-case Dexter Codex web baseline report — generated with
agentv results report .agentv/results/runs/av-zk0.3-dexter-codex-web-baseline/2026-06-10T04-04-57-866Z --out docs/dexter-baseline.html. Published at https://entityprocess.github.io/financial-research-evals/dexter-baseline.html.
GitHub Pages serves the docs/ directory as the project homepage plus secondary baseline pages. Both files are self-contained, read-only AgentV reports and do not require a Dashboard server.