YANG YUBANG Alfonsobang

Hi, I'm Alfonsobang

I work on AI training data and financial agent evaluation, with a focus on LLM data quality, trajectory-aware evaluation, annotation systems, preference data, synthetic data, data governance, and financial-domain AI evaluation.

My public work is intentionally centered on resources that can be reviewed, reused, and improved without relying on private company data or proprietary workflows.

Current Focus

Financial agent evaluation: search, exact data lookup, filing QA, toy backtesting, forecasting cutoffs, tool-use traces, and compliance-boundary tasks
2026 agent evaluation: trajectory-aware grading, repeated-trial metrics, verifier evidence, and process-safety analysis
Training data quality engineering for LLM systems
Dataset cleaning, deduplication, inspection, and documentation
Annotation quality, agreement, adjudication, and reviewer calibration
Human preference data, RLHF / DPO data, and synthetic data evaluation
Financial-domain LLM benchmarks, risk-aware evaluation, and data governance

Public Projects

awesome-llm-training-data - A curated bilingual hub for LLM training data quality and financial agent evaluation, including Harbor workflows, Claw-style trajectory grading, public-data finance task specs, and deterministic verifier templates.

Current Public Work

Maintaining Awesome LLM Training Data & Agent Evaluation, including:
Tracking upstream documentation proposals for LLM data and agent evaluation workflows:
- huggingface/datatrove#485 - dataset-audit example using filters, rejected-sample capture, metadata, and summary stats.
- argilla-io/argilla#5861 - annotation QA workflow using guidelines, suggestions, filters, and adjudication.
- harbor-framework/harbor#1700 - Claw-style trajectory-aware evaluation pattern with repeated attempts and safety evidence.

Open-source Principles

Prefer primary sources, reproducible resources, and practical engineering value.
Avoid private company data, real user data, and proprietary workflows.
Treat financial-domain AI evaluation as a governance problem, not a leaderboard exercise.
Make data quality work visible through documentation, checklists, issues, and small useful contributions.

中文简介

我关注 AI 训练数据与金融 Agent 评测工程，重点方向包括 LLM 数据质量、轨迹感知评测、标注系统、偏好数据、合成数据、数据治理，以及金融搜索、查数、报表问答、回测、预测和合规边界评测。

我的公开项目会尽量使用可审查、可复用、可持续改进的公开资料，不包含私有公司数据、真实用户数据或专有工作流。

当前主要维护 Awesome LLM Training Data & Agent Evaluation，并逐步沉淀金融 Agent 评测课题框架、路线图、公开数据任务规格、Harbor 风格任务模板、确定性 verifier、Claw-style 轨迹评测笔记和多次运行指标示例。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly