Pareta example datasets

Ready-to-run example eval sets for the task families in the Pareta model marketplace. Each folder has an items.jsonl (and real source documents/ for document tasks) you can browse, download, or load via Try the example set in-app.

These are the same bundled examples the product ships — built from public benchmarks (synthetic / CC0 / licensed eval corpora), not customer data.

Task	Metric	Source	Items	Docs
`agent-airline`	Successful task	τ-bench airline	10	—
`agent-retail`	Successful task	τ-bench retail	10	—
`code-generation`	pass@1	MBPP+	10	—
`contract-canonical-fields`	F1	Kleister-NDA	10	—
`contract-clause-enumeration`	F1	CUAD	10	—
`contract-key-fields`	F1	CUAD	10	—
`contract-long-doc-fact`	F1	Kleister-Charity	10	—
`contract-ma-deal-points`	F1	MAUD	10	—
`doc-qa-abstractive`	ANLS	DUDE	10	10
`doc-qa-extractive`	ANLS	DUDE + MP-DocVQA	10	10
`doc-qa-list`	ANLS	DUDE	10	10
`doc-qa-refusal`	NA-acc	DUDE	10	10
`emotion-classification`	F1	GoEmotions	10	—
`form-receipt-extraction`	F1	CORD-v2 + FUNSD + SROIE	10	10
`function-completion`	pass@1	HumanEval+	10	—
`hate-offensive`	F1	Davidson	10	—
`intent-classification`	F1	Banking77	10	—
`intent-in-scope`	F1	CLINC150	10	—
`intent-multilingual`	F1	MASSIVE	10	—
`invoice-extraction`	F1	katanaml + FATURA2	10	10
`phi-redaction`	F1	MTSamples	10	—
`pii-detection`	F1	ai4privacy	10	—
`text-to-api`	Syntax Match Accuracy	BFCL v3	10	—
`text-to-sql`	Execution Accuracy	BIRD-SQL	10	—
`toxic-binary`	F1	toxic-chat	10	—
`toxic-content-multilabel`	F1	Jigsaw	10	—
`unknown-intent`	AUROC	CLINC150 OOS	10	—

Generated by scripts/build-example-datasets.py in the Pareta repo.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Pareta example datasets

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
agent-airline		agent-airline
agent-retail		agent-retail
code-generation		code-generation
contract-canonical-fields		contract-canonical-fields
contract-clause-enumeration		contract-clause-enumeration
contract-key-fields		contract-key-fields
contract-long-doc-fact		contract-long-doc-fact
contract-ma-deal-points		contract-ma-deal-points
doc-qa-abstractive		doc-qa-abstractive
doc-qa-extractive		doc-qa-extractive
doc-qa-list		doc-qa-list
doc-qa-refusal		doc-qa-refusal
emotion-classification		emotion-classification
form-receipt-extraction		form-receipt-extraction
function-completion		function-completion
hate-offensive		hate-offensive
intent-classification		intent-classification
intent-in-scope		intent-in-scope
intent-multilingual		intent-multilingual
invoice-extraction		invoice-extraction
phi-redaction		phi-redaction
pii-detection		pii-detection
text-to-api		text-to-api
text-to-sql		text-to-sql
toxic-binary		toxic-binary
toxic-content-multilabel		toxic-content-multilabel
unknown-intent		unknown-intent
README.md		README.md

Folders and files

Latest commit

History

Repository files navigation

Pareta example datasets

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Packages