Skip to content

Pareta-AI/example-datasets

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Pareta example datasets

Ready-to-run example eval sets for the task families in the Pareta model marketplace. Each folder has an items.jsonl (and real source documents/ for document tasks) you can browse, download, or load via Try the example set in-app.

These are the same bundled examples the product ships — built from public benchmarks (synthetic / CC0 / licensed eval corpora), not customer data.

Task Metric Source Items Docs
agent-airline Successful task τ-bench airline 10
agent-retail Successful task τ-bench retail 10
code-generation pass@1 MBPP+ 10
contract-canonical-fields F1 Kleister-NDA 10
contract-clause-enumeration F1 CUAD 10
contract-key-fields F1 CUAD 10
contract-long-doc-fact F1 Kleister-Charity 10
contract-ma-deal-points F1 MAUD 10
doc-qa-abstractive ANLS DUDE 10 10
doc-qa-extractive ANLS DUDE + MP-DocVQA 10 10
doc-qa-list ANLS DUDE 10 10
doc-qa-refusal NA-acc DUDE 10 10
emotion-classification F1 GoEmotions 10
form-receipt-extraction F1 CORD-v2 + FUNSD + SROIE 10 10
function-completion pass@1 HumanEval+ 10
hate-offensive F1 Davidson 10
intent-classification F1 Banking77 10
intent-in-scope F1 CLINC150 10
intent-multilingual F1 MASSIVE 10
invoice-extraction F1 katanaml + FATURA2 10 10
phi-redaction F1 MTSamples 10
pii-detection F1 ai4privacy 10
text-to-api Syntax Match Accuracy BFCL v3 10
text-to-sql Execution Accuracy BIRD-SQL 10
toxic-binary F1 toxic-chat 10
toxic-content-multilabel F1 Jigsaw 10
unknown-intent AUROC CLINC150 OOS 10

Generated by scripts/build-example-datasets.py in the Pareta repo.

About

Ready-to-run example eval datasets for the Pareta model marketplace (items.jsonl + real source documents per task).

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors