Encyclopedia Evalica / Evaluation / Pass@k

Pass@k illustration

Pass@k

/'pas.at.'kay/Metric for problems where the model can generate k attempts; counts success if any of the k attempts is correct. Pass@k is common in coding and tool-using tasks. (noun)

Pass@k improved after we added a retry strategy.

Related Evaluation terms

From the docs

Get started with Evals

Braintrust is the AI observability and eval platform for production AI. By connecting evals and observability in one workflow, teams at Notion, Stripe, Zapier, Vercel, and Ramp ship quality AI products at scale.

Start building