Skip to main content

Documentation Index

Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt

Use this file to discover all available pages before exploring further.

Human review is a critical part of evaluating AI applications. While Braintrust helps you automatically evaluate AI software with scorers, human feedback provides essential ground truth and quality assessment. Braintrust integrates human feedback from end users, subject matter experts, and product teams in one place. Use human review to:
  • Evaluate and compare experiments
  • Assess the efficacy of automated scoring methods
  • Curate production logs into evaluation datasets
  • Label categorical data and provide corrections
  • Track quality trends over time

Configure review scores

Review scores let you collect structured feedback on spans and label dataset rows.
only available on Pro and Enterprise plans.
Configure scores in Settings > Project > Human review. See Configure human review for details on score types and options.

Assign rows for review

You can assign rows in logs, experiments, and datasets to team members for review, analysis, or follow-up action. Assignments are particularly useful for human review workflows, where you can assign specific rows that need human evaluation and distribute review work across multiple team members. To assign a row to a team member from any table view (logs, experiments, or datasets):
  1. Select the row.
  2. Select Assign.
  3. Choose a member to assign.
Team members receive email notifications when rows are assigned to them.

Score traces and datasets

Go to the Review page and select the type of data to review:
  • Log spans: production traces and debugging sessions
  • Experiment spans: Evaluation results and test runs
  • Dataset rows: Test cases and examples
Then select a row and set scores. You can also add comments and tags while reviewing. When finished reviewing, click Complete review and continue to move to the next item in the queue, or use the Next row and Previous row buttons.
Not all score types appear on dataset rows. Only categorical/slider scores configured to “write to expected” and free-form scores are available for dataset reviews, since datasets store test data (input/expected pairs) rather than subjective quality assessments.

Filter review data

The Review page shows any spans that have been flagged for review within a given time range. Each project provides default table views with common filters, including:
  • Default view: Shows all records
  • Awaiting review: Shows only records flagged for review but not yet started
  • Assigned to me: Shows only records assigned to you for review
  • Completed: Shows only records that have finished review
Use the View menu to switch between views. You can also use the Filter menu to focus on specific subsets for review. Use the Basic tab for point-and-click filtering, or switch to SQL to write precise queries. For example, filter by scores (e.g., scores.Preference > 0.75) to find highly-rated examples.
Built-in views (such as “All logs view”) cannot be modified, but you can create custom table views based on custom filters and display settings.
Use tags to mark items for “Triage”, then review them all at once.

Change the trace layout

While reviewing log and experiment traces, you see detailed information about the flagged span by default. To switch between hierarchy, timeline, thread, and other layouts, see Examine traces.

Create and edit scores inline

While reviewing, create new score types or edit existing configurations without navigating to settings:
  • To create a new score, click + Human review score.
  • To edit an existing score, select the edit icon next to the score name.
Changes apply immediately across your project.
Editing a score configuration affects how that score works going forward. Existing score values on traces remain unchanged.

Annotate in playgrounds

For a lighter-weight alternative to the full review workflow, you can annotate outputs directly in playgrounds and then get prompt improvement suggestions based on your annotations. Playground annotations help with rapid iteration during prompt development, while the Review page is better for systematic evaluation of production logs and experiments.

Capture production feedback

In addition to internal reviews, capture feedback directly from production users. Production feedback helps you understand real-world performance and build datasets from actual user interactions. See Capture user feedback for implementation details and Build datasets from user feedback to learn how to turn feedback into evaluation datasets. You can also use dashboards to monitor user satisfaction trends and correlate automated scores with user feedback.

Customize the review table

Show and hide columns

Select Display > Columns and then:
  • Show or hide columns to focus on relevant data
  • Reorder columns by dragging them
  • Pin important columns to the left
All column settings are automatically saved when you save a view.

Use kanban layout

The kanban layout organizes flagged spans into three columns based on their review status:
  • Backlog: Spans flagged for review but not yet started
  • Pending: Spans currently being reviewed
  • Complete: Spans that have finished review
To use the kanban layout:
  1. On the Review page, select Display > Layout > Kanban.
  2. Drag cards between columns to update review status. Changes save automatically.
  3. Click any card to open the full trace for detailed review.
Each card displays the span name, creation date, assignees, and a preview of the input and output.

Create custom table views

To create or update a custom table view:
  1. Apply the filters and display settings you want.
  2. Open the menu and select Save view… or Save view as….
Custom table views are visible to all project members. Creating or editing a table view requires the Update project permission.

Set default table views

You can set default views at two levels:
  • Organization default: Visible to all members when they open the page. This applies per page — for example, you can set separate organization defaults for Logs, Experiments, and Review. To set an organization default, you need the Manage settings organization permission (included by default in the Owner role). See Access control for details.
  • Personal default: Overrides the organization default for you only. Personal defaults are stored in your browser, so they do not carry over across devices or browsers.
To set a default view:
  1. Switch to the view you want by selecting it from the menu.
  2. Open the menu again and hover over the currently selected view to reveal its submenu.
  3. Choose Set as personal default view or Set as organization default view.
To clear a default view:
  1. Open the menu and hover over the currently selected view to reveal its submenu.
  2. Choose Clear personal default view or Clear organization default view.
When a user opens a page, Braintrust loads the first match in this order: personal default, organization default, then the standard “All …” view (e.g., “All logs view”).

Next steps