Add human feedback - Braintrust

Human review is a critical part of evaluating AI applications. While Braintrust helps you automatically evaluate AI software with scorers, human feedback provides essential ground truth and quality assessment. Braintrust integrates human feedback from end users, subject matter experts, and product teams in one place. Use human review to:

Evaluate and compare experiments
Assess the efficacy of automated scoring methods
Curate production logs into evaluation datasets
Label categorical data and provide corrections
Track quality trends over time

Configure review scores

Review scores appear in all logs and experiments in a project. Use them for quality control, data labeling, or feedback collection.

only available on Pro and Enterprise plans.

Go to Settings > Human review.
Click + Human review score.
Enter a name and description for your score. Descriptions support Markdown.
Select a score type:
- Categorical score: Predefined options with assigned scores. Each option gets a unique percentage value between 0% and 100% (stored as 0 to 1). Use for classification tasks like sentiment or correctness categories. Also supports writing to the expected field instead of creating a score.
- Continuous score: Numeric values between 0% and 100% with a slider input control. Use for subjective quality assessments like helpfulness or tone.
- Free-form input: String values written to the metadata field at a specified path. Use for explanations, corrections, or structured feedback.
(Optional) Expand Score visibility to restrict who sees this score during review. Select specific members or permission groups. If you don’t select anyone, the score is visible to everyone.
Click Save.

Score visibility controls which reviewers see a score in the review modal. It declutters the review experience for large teams. It is not an access control or security boundary: any reviewer with hidden scores can reveal them with the Show all scores toggle.

You can also create human review scores as you review traces. In the trace view, click + Human review score and define the score as described above.

Restrict score visibility

By default, every reviewer sees every configured score. Restrict a score to specific members or permission groups so only relevant reviewers see it in the review modal, which keeps the review experience focused for large teams. To set visibility on a new score, expand Score visibility while configuring it (see the steps above) and select the members or permission groups that should see it. To change visibility on an existing score:

Go to Settings > Human review, or open the review panel while reviewing.
Select the edit icon next to the score name.
Expand Score visibility and select the members or permission groups that should see the score. To make it visible to everyone again, deselect all.
Click Save.

If a row has configured scores but none are visible to the current reviewer, the review panel shows No scores are available to you for this row. Score visibility is a display filter, not an access control rule. Any reviewer who has hidden scores can reveal them with the Show all scores toggle in the review panel. To enforce who can read scores, use project permissions instead.

Assign rows for review

You can assign rows in logs, experiments, and datasets to team members for review, analysis, or follow-up action. Assignments are particularly useful for human review workflows, where you can assign specific rows that need human evaluation and distribute review work across multiple team members. To assign a row to a team member from any table view (logs, experiments, or datasets):

Select the row.
Select Assign.
Choose a member to assign.

Team members receive email notifications when rows are assigned to them.

Score traces and datasets

Go to Review and select the type of data to review:

Log spans: production traces and debugging sessions
Experiment spans: Evaluation results and test runs
Dataset rows: Test cases and examples

Then select a row and set scores. You can also add comments and tags while reviewing. When finished reviewing, click Mark complete to record your review. By default, marking a review complete keeps you on the current row. To move to the next row automatically after marking a review complete, enable the Auto-advance toggle. The button then reads Mark complete and continue. You can also navigate with the Next row and Previous row buttons.

Not all score types appear on dataset rows. Only categorical/slider scores configured to “write to expected” and free-form scores are available for dataset reviews, since datasets store test data (input/expected pairs) rather than subjective quality assessments.

You can also enter review mode directly from the Logs table or an individual experiment table in Experiments using the Review button or the r shortcut. The review will be scoped to the set of traces in the current filter. See Review specific rows in Logs and Experiments.

Review with multiple reviewers

Multiple reviewers can score the same span without overwriting each other. Braintrust stores each reviewer’s scores independently as dedicated review spans attached to the parent span.

How multiple reviewers’ scores combine

How Braintrust resolves multiple reviewers’ scores depends on the score type:

Scores that can be averaged: Slider scores and categorical scores resolve to a numeric value, so Braintrust averages them automatically on the parent span. In the review panel, each score shows its average as a percentage. Hover over the percentage to see detailed breakdowns: Your reviews in this trace, all reviewer scores for this span, and scores across all spans in the trace.
Scores that can’t be averaged: Free-form text scores and categorical scores that write to expected output don’t resolve to a numeric value, so there’s no average to compute. To choose which reviewer’s value becomes the parent span’s value, open the score’s options menu and select Set as base span value. On dataset rows, the same action reads Set as root value, and it’s available on dataset reviews as well. This option is available wherever an individual reviewer’s score is shown: in the review panel, or in the trace when you select that reviewer’s human review span.

Each score also shows a N total reviews indicator. Hover over it to see how many reviews match the base span’s value and how many diverge, so you can gauge consensus and spot reviewers who disagreed with the source-of-truth value.

Reviewers can always see each other’s aggregated scores. Braintrust doesn’t support blind (confidential) review.

Switch between reviewers

A row can have scores from multiple reviewers. To move between them without hunting through the trace tree:

Go to Review and open a row in review mode (see Score traces and datasets for entry points). The scores panel appears to the left of the trace.
In the scores panel header, open the Review spans dropdown.
Select Review by [reviewer] to load that reviewer’s scores, or select Base span to return to the parent span’s own scores.

You can also select a reviewer’s human review span directly in the trace tree. Either way, the right panel keeps showing the parent span’s input and output while the scores panel updates to your selection. When you view another reviewer with Review by [reviewer], the panel shows only the scores that reviewer set a value for, regardless of your personal score-visibility filter. If that reviewer set no values for scores that are hidden from you, the No scores are available to you for this row empty state appears. As elsewhere, the Show all scores toggle reveals the rest.

Filter review data

The Review page shows any spans that have been flagged for review within a given time range. Each project provides default table views with common filters, including:

Default view: Shows all records
Awaiting review: Shows only records flagged for review but not yet started
Assigned to me: Shows only records assigned to you for review
Completed: Shows only records that have finished review

Use the View menu to switch between views. You can also use the Filter menu to focus on specific subsets for review. Use the Basic tab for point-and-click filtering, or switch to SQL to write precise queries. For example, filter by scores (e.g., scores.Preference > 0.75) to find highly-rated examples.

Built-in views (such as “All logs view”) cannot be modified, but you can create custom table views based on custom filters and display settings.

Use tags to mark items for “Triage”, then review them all at once.

Change the trace layout

While reviewing log and experiment traces, you see detailed information about the flagged span by default. To switch between hierarchy, timeline, thread, and other layouts, see Examine traces.

Create and edit scores inline

While reviewing, create new score types or edit existing configurations without navigating to settings:

To create a new score, click + Human review score.
To edit an existing score, select the edit icon next to the score name.

Changes apply immediately across your project.

Editing a score configuration affects how that score works going forward. Existing score values on traces remain unchanged.

Annotate in playgrounds

For a lighter-weight alternative to the full review workflow, you can annotate outputs directly in playgrounds and then get prompt improvement suggestions based on your annotations. Playground annotations help with rapid iteration during prompt development, while the Review page is better for systematic evaluation of production logs and experiments.

Capture production feedback

In addition to internal reviews, capture feedback directly from production users. Production feedback helps you understand real-world performance and build datasets from actual user interactions. See Capture user feedback for implementation details and Build datasets from user feedback to learn how to turn feedback into evaluation datasets. You can also use dashboards to monitor user satisfaction trends and correlate automated scores with user feedback.

Customize the review table

Show and hide columns

Select Display > Columns and then:

Show or hide columns to focus on relevant data
Reorder columns by dragging them
Pin important columns to the left

All column settings are automatically saved when you save a view.

Free-form human review scores and scores that write to expected output appear automatically as columns in the dataset, review, logs, and experiment tables, marked with the review icon. You don’t need to add them as custom columns.

Use kanban layout

The kanban layout organizes flagged spans into three columns based on their review status:

Backlog: Spans flagged for review but not yet started
Pending: Spans currently being reviewed
Complete: Spans that have finished review

To use the kanban layout:

On the Review page, select Display > Layout > Kanban.
Drag cards between columns to update review status. Changes save automatically.
Click any card to open the full trace for detailed review.

Each card displays the span name, creation date, assignees, and a preview of the input and output.

Create custom table views

To create or update a custom table view:

Apply the filters and display settings you want.
Open the menu and select Save view… or Save view as….

Custom table views are visible to all project members. Creating or editing a table view requires the Update project permission.

Set default table views

You can set default views at three levels:

Organization default: Visible to all members when they open the page. This applies per page. For example, you can set separate organization defaults for Logs, Experiments, and Review. To set an organization default, you need the Manage settings organization permission (included by default in the Owner role). See Access control for details.
Project default: Overrides the organization default for everyone viewing this project. To set a project default, you need the project-level Update permission. Project admins can set project defaults even without organization-level permissions. See Access control for details.
Personal default: Overrides the project and organization defaults for you only. Personal defaults are stored in your browser, so they do not carry over across devices or browsers.

To set a default view:

Switch to the view you want by selecting it from the menu.
Open the menu again and hover over the currently selected view to reveal its submenu.
Choose Set as personal default view, Set as project default view, or Set as organization default view.

To clear a default view:

Open the menu and hover over the currently selected view to reveal its submenu.
Choose Clear personal default view, Clear project default view, or Clear organization default view.

Default view settings are mutually exclusive on a given view. Setting one type of default on a view automatically clears any other default that was previously set on the same view. When a user opens a page, Braintrust loads the first match in this order: personal default, project default, organization default, then the standard “All …” view (for example, “All logs view”).

Next steps

Add labels and corrections to categorize and tag traces
Build datasets from reviewed logs
Capture user feedback from production
Run evaluations with human-reviewed datasets

​Configure review scores

​Restrict score visibility

​Assign rows for review

​Score traces and datasets

​Review with multiple reviewers

​How multiple reviewers’ scores combine

​Switch between reviewers

​Filter review data

​Change the trace layout

​Create and edit scores inline

​Annotate in playgrounds

​Capture production feedback

​Customize the review table

​Show and hide columns

​Use kanban layout

​Create custom table views

​Set default table views

​Next steps

Configure review scores

Restrict score visibility

Assign rows for review

Score traces and datasets

Review with multiple reviewers

How multiple reviewers’ scores combine

Switch between reviewers

Filter review data

Change the trace layout

Create and edit scores inline

Annotate in playgrounds

Capture production feedback

Customize the review table

Show and hide columns

Use kanban layout

Create custom table views

Set default table views

Next steps