Documentation Index
Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Summary
When using COUNT(DISTINCT field) in a BTQL query with GROUP BY, the result returns the total row count for each group instead of returning 1 per distinct group, causing apparent duplicate user IDs in logs aggregation queries. This is the current expected behavior of how BTQL processes COUNT(DISTINCT) aggregations within GROUP BY clauses. Use count_distinct(field) function instead of COUNT(DISTINCT field), or restructure the query to avoid COUNT(DISTINCT) in GROUP BY contexts.
Resolution Steps
If you need distinct counts per group
Step 1: Replace COUNT(DISTINCT) with count_distinct()
Use the count_distinct() function in your query instead of COUNT(DISTINCT field).
SELECT
metadata.user,
count_distinct(metadata.user) as distinct_users,
COUNT(*) as total_rows
FROM project_logs('PROJECT_ID', shape => 'spans')
WHERE created > NOW() - INTERVAL 20 day
AND span_attributes.name = 'streamChat'
AND metadata.user IS NOT NULL
GROUP BY metadata.user
Step 2: Verify results in SQL sandbox
Test the query in the Braintrust SQL sandbox to confirm it returns the expected distinct counts.
If you need time-series distinct counts for monitoring dashboards
Step 1: Group by time period and user
Add date grouping to get distinct user counts distributed by day.
SELECT
DATE(created) as date,
metadata.user,
COUNT(*) as event_count
FROM project_logs('PROJECT_ID', shape => 'spans')
WHERE created > NOW() - INTERVAL 3 day
AND span_attributes.name = 'streamChat'
AND metadata.user IS NOT NULL
GROUP BY DATE(created), metadata.user
ORDER BY date DESC, event_count DESC
Step 2: Use count distinct in Monitor charts
Select count distinct as the aggregator when configuring a measure in a Monitor chart. This gives you a time-series view of distinct values without a custom SQL query.