Braintrust offers a self-hosted deployment option that separates data storage from platform management. You deploy and control the infrastructure that stores your sensitive AI data, while Braintrust provides the managed UI, authentication, and platform updates. This gives you full control over your data without the operational overhead of running the entire platform.Documentation Index
Fetch the complete documentation index at: https://braintrust.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Use cases
Self-hosting is designed for organizations with specific requirements:- Data residency and compliance: Meet regulatory or contractual obligations by keeping all customer data (experiment logs, traces, datasets, and prompts) within your own cloud account and region.
- Security posture and isolation: Deploy the data plane behind your firewall or VPN, using your own IAM policies, KMS encryption keys, and audit trails. This ensures sensitive data never traverses external networks.
- Access to private resources: Connect to internal LLM models, proprietary tools, or private APIs that are not accessible from the public internet. The data plane runs within your network and can access resources in your VPC or private network.
How it works
Braintrust’s architecture has two main components:- The data plane stores all sensitive data, including experiment records, logs, traces, spans, datasets, and prompt completions. It consists of the Braintrust API, a PostgreSQL database, Redis cache, object storage, and Brainstore (a high-performance query engine for real-time trace ingestion).
- The control plane provides the web UI, authentication, user management, and metadata storage (project names, experiment names, organization settings). The control plane does not store or process your sensitive data.
Breakdown of where data is stored
Breakdown of where data is stored
| Data | Location |
|---|---|
| Experiment records (input, output, expected, scores, metadata, traces, spans) | Data plane |
| Log records (input, output, expected, scores, metadata, traces, spans) | Data plane |
| Dataset records (input, output, metadata) | Data plane |
| Prompt playground prompts | Data plane |
| Prompt playground completions | Data plane |
| Human review scores | Data plane |
| Project-level LLM provider secrets (encrypted) | Data plane |
| Org-level LLM provider secrets (encrypted) | Control plane |
| API keys (hashed) | Control plane |
| Experiment and dataset names | Control plane |
| Project names | Control plane |
| Project settings | Control plane |
| Git metadata about experiments | Control plane |
| Organization info (name, settings) | Control plane |
| Login info (name, email, avatar URL) | Control plane |
| Auth credentials | Clerk |
Deployment options
Braintrust provides official Terraform modules for self-hosting on AWS, Google Cloud Platform (GCP), and Azure:- AWS: Terraform with Lambda and EC2
- GCP: Terraform with Kubernetes and Helm
- Azure: Terraform with Kubernetes and Helm
Legacy customers: If you previously deployed using AWS CloudFormation, the CloudFormation guide remains available. This deployment method is not supported for new customers.
Shared responsibility
When you self-host, uptime becomes a shared responsibility between your team and Braintrust:- Braintrust is responsible for responding quickly when you have issues, collaboratively resolving them with you, and fixing bugs to improve quality.
- Your team is responsible for following the documentation, assigning infrastructure resources on your team, and ensuring that in the event of an incident, you have staff who are familiar with Braintrust and can work with the Braintrust team to share context and resolve issues.
Monitoring
Braintrust monitors your self-hosted deployment through automatic telemetry and an in-app infra dashboard.Telemetry
By default, your self-hosted data plane automatically sends the following telemetry back to the Braintrust-managed control plane:- Health check information
- System metrics (CPU/memory) and Braintrust-specific metrics like indexing lag
- Billing usage telemetry for aggregate usage metrics
Infra dashboard
Only organization owners and members with the Manage settings permission can access this dashboard.
- Processing throughput (bytes processed, compaction)
- CPU and memory usage by reader and writer nodes
- Object storage latency and operations
- Realtime lag
- Status checks
Upgrades
Braintrust ships new data plane versions 1-2 times per month. You can find the details of each release on the Self-hosting releases. Braintrust recommends upgrading each time a new version is published. New features often depend on data plane changes, and when they do, Braintrust will automatically gate those features until you upgrade.| Data plane age | Status |
|---|---|
| Up to date | Fully supported |
| 1-3 months out of date | Supported with caveats — you may encounter functionality issues or bugs. Given the pace of the AI space, Braintrust prioritizes shipping new features while doing their best to maintain compatibility. If you hit a bug, contact support and Braintrust will prioritize a fix or workaround. |
| More than 3 months out of date | Unsupported — upgrade immediately. If you contact support, the first thing Braintrust will ask you to do is upgrade. |
Remote access
There are occasionally issues that require ad-hoc debugging or running manual commands against containers, the Postgres database, or storage buckets to repair the state of the system. Customers who provide Braintrust with remote access (as needed) have experienced much faster resolutions when such issues occur, because the Braintrust team can connect directly and resolve issues. If this is not possible, factor this into your uptime calculations. If uptime of Braintrust is a key metric for you, strongly consider making remote access available to the Braintrust team as needed. If you cannot set up remote access, ensure that you can swiftly access:- Containers directly (to update them, view logs, restart them, and view host metrics like CPU, network, memory, and disk utilization)
- Postgres to run SQL queries
- Redis to run commands
- Storage buckets to run read, write, and list commands
Hardware requirements
When deploying Braintrust in production, consider these hardware requirements for reliable performance and uptime. These requirements assume typical production usage patterns. For high-utilization deployments, you may need to scale these resources up significantly. Monitor your resource utilization and adjust accordingly.API service
The API service handles all SDK and browser requests to the data plane.This section applies to GCP and Azure with Kubernetes. AWS deployments use Lambda functions, which are managed automatically and do not require manual resource configuration.
| Resource | Testing/Staging | Production |
|---|---|---|
| CPU | 1 vCPU | 2+ vCPUs per instance |
| Memory | 2GB RAM | 8GB+ RAM |
| Instance count | 1 | 4+ |
NODE_MEMORY_PERCENT: Set to80-90if the API is running on a dedicated instance or container orchestrator with cgroup memory limits (e.g. Kubernetes, ECS).TS_API_KEEP_ALIVE_TIMEOUT_SECONDS: Configure the HTTP keep-alive timeout when running behind a load balancer. See Configure HTTP keep-alive timeout for details.
PostgreSQL
PostgreSQL stores metadata required to operate the platform, including pointers to raw data in object storage and aggregate statistics about the data. It is not the primary store for your AI data — traces, spans, and logs live in Brainstore and object storage.| Resource | Testing/Staging | Production |
|---|---|---|
| CPU | 2 vCPUs | 8+ vCPUs |
| Memory | 8GB RAM | 64GB+ RAM |
| Storage size | 100GB | 1000GB+ (monitor for growth) |
| Storage IOPS | 3,000 | 15,000+ |
| Version | 15+ | 17+ |
Redis cache
Redis provides caching and coordination for session management, rate limiting, and Brainstore write ordering.| Resource | Testing/Staging | Production |
|---|---|---|
| CPU | 1 vCPU | 2 vCPUs |
| Memory | 1GB RAM | 4GB+ RAM |
| Version | 7+ | 7+ |
Brainstore
Brainstore is Braintrust’s high-performance database for ingesting and querying AI data. It uses object storage and a streaming Rust engine to load spans in real time, cutting down on latency and enabling deep search capabilities. Brainstore runs as separate reader and writer node types, each with distinct resource requirements.Important
- Brainstore requires high-performance storage with at least 150,000 IOPS for both reads and writes. Use NVMe-based ephemeral storage (the storage does not need to be persistent). Do not use EBS volumes or other slower storage options like Azure’s standard local disks, as these will significantly degrade performance.
- For Kubernetes deployments (GCP and Azure), each Brainstore pod must run on its own dedicated node to ensure optimal performance and resource isolation.
Readers
Readers serve ad-hoc queries, including those from the API and user-defined BTQL queries. Plan for a minimum of 2 reader nodes in production to ensure high availability. A specialized reader variant — fast readers — serves predictable UI queries (paginated viewers, span and trace lookups) in isolation from standard reader nodes, keeping the UI responsive while resource-intensive queries run on readers. On GCP and Azure, fast readers are enabled by default with 2 replicas starting in Helm chart v5.0.0. On AWS, fast readers are disabled by default; setbrainstore_fast_reader_instance_count to enable them. When planning cluster capacity, account for these additional nodes. See Configure Brainstore fast readers for configuration details.
| Resource | Testing/Staging | Prod: readers | Prod: fast readers |
|---|---|---|---|
| CPU | 4 vCPUs | 16 vCPUs | 16 vCPUs |
| Memory | 8GB RAM | 32GB RAM | 32GB RAM |
| Storage size | 128GB | 1024GB+ | 1024GB+ |
| Storage type | SSD | NVMe (ephemeral) | NVMe (ephemeral) |
| Storage IOPS | — | 150,000+ read/write | 150,000+ read/write |
| Instance count | 1 | 2+ | 2+ |
Writers
Writers ingest incoming spans and traces and write them to object storage. Writers don’t serve interactive requests, so a single writer node is sufficient for production.| Resource | Testing/Staging | Production |
|---|---|---|
| CPU | 4 vCPUs | 32 vCPUs |
| Memory | 8GB RAM | 64GB RAM |
| Storage size | 128GB | 1024GB+ |
| Storage type | SSD | NVMe (ephemeral) |
| Storage IOPS | — | 150,000+ read/write |
| Instance count | 1 | 1+ |