Integrating Databrain with Elastic APM
This guide explains how to send OpenTelemetry traces, metrics, and logs from your self-hosted Databrain instance to Elastic APM (Application Performance Monitoring).Why Elastic APM?
Elastic APM is ideal if you’re already using the Elastic Stack (ELK):- Unified Platform: APM data alongside logs and metrics in Kibana
- OpenTelemetry Support: Native OTLP ingestion
- Powerful Querying: Kibana’s full search capabilities
- Flexible Deployment: Cloud or self-hosted options
Prerequisites
- Databrain self-hosted version with OpenTelemetry support
- Elastic Stack 8.0+ (Elasticsearch, Kibana, APM Server)
- APM Server configured for OpenTelemetry
Option 1: Elastic Cloud (Easiest)
1. Set Up Elastic Cloud
- Sign up for Elastic Cloud
- Create a deployment (includes Elasticsearch, Kibana, and APM)
- Note your:
- Cloud ID
- APM endpoint:
https://<deployment-id>.apm.<region>.cloud.es.io - Secret token or API key
2. Configure Databrain
3. Docker Compose Configuration
.env:
Option 2: Self-Hosted Elastic Stack
1. Deploy Elastic Stack with Docker Compose
2. Configure APM Server for OpenTelemetry
Createapm-server.yml:
3. Configure Filebeat for Log Ingestion
Createfilebeat.yml:
4. Start Elastic Stack
5. Configure Databrain for Self-Hosted APM
Option 3: Using Elastic APM Agent (Alternative)
If you prefer the native Elastic APM agent over OpenTelemetry:Kubernetes Deployment
Deploy Elastic Cloud on Kubernetes (ECK)
Configure Databrain on Kubernetes
What Gets Sent to Elastic APM
| Telemetry Type | Elastic Product | Description |
|---|---|---|
| Traces | APM | Distributed traces with transaction and span details |
| Metrics | APM | Service metrics, system metrics, custom metrics |
| Logs | Elasticsearch | Structured logs with APM correlation |
| Errors | APM | Error tracking with stack traces |
Verification
1. Check APM Server Status
2. Generate Test Traffic
3. View in Kibana
- Open Kibana: http://localhost:5601
- Navigate to Observability → APM → Services
- You should see databrain-api in the services list
- Click on the service to see:
- Transaction duration: Latency distribution
- Throughput: Requests per minute
- Error rate: Percentage of failed requests
- Transactions: Individual endpoints
- Dependencies: External services (DB, Redis, etc.)
4. View Traces
- Click on Transactions tab
- Select an endpoint (e.g.,
/api/v2/metric/execute) - Click on a specific transaction to see the trace waterfall
5. View Logs with APM Correlation
- Go to Observability → Logs → Stream
- Filter:
service.name: databrain-api - Click on any log entry
- Click View in APM to see the related trace
Creating Kibana Dashboards
Service Overview Dashboard
- Go to Dashboard → Create dashboard
- Add visualizations:
- Index pattern:
apm-* - Metric: Count of
processor.event: transaction - Time range: Last 15 minutes
- Breakdown: service.name
- Metric: Average of
transaction.duration.us - Filter:
service.name: databrain-api
- Formula:
(count where transaction.result = error) / count * 100 - Filter:
service.name: databrain-api
- Metric: P95 of
transaction.duration.us - Group by:
transaction.name - Top 10
Database Performance Dashboard
Slow Queries:- Filter:
span.type: db AND span.subtype: postgresql - Metric: P95 of
span.duration.us - Group by:
span.db.statement
Setting Up Alerts
Create Alert Rules in Kibana
- Go to Observability → Alerts → Create rule
High Error Rate Alert
- Rule type: APM
- Service: databrain-api
- Alert when: Transaction error rate is above threshold
- Threshold: > 5% for 5 minutes
- Actions: Send to Slack/Email/PagerDuty
High Latency Alert
- Rule type: APM
- Service: databrain-api
- Transaction type: request
- Alert when: Latency threshold is exceeded
- Threshold: P95 > 2000ms for 5 minutes
Service Down Alert
- Rule type: Uptime
- Monitor: databrain-api
- Alert when: Service is down
- Check frequency: Every 1 minute
Machine Learning (X-Pack)
If you have an Elastic license, enable ML anomaly detection:- Go to Machine Learning → Anomaly Detection
- Create job for APM:
- Job type: APM
- Service: databrain-api
- Analyze: Response times and throughput
- ML will automatically detect:
- Unusual latency spikes
- Traffic anomalies
- Error rate changes
Troubleshooting
| Issue | Solution |
|---|---|
| No data in Kibana | Check APM Server logs: docker logs apm-server |
| 401 Unauthorized | Verify secret token matches between Databrain and APM Server |
| Connection refused | Check network connectivity and firewall rules |
| Missing logs | Ensure Filebeat is running and configured correctly |
| High disk usage | Configure ILM (Index Lifecycle Management) policies |
Debug APM Server
Check Elasticsearch Indices
Index Lifecycle Management (ILM)
Configure ILM to manage disk usage:- Go to Stack Management → Index Lifecycle Policies
- Edit APM policies:
- Hot phase: 1 day
- Warm phase: 7 days
- Delete phase: 30 days
- Hot phase: 3 days
- Warm phase: 14 days
- Delete phase: 90 days
- Hot phase: 7 days
- Warm phase: 30 days
- Delete phase: 90 days
Best Practices
1. Use Index Templates
Customize APM index templates for better performance:2. Enable Cross-Cluster Search
For multi-region deployments, use cross-cluster search to aggregate data.3. Use APM Correlations
Let Elastic automatically find correlations between slow transactions and attributes:- Go to APM → databrain-api → Latency correlations
- View attributes that correlate with slow requests
Cost Optimization
For Elastic Cloud:- Optimize data retention: Use ILM to delete old data
- Reduce sampling: Not all traces need to be kept
- Use smaller instance sizes: Start small, scale as needed
- Use hot-warm architecture: Move old data to cheaper storage
- Enable compression: Reduce storage by 50-70%
- Optimize shard size: 20-50GB per shard is ideal
Support
- Elastic Documentation: https://www.elastic.co/guide/en/apm/
- Elastic Forum: https://discuss.elastic.co/
- Elastic Support: Available for paid subscriptions

