Documentation Index
Fetch the complete documentation index at: https://docs.usedatabrain.com/llms.txt
Use this file to discover all available pages before exploring further.
Integrating Databrain with Grafana Cloud
This guide explains how to send OpenTelemetry traces, metrics, and logs from your self-hosted Databrain instance to Grafana Cloud.
Prerequisites
- Databrain self-hosted version with OpenTelemetry support
- Grafana Cloud account (free tier available)
- Grafana Cloud API token
What You’ll Get
Grafana Cloud provides three integrated observability products:
| Product | Purpose | What It Shows |
|---|
| Grafana Tempo | Distributed tracing | Request traces with spans and timing |
| Grafana Loki | Log aggregation | Structured logs with trace correlation |
| Prometheus | Metrics | Request rates, latency histograms, error rates |
Configuration
1. Get Your Grafana Cloud Credentials
- Log into Grafana Cloud
- Navigate to your stack (e.g.,
yourstack.grafana.net)
- Go to Connections → Add new connection → OpenTelemetry
- Note the following:
- Tempo endpoint:
tempo-<region>.grafana.net:443
- Prometheus endpoint:
prometheus-<region>.grafana.net:443
- Loki endpoint:
logs-<region>.grafana.net
- Instance ID: Your unique instance identifier
- API Token: Generate one under Access Policies
2. Using Grafana Alloy (Recommended)
Grafana Alloy is the easiest way to send telemetry to Grafana Cloud. It replaces the generic OpenTelemetry Collector.
Docker Compose Configuration
services:
grafana-alloy:
image: grafana/alloy:latest
restart: always
volumes:
- ./alloy-config.alloy:/etc/alloy/config.alloy
ports:
- "4317:4317" # OTLP gRPC
- "4318:4318" # OTLP HTTP
- "12345:12345" # Alloy UI
environment:
- GRAFANA_CLOUD_TEMPO_ENDPOINT=tempo-us-central1.grafana.net:443
- GRAFANA_CLOUD_PROMETHEUS_ENDPOINT=prometheus-us-central1.grafana.net/api/prom/push
- GRAFANA_CLOUD_LOKI_ENDPOINT=logs-prod-us-central1.grafana.net/loki/api/v1/push
- GRAFANA_CLOUD_INSTANCE_ID=123456
- GRAFANA_CLOUD_API_KEY=${GRAFANA_CLOUD_API_KEY}
networks:
- databrain
databrainbackend:
environment:
OTEL_ENABLED: "true"
OTEL_EXPORTER_OTLP_ENDPOINT: "http://grafana-alloy:4318"
OTEL_SERVICE_NAME: "databrain-api"
LOG_LEVEL: "info"
depends_on:
- grafana-alloy
Alloy Configuration
Create alloy-config.alloy:
// OTLP Receiver
otelcol.receiver.otlp "default" {
grpc {
endpoint = "0.0.0.0:4317"
}
http {
endpoint = "0.0.0.0:4318"
}
output {
traces = [otelcol.processor.batch.default.input]
metrics = [otelcol.processor.batch.default.input]
logs = [otelcol.processor.batch.default.input]
}
}
// Batch Processor
otelcol.processor.batch "default" {
timeout = "1s"
send_batch_size = 1024
output {
traces = [otelcol.exporter.otlp.tempo.input]
metrics = [otelcol.exporter.prometheus.default.input]
logs = [otelcol.exporter.loki.default.input]
}
}
// Tempo (Traces) Exporter
otelcol.exporter.otlp "tempo" {
client {
endpoint = env("GRAFANA_CLOUD_TEMPO_ENDPOINT")
auth = otelcol.auth.basic.grafana_cloud.handler
}
}
// Prometheus (Metrics) Exporter
otelcol.exporter.prometheus "default" {
forward_to = [prometheus.remote_write.grafana_cloud.receiver]
}
prometheus.remote_write "grafana_cloud" {
endpoint {
url = "https://${env("GRAFANA_CLOUD_PROMETHEUS_ENDPOINT")}"
basic_auth {
username = env("GRAFANA_CLOUD_INSTANCE_ID")
password = env("GRAFANA_CLOUD_API_KEY")
}
}
}
// Loki (Logs) Exporter
otelcol.exporter.loki "default" {
forward_to = [loki.write.grafana_cloud.receiver]
}
loki.write "grafana_cloud" {
endpoint {
url = "https://${env("GRAFANA_CLOUD_LOKI_ENDPOINT")}"
basic_auth {
username = env("GRAFANA_CLOUD_INSTANCE_ID")
password = env("GRAFANA_CLOUD_API_KEY")
}
}
}
// Basic Auth
otelcol.auth.basic "grafana_cloud" {
username = env("GRAFANA_CLOUD_INSTANCE_ID")
password = env("GRAFANA_CLOUD_API_KEY")
}
3. Alternative: Direct OTLP to Grafana Cloud
You can also send data directly without Alloy (less flexible):
services:
databrainbackend:
environment:
OTEL_ENABLED: "true"
OTEL_EXPORTER_OTLP_ENDPOINT: "https://tempo-us-central1.grafana.net:443"
OTEL_SERVICE_NAME: "databrain-api"
OTEL_EXPORTER_OTLP_HEADERS: "authorization=Basic $(echo -n '${GRAFANA_INSTANCE_ID}:${GRAFANA_API_KEY}' | base64)"
Note: This sends traces only. For metrics and logs, use Alloy.
4. Kubernetes with Grafana Agent
For Kubernetes, use the Grafana Agent Operator:
apiVersion: v1
kind: Secret
metadata:
name: grafana-cloud-credentials
namespace: monitoring
type: Opaque
stringData:
instance-id: "123456"
api-key: "your-grafana-cloud-api-key"
---
apiVersion: monitoring.grafana.com/v1alpha1
kind: GrafanaAgent
metadata:
name: grafana-agent
namespace: monitoring
spec:
image: grafana/agent:latest
integrations:
selector:
matchLabels:
agent: grafana-agent
logs:
instanceSelector:
matchLabels:
agent: grafana-agent
metrics:
instanceSelector:
matchLabels:
agent: grafana-agent
---
apiVersion: monitoring.grafana.com/v1alpha1
kind: Integration
metadata:
name: opentelemetry
namespace: monitoring
spec:
name: otel
type: otel
config:
receivers:
otlp:
protocols:
http:
endpoint: "0.0.0.0:4318"
grpc:
endpoint: "0.0.0.0:4317"
exporters:
otlp/tempo:
endpoint: "tempo-us-central1.grafana.net:443"
auth:
authenticator: basicauth/grafana_cloud
prometheusremotewrite:
endpoint: "https://prometheus-us-central1.grafana.net/api/prom/push"
basic_auth:
username: "123456"
password: "${GRAFANA_API_KEY}"
service:
pipelines:
traces:
receivers: [otlp]
exporters: [otlp/tempo]
metrics:
receivers: [otlp]
exporters: [prometheusremotewrite]
Verification
1. Check Alloy Status
Visit the Alloy UI: http://localhost:12345
You should see:
- Active OTLP receivers
- Successful exports to Grafana Cloud
2. Generate Test Traffic
curl -X GET "https://your-databrain-instance.com/api/health"
3. View in Grafana Cloud
Tempo (Traces)
- Go to Explore in Grafana
- Select Tempo as the data source
- Query:
{service.name="databrain-api"}
- You should see traces within 1-2 minutes
Prometheus (Metrics)
- Go to Explore
- Select Prometheus as the data source
- Query:
rate(http_server_duration_milliseconds_count{service_name="databrain-api"}[5m])
Loki (Logs)
- Go to Explore
- Select Loki as the data source
- Query:
{service_name="databrain-api"}
- Click on any log to see correlated traces
Creating Dashboards
Pre-built Dashboard
Import the OpenTelemetry APM dashboard:
- Go to Dashboards → New → Import
- Use dashboard ID:
19419 (OpenTelemetry APM)
- Select your Tempo and Prometheus data sources
- Filter by
service_name = databrain-api
Custom Dashboard Example
Create a new dashboard with these panels:
Request Rate:
rate(http_server_duration_milliseconds_count{service_name="databrain-api"}[5m])
Average Latency:
rate(http_server_duration_milliseconds_sum{service_name="databrain-api"}[5m])
/
rate(http_server_duration_milliseconds_count{service_name="databrain-api"}[5m])
Error Rate:
rate(http_server_duration_milliseconds_count{service_name="databrain-api",http_status_code=~"5.."}[5m])
P95 Latency:
histogram_quantile(0.95,
rate(http_server_duration_milliseconds_bucket{service_name="databrain-api"}[5m])
)
Setting Up Alerts
Grafana Alerting
Create alert rules in Grafana:
High Error Rate:
- Go to Alerting → Alert rules → New alert rule
- Query:
(rate(http_server_duration_milliseconds_count{service_name="databrain-api",http_status_code=~"5.."}[5m])
/
rate(http_server_duration_milliseconds_count{service_name="databrain-api"}[5m]))
> 0.05
- Threshold: Alert when value > 0.05 (5% errors)
- Add notification channel (email, Slack, PagerDuty)
High Latency:
histogram_quantile(0.95,
rate(http_server_duration_milliseconds_bucket{service_name="databrain-api"}[5m])
) > 2000
Trace Exemplars
Link metrics to traces using exemplars:
# In alloy-config.alloy, add to Prometheus exporter
prometheus.remote_write "grafana_cloud" {
endpoint {
url = "https://${env("GRAFANA_CLOUD_PROMETHEUS_ENDPOINT")}"
basic_auth {
username = env("GRAFANA_CLOUD_INSTANCE_ID")
password = env("GRAFANA_CLOUD_API_KEY")
}
# Enable exemplars
send_exemplars = true
}
}
Now when viewing metrics, you can click on data points to see example traces.
LogQL Queries
Use LogQL to query structured logs in Loki:
All errors:
{service_name="databrain-api"} | json | level="error"
Slow requests (>1s):
{service_name="databrain-api"} | json | duration > 1000
Logs for specific user:
{service_name="databrain-api"} | json | userId="123"
Logs with trace context:
{service_name="databrain-api"} | json | trace_id!=""
Troubleshooting
| Issue | Solution |
|---|
| No data in Grafana Cloud | Check Alloy logs: docker logs grafana-alloy |
| 401 Unauthorized | Verify Instance ID and API Key are correct |
| Connection timeout | Check firewall allows outbound HTTPS (443) |
| Missing traces in Tempo | Wait 2-3 minutes for ingestion delay |
| High costs | Implement sampling in Alloy configuration |
Debug Alloy
Check Alloy logs:
docker logs grafana-alloy | grep -i error
View Alloy UI metrics:
Test Connectivity
# Test Tempo endpoint
curl -v https://tempo-us-central1.grafana.net:443
# Test with auth
curl -u "${INSTANCE_ID}:${API_KEY}" \
https://tempo-us-central1.grafana.net:443/status
Cost Optimization
Grafana Cloud free tier includes:
- Tempo: 50 GB traces/month
- Loki: 50 GB logs/month
- Prometheus: 10k active series
Tips to stay within limits:
- Sampling: Sample traces in Alloy
otelcol.processor.probabilistic_sampler "default" {
sampling_percentage = 10 // Sample 10%
output {
traces = [otelcol.exporter.otlp.tempo.input]
}
}
- Filter low-value logs: Exclude health checks
otelcol.processor.filter "exclude_healthchecks" {
traces {
span = ["attributes[\"http.target\"] == \"/health\""]
}
}
- Set retention: Adjust in Grafana Cloud settings
- Tempo: Default 7 days
- Loki: Default 30 days
- Prometheus: Default 13 months
Best Practices
1. Use Service Graph
Enable service graph in Tempo to visualize service dependencies:
- Go to Tempo → Service Graph
- View request flow between services
- Identify bottlenecks and failures
2. Correlate Logs with Traces
When logging, Winston automatically includes trace context:
{
"level": "info",
"message": "Processing order",
"orderId": "123",
"trace_id": "abc123",
"span_id": "def456"
}
In Loki, click the trace ID to jump to the full trace in Tempo.
3. Use Grafana Oncall
Integrate with Grafana Oncall for advanced alert management:
- On-call rotations
- Escalation policies
- Alert grouping and deduplication
Support
For Databrain configuration issues, contact your Databrain support team.