Executive summary
Coding tools like Claude Code promise to attach jetpacks to developers, and the data backs it up. With 79% of Claude Code conversations being automation tasks, teams are seeing real productivity gains that go far beyond simple code completion and that models can automate more than augment as they get better. During the June 12th cloud outage, I realized how dependent I'd become on Claude Code - even for fixing simple apostrophe errors in convoluted cURL commands.
Your team probably already understands the need for developer productivity tools, and with Claude Code accessing the Anthropic API directly (directly or through cloud providers), key security concerns have been quelled. What you need now is telemetry to answer the important questions: Are developers actually using it? Which teams are getting the most value? What's our real cost per feature or bug fix?
This guide walks you through setting up observability with Claude Code using OpenTelemetry metrics. You'll get the complete setup - from Prometheus configuration to automated reporting - so your engineering leadership can make data-driven decisions about your AI tooling investment.
Setting up your measurement infrastructure
Quick Verification
Before diving into Prometheus, let's verify telemetry is working:
export CLAUDE_CODE_ENABLE_TELEMETRY=1
export OTEL_METRICS_EXPORTER=console
export OTEL_METRIC_EXPORT_INTERVAL=1000
claude -p "hello world"
You should see output like this:
=== Resource Attributes ===
{ 'service.name': 'claude-code', 'service.version': '1.0.17' }
===========================
{
descriptor: {
name: 'claude_code.cost.usage',
type: 'COUNTER',
description: 'Cost of the Claude Code session',
unit: 'USD',
valueType: 1
},
dataPointType: 3,
dataPoints: [
{
attributes: {
'user.id': '7b673f5715f2da49af2cdd341e0cad17fb3274f32d4ceed00d57d53da4e76fb2',
'session.id': '092d99fc-ac17-4ee7-b310-57d387777c91',
'model': 'claude-4-sonnet-20250514'
},
value: 0.000297
}
]
}
Prometheus Setup
For actual ROI measurement, you need Prometheus. Console output is just debugging - Prometheus gives you dashboards, historical data, and executive-ready visualizations.
First, grab the configuration files:
# Clone the configuration repo
git clone https://github.com/katchu11/claude-code-guide
docker-compose up -d
Configure Claude Code to send metrics to Prometheus:
# Enable telemetry
export CLAUDE_CODE_ENABLE_TELEMETRY=1
# Configure OTLP exporter
export OTEL_METRICS_EXPORTER=otlp
export OTEL_EXPORTER_OTLP_PROTOCOL=grpc
export OTEL_EXPORTER_OTLP_ENDPOINT=http://localhost:4317
# Optional: Set authentication if required
# export OTEL_EXPORTER_OTLP_HEADERS="Authorization=Bearer your-token"
# Run Claude Code
claude
Common infrastructure endpoints:
- Local development: http://localhost:4317
- Kubernetes: http://otel-collector.monitoring.svc.cluster.local:4317
- AWS: Your Application Load Balancer endpoint
- DataDog: https://otlp.datadoghq.com
- Honeycomb: https://api.honeycomb.io:443
Key Prometheus Queries
Once data is flowing, these queries give you the insights you need:
# Total cost across all sessions
sum(claude_code_cost_usage_USD_total)
{} 0.43
# Token usage by type (input/output)
sum(claude_code_token_usage_tokens_total) by (type)
{type="input"} 751
{type="output"} 1863
{type="cacheCreation"} 2136
{type="cacheRead"} 78465
# Token usage by user email
sum(claude_code_token_usage_tokens_total) by (user_email)
{user_email="user@company.com"} 83215
# Cost by model type
sum(claude_code_cost_usage_USD_total) by (model)
{model="claude-opus-4-20250514"} 0.429
{model="claude-3-5-haiku-20241022"} 0.0006
# Rate of token consumption over time (requires multiple data points)
rate(claude_code_token_usage_tokens_total[5m])
{type="input"} 2.5
{type="output"} 6.2
{type="cacheRead"} 261.5


Understanding Usage and Cost
Let's start with what Claude Code actually costs. Unlike fixed subscription tools, Claude Code has variable costs based on usage patterns.
Token Usage Patterns
The claude_code.token.usage metric breaks down by:
- Input tokens: Code you provide for analysis
- Output tokens: Code Claude generates
- Model type: Different models have different costs
Usage Patterns by Plan Type:
Subscription Users (Pro/Max):
- Claude Pro: predictable $20/month
- Claude Max 5x: $100/month for heavy development
- Claude Max 20x:$200/month for intensive workflows
API Users (Pay-per-token) - Real Usage Patterns:
Pro tip: "ultrathink" triggers Claude's maximum thinking budget for complex analysis. The hierarchy is: "think" < "think hard" < "think harder" < "ultrathink" - each level increases the amount of tokens set to the reasoning budget. See Claude Code's best practices doc for more info!
Simple Development Tasks (Haiku):
- Quick questions, simple functions, code snippets
- 200-500 input tokens, 10-50 output tokens
- $0.0001-0.0003 per simple query
Real examples: Complex architectural analysis cost $0.34, simple "hello world" cost $0.0003
*Annual discounts may be available for subscription plans.
API Pricing by Model:
- Haiku 3.5: $0.80/$4.00 per million input/output tokens (fastest, cheapest)
- Sonnet 4: $3.00/$15.00 per million tokens (balanced performance)
- Opus 4: $15.00/$75.00 per million tokens (most capable)
Pro tip: For regular Claude Code usage, Max subscriptions are often more predictable than API costs. Individual developers can't control API optimizations like prompt caching, making subscriptions simpler for daily development work.
Note: Telemetry may require a clean Claude Code installation if you experience hanging. See troubleshooting.md for solutions.

Session Duration Analysis
Understanding how long developers stay in Claude Code sessions helps identify engagement patterns:

Understanding Developer Productivity Impact
Measuring developer productivity isn't straightforward, but Claude Code provides several useful metrics for tracking impact on your development workflows.
Core Productivity Metrics
Additional Available Metrics
Claude Code provides these additional telemetry metrics:
Combining Metrics for Insights
While Claude Code provides specific telemetry, you can combine this with your existing development metrics:
- Session count + PR count: Measure conversion from Claude sessions to completed work
- Token usage patterns + commit frequency: Understand if high token usage correlates with code quality
- Code edit tool decisions: Track developer confidence in Claude's suggestions
- Cost per PR: Calculate by dividing total costs by PR count over time periods
Key insight: The built-in metrics work best when combined with your existing Git and project management data.
Before/After Analysis Framework
The most compelling productivity story comes from systematic before/after comparisons. Here's a practical framework:
Baseline Establishment (30 days minimum)
Phase 1: Pre-Claude Code Measurement
- Track existing development metrics from your Git/PM tools
- Establish statistical baselines with confidence intervals
- Document current development workflows and pain points
Phase 2: Implementation Period
- Deploy Claude Code with telemetry enabled
- Maintain existing workflow documentation
- Track adoption rates and initial usage patterns
Phase 3: Post-Implementation Analysis (30+ days)
- Compare same metrics with same statistical rigor
- Account for external factors (holidays, releases, team changes)
- Focus on trends over absolute numbers

Keep in mind, that often it's best to keep it simple!
Based on Anthropic's research on software development impact:
- 79% of Claude Code conversations are automation tasks (vs 49% on Claude.ai)
- 35.8% are feedback loop interactions - iterative development and debugging
- 43.8% are directive conversations - direct task completion
- JavaScript/TypeScript dominates usage at 31% of queries, followed by HTML/CSS (28%) and Python (14%)
Important Note: The metrics and examples in this guide are generated from limited sample data and hypothetical scenarios. Development workflows vary significantly across:
- Team practices: Some teams prefer frequent small commits, others batch larger changes
- Project types: Frontend development patterns differ from backend API work
- Developer experience: Senior developers may use Claude for architecture review, while junior developers focus on implementation assistance
- Organizational culture: Code review practices, testing requirements, and deployment processes all influence usage patterns
Your organization should define what productivity means in your context. Claude Code provides the measurement tools - the interpretation depends on your team's goals and workflows.
Key Telemetry Combinations for Analysis
# Total cost for analysis period
sum(claude_code_cost_usage_USD_total)
# Total cost for analysis period: $0.43 across 6 sessions
sum(claude_code_cost_usage_USD_total) = 0.43
# Total tokens consumed by type
sum(claude_code_token_usage_tokens_total) by (type)
{type="input"} 751
{type="output"} 1863
{type="cacheCreation"} 2136
{type="cacheRead"} 78465
# Cost per user for budget allocation
sum(claude_code_cost_usage_USD_total) by (user_email)
# Cost breakdown by model complexity
{model="claude-opus-4-20250514"} 0.429
{model="claude-3-5-haiku-20241022"} 0.0006 # Simple queries
# Average cost per session
sum(claude_code_cost_usage_USD_total) / count(claude_code_cost_usage_USD_total)
# Average cost per session: $0.43 / 6 = $0.072
# Usage patterns by time of day
sum(rate(claude_code_token_usage_tokens_total[1h])) by (hour)
# Cache efficiency: 78k cache reads vs 2k cache creation (39:1 ratio)
Example telemetry results (real data from 6 sessions):
# Total cost for analysis period: $0.43 across 6 sessions
sum(claude_code_cost_usage_USD_total) = 0.43
# Tokens by type: Shows cache efficiency
{type="input"} 751
{type="output"} 1863
{type="cacheCreation"} 2136
{type="cacheRead"} 78465
# Cost breakdown by model complexity
{model="claude-opus-4-20250514"} 0.429
{model="claude-3-5-haiku-20241022"} 0.0006 # Simple queries
# Cache efficiency: 78k cache reads vs 2k cache creation (39:1 ratio)
# Average cost per session: $0.43 / 6 = $0.072
Your actual usage patterns will depend on:
- Workflow complexity: Simple bug fixes vs architectural reviews
- Developer preferences: Some use "ultrathink" for complex analysis, others prefer iterative conversations
- Project phase: Initial development vs maintenance vs refactoring work
- Team collaboration patterns: Individual work vs pair programming with Claude
Use this telemetry data as input for your organization's specific value assessment framework.
Brownie Points: Automated Reporting
Here's where it gets fun. You can ask Claude to generate reports combining your telemetry data with project management metrics.
Setting up Linear MCP Integration
Claude Code has built-in MCP support. Set up the Linear MCP server using the CLI:
# Add Linear MCP server to Claude Code
claude mcp add linear -s user -- npx -y mcp-remote https://mcp.linear.app/sse
# Verify it's working
claude mcp list
# Restart Claude Code to activate the integration
Alternatively, you can configure it directly in your .claude.json file:
{
"mcpServers": {
"linear": {
"type": "stdio",
"command": "npx",
"args": ["-y", "mcp-remote", "https://mcp.linear.app/sse"]
}
}
}
Generating Automated Reports
We've created a comprehensive prompt template for generating reports. Here's a simplified example:
claude -p "Using the Linear MCP, analyze our team's velocity for the last sprint and combine with these Claude Code metrics:
{
\"claude_code_sessions\": 42,
\"total_cost\": 103.45,
\"avg_session_duration\": \"28.5 minutes\",
\"tool_acceptance_rates\": {\"Edit\": 0.81, \"MultiEdit\": 0.92}
}
Generate a comprehensive productivity report with visualizations."
Dynamic Metric Fetching Example:
#!/bin/bash
# fetch-claude-metrics.sh - Automate report generation with live data
# Fetch metrics from Prometheus
TOTAL_COST=$(curl -s "http://localhost:9090/api/v1/query?query=sum(claude_code_cost_usage_USD_total)" | jq -r '.data.result[0].value[1] // "0"')
INPUT_TOKENS=$(curl -s "http://localhost:9090/api/v1/query?query=sum(claude_code_token_usage_tokens_total{type=\"input\"})" | jq -r '.data.result[0].value[1] // "0"')
OUTPUT_TOKENS=$(curl -s "http://localhost:9090/api/v1/query?query=sum(claude_code_token_usage_tokens_total{type=\"output\"})" | jq -r '.data.result[0].value[1] // "0"')
# Generate report with live data
claude -p "Using the Linear MCP, analyze our team's velocity and combine with these live Claude Code metrics:
{
\"total_cost\": $TOTAL_COST,
\"input_tokens\": $INPUT_TOKENS,
\"output_tokens\": $OUTPUT_TOKENS,
\"cost_per_token\": $(echo "$TOTAL_COST / ($INPUT_TOKENS + $OUTPUT_TOKENS)" | bc -l)
}
Generate a comprehensive productivity report with visualizations."
For the complete prompt with all metrics and detailed instructions, see report-generation-prompt.md.
Sample Report Output
Using the prompt template above, Claude generates comprehensive reports like this:
Executive Summary
This week's analysis shows a 17% improvement in commit velocity for teams using Claude Code effectively. Total cost was $103.45 across 15 active Linear issues.
Key Insights Generated
- Development velocity improved 17% for teams with high Claude Code adoption
- Average session duration of 28.5 minutes falls within optimal productivity range
- MultiEdit tool shows 92% acceptance rate but low usage - opportunity for training
- Bug fix tickets (KAS-10, KAS-14) would benefit from increased MultiEdit usage
What the Report Includes:
- Mermaid visualizations for tool usage, costs, and productivity metrics
- Real Linear ticket analysis (KAS-10 through KAS-15)
- Actionable recommendations based on actual usage patterns
- Cost breakdowns and ROI calculations
For the complete generated report with all visualizations and metrics, see sample-report-output.md.
Organization Deployment
Rolling out telemetry across your organization is straightforward with managed settings.
Managed Settings Configuration
Create a managed-settings.json file:
{
"telemetry": {
"enabled": true,
"endpoint": "https://your-otel-collector.company.com:4317",
"headers": {
"Authorization": "Bearer ${OTEL_TOKEN}"
}
},
"exporters": {
"metrics": "otlp",
"logs": "otlp"
}
}
Deploy via your MDM solution or configuration management tool. Users can't disable telemetry, ensuring consistent data collection.
For detailed deployment options, see the Claude Code deployment documentation.
Conclusion
Measuring Claude Code's ROI doesn't have to be complicated. The key steps are:
- Enable telemetry immediately - Even console output gives you basic insights
- Set up Prometheus for real measurement - Dashboards and queries unlock the full picture
- Track both cost and productivity metrics - You need both sides of the ROI equation
- Use automated reporting for regular reviews - Let Claude generate the insights from your data