Telephony monitoring and dashboards

You cannot improve what you do not measure. A production telephony system generates a constant stream of signals — call starts, call ends, failures, transfers, queue entries, agent handle times. This chapter shows you how to capture those signals as Call Detail Records, compute the metrics that matter, and build dashboards that give your operations team real-time visibility.

Call metricsCDRDashboard

What you'll learn

How to generate Call Detail Records (CDR) from LiveKit webhook events
The key telephony metrics: answer rate, handle time, abandonment rate, and more
How to structure a monitoring dashboard for operations teams
How to set up alerts for anomalous conditions

Call Detail Records

A Call Detail Record captures everything that happened during a single call: who called, when, how long the call lasted, what the outcome was, and whether any transfers or errors occurred. CDRs are the foundation of all telephony analytics.

cdr.pypython

from dataclasses import dataclass, field
from datetime import datetime

@dataclass
class CallDetailRecord:
  call_id: str
  direction: str                  # inbound or outbound
  caller_number: str
  callee_number: str
  trunk_id: str
  room_name: str
  start_time: datetime
  answer_time: datetime | None = None
  end_time: datetime | None = None
  duration_seconds: float = 0.0
  ring_duration_seconds: float = 0.0
  outcome: str = "in_progress"    # answered, no_answer, busy, failed, abandoned
  transfer_type: str | None = None
  transfer_target: str | None = None
  queue_wait_seconds: float = 0.0
  recording_url: str | None = None
  sip_error_code: int | None = None
  agent_id: str | None = None
  metadata: dict = field(default_factory=dict)

class CDRManager:
  def __init__(self, storage):
      self.storage = storage
      self.active_calls: dict[str, CallDetailRecord] = {}

  async def on_call_start(self, call_id: str, direction: str, caller: str, callee: str, trunk_id: str, room_name: str):
      cdr = CallDetailRecord(
          call_id=call_id,
          direction=direction,
          caller_number=caller,
          callee_number=callee,
          trunk_id=trunk_id,
          room_name=room_name,
          start_time=datetime.utcnow(),
      )
      self.active_calls[call_id] = cdr

  async def on_call_answered(self, call_id: str):
      cdr = self.active_calls.get(call_id)
      if cdr:
          cdr.answer_time = datetime.utcnow()
          cdr.ring_duration_seconds = (cdr.answer_time - cdr.start_time).total_seconds()
          cdr.outcome = "answered"

  async def on_call_end(self, call_id: str):
      cdr = self.active_calls.pop(call_id, None)
      if cdr:
          cdr.end_time = datetime.utcnow()
          if cdr.answer_time:
              cdr.duration_seconds = (cdr.end_time - cdr.answer_time).total_seconds()
          await self.storage.save(cdr)

What's happening

CDRs should be generated from LiveKit webhook events — participant_joined, participant_left, room_finished, and SIP-specific events. Each event updates the CDR in memory, and the complete record is persisted when the call ends. Store CDRs in a database that supports fast aggregation queries — PostgreSQL with time-based partitioning or a time-series database like TimescaleDB.

Key telephony metrics

These are the metrics your operations team will check every day:

Metric	Formula	Target
Answer rate	Answered calls / Total inbound calls	> 95%
Average handle time	Sum of call durations / Answered calls	Varies by use case
Abandonment rate	Callers who hung up in queue / Total queued calls	Under 5%
Average speed of answer	Sum of ring + queue time / Answered calls	Under 30 seconds
Transfer rate	Transferred calls / Answered calls	Depends on agent capability
Error rate	Failed calls (SIP errors) / Total calls	Under 1%

metrics.pypython

from datetime import datetime, timedelta

class TelephonyMetrics:
  def __init__(self, cdr_storage):
      self.storage = cdr_storage

  async def compute_metrics(self, start: datetime, end: datetime) -> dict:
      cdrs = await self.storage.query(start=start, end=end)
      total = len(cdrs)
      if total == 0:
          return {}

      answered = [c for c in cdrs if c.outcome == "answered"]
      abandoned = [c for c in cdrs if c.outcome == "abandoned"]
      failed = [c for c in cdrs if c.outcome == "failed"]

      avg_handle_time = (
          sum(c.duration_seconds for c in answered) / len(answered)
          if answered else 0.0
      )
      avg_speed_of_answer = (
          sum(c.ring_duration_seconds + c.queue_wait_seconds for c in answered) / len(answered)
          if answered else 0.0
      )

      return {
          "total_calls": total,
          "answer_rate": len(answered) / total,
          "abandonment_rate": len(abandoned) / total,
          "error_rate": len(failed) / total,
          "avg_handle_time_seconds": avg_handle_time,
          "avg_speed_of_answer_seconds": avg_speed_of_answer,
      }

Dashboard patterns

A telephony dashboard should answer three questions at a glance: "Is the system healthy right now?", "How did we perform today?", and "Are there any trends I should worry about?"

Real-time panel

Show current active calls, agents available, callers in queue, and any active alerts. This panel updates every few seconds. Use WebSocket connections to push updates rather than polling.

Today's summary

Display today's key metrics compared to the same day last week. Answer rate, handle time, abandonment rate, and total call volume. Highlight any metrics that are outside normal ranges in red.

Trend charts

Line charts showing metrics over the past 7 and 30 days. Look for gradual degradation — a slowly rising abandonment rate often indicates a staffing problem before it becomes a crisis.

Alerting

Configure alerts for conditions that need immediate attention: error rate above 5%, abandonment rate above 10%, zero available agents, or a SIP trunk marked unhealthy. Route alerts to your on-call channel.

Start with the basics

You do not need a custom dashboard on day one. Export CDRs to a database and use Grafana or a similar tool to build dashboards. The important thing is that the data is being captured correctly. Visualization can be refined over time.

Test your knowledge

Question 1 of 2

Why should Call Detail Records (CDRs) be generated from webhook events rather than constructed after the call ends?

What you learned

Call Detail Records capture the full lifecycle of every call and are the foundation of telephony analytics.
The key metrics — answer rate, handle time, abandonment rate, speed of answer, and error rate — tell you whether your system is healthy.
Dashboards should show real-time status, daily summaries, and multi-day trends.
Alerting on key thresholds catches problems before they affect large numbers of callers.

Next up

In the final chapter, you will load test your telephony system to find its limits before your callers do.

Monitoring & analytics