Citation in spoken conversation

A voice agent that answers questions from a knowledge base is only trustworthy if users can verify where the information came from. Citation is not just a nice-to-have — in regulated industries, you may be legally required to identify the source of advice. Even in general customer service, saying "According to our return policy document..." is vastly more credible than stating facts without attribution.

Voice-friendly citationsSource trackingHallucination preventionAudit trails

What you'll learn

How to track which sources informed each agent response
How to instruct the LLM to cite sources naturally in voice responses
How to store citation metadata for audit trails
Patterns for presenting citations in voice versus text channels
How reranked hybrid search results carry source metadata through to citations

Source tracking through the pipeline

Citation starts at ingestion time. Every chunk stored in your vector database must carry metadata identifying its source — the document name, section, page number, or URL. When retrieval returns results, this metadata flows through to the LLM prompt so the model knows where each piece of context came from.

source_metadata.pypython

from dataclasses import dataclass

@dataclass
class ChunkMetadata:
  source_document: str
  section: str
  page: int | None = None
  url: str | None = None
  last_updated: str | None = None

@dataclass
class SearchResult:
  text: str
  metadata: ChunkMetadata
  score: float

def format_context_with_sources(results: list[SearchResult]) -> str:
  """Format retrieved results with source labels for the LLM prompt."""
  sections = []
  for i, r in enumerate(results, 1):
      source_label = f"{r.metadata.source_document}"
      if r.metadata.section:
          source_label += f", Section: {r.metadata.section}"
      if r.metadata.page:
          source_label += f", Page {r.metadata.page}"
      sections.append(f"[Source {i}: {source_label}]\n{r.text}")
  return "\n---\n".join(sections)

source_metadata.tstypescript

interface ChunkMetadata {
sourceDocument: string;
section: string;
page?: number;
url?: string;
lastUpdated?: string;
}

interface SearchResult {
text: string;
metadata: ChunkMetadata;
score: number;
}

function formatContextWithSources(results: SearchResult[]): string {
return results
  .map((r, i) => {
    let label = r.metadata.sourceDocument;
    if (r.metadata.section) label += `, Section: ${r.metadata.section}`;
    if (r.metadata.page) label += `, Page ${r.metadata.page}`;
    return `[Source ${i + 1}: ${label}]\n${r.text}`;
  })
  .join("\n---\n");
}

Instructing the LLM to cite sources

The LLM will not cite sources unless you tell it to. Add explicit citation instructions to your agent's system prompt, and format the retrieved context with clear source labels:

citing_agent.pypython

from livekit.agents import Agent

CITATION_INSTRUCTIONS = """You are a knowledge assistant with access to company documents.

Rules for citations:
1. When answering from provided context, mention the source naturally.
 Good: "According to our return policy, you have 30 days to return items."
 Good: "Based on the employee handbook, section 4.2, vacation requests require..."
 Bad: "[Source 1] states that..." (too robotic for voice)
2. If multiple sources support your answer, cite the most specific one.
3. If the context does not contain the answer, say so honestly.
4. Never fabricate a source that was not provided in the context.
"""

class CitingRAGAgent(Agent):
  def __init__(self, vector_store):
      super().__init__(instructions=CITATION_INSTRUCTIONS)
      self.vector_store = vector_store

  async def on_user_turn_completed(self, turn_ctx):
      query = turn_ctx.user_message
      results = await self.vector_store.search(query, top_k=3)
      relevant = [r for r in results if r.score >= 0.7]

      if relevant:
          context = format_context_with_sources(relevant)
          turn_ctx.add_system_message(f"Retrieved context:\n{context}")

          # Store source metadata for audit logging
          turn_ctx.set_metadata("sources", [
              {
                  "document": r.metadata.source_document,
                  "section": r.metadata.section,
                  "score": r.score,
              }
              for r in relevant
          ])

      await Agent.default.on_user_turn_completed(self, turn_ctx)

What's happening

The citation instructions tell the LLM to weave source references into its response naturally. For voice agents, this means phrases like "According to our policy..." or "Based on the product documentation..." rather than bracketed citation numbers. The system message includes labeled sources so the model knows which document to reference.

Voice-friendly citation patterns

Voice is different from text. A user cannot click a footnote or hover over a citation number. Citations in voice must be woven into the natural flow of speech:

Pattern	Example	Best for
Document reference	"According to our return policy..."	Policy documents
Section reference	"In section 3 of the employee handbook..."	Long documents with sections
Temporal reference	"Based on our pricing update from January..."	Time-sensitive information
Confidence qualifier	"Our documentation indicates that..."	Lower-confidence matches
Explicit disclaimer	"I could not find a specific policy on that..."	No relevant results

Do not over-cite

In voice, every citation adds words the user must listen to. Cite the source once at the beginning of your answer, not for every sentence. If the entire answer comes from one document, one citation is enough.

Handling no-source scenarios

When retrieval returns nothing relevant, the agent must be transparent. Do not let the model fall back to its training data and present it as if it came from your knowledge base:

no_source_handling.pypython

async def on_user_turn_completed(self, turn_ctx):
  query = turn_ctx.user_message
  results = await self.vector_store.search(query, top_k=5)
  relevant = [r for r in results if r.score >= 0.7]

  if not relevant:
      turn_ctx.add_system_message(
          "IMPORTANT: No relevant documents were found for this query. "
          "Do NOT answer from general knowledge as if it came from our documents. "
          "Instead, tell the user you do not have specific information on that topic "
          "and suggest they contact support or check our website."
      )
  else:
      context = format_context_with_sources(relevant)
      turn_ctx.add_system_message(f"Retrieved context:\n{context}")

  await Agent.default.on_user_turn_completed(self, turn_ctx)

Hallucination risk

The biggest risk with RAG citation is the model citing a source that does not actually support its claim. Mitigate this by keeping chunks focused, filtering by relevance score, and including explicit instructions not to fabricate citations. Test your agent by asking questions that are close to but not covered by your knowledge base.

Hybrid search and citation metadata

When you use hybrid search with reranking (covered in the architecture chapter), the reranked results carry source metadata through the pipeline. This means your citation system works identically whether results come from semantic search, keyword search, or a fused and reranked combination:

hybrid_citation.pypython

from cohere import AsyncClient as CohereClient

class HybridCitingAgent(Agent):
  def __init__(self, hybrid_store, reranker, min_score=0.7):
      super().__init__(instructions=CITATION_INSTRUCTIONS)
      self.hybrid_store = hybrid_store
      self.reranker = reranker
      self.min_score = min_score

  async def on_user_turn_completed(self, turn_ctx):
      query = turn_ctx.user_message

      # Hybrid search returns candidates with source metadata intact
      candidates = await self.hybrid_store.search(query, top_k=10)
      candidate_dicts = [
          {"content": r.text, "source": r.metadata.source_document,
           "section": r.metadata.section, "page": r.metadata.page}
          for r in candidates
      ]

      # Reranker selects top results — metadata passes through
      reranked = await self.reranker.rerank(query, candidate_dicts, top_k=3)

      if reranked:
          context = "\n---\n".join(
              f"[Source: {r['source']}, Section: {r['section']}]\n{r['content']}"
              for r in reranked
          )
          turn_ctx.add_system_message(f"Retrieved context:\n{context}")

          turn_ctx.set_metadata("sources", [
              {"document": r["source"], "section": r["section"],
               "rerank_score": r["rerank_score"]}
              for r in reranked
          ])

      await Agent.default.on_user_turn_completed(self, turn_ctx)

What's happening

The key insight is that source metadata flows through every stage: ingestion attaches it to chunks, hybrid search preserves it in results, RRF fusion carries it through merging, and the reranker passes it along with relevance scores. Your citation system does not need to know how the results were found — it only needs the metadata that arrives with them.

Audit trail and logging

For compliance and debugging, log which sources were used for each response. This creates an audit trail that can answer: "Why did the agent say X?"

citation_logger.pypython

import json
import logging
from datetime import datetime

logger = logging.getLogger("citations")

class CitationLogger:
  def __init__(self, storage_backend):
      self.storage = storage_backend

  async def log_interaction(
      self,
      session_id: str,
      user_query: str,
      sources: list[dict],
      agent_response: str,
  ):
      record = {
          "session_id": session_id,
          "timestamp": datetime.utcnow().isoformat(),
          "query": user_query,
          "sources_used": sources,
          "response": agent_response,
          "source_count": len(sources),
          "avg_relevance": (
              sum(s["score"] for s in sources) / len(sources) if sources else 0
          ),
      }
      await self.storage.insert("citation_audit", record)
      logger.info(f"Session {session_id}: {len(sources)} sources cited")

citation_logger.tstypescript

interface CitationRecord {
sessionId: string;
timestamp: string;
query: string;
sourcesUsed: { document: string; section: string; score: number }[];
response: string;
}

class CitationLogger {
private storage: any;

constructor(storage: any) {
  this.storage = storage;
}

async logInteraction(
  sessionId: string,
  userQuery: string,
  sources: { document: string; section: string; score: number }[],
  agentResponse: string
): Promise<void> {
  const record: CitationRecord = {
    sessionId,
    timestamp: new Date().toISOString(),
    query: userQuery,
    sourcesUsed: sources,
    response: agentResponse,
  };
  await this.storage.insert("citation_audit", record);
}
}

Test your knowledge

Question 1 of 2

Why should a voice agent use natural language phrases like 'According to our return policy...' instead of bracketed citation numbers like '[Source 1]'?

What you learned

Source metadata must be attached to chunks at ingestion time and carried through retrieval to the LLM prompt.
Voice-friendly citations use natural language phrases rather than bracketed numbers or footnotes.
Hybrid search with reranking preserves source metadata through every stage — fusion, reranking, and citation all work on the same metadata.
Audit logging creates a traceable record of which sources informed each response, essential for compliance.
When no relevant sources are found, the agent must be transparent rather than hallucinating citations.

Next up

In the final chapter, you will learn production RAG patterns — caching, monitoring, and quality metrics to keep your retrieval system reliable at scale.