Chapter 220m

MCP tools for structured data

MCP tool integration

RAG retrieves unstructured knowledge from documents. But voice agents also need access to structured data — customer records in a CRM, events from a calendar, rows from a database. The Model Context Protocol (MCP) provides a standardized way to expose these external systems as tools that the LLM can call during a conversation, without you having to write custom function-calling boilerplate for each integration.

MCPTool serversExternal APIs

What you'll learn

  • What MCP is and how it standardizes tool integration
  • How to set up an MCP tool server that exposes external APIs
  • How to connect MCP tools to a LiveKit voice agent
  • How to combine MCP tools with RAG for comprehensive data access

What is MCP

The Model Context Protocol is an open standard for connecting LLMs to external data sources and tools. Instead of writing custom function definitions for every API your agent needs, you run an MCP server that advertises its capabilities — tools, resources, and prompts — in a format any MCP-compatible client can consume.

The architecture is straightforward:

1

MCP server exposes tools

An MCP server wraps your external APIs (database queries, CRM lookups, calendar operations) as named tools with typed parameters and descriptions. The server runs as a separate process.

2

Agent connects as MCP client

Your LiveKit agent connects to the MCP server using the MCP plugin. The plugin discovers available tools and registers them with the LLM.

3

LLM calls tools during conversation

When the user asks a question that requires external data, the LLM decides which tool to call, the MCP client forwards the call to the server, and the result is injected back into the conversation.

Setting up an MCP tool server

Here is a minimal MCP server that exposes a customer lookup and an appointment scheduler:

mcp_server.pypython
from mcp.server import Server
from mcp.types import Tool, TextContent
import json

server = Server("customer-tools")

@server.tool("lookup_customer")
async def lookup_customer(email: str) -> list[TextContent]:
  """Look up a customer by email address. Returns name, plan, and account status."""
  # In production, query your actual CRM or database
  customer = await db.customers.find_one({"email": email})
  if not customer:
      return [TextContent(type="text", text=f"No customer found with email {email}")]
  return [TextContent(
      type="text",
      text=json.dumps({
          "name": customer["name"],
          "plan": customer["plan"],
          "status": customer["status"],
          "since": customer["created_at"].isoformat(),
      }),
  )]

@server.tool("schedule_appointment")
async def schedule_appointment(
  customer_email: str,
  date: str,
  time: str,
  reason: str,
) -> list[TextContent]:
  """Schedule an appointment for a customer. Date format: YYYY-MM-DD, time format: HH:MM."""
  appointment = await calendar.create_event(
      attendee=customer_email,
      date=date,
      time=time,
      description=reason,
  )
  return [TextContent(
      type="text",
      text=f"Appointment scheduled for {date} at {time}. Confirmation: {appointment.id}",
  )]

if __name__ == "__main__":
  server.run(transport="sse", port=3001)
mcp_server.tstypescript
import { Server } from "@modelcontextprotocol/sdk/server/index.js";
import { SSEServerTransport } from "@modelcontextprotocol/sdk/server/sse.js";

const server = new Server(
{ name: "customer-tools", version: "1.0.0" },
{ capabilities: { tools: {} } }
);

server.setRequestHandler("tools/list", async () => ({
tools: [
  {
    name: "lookup_customer",
    description: "Look up a customer by email address",
    inputSchema: {
      type: "object",
      properties: { email: { type: "string" } },
      required: ["email"],
    },
  },
  {
    name: "schedule_appointment",
    description: "Schedule an appointment for a customer",
    inputSchema: {
      type: "object",
      properties: {
        customer_email: { type: "string" },
        date: { type: "string", description: "YYYY-MM-DD" },
        time: { type: "string", description: "HH:MM" },
        reason: { type: "string" },
      },
      required: ["customer_email", "date", "time", "reason"],
    },
  },
],
}));

server.setRequestHandler("tools/call", async (request) => {
const { name, arguments: args } = request.params;
if (name === "lookup_customer") {
  const customer = await db.customers.findOne({ email: args.email });
  return {
    content: [{ type: "text", text: JSON.stringify(customer) }],
  };
}
// Handle other tools...
});
What's happening

The MCP server defines tools with typed parameters and descriptions. When an LLM decides to call a tool, the MCP client serializes the arguments, sends them to the server, and returns the result. The tool descriptions are critical — the LLM uses them to decide when and how to call each tool.

Connecting MCP to your voice agent

LiveKit provides an MCP plugin that handles the client-side connection. You pass the MCP tool provider to your agent session, and the tools become available to the LLM alongside any other tools you have defined:

agent_with_mcp.pypython
from livekit.agents import Agent, AgentSession, RoomInputOptions
from livekit.plugins import deepgram, openai as oai
from livekit.plugins.mcp import MCPToolProvider

async def entrypoint(ctx):
  # Connect to the MCP server
  tools = MCPToolProvider(server_url="http://localhost:3001")

  agent = Agent(
      instructions=(
          "You are a customer service assistant. Use the available tools "
          "to look up customer information and schedule appointments. "
          "Always verify the customer's identity before sharing account details."
      ),
  )

  session = AgentSession(
      stt=deepgram.STT(),
      llm=oai.LLM(model="gpt-4o"),
      tts=oai.TTS(),
      tools=[tools],
  )

  await session.start(
      agent=agent,
      room=ctx.room,
      room_input_options=RoomInputOptions(),
  )
agent_with_mcp.tstypescript
import { Agent, AgentSession, RoomInputOptions } from "@livekit/agents";
import { MCPToolProvider } from "@livekit/agents-plugin-mcp";
import { DeepgramSTT } from "@livekit/agents-plugin-deepgram";
import { OpenAILLM, OpenAITTS } from "@livekit/agents-plugin-openai";

async function entrypoint(ctx: any) {
const tools = new MCPToolProvider({
  serverUrl: "http://localhost:3001",
});

const agent = new Agent({
  instructions:
    "You are a customer service assistant. Use the available tools " +
    "to look up customer information and schedule appointments.",
});

const session = new AgentSession({
  stt: new DeepgramSTT(),
  llm: new OpenAILLM({ model: "gpt-4o" }),
  tts: new OpenAITTS(),
  tools: [tools],
});

await session.start({
  agent,
  room: ctx.room,
  roomInputOptions: new RoomInputOptions(),
});
}

Combining MCP with RAG

The most powerful pattern combines RAG for unstructured knowledge with MCP for structured operations. A customer calls and asks: "What is your return policy for electronics, and can you check the status of my return?" The agent retrieves the return policy from the vector database and calls the MCP tool to check the order status — all in one turn.

rag_plus_mcp.pypython
from livekit.plugins.mcp import MCPToolProvider

class RAGWithToolsAgent(Agent):
  def __init__(self, vector_store):
      super().__init__(
          instructions=(
              "You are a customer service assistant with access to both "
              "a knowledge base and customer management tools. Use the "
              "knowledge base context for policy questions. Use tools "
              "for account-specific operations."
          ),
      )
      self.vector_store = vector_store

  async def on_user_turn_completed(self, turn_ctx):
      query = turn_ctx.user_message
      results = await self.vector_store.search(query, top_k=3)
      relevant = [r for r in results if r.score >= 0.7]
      if relevant:
          context = "\n---\n".join([r.text for r in relevant])
          turn_ctx.add_system_message(f"Knowledge base context:\n{context}")
      await Agent.default.on_user_turn_completed(self, turn_ctx)

async def entrypoint(ctx):
  vector_store = PgVectorStore(conn=db_connection, openai_client=AsyncOpenAI())
  tools = MCPToolProvider(server_url="http://localhost:3001")
  agent = RAGWithToolsAgent(vector_store=vector_store)

  session = AgentSession(
      stt=deepgram.STT(),
      llm=oai.LLM(model="gpt-4o"),
      tts=oai.TTS(),
      tools=[tools],
  )
  await session.start(agent=agent, room=ctx.room, room_input_options=RoomInputOptions())

Tool descriptions drive behavior

The LLM decides whether to call a tool based entirely on the tool's name and description. Invest time in writing clear, specific descriptions. A vague description like "look up data" will cause the model to call the tool when it should not. A precise description like "Look up a customer by email address. Returns name, plan, and account status." helps the model make the right decision.

Test your knowledge

Question 1 of 2

What is the key architectural difference between RAG and MCP tools for data access?

What you learned

  • MCP standardizes how LLMs connect to external tools, eliminating custom function-calling boilerplate for each integration.
  • An MCP server exposes tools with typed parameters and descriptions; the LiveKit MCP plugin handles client-side integration.
  • Combining MCP tools with RAG gives your agent access to both unstructured knowledge and structured operations in a single conversation.

Next up

In the next chapter, you will add citations and source attribution so users know where the agent's answers come from.

Concepts covered
Model Context ProtocolTool serversRAG + MCP combinationStructured vs unstructured data