Build and Deploy MCP Server from Scratch

A hands-on guide to building, testing, and deploying Model Context Protocol servers with Python — tools, resources, prompts, transports, and production hosting

Published

June 15, 2025

Keywords: MCP server, Model Context Protocol, FastMCP, tool calling, resources, prompts, stdio transport, Streamable HTTP, JSON-RPC, MCP Inspector, Claude Desktop, VS Code, Mistral AI, deployment, Docker, production

Introduction

The Model Context Protocol (MCP) is an open standard — originally created by Anthropic and now governed by the Linux Foundation — that defines how AI applications connect to external data sources, tools, and workflows. Think of MCP as a USB-C port for AI: just as USB-C gives every device a single connector, MCP gives every AI application a single protocol for context exchange.

Before MCP, every integration between an AI assistant and an external system required a bespoke adapter — one for Slack, another for a database, another for GitHub. With N AI applications and M tools, that’s N × M integrations. MCP collapses this to N + M: every application speaks one protocol, every tool exposes one interface.

This article walks through building an MCP server from scratch in Python, covering the full journey from a single-tool prototype to a production-deployed server with multiple tools, resources, and prompts. We also explore how leading AI platforms — including Mistral AI, which recently hosted an MCP Hackathon to encourage community-built MCP servers — are adopting and extending the protocol.

MCP Architecture Overview

Participants

MCP follows a client-server architecture with three key participants:

Participant	Role	Example
MCP Host	AI application that coordinates one or more MCP clients	Claude Desktop, VS Code Copilot, Cursor, ChatGPT
MCP Client	Maintains a dedicated connection to one MCP server	Created by the host at runtime
MCP Server	Exposes tools, resources, and prompts to clients	Your custom Python/TypeScript server

graph TD
    Host["MCP Host<br/>(AI Application)"]
    Host --> C1["MCP Client 1"]
    Host --> C2["MCP Client 2"]
    Host --> C3["MCP Client 3"]
    C1 -->|"Dedicated<br/>Connection"| S1["MCP Server A<br/>(Local - Filesystem)"]
    C2 -->|"Dedicated<br/>Connection"| S2["MCP Server B<br/>(Local - Database)"]
    C3 -->|"Dedicated<br/>Connection"| S3["MCP Server C<br/>(Remote - API)"]

    style Host fill:#4A90D9,color:#fff
    style C1 fill:#7B68EE,color:#fff
    style C2 fill:#7B68EE,color:#fff
    style C3 fill:#7B68EE,color:#fff
    style S1 fill:#2ECC71,color:#fff
    style S2 fill:#2ECC71,color:#fff
    style S3 fill:#E67E22,color:#fff

Protocol Layers

MCP is composed of two layers:

Data Layer — Defines the JSON-RPC 2.0 based protocol for client-server communication, including lifecycle management and the three core primitives (tools, resources, prompts).
Transport Layer — Manages communication channels (stdio or Streamable HTTP), connection establishment, message framing, and authentication.

Core Primitives

MCP servers expose three types of primitives:

Primitive	Purpose	Discovery	Execution
Tools	Executable functions the LLM can invoke	`tools/list`	`tools/call`
Resources	Read-only data sources providing context	`resources/list`	`resources/read`
Prompts	Reusable interaction templates	`prompts/list`	`prompts/get`

graph LR
    subgraph Server["MCP Server"]
        T["🔧 Tools<br/>Executable Functions"]
        R["📄 Resources<br/>Context Data"]
        P["📝 Prompts<br/>Interaction Templates"]
    end

    Client["MCP Client"] -->|"tools/list → tools/call"| T
    Client -->|"resources/list → resources/read"| R
    Client -->|"prompts/list → prompts/get"| P

    style Server fill:#f0f4ff,stroke:#4A90D9
    style Client fill:#4A90D9,color:#fff

Setting Up the Development Environment

Prerequisites

Python 3.10+
uv package manager (recommended) or pip
An MCP-compatible host for testing (Claude Desktop, VS Code, or the MCP Inspector)

Project Scaffolding

# Create project directory
mkdir my-mcp-server && cd my-mcp-server

# Initialize with uv
uv init
uv venv && source .venv/bin/activate

# Install dependencies
uv add "mcp[cli]" httpx

# Create server file
touch server.py

The mcp[cli] package includes FastMCP, MCP’s high-level Python SDK that auto-generates tool definitions from type hints and docstrings.

Building Your First MCP Server

Step 1: Initialize FastMCP

from mcp.server.fastmcp import FastMCP

# Create the MCP server instance
mcp = FastMCP("my-knowledge-server")

FastMCP is the recommended entry point. It wraps the low-level JSON-RPC 2.0 protocol handling and provides decorator-based APIs for tools, resources, and prompts.

Step 2: Define Tools

Tools are functions the LLM can call. The key principle: tools should be model-controlled — the AI decides when and how to invoke them.

import httpx
from typing import Any


@mcp.tool()
async def search_documentation(query: str, max_results: int = 5) -> str:
    """Search the project documentation for relevant articles.

    Args:
        query: The search query string
        max_results: Maximum number of results to return (default: 5)
    """
    async with httpx.AsyncClient() as client:
        response = await client.get(
            "https://api.example.com/search",
            params={"q": query, "limit": max_results},
            timeout=30.0,
        )
        response.raise_for_status()
        results = response.json()

    if not results["items"]:
        return "No documentation found for this query."

    formatted = []
    for item in results["items"]:
        formatted.append(
            f"**{item['title']}**\n{item['snippet']}\nURL: {item['url']}"
        )
    return "\n---\n".join(formatted)


@mcp.tool()
async def run_sql_query(query: str) -> str:
    """Execute a read-only SQL query against the analytics database.

    Args:
        query: SQL SELECT query to execute (write operations are blocked)
    """
    if not query.strip().upper().startswith("SELECT"):
        return "Error: Only SELECT queries are allowed for safety."

    # Execute query against your database
    results = await execute_readonly_query(query)
    return format_as_table(results)

The @mcp.tool() decorator registers the function as an MCP tool. FastMCP extracts the function name, docstring, and type hints to generate the JSON Schema that clients use for tool discovery.

Step 3: Define Resources

Resources provide read-only context data to the client. Unlike tools, resources are typically user-controlled — the user or application decides which resources to attach to the conversation.

@mcp.resource("config://app-settings")
def get_app_settings() -> str:
    """Return the current application configuration."""
    import json

    settings = {
        "version": "2.1.0",
        "environment": "production",
        "features": ["search", "analytics", "notifications"],
        "rate_limits": {"requests_per_minute": 60},
    }
    return json.dumps(settings, indent=2)


@mcp.resource("schema://database/{table_name}")
def get_table_schema(table_name: str) -> str:
    """Return the schema definition for a database table.

    Args:
        table_name: Name of the database table
    """
    schemas = load_database_schemas()
    if table_name not in schemas:
        return f"Table '{table_name}' not found."
    return schemas[table_name]

Resource URIs use a scheme-based format. The {table_name} placeholder creates a resource template — a dynamic resource whose content depends on the parameter.

Step 4: Define Prompts

Prompts are reusable interaction templates that structure how users interact with the LLM through your server’s capabilities.

@mcp.prompt()
def debug_error(error_message: str, stack_trace: str = "") -> str:
    """Create a debugging prompt for analyzing errors.

    Args:
        error_message: The error message to analyze
        stack_trace: Optional stack trace for context
    """
    context = f"Error: {error_message}"
    if stack_trace:
        context += f"\n\nStack Trace:\n{stack_trace}"

    return f"""You are a senior software engineer debugging an issue.

Analyze the following error and provide:
1. Root cause analysis
2. Step-by-step fix
3. Prevention recommendations

{context}"""

Step 5: Run the Server

if __name__ == "__main__":
    mcp.run(transport="stdio")

The complete server in a single file:

# server.py
from mcp.server.fastmcp import FastMCP
import httpx
import json

mcp = FastMCP("my-knowledge-server")


@mcp.tool()
async def search_documentation(query: str, max_results: int = 5) -> str:
    """Search the project documentation for relevant articles.

    Args:
        query: The search query string
        max_results: Maximum number of results to return
    """
    # ... implementation
    return "Search results here"


@mcp.resource("config://app-settings")
def get_app_settings() -> str:
    """Return the current application configuration."""
    return json.dumps({"version": "2.1.0", "environment": "production"}, indent=2)


@mcp.prompt()
def debug_error(error_message: str) -> str:
    """Create a debugging prompt for analyzing errors."""
    return f"Analyze this error and suggest a fix:\n{error_message}"


if __name__ == "__main__":
    mcp.run(transport="stdio")

Lifecycle and Communication Flow

Every MCP session follows a structured lifecycle governed by JSON-RPC 2.0 messages:

sequenceDiagram
    participant Client as MCP Client
    participant Server as MCP Server

    Note over Client,Server: 1. Initialization
    Client->>Server: initialize (protocolVersion, capabilities, clientInfo)
    Server->>Client: InitializeResult (capabilities, serverInfo)
    Client->>Server: notifications/initialized

    Note over Client,Server: 2. Discovery
    Client->>Server: tools/list
    Server->>Client: Tool definitions (name, schema, description)

    Note over Client,Server: 3. Execution
    Client->>Server: tools/call (name, arguments)
    Server->>Client: Result (content array)

    Note over Client,Server: 4. Notifications
    Server->>Client: notifications/tools/list_changed
    Client->>Server: tools/list (refresh)

    Note over Client,Server: 5. Shutdown
    Client->>Server: Close connection

Capability negotiation happens during initialization — both client and server declare which primitives they support. For example, a server might declare "tools": {"listChanged": true} to indicate it supports tools and can send change notifications.

Transport Mechanisms

MCP supports two standard transports. Choosing the right one depends on your deployment model:

Stdio Transport

The simplest transport: the client launches the server as a subprocess and communicates over stdin/stdout.

# Run with stdio (default for local servers)
mcp.run(transport="stdio")

Key constraints:

Never write to stdout from your server code (it corrupts JSON-RPC messages)
Use logging or print(..., file=sys.stderr) for debug output
Single client per server process

Best for: Local integrations with Claude Desktop, VS Code, Cursor.

Streamable HTTP Transport

For remote servers serving multiple clients over the network:

# Run with HTTP transport
mcp.run(transport="streamable-http", host="0.0.0.0", port=8080)

Streamable HTTP uses a single endpoint (e.g., https://example.com/mcp) supporting both POST and GET:

POST: Client sends JSON-RPC requests; server responds with JSON or opens an SSE stream
GET: Client opens an SSE stream for server-initiated messages

Feature	Stdio	Streamable HTTP
Deployment	Local subprocess	Remote server
Clients	Single	Multiple concurrent
Network	No network overhead	HTTP-based
Auth	Process-level	Bearer tokens, OAuth, API keys
Use case	Desktop integrations	Cloud-hosted services, multi-tenant

Testing Your MCP Server

Using the MCP Inspector

The MCP Inspector is the official visual testing tool:

# Launch the inspector with your server
npx @modelcontextprotocol/inspector uv run server.py

The Inspector provides a web UI where you can:

Browse all tools, resources, and prompts
Execute tools with custom arguments
View raw JSON-RPC messages
Test error handling

Configuring Claude Desktop

Add your server to ~/.config/Claude/claude_desktop_config.json (Linux) or the equivalent path on macOS/Windows:

{
  "mcpServers": {
    "my-knowledge-server": {
      "command": "uv",
      "args": [
        "--directory",
        "/absolute/path/to/my-mcp-server",
        "run",
        "server.py"
      ]
    }
  }
}

Configuring VS Code

Add to .vscode/mcp.json in your workspace:

{
  "servers": {
    "my-knowledge-server": {
      "command": "uv",
      "args": ["--directory", "${workspaceFolder}", "run", "server.py"]
    }
  }
}

Building a Real-World Example: Project Knowledge Server

Let’s build a complete MCP server that provides an AI assistant with deep context about a software project — combining tools, resources, and prompts:

# project_server.py
import json
import subprocess
from pathlib import Path
from mcp.server.fastmcp import FastMCP

mcp = FastMCP("project-knowledge")

PROJECT_ROOT = Path("/path/to/your/project")


# ─── Tools ──────────────────────────────────────────
@mcp.tool()
def search_codebase(pattern: str, file_glob: str = "**/*.py") -> str:
    """Search the codebase using grep for a pattern.

    Args:
        pattern: Regex pattern to search for
        file_glob: File glob to limit search scope (default: all Python files)
    """
    try:
        result = subprocess.run(
            ["grep", "-rn", "--include", file_glob, pattern, str(PROJECT_ROOT)],
            capture_output=True,
            text=True,
            timeout=30,
        )
        if not result.stdout.strip():
            return f"No matches found for pattern: {pattern}"
        # Limit output to avoid overwhelming the LLM
        lines = result.stdout.strip().split("\n")[:20]
        return "\n".join(lines)
    except subprocess.TimeoutExpired:
        return "Search timed out. Try a more specific pattern."


@mcp.tool()
def read_file(file_path: str, start_line: int = 1, end_line: int = 100) -> str:
    """Read a file from the project with optional line range.

    Args:
        file_path: Relative path from project root
        start_line: First line to read (1-indexed)
        end_line: Last line to read (inclusive)
    """
    target = PROJECT_ROOT / file_path
    # Security: prevent path traversal
    if not target.resolve().is_relative_to(PROJECT_ROOT.resolve()):
        return "Error: Access denied — path is outside the project."
    if not target.exists():
        return f"File not found: {file_path}"

    lines = target.read_text().splitlines()
    selected = lines[start_line - 1 : end_line]
    numbered = [f"{i}: {line}" for i, line in enumerate(selected, start=start_line)]
    return "\n".join(numbered)


@mcp.tool()
def list_directory(dir_path: str = ".") -> str:
    """List files and directories at the given path.

    Args:
        dir_path: Relative path from project root (default: root)
    """
    target = PROJECT_ROOT / dir_path
    if not target.resolve().is_relative_to(PROJECT_ROOT.resolve()):
        return "Error: Access denied — path is outside the project."
    if not target.is_dir():
        return f"Not a directory: {dir_path}"

    entries = sorted(target.iterdir())
    formatted = []
    for entry in entries[:50]:
        prefix = "📁" if entry.is_dir() else "📄"
        formatted.append(f"{prefix} {entry.name}")
    return "\n".join(formatted)


@mcp.tool()
def run_tests(test_path: str = "", verbose: bool = False) -> str:
    """Run project tests using pytest.

    Args:
        test_path: Specific test file or directory (default: all tests)
        verbose: Whether to show verbose output
    """
    cmd = ["python", "-m", "pytest", "--tb=short", "-q"]
    if verbose:
        cmd.append("-v")
    if test_path:
        cmd.append(str(PROJECT_ROOT / test_path))
    else:
        cmd.append(str(PROJECT_ROOT / "tests"))

    try:
        result = subprocess.run(
            cmd, capture_output=True, text=True, timeout=120, cwd=str(PROJECT_ROOT)
        )
        output = result.stdout + result.stderr
        # Truncate if too long
        if len(output) > 3000:
            output = output[:3000] + "\n... (truncated)"
        return output
    except subprocess.TimeoutExpired:
        return "Tests timed out after 120 seconds."


# ─── Resources ──────────────────────────────────────
@mcp.resource("project://readme")
def get_readme() -> str:
    """Return the project README."""
    readme = PROJECT_ROOT / "README.md"
    if readme.exists():
        return readme.read_text()
    return "No README.md found."


@mcp.resource("project://structure")
def get_project_structure() -> str:
    """Return a tree view of the project structure."""
    try:
        result = subprocess.run(
            ["find", str(PROJECT_ROOT), "-type", "f", "-not", "-path", "*/.git/*",
             "-not", "-path", "*/__pycache__/*", "-not", "-path", "*/.venv/*"],
            capture_output=True, text=True, timeout=10,
        )
        files = sorted(result.stdout.strip().split("\n"))[:100]
        return "\n".join(f.replace(str(PROJECT_ROOT) + "/", "") for f in files)
    except Exception:
        return "Unable to generate project structure."


@mcp.resource("project://dependencies")
def get_dependencies() -> str:
    """Return the project dependencies."""
    for dep_file in ["pyproject.toml", "requirements.txt", "package.json"]:
        path = PROJECT_ROOT / dep_file
        if path.exists():
            return f"# {dep_file}\n{path.read_text()}"
    return "No dependency file found."


# ─── Prompts ────────────────────────────────────────
@mcp.prompt()
def code_review(file_path: str) -> str:
    """Generate a prompt for reviewing a specific file.

    Args:
        file_path: Path to the file to review
    """
    return f"""You are a senior engineer conducting a code review.

Review the file at `{file_path}` and provide feedback on:
1. Code quality and readability
2. Potential bugs or edge cases
3. Performance considerations
4. Security concerns
5. Suggestions for improvement

Use the read_file tool to examine the code, and search_codebase to understand
how it integrates with the rest of the project."""


@mcp.prompt()
def investigate_bug(description: str) -> str:
    """Generate a prompt for investigating a bug.

    Args:
        description: Description of the bug
    """
    return f"""You are debugging a reported issue in the project.

Bug description: {description}

Approach:
1. Use search_codebase to find relevant code
2. Use read_file to examine suspicious areas
3. Use run_tests to verify behavior
4. Provide a root cause analysis and suggested fix"""


if __name__ == "__main__":
    mcp.run(transport="stdio")

Integrating with Mistral AI

Mistral AI has embraced MCP as a first-class integration for its Agents and Conversations API. Their Python SDK (mistralai) includes built-in MCP client support, and Mistral recently hosted an MCP Hackathon to encourage the community to build creative MCP servers — from code assistants to data pipeline connectors.

Here’s how to connect your MCP server to a Mistral agent:

import asyncio
import os
from pathlib import Path

from mistralai.client import Mistral
from mistralai.extra.run.context import RunContext
from mcp import StdioServerParameters
from mistralai.extra.mcp.stdio import MCPClientSTDIO

MODEL = "mistral-medium-latest"
cwd = Path(__file__).parent


async def main() -> None:
    client = Mistral(api_key=os.environ["MISTRAL_API_KEY"])

    # Point to your MCP server
    server_params = StdioServerParameters(
        command="uv",
        args=["--directory", str(cwd), "run", "server.py"],
        env=None,
    )

    # Create an agent
    agent = client.beta.agents.create(
        model=MODEL,
        name="project-assistant",
        instructions="You help developers understand and work with the project.",
    )

    # Set up the run context with MCP
    async with RunContext(agent_id=agent.id) as run_ctx:
        mcp_client = MCPClientSTDIO(stdio_params=server_params)
        await run_ctx.register_mcp_client(mcp_client=mcp_client)

        # Run the agent
        result = await client.beta.conversations.run_async(
            run_ctx=run_ctx,
            inputs="Search for all API endpoint definitions in the project.",
        )

        for entry in result.output_entries:
            print(entry)


if __name__ == "__main__":
    asyncio.run(main())

Mistral’s agent framework automatically discovers tools exposed by the MCP server and makes them available to the model. The agent can combine MCP tools with Mistral’s built-in connectors (web search, code interpreter) in a single conversation — or you can use handoffs to route between multiple specialized agents, each connected to different MCP servers.

Deploying to Production

Containerizing with Docker

FROM python:3.12-slim

WORKDIR /app

# Install uv
RUN pip install uv

# Copy project files
COPY pyproject.toml uv.lock ./
COPY server.py ./

# Install dependencies
RUN uv sync --frozen

# Expose port for Streamable HTTP transport
EXPOSE 8080

# Run the server
CMD ["uv", "run", "server.py"]

Update your server to use Streamable HTTP for remote access:

import os

if __name__ == "__main__":
    transport = os.environ.get("MCP_TRANSPORT", "stdio")
    if transport == "streamable-http":
        mcp.run(
            transport="streamable-http",
            host="0.0.0.0",
            port=int(os.environ.get("PORT", 8080)),
        )
    else:
        mcp.run(transport="stdio")

Security Best Practices

When deploying MCP servers — especially remote ones — follow these security guidelines:

Concern	Recommendation
Path traversal	Validate all file paths against an allowed root directory
SQL injection	Use parameterized queries; restrict to read-only operations
DNS rebinding	Validate the `Origin` header on all HTTP connections
Network binding	Bind to `127.0.0.1` for local servers, not `0.0.0.0`
Authentication	Use OAuth or bearer tokens for remote servers
Input validation	Validate all tool arguments at the server boundary
Rate limiting	Implement per-client rate limits for resource-intensive tools
Logging	Never log sensitive data; use `stderr` for stdio servers

Deployment Options

Platform	Transport	Configuration
Local (Claude Desktop / VS Code)	stdio	JSON config pointing to `uv run server.py`
Docker	Streamable HTTP	Container with exposed port
Cloud Run / Railway	Streamable HTTP	Autoscaling HTTP service
AWS Lambda	Streamable HTTP	Serverless function behind API Gateway
Kubernetes	Streamable HTTP	Deployment with service and ingress

Comparison: MCP vs. Direct Function Calling

Aspect	Direct Function Calling	MCP
Protocol	Provider-specific (OpenAI, Mistral, etc.)	Open standard (JSON-RPC 2.0)
Discovery	Manual schema definition	Automatic via `tools/list`
Portability	Locked to one provider	Works across all MCP-compatible hosts
Execution	Client-side (developer handles calls)	Server-side (server manages tool logic)
State	Stateless per request	Stateful sessions with lifecycle
Real-time updates	Not supported	Notifications for capability changes
Ecosystem	Provider-specific	Growing open ecosystem of servers

Function calling and MCP are complementary. Function calling defines how the model requests tool invocations; MCP defines how those tools are discovered, connected, and managed. In fact, Mistral’s MCP integration wraps function calling — the model uses its standard tool-calling capability, and the Mistral SDK automatically routes calls to the appropriate MCP server.

References

Model Context Protocol, “Introduction — What is MCP?”, modelcontextprotocol.io, 2025. Available: https://modelcontextprotocol.io/introduction
Model Context Protocol, “Architecture Overview”, modelcontextprotocol.io, 2025. Available: https://modelcontextprotocol.io/docs/concepts/architecture
Model Context Protocol, “Build an MCP Server — Quickstart”, modelcontextprotocol.io, 2025. Available: https://modelcontextprotocol.io/quickstart/server
Model Context Protocol, “Transports Specification”, modelcontextprotocol.io, 2025. Available: https://modelcontextprotocol.io/specification/2025-06-18/basic/transports
Mistral AI, “MCP — Agents Tools Documentation”, docs.mistral.ai, 2025. Available: https://docs.mistral.ai/agents/tools/mcp/
Mistral AI, “Function Calling”, docs.mistral.ai, 2025. Available: https://docs.mistral.ai/capabilities/function_calling/
Mistral AI, “MCP Hackathon”, hackathon.mistral.ai, 2025. Available: https://hackathon.mistral.ai
GitHub, “MCP Inspector — Visual Testing Tool”, github.com/modelcontextprotocol/inspector, 2025. Available: https://github.com/modelcontextprotocol/inspector

Tool Use and Function Calling for Retrieval Agents — Covers function calling patterns that MCP servers expose as tools, and how LLMs select and invoke them.
Building a ReAct Agent from Scratch — Build the Thought-Action-Observation loop that powers agents consuming MCP tools.
Design Patterns for AI Agents — Architectural patterns like Orchestrator-Workers and Tool Use that MCP servers enable at the infrastructure level.
Building Agents with LangGraph — LangGraph agents can integrate with MCP servers for tool orchestration in stateful workflows.
Deploying Retrieval Agents in Production — Production deployment patterns for agent systems, including containerization and scaling strategies applicable to MCP servers.