Building Skills for AI Agents

Designing modular, reusable agent capabilities — from tool engineering to skill bundles and the SKILL.md standard

Open In Colab

📖 Read the full article


Table of Contents

  1. Setup
  2. ACI Design Principles
  3. Building a Retrieval Skill
  4. Building a Data Analysis Skill
  5. Composing Skills into Agents
  6. SKILL.md Pattern
!pip install -q langchain-openai langchain-core langgraph openai
import os
# os.environ["OPENAI_API_KEY"] = "your-key"

2. ACI Design Principles

The Agent-Computer Interface — design tools so they’re easy for LLMs to use correctly.

Principle Bad Good
Clear naming process_data(input) search_knowledge_base(query)
Minimal params 7 parameters 1-2 focused params
Rich descriptions “Search for stuff” Full docstring with examples
Poka-yoke Relative paths Absolute paths required
Actionable errors KeyError: 'user_id' Guide on what to fix
from langchain_core.tools import tool
from langchain_openai import ChatOpenAI

llm = ChatOpenAI(model="gpt-4o-mini", temperature=0)


# Good: Clear name, detailed docstring, constrained inputs
@tool
def search_knowledge_base(query: str) -> str:
    """Search the documentation knowledge base for relevant information.

    Use for questions about product features, API usage, configuration.
    Returns up to 5 relevant passages with sources.

    Args:
        query: Specific natural language query.
               Good: 'how to configure rate limiting for REST API'
               Bad: 'rate limit' (too vague)
    """
    results = [
        {"content": "Rate limiting is configured via the API gateway...",
         "source": "docs/api/rate-limits.md", "score": 0.92},
        {"content": "Default rate limit is 60 requests per minute...",
         "source": "docs/api/defaults.md", "score": 0.87},
    ]
    formatted = []
    for i, r in enumerate(results, 1):
        formatted.append(f"{i}. **{r['source']}** ({r['score']:.0%})\n   {r['content'][:200]}")
    return "\n\n".join(formatted)


# Actionable error messages
@tool
def read_file(absolute_path: str) -> str:
    """Read a file. Path MUST be absolute (starting with /).
    Example: /home/user/project/src/main.py"""
    if not absolute_path.startswith("/"):
        return f"Error: path must be absolute. Got: {absolute_path}. Use full path like /home/user/..."
    return f"Contents of {absolute_path}"


print("Good tool:", search_knowledge_base.name)
print(read_file.invoke({"absolute_path": "relative/path.py"}))

3. Building a Retrieval Skill

A complete retrieval skill with query rewriting and relevance grading.

@tool
def rewrite_query(original_query: str, context: str = "") -> str:
    """Rewrite a search query for better retrieval results.

    Use BEFORE searching if the original query is vague or previous search failed.

    Args:
        original_query: The query that needs improvement.
        context: Optional context about what info is needed.
    """
    response = llm.invoke([{
        "role": "user",
        "content": f"Rewrite this search query to be more specific:\n"
                   f"Original: {original_query}\nContext: {context}\n"
                   f"Return ONLY the rewritten query."
    }])
    return response.content.strip()


@tool
def grade_relevance(query: str, document: str) -> str:
    """Check if a retrieved document is relevant to the query.
    Returns 'relevant' or 'not_relevant' with explanation.

    Args:
        query: The original user question.
        document: The retrieved document text to evaluate.
    """
    response = llm.invoke([{
        "role": "system",
        "content": "You are a relevance grader. Reply with 'relevant' or 'not_relevant' and brief explanation."
    }, {
        "role": "user",
        "content": f"Query: {query}\nDocument: {document}"
    }])
    return response.content.strip()


# Skill bundle
RETRIEVAL_SKILL = {
    "name": "knowledge-base-retrieval",
    "description": "Search, rewrite, and grade results from the documentation.",
    "tools": [search_knowledge_base, rewrite_query, grade_relevance],
}

# Test the skill
print(rewrite_query.invoke({"original_query": "rate limit", "context": "need API configuration details"}))
print("\n" + grade_relevance.invoke({"query": "rate limiting", "document": "Rate limits are 60 req/min."}))

4. Building a Data Analysis Skill

import math


@tool
def query_metrics_database(description: str) -> str:
    """Query the metrics database using natural language.

    Use for: user counts, growth rates, retention, feature usage, performance stats.

    Args:
        description: What data you need in plain English.
                     Good: 'monthly active users for the last 6 months'
    """
    return (f"Query: {description}\n"
            "| Month | Active Users | Growth |\n"
            "|:------|:-------------|:-------|\n"
            "| Jan   | 12,450       | +15%   |\n"
            "| Feb   | 14,320       | +12%   |\n"
            "| Mar   | 15,890       | +11%   |")


@tool
def calculate(expression: str) -> str:
    """Evaluate a mathematical expression.

    Supports: +, -, *, /, ** (power), sqrt(), abs(), round()

    Args:
        expression: e.g. 'round((15890 - 12450) / 12450 * 100, 1)'
    """
    allowed = {"sqrt": math.sqrt, "abs": abs, "round": round, "min": min, "max": max}
    try:
        result = eval(expression, {"__builtins__": {}}, allowed)
        return str(result)
    except Exception as e:
        return f"Error: {e}. Check syntax and try again."


ANALYSIS_SKILL = {
    "name": "data-analysis",
    "description": "Query metrics, calculate, and generate reports.",
    "tools": [query_metrics_database, calculate],
}

print(query_metrics_database.invoke({"description": "monthly active users"}))
print("\nGrowth:", calculate.invoke({"expression": "round((15890 - 12450) / 12450 * 100, 1)"}))

5. Composing Skills into Agents

Combine tools from multiple skills into a single agent.

from langgraph.prebuilt import create_react_agent

# Combine tools from multiple skills
all_tools = RETRIEVAL_SKILL["tools"] + ANALYSIS_SKILL["tools"]

agent = create_react_agent(
    model=llm,
    tools=all_tools,
    prompt="""You are a product analyst with two skill sets:

1. **Knowledge Base Retrieval**: search_knowledge_base, rewrite_query, grade_relevance
2. **Data Analysis**: query_metrics_database, calculate

Use retrieval for 'what/how' questions. Use analysis for 'how many/trend' questions.
Always verify retrieved info with grade_relevance before using it.""",
)

result = agent.invoke({
    "messages": [{"role": "user", "content": "What's our user growth rate this quarter?"}]
})

for msg in result["messages"]:
    print(f"{msg.type}: {msg.content[:200] if msg.content else '[tool_calls]'}")

6. SKILL.md Pattern

The emerging standard for packaging and sharing agent skills.

SKILL_MD = """
---
name: csv-insights
description: >
  Analyze CSV files and produce summary reports with statistics,
  distributions, and anomaly detection.
version: 1
---

## Instructions

When analyzing a CSV file:
1. **Validate** — Read the CSV and check for expected columns
2. **Profile** — Compute column types, missing values, statistics
3. **Analyze** — Run analysis for distributions and anomalies
4. **Report** — Format results using the report template

## Rules
- Always show sample data (first 5 rows) before analysis
- Round numbers to 2 decimal places
- Flag any column with >10% missing values
- Never modify the original file
"""

print("SKILL.md template:")
print(SKILL_MD)
print("\nDirectory structure:")
print("csv-insights/")
print("├── SKILL.md")
print("├── analyze.py")
print("├── templates/report.md")
print("└── schemas/columns.json")