Programmatic Tool Calling with Claude Code: The Developer’s Guide to Agent-Scale Automation

When Claude Code executes a tool, it typically works like this: call a function, wait for the result, process it, call the next function. Repeat fifty times for fifty tasks. It's effective, but each step requires a full round-trip through the API—and those round-trips add up fast. Programmatic Tool Calling (PTC) inverts this pattern. Instead of orchestrating tools through conversation turns, Claude writes and executes code that calls multiple tools, processes their outputs, and returns only the final results to its context window. One inference pass. One execution block. The difference isn't incremental—it's architectural.

This guide explains how PTC works, why it matters for serious agent development, and how to implement it with Claude Code and custom MCP servers. We'll cover the mechanism, build a working example, and troubleshoot the gotchas that trip up most developers.

The Problem PTC Solves

Consider a practical scenario: you need to check the status of fifty servers. With standard tool use, Claude calls check_status(server_1), waits for the result, processes it, calls check_status(server_2), and so on. Fifty tool calls means fifty API round-trips, each consuming tokens for the request, the response, and Claude's reasoning about what to do next. The context window fills with intermediate results. Latency compounds.

Now consider the alternative: Claude writes a Python script that calls check_status for all fifty servers, processes the results into a summary, and returns a single JSON object. Fewer round-trips. Minimal context consumption. The orchestration logic lives in explicit code rather than Claude's natural-language reasoning. According to Anthropic's internal testing (not independently verified), PTC can reduce token consumption by up to 37% on complex research tasks, with improved accuracy on certain benchmarks. The gains reportedly come from two sources: reduced context pollution from intermediate results, and explicit programmatic control flow replacing conversational orchestration. Your mileage will vary depending on task characteristics and tool design.

How Programmatic Tool Calling Works

The mechanism involves three components working together:

1. Code Execution Tool: PTC requires the code execution sandbox to be enabled. This sandboxed environment lets Claude write and run code during a conversation.

2. Tool Opt-In: Your tools must explicitly declare they can be called from code execution by including "allowed_callers" in their definition with the appropriate version string. Without this flag, Claude will use standard sequential tool calling. The opt-in requirement is intentional: it keeps tools with side effects (payments, deletes, production deploys) from being callable inside generated code by default.

3. Script Generation: When Claude encounters a task that would benefit from batch execution or complex data processing, it writes a script that calls your tools. The script emits tool call requests during execution; the host application resolves those calls and re-enters execution with the results. Claude's context receives only the final processed output.

This last point deserves emphasis: Claude does not "pause" execution in the traditional async sense. The code execution environment emits tool call intents, then the host application is responsible for resolving them and resuming execution. There's no true coroutine suspension—this is a request-response pattern mediated by your application layer.

Here's what the API response structure looks like when PTC activates. (Schema is simplified for readability; field names vary across SDKs and debug traces.)

{
  "role": "assistant",
  "content": [
    {
      "type": "text",
      "text": "I'll query the server status and summarize the results."
    },
    {
      "type": "server_tool_use",
      "id": "srvtoolu_abc123",
      "name": "code_execution",
      "input": {
        "code": "import asyncio\n\nasync def check_all():\n    tasks = [check_status(f'server_{i}') for i in range(50)]\n    results = await asyncio.gather(*tasks)\n    healthy = sum(1 for r in results if r['status'] == 'ok')\n    return {'healthy': healthy, 'total': 50}\n\nprint(asyncio.run(check_all()))"
      }
    },
    {
      "type": "tool_use",
      "id": "toolu_def456",
      "name": "check_status",
      "input": {"server": "server_0"},
      "caller": {
        "type": "code_execution_20250825",
        "tool_id": "srvtoolu_abc123"
      }
    }
  ],
  "container": {
    "id": "container_xyz789",
    "expires_at": "2025-01-15T14:30:00Z"
  },
  "stop_reason": "tool_use"
}

Notice the caller field in the tool use block—it indicates this call originated from the code execution environment, not from Claude's standard tool use path. Your application returns the tool result, execution resumes, and Claude eventually receives only the final processed output.

Important: PTC does not guarantee parallel execution. The script may express parallelism (via asyncio.gather()as shown above), but whether tool calls actually execute concurrently depends entirely on how your host application handles the emitted requests. PTC's primary win is context and token locality—the orchestration logic lives in code rather than conversation turns—even when execution is ultimately sequential. If you're firing many tool calls in rapid succession, also consider rate-limiting implications on the receiving services.

A Note on Versioning

Throughout this article, you'll see identifiers like:

  • code_execution_20250825
  • advanced-tool-use-2025-11-20
  • claude-sonnet-4-5

These are the current beta identifiers as of January 2026. Anthropic has historically renamed or replaced similar identifiers without extended deprecation windows. Treat them as illustrative of the current implementation, not as stable public contracts. Always check the official documentation for current values before deploying to production.

Prerequisites

Before implementing PTC, ensure you have:

  • Claude Code CLI installed (npm install -g @anthropic-ai/claude-code)
  • Python 3.10+ with uv or pip for package management
  • MCP Python SDK (pip install mcp or uv add mcp)
  • Anthropic API access with the appropriate beta header (currently advanced-tool-use-2025-11-20)

Building a PTC-Compatible MCP Server

Let's build a practical example: a finance tools server that fetches stock prices. We'll structure it to enable programmatic calling, which means Claude can batch-fetch prices for multiple tickers in a single execution pass.

Project Setup

mkdir finance-mcp-server
cd finance-mcp-server
uv init
uv add mcp httpx

Server Implementation

Create server.py:

from mcp.server.fastmcp import FastMCP
import asyncio
import random

mcp = FastMCP("Finance-Tools")

async def fetch_price_from_api(ticker: str) -> float:
    """Simulate external API call with network latency."""
    await asyncio.sleep(0.3)  # Realistic API delay
    # In production: httpx.get(f"https://api.example.com/price/{ticker}")
    return round(random.uniform(50, 500), 2)


@mcp.tool(
    name="get_stock_price",
    description="Fetch current stock price for a ticker symbol. "
                "Efficient for batch operations (parallelized when runner supports it).",
)
async def get_stock_price(ticker: str) -> dict:
    """
    Fetch the current price of a stock.
    
    Args:
        ticker: Stock symbol (e.g., AAPL, MSFT, GOOGL)
    
    Returns:
        Dictionary with ticker and current price
    """
    price = await fetch_price_from_api(ticker)
    return {"ticker": ticker, "price": price, "currency": "USD"}


@mcp.tool(
    name="get_company_info",
    description="Fetch company information including sector and market cap.",
)
async def get_company_info(ticker: str) -> dict:
    """
    Fetch company metadata.
    
    Args:
        ticker: Stock symbol
    
    Returns:
        Company information dictionary
    """
    await asyncio.sleep(0.2)
    # Simulated data
    companies = {
        "AAPL": {"name": "Apple Inc.", "sector": "Technology", "market_cap": "3.0T"},
        "MSFT": {"name": "Microsoft Corporation", "sector": "Technology", "market_cap": "2.8T"},
        "GOOGL": {"name": "Alphabet Inc.", "sector": "Technology", "market_cap": "1.9T"},
    }
    return companies.get(ticker, {"name": "Unknown", "sector": "Unknown", "market_cap": "N/A"})


if __name__ == "__main__":
    mcp.run()

The allowed_callers Configuration

Here's where it gets important. The FastMCP decorator doesn't currently expose allowed_callers directly in all SDK versions. You need to ensure the tool definition sent to Claude includes this field. There are three approaches, each with trade-offs:

Option 1: Check SDK Support

Recent versions of the MCP SDK (late 2025) may support this directly:

@mcp.tool(
    name="get_stock_price",
    allowed_callers=["code_execution_20250825"]  # If supported
)
async def get_stock_price(ticker: str) -> dict:
    ...

This is the cleanest approach when available. Check your SDK version's documentation.

Option 2: Low-Level Server Definition

If the decorator doesn't support it, use the lower-level Server class:

from mcp.server import Server
from mcp.types import Tool, TextContent
import json

server = Server("Finance-Tools")

@server.list_tools()
async def list_tools():
    return [
        Tool(
            name="get_stock_price",
            description="Fetch current stock price for a ticker symbol.",
            inputSchema={
                "type": "object",
                "properties": {
                    "ticker": {"type": "string", "description": "Stock symbol"}
                },
                "required": ["ticker"]
            },
            # This is the critical field for PTC
            allowed_callers=["code_execution_20250825"]
        )
    ]

@server.call_tool()
async def call_tool(name: str, arguments: dict):
    if name == "get_stock_price":
        ticker = arguments["ticker"]
        price = await fetch_price_from_api(ticker)
        return [TextContent(type="text", text=json.dumps({
            "ticker": ticker, 
            "price": price, 
            "currency": "USD"
        }))]

More verbose, but gives you full control over the tool schema.

Option 3: Post-Process Tool Definitions

Intercept the tool list before it's sent and inject the field:

# Monkey-patch approach — use with caution
original_list_tools = mcp._list_tools_handler

async def patched_list_tools():
    tools = await original_list_tools()
    for tool in tools:
        if tool.name in ["get_stock_price", "get_company_info"]:
            tool.allowed_callers = ["code_execution_20250825"]
    return tools

mcp._list_tools_handler = patched_list_tools

This is fragile—it depends on internal SDK structure that may change. Use it as a stopgap while waiting for proper SDK support, not as a long-term solution.

The correct approach depends on your SDK version. Check the official MCP documentation for current recommendations.

What About Claude Desktop Users?

Since November 2025, Claude Code is available directly in the Claude Desktop app. This is significant: you no longer need the terminal to access Claude Code's agentic capabilities. Max, Pro, Team, and Enterprise users can run multiple Claude Code sessions in parallel from the Desktop interface—one fixing bugs, another researching GitHub, a third updating documentation.

However, Programmatic Tool Calling is not yet available through Claude Code—whether you access it via terminal or Desktop app. As of January 2026, there's an open feature request (GitHub issue ) asking Anthropic to add support for the allowed_callers flag and related PTC betas to Claude Code. The issue has significant community support but hasn't been implemented.

Here's the current state:

Access Method MCP Support PTC Support
Claude API (direct) Yes Yes (beta)
Claude Code CLI Yes Not yet
Claude Code in Desktop Yes Not yet
Claude Desktop (chat mode) Yes No

What this means for end users:

  • If you're using Claude Code through the Desktop app with MCP servers, you get standard tool calling—which is already powerful for most workflows.
  • Your MCP servers' allowed_callers field will be ignored; Claude Code doesn't currently pass this to the API with the required beta header.
  • For batch operations where PTC would shine, Claude Code's existing capabilities (running bash commands, iterating on code, parallel sessions via git worktrees) often achieve similar practical outcomes through different mechanisms.

If you specifically need PTC today, direct API integration is the only path. This typically means building a custom application that includes both the code_execution_20250825 tool and the advanced-tool-use-2025-11-20 beta header in requests.

The good news: Anthropic is clearly aware of demand for PTC in Claude Code. Given that the infrastructure exists at the API level, it's reasonable to expect this capability will eventually surface in Claude Code—which would automatically make it available in the Desktop app as well. Watch the release notes.

Configuring Claude Code

MCP server configuration differs between Claude Code (CLI) and Claude Desktop. Since this article focuses on Claude Code, we'll cover the CLI workflow—but I'll note the Desktop path for completeness.

Claude Code (CLI)

Use the MCP wizard:

claude mcp add

Enter:

  • Name: finance-tools
  • Command: uv
  • Arguments: run server.py

This creates or updates a .mcp.json file in your project directory (or user-level config). You can also edit it directly:

{
  "mcpServers": {
    "finance-tools": {
      "command": "uv",
      "args": ["run", "server.py"],
      "cwd": "/path/to/finance-mcp-server"
    }
  }
}

For user-scoped configuration (available across all projects), use claude mcp add --scope user.

Claude Desktop (for reference)

If you're using Claude Desktop instead, the configuration file is different:

  • macOS: ~/Library/Application Support/Claude/claude_desktop_config.json
  • Windows: %APPDATA%\Claude\claude_desktop_config.json
  • Linux: ~/.config/Claude/claude_desktop_config.json

The JSON structure is similar, but Claude Code CLI won't read this file—they're separate configuration systems.

Verifying the Configuration

Start Claude Code and check MCP status:

claude
> /mcp

You should see finance-tools: connected. Test the basic tool:

> Get the stock price for AAPL

Claude should call get_stock_price and return a result.

Triggering Programmatic Tool Calling

PTC activates when Claude determines that code-based orchestration would be more efficient. You can encourage this with explicit prompts:

> Get the current stock price for these 10 tickers: AAPL, MSFT, GOOGL, AMZN, 
> TSLA, NVDA, META, NFLX, AMD, INTC. Compare them and tell me which has the 
> highest price. Use code execution to fetch them efficiently.

Without PTC, you'd see ten sequential tool calls in the conversation, each appearing as a separate Tool Use: get_stock_priceevent.

With PTC, you'll see:

  1. A single code_execution block containing a script
  2. Tool calls originating from within that script (visible in debug logs)
  3. Claude receiving only the final processed result

The visual indicator in Claude Code is subtle—look for "Running code..." status or a single tool use event that encompasses multiple underlying calls.

Using PTC with the Raw API

If you're building your own agent rather than using Claude Code, here's how to enable PTC in API calls:

from anthropic import Anthropic

client = Anthropic()

response = client.beta.messages.create(
    model="claude-sonnet-4-5",  # Check current model strings
    betas=["advanced-tool-use-2025-11-20"],  # Check current beta header
    max_tokens=4096,
    messages=[
        {
            "role": "user",
            "content": "Get stock prices for AAPL, MSFT, and GOOGL"
        }
    ],
    tools=[
        # Enable code execution
        {
            "type": "code_execution_20250825",  # Check current version
            "name": "code_execution"
        },
        # Your PTC-enabled tool
        {
            "name": "get_stock_price",
            "description": "Fetch current stock price. Efficient for batch operations.",
            "input_schema": {
                "type": "object",
                "properties": {
                    "ticker": {"type": "string", "description": "Stock symbol"}
                },
                "required": ["ticker"]
            },
            "allowed_callers": ["code_execution_20250825"]
        }
    ]
)

When processing the response, check for tool calls with a caller field—these need to be executed and their results fed back into the code execution context, not returned directly to Claude.

Troubleshooting Common Issues

"Claude is still calling tools one by one"

Cause: The allowed_callers field is missing from the tool definition.

Diagnosis: Use the MCP Inspector to verify:

npx @modelcontextprotocol/inspector

Connect to your server and inspect the list_tools output. Look for:

{
  "name": "get_stock_price",
  "allowed_callers": ["code_execution_20250825"]
}

If allowed_callers is absent, review your server implementation.

Workaround: Explicitly prompt Claude: "Use code execution to handle these calls efficiently."

"Version string not recognized"

Cause: The version identifier has changed since this article was written.

Fix: Check the current API documentation for the active version string. These identifiers are beta-scoped and subject to change.

"Code execution not available"

Cause: The code execution tool isn't enabled in your request.

Fix: Include both the code execution tool and the beta header:

tools=[
    {"type": "code_execution_20250825", "name": "code_execution"},
    # ... your other tools
]
betas=["advanced-tool-use-2025-11-20"]

"Container expired"

Cause: Code execution containers have a TTL. If you wait too long between tool result submissions, the container expires.

Fix: Process tool results promptly. For long-running operations, consider chunking work into smaller execution blocks.

Security Considerations

PTC involves Claude writing and executing code. This creates surface area for prompt injection and unintended behavior.

Input Validation: Ensure your tools validate inputs. If Claude passes malformed data from the script, your tool should fail gracefully with clear error messages.

Output Sanitization: Tool results fed back into code execution can influence subsequent code generation. Avoid returning data that could be interpreted as executable code or control characters.

State Leakage Within Execution: This is the risk many articles miss. Within a single PTC execution block, intermediate variables persist. Malicious or malformed tool output from an early call can poison subsequent steps—variables remain in scope, and Claude's generated code may reference or manipulate them in unexpected ways. PTC increases the blast radius of a single bad tool response compared to standard calling, where each response is isolated in conversation context.

Permission Boundaries: PTC doesn't bypass Claude Code's permission system. Even within a script, tool calls that require approval will pause execution.

Container Isolation: Code execution runs in a sandboxed container. It can't access your file system, network (except through defined tools), or other system resources directly. However, the container itself persists across tool calls within one execution block—that's the state leakage vector mentioned above.

When PTC Might Not Be the Right Choice

PTC isn't universally superior to standard tool calling. Consider the trade-offs:

Debugging Opacity: When something goes wrong in a PTC execution, you're debugging generated code. The orchestration logic that would have been visible as conversational reasoning is now embedded in a script you didn't write. Error localization is harder.

Observability Gaps: Standard tool calling produces a clear audit trail: each call appears in the conversation, with Claude's reasoning visible between calls. PTC compresses this into a code block and final output. For compliance-sensitive workflows, this may be problematic.

Human-in-the-Loop Friction: If your workflow requires human approval at intermediate steps, standard calling is more natural. PTC wants to run to completion; interrupting mid-execution requires breaking the container session.

Simple Tasks: The overhead of PTC—script generation, container management, execution state—makes sense for batch operations, not for checking the weather in one city. If you're calling a tool once, standard calling is simpler and equally fast.

Unpredictable Tool Behavior: If your tools have side effects that require careful sequencing or rollback capabilities, explicit conversational orchestration gives you more control points.

Advanced Pattern: Multi-Tool Orchestration

PTC becomes particularly powerful when orchestrating multiple tools:

> For each of these companies (AAPL, MSFT, GOOGL), get both the stock price 
> and company info. Calculate a "value score" as (price / market_cap_billions) 
> and rank them. Use code execution to handle this efficiently.

Claude might generate something like:

import asyncio

async def analyze_companies(tickers):
    results = []
    for ticker in tickers:
        # Tools are exposed as async functions in the execution environment
        price_data = await get_stock_price(ticker)
        info = await get_company_info(ticker)
        
        # Parse market cap (e.g., "3.0T" -> 3000)
        cap_str = info["market_cap"]
        if cap_str.endswith("T"):
            cap_billions = float(cap_str[:-1]) * 1000
        elif cap_str.endswith("B"):
            cap_billions = float(cap_str[:-1])
        else:
            cap_billions = 1  # fallback
        
        value_score = price_data["price"] / cap_billions
        results.append({
            "ticker": ticker,
            "price": price_data["price"],
            "market_cap": info["market_cap"],
            "value_score": round(value_score, 6)
        })
    
    # Sort by value score descending
    results.sort(key=lambda x: x["value_score"], reverse=True)
    return results

tickers = ["AAPL", "MSFT", "GOOGL"]
rankings = asyncio.run(analyze_companies(tickers))
print(rankings)

Six tool calls, data transformation, ranked output—all happening in one execution block. Claude's context receives only the final rankings, not six separate tool results plus reasoning about how to combine them.

Conclusion

Programmatic Tool Calling changes how Claude interacts with external tools—from conversational back-and-forth to code-driven orchestration. In practice, PTC can significantly reduce latency and token consumption for batch operations and complex multi-tool workflows, though the benefits depend heavily on task characteristics and implementation details.

The implementation requires attention to detail: ensuring the allowed_callers field is properly set, enabling both code execution and the appropriate beta header, and designing tools that benefit from batch execution. The debugging and observability trade-offs are real, and PTC isn't the right choice for every workflow.

For developers building with Claude Code, PTC is already available through MCP servers that advertise their compatibility. For those building custom agents on the API, it's a beta feature worth experimenting with—particularly for data-intensive operations where context management becomes a bottleneck.

The agent doesn't just call your tools. It writes code that orchestrates them. Whether that's the right architecture for your use case depends on what you're optimizing for.

Unlock the Future of Business with AI

Dive into our immersive workshops and equip your team with the tools and knowledge to lead in the AI era.

Scroll to top