labs | 02 | http + ngrok
lab 02 | ~7 min | segment 2

One argument moves you to production.

stdio is a local pipe. It works for one client on your machine and dies the moment you want more. Production MCP is Streamable HTTP, never stdio. The good news: the server you wrote in Lab 01 does not change. You flip one argument in run(), and the same tools now answer HTTP at /mcp. Then ngrok puts that endpoint on the public internet so an agent anywhere can reach it.

why stdio dies in prod

stdio binds the server to one process pair on one machine: one client, talking over one stdin/stdout pipe. There is no addressing, no concurrency, no network. It is local glue, and it does not scale to many concurrent clients. The fix is not a rewrite. It is a transport swap.

step 1

Change one argument.

Take server.py from Lab 01. The only edit is the run() call. Everything above it is byte-for-byte identical.

server.py (the diff)
    mcp.run()
    mcp.run(transport="http", host="127.0.0.1", port=8000)

transport="http" is FastMCP's Streamable HTTP transport. The string "streamable-http" is an accepted alias for the exact same thing, so you will see both in the wild. The full file, for copy-paste convenience:

server.py (full)
from fastmcp import FastMCP

mcp = FastMCP("hello-toolsmith")

@mcp.tool
def add(a: int, b: int) -> int:
    """Add two integers and return the sum."""
    return a + b

if __name__ == "__main__":
    # stdio was a local pipe. "http" is FastMCP's Streamable HTTP transport.
    mcp.run(transport="http", host="127.0.0.1", port=8000)
step 2

Boot it and watch it bind a port.

This time the server does not sit silent. It binds a port and logs that it is listening. The MCP endpoint is mounted at /mcp.

terminal A (leave running)
python server.py
what you'll see
Starting MCP server 'hello-toolsmith' with transport 'http'
Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit)
  -> MCP endpoint: http://127.0.0.1:8000/mcp/
step 3

Confirm the endpoint answers.

Open a second terminal and POST an initialize to /mcp. Streamable HTTP requires the client to accept both JSON and the SSE stream, so the Accept header lists both. The -L flag is the load-bearing detail (see the trap below). The server replies with its capabilities.

terminal B
curl -s -L -X POST http://127.0.0.1:8000/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"curl","version":"0"}}}'
powershell (terminal B)
$body = '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"curl","version":"0"}}}'
curl.exe -s -L -X POST http://127.0.0.1:8000/mcp `
  -H "Content-Type: application/json" `
  -H "Accept: application/json, text/event-stream" `
  -d $body
what you'll see (HTTP 200, an mcp-session-id header, then the SSE frame)
event: message
data: {"jsonrpc":"2.0","id":1,"result":{"protocolVersion":"2025-06-18",
 "capabilities":{"tools":{"listChanged":true}},
 "serverInfo":{"name":"hello-toolsmith","version":"3.3.1"}}}

The reply arrives as a text/event-stream frame: that is the "Streamable" half of Streamable HTTP. The server can also open a long-lived stream to push notifications. A real client reads the mcp-session-id response header and sends it back on every following request.

the trailing-slash trap (this one bites people)

FastMCP mounts the endpoint at /mcp/ with a trailing slash, and a request to the other form 307-redirects to the canonical one. A 307 preserves your POST body and method, but curl does not follow any redirect unless you pass -L. So either hit the path the server prints exactly, or always pass -L and stop thinking about it. Leaving -L off is a silent "why is my body empty" afternoon.

step 4

Put it on the public internet with ngrok.

Your server listens on localhost. ngrok opens a secure tunnel and hands you a public HTTPS URL that forwards to port 8000. One-time setup: sign up free, grab your authtoken, register it.

one-time setup
ngrok config add-authtoken YOUR_TOKEN_HERE

Then, in a third terminal, point ngrok at port 8000:

terminal C (leave running)
ngrok http 8000
what you'll see
Forwarding   https://a1b2-203-0-113-7.ngrok-free.app -> http://localhost:8000

Your public MCP endpoint is now that URL with /mcp on the end. Hand it to any remote client. Prove it from anywhere (same -L as before):

from any machine
curl -s -L -X POST https://a1b2-203-0-113-7.ngrok-free.app/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"remote","version":"0"}}}'

stdio

  • reach one local process pair
  • clients exactly one
  • addressing none, it is a pipe
  • use it for local dev, Claude Desktop, CLI tools
  • scales to nothing past one client

streamable http

  • reach any network client over HTTPS
  • clients many, session-scoped
  • addressing a URL at /mcp
  • use it for production, remote agents, the public web
  • scales to one session per tenant, behind a load balancer

the silent-failure trap: CORS

The moment a browser-based agent tries to reach your server, the browser sends a preflight and your server must return CORS headers on /mcp and on /.well-known/*. Miss them and the request fails silently with no useful error in your logs, just a dead client. ngrok forwards faithfully, so this bites in prod, not in your terminal tests. FastMCP lets you pass CORS middleware when you build the HTTP app; allow your agent's origin, the POST and GET methods, and the Mcp-Session-Id header.

production posture (the one-liner to remember)

One MCP session per tenant, isolated. Cloudflare's Agents Week recipe runs each tenant's session in its own Worker with an isolated SQLite-backed Durable Object, so one noisy tenant cannot see or starve another. You do not need that tonight. You need to know that stdio in prod is the number-one architecture mistake, and you have already avoided it.