stdio is a local pipe. It works for one client on your machine and dies the moment you want more. Production MCP is Streamable HTTP, never stdio. The good news: the server you wrote in Lab 01 does not change. You flip one argument in run(), and the same tools now answer HTTP at /mcp. Then ngrok puts that endpoint on the public internet so an agent anywhere can reach it.
stdio binds the server to one process pair on one machine: one client, talking over one stdin/stdout pipe. There is no addressing, no concurrency, no network. It is local glue, and it does not scale to many concurrent clients. The fix is not a rewrite. It is a transport swap.
Take server.py from Lab 01. The only edit is the run() call. Everything above it is byte-for-byte identical.
mcp.run() mcp.run(transport="http", host="127.0.0.1", port=8000)
transport="http" is FastMCP's Streamable HTTP transport. The string "streamable-http" is an accepted alias for the exact same thing, so you will see both in the wild. The full file, for copy-paste convenience:
from fastmcp import FastMCP
mcp = FastMCP("hello-toolsmith")
@mcp.tool
def add(a: int, b: int) -> int:
"""Add two integers and return the sum."""
return a + b
if __name__ == "__main__":
# stdio was a local pipe. "http" is FastMCP's Streamable HTTP transport.
mcp.run(transport="http", host="127.0.0.1", port=8000)
This time the server does not sit silent. It binds a port and logs that it is listening. The MCP endpoint is mounted at /mcp.
python server.py
Starting MCP server 'hello-toolsmith' with transport 'http' Uvicorn running on http://127.0.0.1:8000 (Press CTRL+C to quit) -> MCP endpoint: http://127.0.0.1:8000/mcp/
Open a second terminal and POST an initialize to /mcp. Streamable HTTP requires the client to accept both JSON and the SSE stream, so the Accept header lists both. The -L flag is the load-bearing detail (see the trap below). The server replies with its capabilities.
curl -s -L -X POST http://127.0.0.1:8000/mcp \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"curl","version":"0"}}}'
$body = '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"curl","version":"0"}}}'
curl.exe -s -L -X POST http://127.0.0.1:8000/mcp `
-H "Content-Type: application/json" `
-H "Accept: application/json, text/event-stream" `
-d $body
event: message
data: {"jsonrpc":"2.0","id":1,"result":{"protocolVersion":"2025-06-18",
"capabilities":{"tools":{"listChanged":true}},
"serverInfo":{"name":"hello-toolsmith","version":"3.3.1"}}}
The reply arrives as a text/event-stream frame: that is the "Streamable" half of Streamable HTTP. The server can also open a long-lived stream to push notifications. A real client reads the mcp-session-id response header and sends it back on every following request.
FastMCP mounts the endpoint at /mcp/ with a trailing slash, and a request to the other form 307-redirects to the canonical one. A 307 preserves your POST body and method, but curl does not follow any redirect unless you pass -L. So either hit the path the server prints exactly, or always pass -L and stop thinking about it. Leaving -L off is a silent "why is my body empty" afternoon.
Your server listens on localhost. ngrok opens a secure tunnel and hands you a public HTTPS URL that forwards to port 8000. One-time setup: sign up free, grab your authtoken, register it.
ngrok config add-authtoken YOUR_TOKEN_HERE
Then, in a third terminal, point ngrok at port 8000:
ngrok http 8000
Forwarding https://a1b2-203-0-113-7.ngrok-free.app -> http://localhost:8000
Your public MCP endpoint is now that URL with /mcp on the end. Hand it to any remote client. Prove it from anywhere (same -L as before):
curl -s -L -X POST https://a1b2-203-0-113-7.ngrok-free.app/mcp \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2025-06-18","capabilities":{},"clientInfo":{"name":"remote","version":"0"}}}'
The moment a browser-based agent tries to reach your server, the browser sends a preflight and your server must return CORS headers on /mcp and on /.well-known/*. Miss them and the request fails silently with no useful error in your logs, just a dead client. ngrok forwards faithfully, so this bites in prod, not in your terminal tests. FastMCP lets you pass CORS middleware when you build the HTTP app; allow your agent's origin, the POST and GET methods, and the Mcp-Session-Id header.
One MCP session per tenant, isolated. Cloudflare's Agents Week recipe runs each tenant's session in its own Worker with an isolated SQLite-backed Durable Object, so one noisy tenant cannot see or starve another. You do not need that tonight. You need to know that stdio in prod is the number-one architecture mistake, and you have already avoided it.