2025, Nov 10 13:00

Run a Secure FastMCP Server on Google Cloud Run: Resolve 403 Errors with MCP Client, ID Token, and /mcp/ streamable-http

Troubleshoot 403 on Google Cloud Run MCP servers: use FastMCP client with streamable-http, ID token (ADC), and the /mcp/ endpoint to keep IAM auth enabled.

Running an MCP server on Google Cloud Run looks straightforward until you try to call it programmatically and hit a 403. The container is healthy, the HTTPS URL is live, IAM is required, and you can mint an ID token. Yet the call fails. The missing piece is that MCP endpoints expect the MCP protocol over HTTP, not a bare POST. Below is a concise walkthrough of the error pattern and a working end-to-end setup that keeps Cloud Run authentication enabled.

Problem setup

The service is a FastMCP server packaged into a container and deployed to Cloud Run with authentication required. The server uses streamable-http transport and binds to 0.0.0.0 and Cloud Run’s PORT.

import asyncio
import logging
import os
from fastmcp import FastMCP 
logr = logging.getLogger(__name__)
logging.basicConfig(format="[%(levelname)s]: %(message)s", level=logging.INFO)
svc = FastMCP("MCP Server on Cloud Run")
@svc.tool()
def sum_up(a: int, b: int) -> int:
    """Use this to add two numbers together.
    Args:
        a: The first number.
        b: The second number.
    Returns:
        The sum of the two numbers.
    """
    logr.info(f">>> Tool: 'sum_up' called with numbers '{a}' and '{b}'")
    return a + b
if __name__ == "__main__":
    logr.info(f" MCP server started on port {os.getenv('PORT', 8080)}")
    asyncio.run(
        svc.run_async(
            transport="streamable-http", 
            host="0.0.0.0", 
            port=os.getenv("PORT", 8080),
        )
    ) 

The client attempt fetches an ID token and posts to the service. The token prints fine, but the request returns 403.

import os
import requests
import google.oauth2.id_token
import google.auth.transport.requests
def trigger_call():
    os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = 'path\\to\\file.json'
    req = google.auth.transport.requests.Request()
    target = 'https://cloud_run_service_url'
    jwt_token = google.oauth2.id_token.fetch_id_token(req, target)
    print(jwt_token)
    resp = requests.post(
        target + '/mcp',
        headers={'Authorization': 'Bearer ' + jwt_token, 'Content-Type': 'application/json'}
    )
    print(resp.status_code)
if __name__ == "__main__":
    trigger_call()

What’s actually going wrong

The MCP server isn’t a generic REST endpoint. It expects the MCP protocol over HTTP. A raw POST without the negotiated MCP exchange won’t work. Using a purpose-built client solves two things at once: it forms the correct MCP messages and it supplies bearer authentication the way the server expects. Another practical detail is to rely on Application Default Credentials by exporting GOOGLE_APPLICATION_CREDENTIALS in your environment, rather than setting it in code. Finally, the MCP URL uses a trailing slash and, when the server is configured with streamable-http, the client should use a matching transport.

Working solution

The following end-to-end setup maintains Cloud Run authentication and uses an MCP client to call a tool on the server.

First, set credentials and capture the service URL. Let ADC discover your credentials via the environment.

export GOOGLE_APPLICATION_CREDENTIALS=${PWD}/caller.json
export RUN_BASE_URL=$( \
  gcloud run services describe ${APP_NAME} \
  --region=${GCP_REGION} \
  --project=${GCP_PROJECT} \
  --format="value(status.url)" )
uv run call_plus.py 25 17

Expected output indicates a successful connection and the tool result.

Connected
[TextContent(type='text', text='42', annotations=None)]

Project definition uses FastMCP, google-auth, and requests.

[project]
name = "mcp-cr-guide"
version = "0.0.1"
description = "Stackoverflow: 79685701"
readme = "README.md"
requires-python = ">=3.13"
dependencies = [
    "fastmcp>=2.9.2",
    "google-auth>=2.40.3",
    "requests>=2.32.4"
]

Server code serves the MCP tool over streamable-http.

import asyncio
import os
from fastmcp import FastMCP, Context
bridge = FastMCP("MCP Server on Cloud Run")
@bridge.tool()
async def plus_async(a: int, b: int, cx: Context) -> int:
    await cx.debug(f"[plus_async] {a}+{b}")
    total = a + b
    await cx.debug(f"result={total}")
    return total
if __name__ == "__main__":
    asyncio.run(
        bridge.run_async(
            transport="streamable-http",
            host="0.0.0.0",
            port=os.getenv("PORT", 8080),
        )
    )

Client code uses fastmcp.Client, passes the bearer token, and targets the /mcp/ endpoint. Note the trailing slash and transport.

from fastmcp import Client
import asyncio
import google.oauth2.id_token
import google.auth.transport.requests
import os
import sys
argv = sys.argv
if len(argv) != 3:
    sys.stderr.write(f"Usage: python {argv[0]} <a> <b>\n")
    sys.exit(1)
x = argv[1]
y = argv[2]
service_url = os.getenv("RUN_BASE_URL")
req = google.auth.transport.requests.Request()
id_tok = google.oauth2.id_token.fetch_id_token(req, service_url)
client_cfg = {
    "mcpServers": {
        "cloud-run": {
            "transport": "streamable-http",
            "url": f"{service_url}/mcp/",
            "headers": {
                "Authorization": "Bearer token",
            },
            "auth": id_tok,
        }
    }
}
conn = Client(client_cfg)
async def invoke():
    async with conn:
        print("Connected")
        result = await conn.call_tool(
            name="plus_async",
            arguments={"a": x, "b": y},
        )
        print(result)
if __name__ == "__main__":
    asyncio.run(invoke())

Container image for deployment:

# FastMCP Application Dockerfile
FROM docker.io/python:3.13-slim
RUN apt-get update && \
    apt-get install -y --no-install-recommends curl ca-certificates
ADD https://astral.sh/uv/install.sh /uv-installer.sh
RUN sh /uv-installer.sh && \
    rm /uv-installer.sh
ENV PATH="/root/.local/bin:${PATH}"
WORKDIR /app
COPY main.py main.py
COPY pyproject.toml pyproject.toml
COPY uv.lock uv.lock
RUN uv sync --locked
EXPOSE 8080
ENTRYPOINT ["uv", "run","/app/main.py"]

One possible workflow to build, push, and deploy the service with IAM required:

BILLING_ID="..."
GCP_PROJECT="..."
APP_NAME="fastmcp"
GCP_REGION="..."
SA_NAME="caller"
SA_EMAIL=${SA_NAME}@${GCP_PROJECT}.iam.gserviceaccount.com
gcloud iam service-accounts create ${SA_NAME} \
--project=${GCP_PROJECT}
gcloud iam service-accounts keys create ${PWD}/${SA_NAME}.json \
 --iam-account=${SA_EMAIL} \
 --project=${GCP_PROJECT}
gcloud projects add-iam-policy-binding ${GCP_PROJECT} \
--member=serviceAccount:${SA_EMAIL} \
--role=roles/run.invoker
gcloud auth print-access-token \
| podman login ${GCP_REGION}-docker.pkg.dev \
  --username=oauth2accesstoken \
  --password-stdin
REPO_NAME="cloud-run-source-deploy"
APP_VERSION="0.0.1"
IMG_URI=${GCP_REGION}-docker.pkg.dev/${GCP_PROJECT}/${REPO_NAME}/${APP_NAME}:${APP_VERSION}
podman build \
--tag=${IMG_URI} \
--file=${PWD}/Dockerfile \
${PWD}
podman push ${IMG_URI}
gcloud run deploy ${APP_NAME} \
--image=${IMG_URI} \
--region=${GCP_REGION} \
--project=${GCP_PROJECT} \
--no-allow-unauthenticated

Why this matters

Cloud Run’s IAM integrates cleanly with ID tokens, but authentication alone does not make a protocol work. MCP over HTTP has a specific handshake and message format. If you post arbitrary JSON—or no body at all—to /mcp, the server will not process the request. Using a client that speaks MCP ensures the transport, path, and headers are formed correctly and keeps the service secure without disabling authentication.

Takeaways and practical advice

Rely on Application Default Credentials by exporting GOOGLE_APPLICATION_CREDENTIALS rather than hard-wiring paths inside code. Match the client transport to the server’s transport; streamable-http on the server pairs with streamable-http in the client. Use the MCP client to negotiate the protocol instead of raw requests, and include the service URL as the ID token audience. Pay attention to endpoint details such as the trailing slash on /mcp/. With these in place, you keep IAM enforced on Cloud Run and gain a clean, repeatable way to call MCP tools.

Conclusion

The 403 here wasn’t about “turning off” security; it was about speaking the right protocol to an authenticated service. Deploy the FastMCP server on Cloud Run with authentication enabled, fetch an ID token for the service URL, and use fastmcp.Client configured for streamable-http against the /mcp/ endpoint. That combination removes guesswork about request bodies and lets you call tools reliably while preserving security.

The article is based on a question from StackOverflow by Sachu and an answer by DazWilkin.