2025, Nov 19 05:00

Stop Azure OpenAI 404s in LangChain: AzureChatOpenAI is required for gpt‑4o and other chat models

Getting 404 Not Found in LangChain with Azure OpenAI calling gpt‑4o? Use AzureChatOpenAI for modern chat models; AzureOpenAI is only for text completions.

Using LangChain with Azure OpenAI can be deceptively easy—until a minor mismatch between client and model family turns into a head‑scratcher. A common case: the same configuration works with AzureChatOpenAI but fails with AzureOpenAI, even when you only switch the class. The error typically shows up as a 404 Not Found when calling a modern chat model like gpt-4o-2024-08-06.

Problem setup

The following snippet demonstrates the failing scenario: a call through AzureOpenAI using a gpt-4 class model. The environment variables store your endpoint, key, and API version.

import os
from dotenv import load_dotenv
from langchain_openai import AzureOpenAI

load_dotenv()

endpoint_url = os.getenv("AZURE_ENDPOINT")
access_token = os.getenv("TOOL_KEY")
api_ver = os.getenv("API_VERSION")

text_client = AzureOpenAI(
    azure_endpoint=endpoint_url,
    api_key=access_token,
    api_version=api_ver,
    model="gpt-4o-2024-08-06"
)

result = text_client.invoke("what is 2+3?")
print(result.content)

This returns a 404 and the library surfaces it as:

openai.NotFoundError: Error code: 404 - {'detail': 'Not Found'}

Why this happens

The root cause is a client–model mismatch. AzureChatOpenAI is designed for chat completion models, including newer options like gpt-4o-2024-08-06. AzureOpenAI targets text completion models and does not support chat completion models. That is why the exact same model name works with AzureChatOpenAI but fails with AzureOpenAI.

AzureChatOpenAI supports chat completion models, including newer models like gpt-4o-2024-08-06. AzureOpenAI supports text completion models, but NOT chat completion models (like the new gpt-4 models).

Reference: LangChain docs for Azure OpenAI.

Fix

Use the chat-aware client when calling gpt-4 class models. With AzureChatOpenAI, the same prompt completes successfully.

import os
from dotenv import load_dotenv
from langchain_openai import AzureChatOpenAI

load_dotenv()

endpoint_url = os.getenv("AZURE_ENDPOINT")
access_token = os.getenv("TOOL_KEY")
api_ver = os.getenv("API_VERSION")

chat_client = AzureChatOpenAI(
    azure_endpoint=endpoint_url,
    api_key=access_token,
    api_version=api_ver,
    model="gpt-4o-2024-08-06"
)

reply = chat_client.invoke("what is 2+3?")
print(reply.content)

The response resolves as expected, for example: 2 + 3 equals 5.

Why this matters

Choosing the correct LangChain integration saves time and avoids opaque HTTP errors. Modern gpt-4 family models are chat models, and they require the chat-oriented client. If you wire them to a text completion client, you get a Not Found response even though the endpoint and credentials are correct.

Takeaways

Pair models with the right client: use AzureChatOpenAI for gpt-4 class chat models such as gpt-4o-2024-08-06, and reserve AzureOpenAI for text completion models. If you see a 404 from AzureOpenAI while calling a gpt‑4 model, switch to AzureChatOpenAI. When in doubt, check the LangChain integration docs to confirm which model families a client supports.