2025, Dec 29 01:00

Build a RAG assistant that asks for the product: switch from direct RetrieveAndGenerate to an AWS Bedrock Agent

Learn how to fix RAG product mismatches by using an AWS Bedrock Agent with InvokeAgent to clarify user intent, confirm product, and improve retrieval accuracy

Build a RAG assistant that actually clarifies user intent: when your catalog has similar products backed by near-identical documents, generic questions like “how do I run a calibration test” will return valid but product-mismatched answers. The key is to stop querying the knowledge base directly for every question and introduce an agent that drives the conversation and asks for the product when it’s missing.

Baseline implementation that causes the issue

The following Lambda handler calls Bedrock’s RetrieveAndGenerate API straight against a knowledge base. It returns text and citations just fine, but there is no conversational layer to request clarification about which product the user means.

brt = boto3.client("bedrock-agent-runtime", region_name="us-west-1")


def handler(evt, ctx):
    fm_id = "amazon.titan-text-premier-v1:0"
    user_q = evt["queryStringParameters"]["question"]
    sid = evt["queryStringParameters"]["session_id"] if "session_id" in evt["queryStringParameters"] else None

    kb_env_id = os.environ["KNOWLEDGE_BASE_ID"]
    aws_reg = "us-west-1"
    fm_arn = f"arn:aws:bedrock:{aws_reg}::foundation-model/{fm_id}"

    rag_out = kb_search(user_q, kb_env_id, fm_arn, sid)

    out_text = rag_out["output"]["text"].strip()
    sid = rag_out.get("sessionId", "")
    refs = rag_out["citations"]

    hdrs = {
        "Access-Control-Allow-Origin": "*",
        "Access-Control-Allow-Credentials": True
    }

    return {
        "statusCode": 200,
        "headers": hdrs,
        "body": json.dumps({
            "question": user_q,
            "answer": out_text,
            "citations": refs,
            "sessionId": sid
        }, ensure_ascii=False)
    }


def kb_search(q, kb_id, model_arn, sid=None):
    return brt.retrieve_and_generate(
        input={"text": q},
        retrieveAndGenerateConfiguration={
            "type": "KNOWLEDGE_BASE",
            "knowledgeBaseConfiguration": {
                "knowledgeBaseId": kb_id,
                "modelArn": model_arn
            }
        },
        sessionId=sid if sid else None
    )

Why this happens

Direct RetrieveAndGenerate targets retrieval and synthesis. It does not manage dialogue, slot-filling, or “ask-back” behavior. With four products sharing similarly named documents, an underspecified question will retrieve plausible passages from any product, and the model will answer without confirming which product is intended. That is expected for a raw RAG call.

There is also a correctness pitfall to watch for in the sample: using a curly opening quote in a key like “statusCode" will prevent the function from running. Make sure you use straight ASCII quotes throughout the code.

Make the assistant ask for the product

Introduce a Bedrock Agent backed by the same knowledge base and give it an instruction that enforces a confirmation step. A concise system directive works well here:

You are a product information assistant. Assist the customer with questions they have about our products. Always seek confirmation from the customer regarding which product they're enquiring about.

With this setup, the agent handles the conversational turn-taking and will prompt the user to choose A, B, C, or D when the input doesn't specify a product. This is fundamentally different from a bare knowledge base query and is the right tool for disambiguation.

Switching the call from RetrieveAndGenerate to InvokeAgent

In the Lambda, replace the direct knowledge base query with a call to the agent via the InvokeAgent method. The knowledge base remains attached to the agent; the difference is that the agent mediates the conversation, retrieves from the knowledge base as needed, and applies the instruction to always confirm the product.

agent_rt = boto3.client("bedrock-agent-runtime", region_name="us-west-1")


def handler(evt, ctx):
    user_q = evt["queryStringParameters"]["question"]
    sid = evt["queryStringParameters"].get("session_id") if "queryStringParameters" in evt else None

    # Call the Bedrock Agent instead of direct RAG
    agent_resp = agent_rt.invoke_agent(  # invoke the agent configured with your knowledge base
        # provide the user input and your session handling here
        # (the agent will ask the user to pick A/B/C/D when missing)
    )

    # Extract and return the agent's response per your application's needs
    # ...

Author the agent in the console following the Create Bedrock Agents guide, attach your knowledge base, and place the instruction above as its core behavior. Your application continues to pass the session identifier so the agent can keep context across turns.

Improving retrieval precision

When you still need to tighten what gets pulled from the knowledge base, metadata filtering can help improve retrieval accuracy. If your documents are tagged by product, filter accordingly to keep the context product-specific once the user has confirmed their choice.

Why this matters

Users rarely provide perfect context. Offloading clarification, confirmation, and other dialogue management to a Bedrock Agent prevents silent mismatches, reduces frustration, and avoids brittle application-side heuristics. You keep the same knowledge base but add a layer that enforces the crucial “which product?” checkpoint before answering.

Takeaways

If you need the assistant to ask for missing details like product selection, don't query the knowledge base directly. Configure a Bedrock Agent with a clear instruction to always confirm the product and call it via InvokeAgent. Keep your Lambda free of curly quotes so it actually runs, and consider metadata filtering to narrow retrieval to the selected product. This combination preserves the benefits of RAG while adding the conversational scaffolding needed to deliver the correct answer for A, B, C, or D.