Combining Pydantic models with LangChain's with_structured_output cleanly — avoiding prompt collisions, validation errors, and runaway evaluation loops.
with_structured_output binds a Pydantic BaseModel to an LLM so the model returns a validated object instead of free text. It works well until your own system prompt starts fighting the one the method injects behind your back. Here is what actually goes wrong and how to keep it boring.
Make it explicit that the reply must be a single JSON object. Without this the model tends to wrap the payload in prose or markdown fences, which breaks parsing.
with_structured_output injects its own hidden system prompt that already tells the model to conform to the supplied BaseModel. If you also restate every field constraint, the two rule sets can clash — leading to validation errors or hallucinated keys.
Keep your prompt about the format. Let the method own the schema.
binary_score A common grader pattern polls until the score flips:
result = grader.invoke({...})
while result.binary_score.lower() != "yes":
result = grader.invoke({...}) # keep polling
If the model ever returns malformed JSON — or anything other than "yes" / "no" — Pydantic raises a validation error. The surrounding while silently retries, and the workflow becomes an infinite loop.
| Risk | Mitigation |
|---|---|
| JSON parse failure | Wrap the invoke call in try/except and break after N retries. |
| Unexpected fields | Set extra = "forbid" on the model so issues surface immediately. |
| Non-terminating loop | Add a max_attempts or timeout in the LangGraph node; return a fallback if exceeded. |
Model drift ("Yes" vs "yes") | Normalise with .strip().lower() before comparison. |
A bounded version of the loop above:
from pydantic import BaseModel, ConfigDict
class Grade(BaseModel):
model_config = ConfigDict(extra="forbid") # Pydantic v2; use `class Config` on v1
binary_score: str
max_attempts = 5
for attempt in range(max_attempts):
try:
result = grader.invoke({...})
except Exception: # JSON / validation failure
continue
if result.binary_score.strip().lower() == "yes":
break
else:
result = fallback_response() # loop exhausted, don't hang
You are a JSON-only assistant. Respond with a JSON object that matches the schema
exactly—no commentary, no extra keys, no markdown fencing.
That’s the whole job of your prompt: enforce the format. with_structured_output handles the schema specifics.
Keep the system prompt succinct and delegate schema enforcement to with_structured_output. You avoid prompt collisions, surface validation issues early instead of swallowing them, and never ship a loop that can run forever.