The Problem with Unstructured Output
Imagine asking a colleague for a project status update. They might say:
“Yeah so we’re about 70% done, I think. The frontend is basically finished, but the backend API is still being worked on. We might be done by Friday, possibly Thursday if things go well.”
That’s useful for a conversation. It’s useless for a dashboard.
If your application needs to update a progress bar, log a completion percentage, or send a status email - you need the data in a form your code can actually read. You need the equivalent of a form, not a monologue.
This is the fundamental tension in AI applications: LLMs are optimized to produce helpful human-readable text, but your code needs machine-readable data.
Unstructured vs Structured: The Pipeline Comparison
Unstructured vs Structured AI Output Pipelines
flowchart LR
subgraph Bad ["Unstructured - Brittle"]
B1[Prompt] --> B2[Free text response]
B2 --> B3[Regex parsing]
B3 --> B4[Brittle code]
end
subgraph Good ["Structured - Reliable"]
G1[Prompt + Schema] --> G2[JSON response]
G2 --> G3[json.loads]
G3 --> G4[Reliable code]
end
style Bad fill:#fef2f2,stroke:#ef4444
style Good fill:#f0fdf4,stroke:#22c55e
style B4 fill:#fecaca,stroke:#ef4444,color:#991b1b
style G4 fill:#bbf7d0,stroke:#16a34a,color:#15803d
flowchart LR
subgraph Bad ["Unstructured - Brittle"]
B1[Prompt] --> B2[Free text response]
B2 --> B3[Regex parsing]
B3 --> B4[Brittle code]
end
subgraph Good ["Structured - Reliable"]
G1[Prompt + Schema] --> G2[JSON response]
G2 --> G3[json.loads]
G3 --> G4[Reliable code]
end
style Bad fill:#fef2f2,stroke:#ef4444
style Good fill:#f0fdf4,stroke:#22c55e
style B4 fill:#fecaca,stroke:#ef4444,color:#991b1b
style G4 fill:#bbf7d0,stroke:#16a34a,color:#15803d
The unstructured path requires fragile regex patterns like r"(\d+)%" that break whenever the model slightly changes its phrasing. The structured path uses json.loads() - a built-in function that either works or raises a clear exception.
What “Free Text” Looks Like in Practice
Here’s the same question asked to the same model, three times, with no format constraint:
Response 1: “The sentiment is positive.” Response 2: “I would classify this as a positive review.” Response 3: “Positive - the customer seems satisfied with their purchase.”
Three different formats for the same answer. Your regex has to handle all three. It won’t. At least one will break in production.
JSON Mode: The First Solution
OpenAI (and most other providers) support a response_format parameter that forces the model to output valid JSON:
response = client.chat.completions.create(
model="gpt-4o",
response_format={"type": "json_object"},
messages=[
{
"role": "system",
"content": "You are a sentiment analyzer. Always respond with valid JSON."
},
{
"role": "user",
"content": 'Analyze: "The product arrived on time but packaging was damaged." Return JSON with fields: sentiment (positive/neutral/negative), confidence (0.0-1.0), reason (string).'
}
]
)
import json
result = json.loads(response.choices[0].message.content)
print(result["sentiment"]) # "neutral"
print(result["confidence"]) # 0.72
JSON mode guarantees valid JSON syntax, but not the fields you asked for. The model might return {"result": "neutral"} when you asked for {"sentiment": "neutral"}. You still need to validate the keys and types. That’s what Tutorial 6 (Pydantic schemas) is for.
Structured Output with JSON Schema: The Better Solution
The newer and more reliable approach is json_schema mode, where you provide a formal schema that the model must conform to:
response = client.chat.completions.create(
model="gpt-4o",
response_format={
"type": "json_schema",
"json_schema": {
"name": "sentiment_analysis",
"strict": True,
"schema": {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": ["positive", "neutral", "negative"]
},
"confidence": {
"type": "number"
},
"reason": {
"type": "string"
}
},
"required": ["sentiment", "confidence", "reason"],
"additionalProperties": False
}
}
},
messages=[...]
)
With strict: True, the model will always return exactly these fields, with exactly these types. No extra fields, no missing fields.
Structured Input: The Other Half
”Structured output” is well-understood. “Structured input” is less discussed but equally important.
When you pass data into an AI call, how you structure it matters. Compare:
Unstructured input:
Here is some customer data: John Smith, 45, premium plan, joined 2021, 3 support tickets last month, last login was 2 weeks ago. Tell me if he's a churn risk.
Structured input:
customer_data = {
"name": "John Smith",
"age": 45,
"plan": "premium",
"join_year": 2021,
"support_tickets_last_30_days": 3,
"days_since_last_login": 14
}
prompt = f"""Analyze churn risk for the following customer:
{json.dumps(customer_data, indent=2)}
Return JSON with: churn_risk (low/medium/high), primary_risk_factor (string), recommended_action (string)."""
Structured input is easier to template, easier to test, easier to audit, and produces more consistent outputs because the model sees the data in a consistent format every time.
A Complete Working Example
Structured Input → Structured Output
Example code (static). Copy and run locally in your own environment.
import os
import json
from openai import OpenAI
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
# Structured input
user_profile = {
"name": "Alice Chen",
"account_tier": "basic",
"days_since_last_login": 21,
"failed_payments": 2,
"support_tickets_open": 1,
"feature_usage_score": 0.3
}
response = client.chat.completions.create(
model="gpt-4o-mini",
response_format={"type": "json_object"},
messages=[
{
"role": "system",
"content": "You are a customer health analyst. Respond only with valid JSON."
},
{
"role": "user",
"content": f"""Analyze this customer profile for churn risk:
{json.dumps(user_profile, indent=2)}
Return JSON with exactly these fields:
- churn_risk: one of "low", "medium", "high"
- confidence: number between 0 and 1
- top_risk_factor: the single most important risk indicator
- recommended_action: one concrete next step for the account team"""
}
],
max_tokens=300
)
result = json.loads(response.choices[0].message.content)
print(f"Churn Risk: {result['churn_risk']}")
print(f"Confidence: {result['confidence']:.0%}")
print(f"Key Factor: {result['top_risk_factor']}")
print(f"Action: {result['recommended_action']}")
Validating the Response
json.loads() tells you the response is valid JSON. It does not tell you the response has the fields your code expects.
Always validate the parsed result before using it:
result = json.loads(response.choices[0].message.content)
# Basic validation
required_fields = ["churn_risk", "confidence", "top_risk_factor", "recommended_action"]
for field in required_fields:
if field not in result:
raise ValueError(f"AI response missing required field: {field}")
# Type validation
valid_risk_levels = {"low", "medium", "high"}
if result["churn_risk"] not in valid_risk_levels:
raise ValueError(f"Invalid churn_risk value: {result['churn_risk']}")
Tutorial 6 shows how to replace this manual validation with Pydantic models that do it automatically.
What this means for your code: Use Pydantic models as your JSON schema source of truth. Define the expected shape of every AI response as a Pydantic model, then validate all AI output through it before using the data anywhere in your application. This gives you type safety, automatic validation errors, and self-documenting contracts between your AI calls and your business logic.
What this means for testing: Structured output gives you deterministic assertions. Instead of checking if a response “sounds right,” you can assert response["churn_risk"] in ["low", "medium", "high"]. Schema violations become test failures. Write assertions against the schema - field presence, type correctness, enum values - not against the human-readable content inside those fields.
What this means for data pipelines: Structured AI output means AI analysis can feed directly into your existing databases, dashboards, and workflows - without a human parsing step. A churn risk score from an AI model can populate the same CRM field as a score from a traditional ML model. This is what makes AI features operationally viable at scale, not just impressive in demos.
What’s Next
JSON mode gets you valid JSON. JSON schema mode gets you the right shape. But neither guarantees the right data types, valid enum values, or semantic correctness. That’s where Pydantic schemas come in - and that’s the subject of the next tutorial.
Any AI output that your code will read - not just display - must be structured. If a human is the consumer, free text is fine. If code is the consumer, use JSON mode at minimum, JSON schema mode for production.
Interview Practice
- Why is free-text output risky for application code?
- What is the difference between structured input and structured output?
- How does JSON mode or schema-constrained output improve reliability?
- What should your app do when model output fails validation?
- Why is deterministic validation still needed after a model returns JSON?
- Give an example of a field that should be an enum rather than free text.