The 30-Second Version
Generative AI is software that predicts what comes next. Give it a question, it predicts the most likely helpful answer. Give it half a sentence, it predicts the rest. That’s genuinely all it is at the core - a very sophisticated autocomplete.
The “generative” part means it creates new content (text, images, code, audio) rather than retrieving stored answers.
How a Prompt Becomes a Response
flowchart LR A([Your Prompt]) --> B[Tokenizer Breaks text into tokens] B --> C[LLM Model Predicts next tokens] C --> D[Decoder Reassembles into text] D --> E([Response]) style A fill:#dbeafe,stroke:#2563eb,color:#1d4ed8 style E fill:#dcfce7,stroke:#16a34a,color:#15803d style C fill:#f3e8ff,stroke:#7c3aed,color:#7c3aedflowchart LR A([Your Prompt]) --> B[Tokenizer Breaks text into tokens] B --> C[LLM Model Predicts next tokens] C --> D[Decoder Reassembles into text] D --> E([Response]) style A fill:#dbeafe,stroke:#2563eb,color:#1d4ed8 style E fill:#dcfce7,stroke:#16a34a,color:#15803d style C fill:#f3e8ff,stroke:#7c3aed,color:#7c3aed
What the Model Actually Does
The model doesn’t think. It doesn’t understand. It does one thing very well: given all the text before this point, what token is most likely to come next?
It does this billions of times, one token at a time, until it decides to stop.
A token is roughly ¾ of a word. “Understanding” might be 3 tokens: “Under”, “stand”, “ing”.
The model has seen so much human text during training that predicting “what comes next” is functionally equivalent to sounding like a knowledgeable human on almost any topic. That’s the magic and the limitation.
The Three Main Types
Not all generative AI is the same. The field has three major branches you’ll encounter:
Language models (LLMs) - GPT-4, Claude, Gemini. Take text in, produce text out. This is where most business applications live.
Image models - DALL-E, Midjourney, Stable Diffusion. Take a text description, produce an image.
Multimodal models - Can handle text, images, audio, and code together. GPT-4o and Gemini Ultra are examples.
This tutorial series focuses on LLMs because they power 90% of business AI applications.
Why It Hallucinates
The model doesn’t have a “facts database” it looks up. It predicts based on patterns. If it’s never seen reliable data about a niche topic, it will still generate a confident-sounding response - because that’s what it’s optimized to do.
This is not a bug to be fixed. It’s a consequence of how the system works. Your architecture needs to account for it.
What this means for your code: Every AI response needs validation. Don’t pipe AI output directly into your database, your UI, or your logic without checking it. Hallucinations are real and consistent - plan for them with evals, structured output, and fallback handling.
What this means for testing: You cannot write a test that says assert response == "exact expected string" for most AI outputs. You need probabilistic testing: does the response contain what it should? Is it within acceptable range? Tutorial 8 (AI Testing Strategies) covers this in depth.
What this means for requirements: Any AI feature you specify needs acceptance criteria that account for variability. “The AI will always return the correct answer” is not a valid acceptance criterion. “The AI will return an answer that passes the following 5 quality checks” is.
What this means for your roadmap: AI features need a different definition of “done.” Plan time for evaluation, iteration, and ongoing monitoring. A feature that works 90% of the time today might work 70% of the time after a model update. Build that into your maintenance budget.
How a Real Application Uses This
A real AI app has layers beyond just the model:
Where the AI Fits in a Real Application
flowchart TB U([User]) --> UI[UI Layer Chat, form, button] UI --> API[Application API Your backend] API --> P[Prompt Builder Templates + context] P --> M[AI Model OpenAI / Anthropic / Google] M --> V[Validator Check output format] V --> DB[(Your Database)] V --> UI style M fill:#f3e8ff,stroke:#7c3aed,color:#7c3aed style P fill:#fef3c7,stroke:#d97706,color:#b45309 style V fill:#dcfce7,stroke:#16a34a,color:#15803dflowchart TB U([User]) --> UI[UI Layer Chat, form, button] UI --> API[Application API Your backend] API --> P[Prompt Builder Templates + context] P --> M[AI Model OpenAI / Anthropic / Google] M --> V[Validator Check output format] V --> DB[(Your Database)] V --> UI style M fill:#f3e8ff,stroke:#7c3aed,color:#7c3aed style P fill:#fef3c7,stroke:#d97706,color:#b45309 style V fill:#dcfce7,stroke:#16a34a,color:#15803d
The model is just one component. The prompt builder, validator, and application logic around it often matter more than the model itself.
What’s Next
In the next tutorial, you’ll go deeper into how LLMs specifically work - tokens, context windows, temperature settings, and why model choice matters.
You don’t need to understand the math behind transformers to build with AI effectively. Focus on the mental model: inputs in, outputs out, validate everything.
Interview Practice
- Explain generative AI in one sentence without using hype words.
- What is the difference between discriminative AI and generative AI?
- Why is next-token prediction enough to produce useful text?
- Name two risks that come from models generating plausible content instead of verified facts.
- How would you explain GenAI value differently to a developer, QA engineer, BA, and PM?
- What belongs in application logic rather than relying on the model?