GenAI Foundations / Beginner Track Module 1 / 9
GenAI Foundations Beginner ⏱ 15 min
DEVQABAPM

What is Generative AI and How It Works

Understand what generative AI actually does - not the hype, but the mechanism. How text, images, and code come out of a model and why it matters for your role.

How to Use This Lesson

  • Start with the user problem, then map the pattern to architecture and failure modes.
  • If a code or design example is included, change one assumption and reason through the impact.
  • Use role callouts, checklists, and Q&A sections as implementation or interview prep notes.

The 30-Second Version

Generative AI is software that predicts what comes next. Give it a question, it predicts the most likely helpful answer. Give it half a sentence, it predicts the rest. That’s genuinely all it is at the core - a very sophisticated autocomplete.

The “generative” part means it creates new content (text, images, code, audio) rather than retrieving stored answers.

How a Prompt Becomes a Response

flowchart LR
  A([Your Prompt]) --> B[Tokenizer
Breaks text into tokens]
  B --> C[LLM Model
Predicts next tokens]
  C --> D[Decoder
Reassembles into text]
  D --> E([Response])

  style A fill:#dbeafe,stroke:#2563eb,color:#1d4ed8
  style E fill:#dcfce7,stroke:#16a34a,color:#15803d
  style C fill:#f3e8ff,stroke:#7c3aed,color:#7c3aed
Code copied! Link copied!

What the Model Actually Does

The model doesn’t think. It doesn’t understand. It does one thing very well: given all the text before this point, what token is most likely to come next?

It does this billions of times, one token at a time, until it decides to stop.

A token is roughly ¾ of a word. “Understanding” might be 3 tokens: “Under”, “stand”, “ing”.

The Key Insight

The model has seen so much human text during training that predicting “what comes next” is functionally equivalent to sounding like a knowledgeable human on almost any topic. That’s the magic and the limitation.

The Three Main Types

Not all generative AI is the same. The field has three major branches you’ll encounter:

Language models (LLMs) - GPT-4, Claude, Gemini. Take text in, produce text out. This is where most business applications live.

Image models - DALL-E, Midjourney, Stable Diffusion. Take a text description, produce an image.

Multimodal models - Can handle text, images, audio, and code together. GPT-4o and Gemini Ultra are examples.

This tutorial series focuses on LLMs because they power 90% of business AI applications.

Why It Hallucinates

The model doesn’t have a “facts database” it looks up. It predicts based on patterns. If it’s never seen reliable data about a niche topic, it will still generate a confident-sounding response - because that’s what it’s optimized to do.

This is not a bug to be fixed. It’s a consequence of how the system works. Your architecture needs to account for it.

⚙️ For Developers

What this means for your code: Every AI response needs validation. Don’t pipe AI output directly into your database, your UI, or your logic without checking it. Hallucinations are real and consistent - plan for them with evals, structured output, and fallback handling.

🧪 For QA Engineers

What this means for testing: You cannot write a test that says assert response == "exact expected string" for most AI outputs. You need probabilistic testing: does the response contain what it should? Is it within acceptable range? Tutorial 8 (AI Testing Strategies) covers this in depth.

📊 For Business Analysts

What this means for requirements: Any AI feature you specify needs acceptance criteria that account for variability. “The AI will always return the correct answer” is not a valid acceptance criterion. “The AI will return an answer that passes the following 5 quality checks” is.

🎯 For Product Managers

What this means for your roadmap: AI features need a different definition of “done.” Plan time for evaluation, iteration, and ongoing monitoring. A feature that works 90% of the time today might work 70% of the time after a model update. Build that into your maintenance budget.

How a Real Application Uses This

A real AI app has layers beyond just the model:

Where the AI Fits in a Real Application

flowchart TB
  U([User]) --> UI[UI Layer
Chat, form, button]
  UI --> API[Application API
Your backend]
  API --> P[Prompt Builder
Templates + context]
  P --> M[AI Model
OpenAI / Anthropic / Google]
  M --> V[Validator
Check output format]
  V --> DB[(Your Database)]
  V --> UI

  style M fill:#f3e8ff,stroke:#7c3aed,color:#7c3aed
  style P fill:#fef3c7,stroke:#d97706,color:#b45309
  style V fill:#dcfce7,stroke:#16a34a,color:#15803d
Code copied! Link copied!

The model is just one component. The prompt builder, validator, and application logic around it often matter more than the model itself.

What’s Next

In the next tutorial, you’ll go deeper into how LLMs specifically work - tokens, context windows, temperature settings, and why model choice matters.

Before You Continue

You don’t need to understand the math behind transformers to build with AI effectively. Focus on the mental model: inputs in, outputs out, validate everything.

Interview Practice

  1. Explain generative AI in one sentence without using hype words.
  2. What is the difference between discriminative AI and generative AI?
  3. Why is next-token prediction enough to produce useful text?
  4. Name two risks that come from models generating plausible content instead of verified facts.
  5. How would you explain GenAI value differently to a developer, QA engineer, BA, and PM?
  6. What belongs in application logic rather than relying on the model?