A detective examining code with magnifying glass, following trails of data flow and control flow through a complex program maze

Lecture overview:

Total time: ~55 minutes (tight)
Theme: Debugging as detective work—systematic investigation, not random guessing
Prerequisites: L5 readability (naming, code clarity), L13 AI coding assistants (workflow)
Connects to: All future assignments (debugging is constant), final project

Key connections to prior lectures:

L5 Readability: Good names and clear code make program understanding MUCH easier
L13 AI Assistants: The 6-step workflow applies directly to debugging with AI

The narrative:

Understanding code requires tracing control flow and data flow
Diagrams help visualize complex relationships
Debugging is the scientific method applied to code
Debuggers accelerate hypothesis testing
AI can assist, but you must understand the code yourself

→ Transition: Let's start with the learning objectives...

CS 3100: Program Design and Implementation II

Lecture 14: Program Understanding & Debugging

Poll: Why don't you use Oakland office hours or appointments?

We offer in-person office hours and video appointments.

A. I didn't know you held them.

B. I'm not available at those times.

C. I prefer getting help from AI.

D. I prefer getting help from friends.

E. I prefer getting help through Pawtograder.

F. I don't think it would be helpful.

G. I don't know

H. other

Text espertus to 22333 if the
URL isn't working for you.

https://pollev.com/espertus

Answers are anonymous but count toward your participation grade.

Learning Objectives

After this lecture, you will be able to:

Utilize control flow and data flow analysis to understand a program
Utilize diagrams (call graphs, sequence diagrams) to visualize program behavior
Apply the scientific method to debugging
Utilize a debugger to step through a program and inspect state
Utilize an AI programming agent to assist with debugging

Coming Soon: Much Bigger Codebases

A3 (this week!):

Starter code with 10+ interfaces: RecipeCollection, Cookbook, UserLibrary...
Inheritance hierarchies you didn't design
Jackson annotations you've never seen

Group project (March):

Teammates' code you need to understand and extend

"How do I even start understanding this codebase?"

Poll: How many hours per assignment do you spend debugging?

Text espertus to 22333 if the
URL isn't working for you.

https://pollev.com/espertus

Poll: When You Encounter a Bug, What's Your First Instinct?

A. Add print statements everywhere

B. Start changing code to see what happens

C. Read the code carefully to understand it

D. Ask AI to fix it

E. Panic and consider dropping the class

Text espertus to 22333 if the
URL isn't working for you.

https://pollev.com/espertus

Building Mental Models

Understanding code = building a mental model.

Form a hypothesis: "I think UserLibrary stores recipes directly"
Test it: Look at the interface—getCollections(), not getRecipes()
Refine: "It stores collections that contain recipes"

Same process for debugging: hypothesis → test → refine

The key insight:

Understanding code and debugging are the SAME skill
Both involve forming and testing hypotheses about behavior
Both require building accurate mental models

A3 example (point to diagram):

First hypothesis: "UserLibrary holds recipes"
Test: Look at the interface methods → getCollections(), not getRecipes()
Refined model: "UserLibrary holds RecipeCollections, which hold Recipes"
The diagram SHOWS this—follow the arrows!

Connection to patterns (L7-L8):

Patterns are reusable mental models!
When you see CookbookImpl.builder(), you instantly know it's the Builder pattern
Pattern recognition accelerates understanding

Why mental models matter:

You can't hold entire codebase in your head
You need accurate models of how pieces interact
Wrong mental model = wrong assumptions = bugs

→ Transition: Two lenses help us build these models...

Two Lenses for Understanding Code

Split visualization showing control flow and data flow analysis of the same Java code snippet

Control Flow: Tracing Execution Paths

// From A3: CookbookImpl.addRecipe()
@Override
public Cookbook addRecipe(Recipe recipe) {
    if (containsRecipe(recipe.getId())) {        // Branch point
        throw new IllegalArgumentException(
            "Recipe with ID '" + recipe.getId() + "' already exists");
    }
    List<Recipe> newRecipes = new ArrayList<>(recipes);
    newRecipes.add(recipe);
    return new CookbookImpl(id, title, newRecipes, author, isbn, publisher, publicationYear);
}

Questions to ask: Which paths are possible? What happens if the recipe ID already exists? What happens on the "happy path"?

Data Flow: Tracking How Values Change

// From A3: CookbookImpl constructor (used by Jackson for JSON deserialization)
@JsonCreator
private CookbookImpl(
    @JsonProperty("id") @Nullable String id,
    @JsonProperty("title") String title,
    @JsonProperty("recipes") List<Recipe> recipes,
    @JsonProperty("author") @Nullable String author,
    @JsonProperty("isbn") @Nullable String isbn,
    @JsonProperty("publisher") @Nullable String publisher,
    @JsonProperty("publicationYear") @Nullable Integer publicationYear) {
  this.id = (id != null) ? id : UUID.randomUUID().toString();  // Data: id assigned
  this.title = title;                                          // Data: title assigned
  this.recipes = List.copyOf(recipes);                         // Data: defensive copy
  this.author = isBlank(author) ? null : author;               // Data: normalization!
  this.isbn = isBlank(isbn) ? null : isbn;
  this.publisher = isBlank(publisher) ? null : publisher;
  this.publicationYear = publicationYear;
}

Questions to ask: What happens if author = " " (whitespace)? Trace how the data flows through isBlank() to this.author.

This is code from A3! Understanding this data flow helps debug serialization issues.

Reading data flow:

Track where variables are DEFINED: constructor parameters → fields
Track transformations: isBlank(author) changes data in transit
Consider ALL possible values: null, blank string, valid string

The normalization pattern:

Input: author could be null, "", " ", or "Julia Child"
Flow: isBlank(author) ? null : author
Result: blank strings become null (so getAuthor() returns Optional.empty())
Why? Consistent representation: "not specified" is always null, never blank string

Why this matters for A3:

Jackson deserializes JSON → calls this constructor with @JsonCreator
The @JsonProperty annotations tell Jackson which JSON fields map to parameters
You'll use this same pattern for PersonalCollectionImpl and WebCollectionImpl

Common data flow bugs in serialization:

Forgetting defensive copy → mutable collections leak out
Not normalizing blank strings → inconsistent Optional behavior
Missing null checks before calling methods

→ Transition: Let's see how control and data flow combine...

Combining Control and Data Flow Analysis

// From A3: UserLibraryImpl.findRecipesByTitle() - YOU must implement this!
public List<Recipe> findRecipesByTitle(String title) {
    // Control: iterate through all collections
    // Data: title parameter flows into comparison
    return collections.stream()                              // Data: collections read
        .flatMap(c -> c.getRecipes().stream())              // Control: flatten nested lists
        .filter(r -> r.getTitle().equalsIgnoreCase(title))  // Data: title compared
        .toList();                                           // Control: collect results

    // What could go wrong?
    // - If collections is modified during iteration? → Immutability prevents this!
    // - If title is null? → NullPointerException (should validate at method entry)
    // - If no matches? → Returns empty list (correct behavior)
}

Real bugs often involve both control and data flow issues interacting.

This is code YOU'LL implement for A3!

The interplay of control and data flow:

Control flow: Stream operations create a pipeline of transformations
- stream() → flatMap() → filter() → toList() is the execution path
- Each operation processes elements in sequence
Data flow: The title parameter flows through the entire pipeline
- Read from parameter → flows into equalsIgnoreCase() for each recipe
- Recipe objects flow from collections → through flatMap → through filter → into result list

Why this is safe:

collections is immutable (from List.copyOf() in constructor)
Can't be modified during iteration → no ConcurrentModificationException
Each recipe's title is also immutable → consistent comparisons

Common bugs to watch for:

Forgetting case-insensitive comparison → equals() instead of equalsIgnoreCase()
Not handling null title → should add null check at method entry
Accidentally mutating collections during iteration (prevented by immutability here!)

Connection to L5 Readability:

Stream operations make control flow explicit and linear
Method name findRecipesByTitle clearly documents what data flows in/out

→ Transition: For complex code, we need visualization tools...

Interprocedural Analysis: Following Calls Across Methods

When examining control flow, you may need to trace across method boundaries:

IDE Tools:

Find All References: Where is this method called?
Go to Definition: What does this method do?
Call Hierarchy: Who calls whom?

Visualization:

Call graphs: Static view of what CAN call what
Sequence diagrams: Dynamic view of what DID happen

These tools become essential as codebases grow beyond what you can hold in your head.

Don't Forget: Dynamic Dispatch Affects Control Flow

From A3: When you see collection.addRecipe(recipe), which addRecipe runs?

⚠️ Recall from Quiz 1: The runtime type determines which method executes, not the declared type!

This is the A3 collection hierarchy you'll implement!

This was tricky on Quiz 1!

Many students struggled with dynamic dispatch questions
Static analysis shows what COULD be called
Runtime behavior depends on actual object type

When tracing control flow in A3:

RecipeCollection collection = repository.findById(id).get(); — declared type is RecipeCollection
collection.addRecipe(recipe) — but which addRecipe?
Depends on runtime type: CookbookImpl? PersonalCollectionImpl? WebCollectionImpl?
Each returns its own specific type (Cookbook, PersonalCollection, WebCollection)

Why this matters for serialization:

Jackson uses @JsonTypeInfo to add a "type" field in JSON
When deserializing, Jackson reads "type": "cookbook" and creates CookbookImpl
Dynamic dispatch ensures the right addRecipe implementation runs
This is why polymorphic serialization is crucial!

Debugging implication:

Can't always know statically which method runs
Use debugger to see actual runtime type
Or check the JSON file to see "type" field

Key skill: When "Go to Definition" shows RecipeCollection interface, remember to check which implementation is actually being used at runtime.

→ Transition: Diagrams can help visualize these relationships...

Diagrams Help You See the Big Picture

Complex systems require visualization. Three key diagram types:

Diagram Type	Shows	Use When
Call Graph	Which methods call which	Understanding static structure
Sequence Diagram	Order of calls over time	Understanding dynamic behavior
Class Diagram	Relationships between types	Understanding data model

Call Graphs: Static View of Method Relationships

From A3: JSON serialization involves many method calls under the hood. Understanding this call graph helps debug serialization issues.

This is the A3 JSON serialization flow!

What call graphs reveal:

Entry point: save(collection) in your repository implementation
Jackson's ObjectMapper does the heavy lifting
@JsonTypeInfo annotation adds type discrimination
Polymorphic serialization: Jackson must figure out concrete types
- Is this Ingredient a MeasuredIngredient or VagueIngredient?
- Is this Quantity an ExactQuantity, FractionalQuantity, or RangeQuantity?
Finally writes to file via Files.writeString()

Why this matters for debugging:

If JSON is missing a field → check if getter exists
If JSON has wrong type → check @JsonTypeInfo annotations
If serialization fails → check which object in the graph is problematic
Understanding the call flow helps you identify where things break

How to create:

Start with the entry point (save)
Add direct calls as arrows
Highlight key decision points (polymorphism in blue)
Jackson internals (green) vs your code (starting point)

The Mermaid syntax:

graph LR = left-to-right flow
A[text] = node with label
A --> B = arrow from A to B
style B fill:#color = highlight node

→ Transition: Sequence diagrams show the ORDER of execution...

Sequence Diagrams: Dynamic View Over Time

This is the A3 JSON round-trip sequence!

What sequence diagrams reveal:

The ORDER of method calls (top to bottom = time)
SAVE operation:
- Test calls save(cookbook)
- Repository uses Jackson to serialize: writeValueAsString()
- Jackson adds type info via @JsonTypeInfo
- Result written to file via Files.writeString()
LOAD operation:
- Test calls findById(id)
- Repository reads file via Files.readString()
- Jackson deserializes: readValue(json, RecipeCollection.class)
- Jackson reads "type": "cookbook" field and creates CookbookImpl
- Returns as Optional<RecipeCollection>

Key insight:

The "type" field is CRITICAL for polymorphism
Without it, Jackson wouldn't know to create CookbookImpl vs PersonalCollectionImpl
This is why @JsonTypeInfo and @JsonSubTypes are essential

When to use sequence diagrams:

Debugging serialization issues: "Where in this sequence did it fail?"
Understanding round-trip behavior: "Does the loaded object equal the saved one?"
Clarifying Jackson's role: "What does the library do vs what we do?"

The Mermaid syntax:

participant X = declare an actor
A->>B: message = synchronous call
A-->>B: response = return/response
Note over = add annotations

→ Transition: Class diagrams show data relationships...

Class Diagrams: Understanding the Data Model

This is the core A3 data model!

What class diagrams reveal:

RecipeCollection: Interface with id, title, list of recipes
- Methods return NEW collections (immutability)
Recipe: Has ingredients, instructions, optional servings
Ingredient: Abstract base class (MeasuredIngredient and VagueIngredient extend it)
Quantity: Also abstract (ExactQuantity, FractionalQuantity, RangeQuantity)

Key relationships:

"1" --> "*" = one-to-many: One collection contains many recipes
"1" --> "*" = one-to-many: One recipe has many ingredients
"1" --> "0..1" = optional: Recipe may have servings (or not)

When to use class diagrams:

Understanding domain model structure
Seeing containment relationships (what holds what?)
Planning implementations: "What fields do I need?"
Debugging: "Where should this data live?"

Why this matters for A3:

You need to serialize/deserialize this entire object graph
Jackson handles nested objects automatically
Understanding the structure helps debug JSON issues
When JSON is missing data, check: "Which relationship broke?"

The symbols:

<<interface>> = interface (not concrete class)
<<abstract>> = abstract class
--> = association (has-a)
Numbers = multiplicity (how many)

→ Transition: Let's talk about practical Mermaid usage...

Practical Diagram Workflow

When debugging, sketching quick ASCII art (or gasp paper and pencil) is fastest:

Test -> JsonRecipeCollectionRepository.save(cookbook)
  Repo -> ObjectMapper.writeValueAsString()
    -> adds "type": "cookbook"
    -> serializes Recipe[]
  Repo -> Files.writeString() ⚠️ IO Exception?

Faster than looking up syntax
Forces you to understand flow
Easy to annotate with hypotheses

Using AI to Generate Diagrams: Concrete Examples

Example 1: Visualizing classes relating to CookbookImpl

Prompt: "Create a class diagram with CookbookImpl and all classes it has relationships with."

Example 2: Understanding CookbookImpl construction flow

Prompt: "Create a sequence diagram showing how CookbookImpl is created from JSON
using Jackson."

Example 3: Tracing data flow through UserLibraryImpl

Prompt: "Show the data flow when UserLibraryImpl.findRecipesByTitle()
searches across collections."

✅ Great use of AI: Quickly get the benefits of visualization.

Tip: Use mermaid.live for live editing

When to Use the IDE, Not AI

❌ Don't use AI to:

find a method's javadoc
go to an implementation
find where a method is called

Use the IDE!

Animated demonstration of IDE navigation features

When to Use AI for Diagrams

✅ Use AI for:

Complex call chains: Multiple methods across classes

"Show how save() calls Jackson which calls getters"

Understanding polymorphism:

"Diagram the RecipeCollection hierarchy and show which addRecipe runs for each type"

Stream pipelines:

"Visualize the data transformations in findRecipesByTitle"

Documenting for others:

AI generates shareable Mermaid diagrams
You validate them, others can trust them

AI Generated This Diagram. Now What?

Prompt used:

"I'm working on implementing JsonRecipeCollectionRepository. Create a mermaid sequence diagram showing the save() method, from here to disk."

Pause here and ask students: "What would you do with this?"

Key points to draw out:

Verify against actual code:
- Does it call writeValueAsString() or something else?
- Is the @JsonTypeInfo behavior correct?
- Check the actual implementation—AI may have hallucinated details
Test your understanding:
- Can you explain each step?
- Do you know WHY @JsonTypeInfo is needed?
- What would break if you removed it?
Iterate if wrong:
- "This diagram shows X, but the actual code does Y. Please fix it."
- Include the actual method signature if AI hallucinated

The meta-lesson: AI can generate impressive-looking diagrams, but YOU must verify them. The diagram is only useful if it's accurate AND you understand it.

Why this prompt is effective:

Specific context: "A3's JsonRecipeCollectionRepository" (not generic)
Numbered steps: Clear what to include
Mentions key concepts: @JsonTypeInfo, polymorphism
Specifies format: Mermaid sequence diagram with named participants
Purpose-driven: Understanding serialization flow

Common mistakes to avoid:

❌ "Make a diagram of my code" → too vague
❌ Pasting 500 lines of code → AI will summarize poorly
❌ Accepting diagram without verification → might be wrong
❌ Using AI for simple lookups → IDE is faster

The verification step is CRITICAL:

AI might hallucinate method names
AI might not know about @JsonTypeInfo if not told
AI might oversimplify complex logic
YOU are responsible for correctness

Real A3 scenario:

You're confused about polymorphic serialization
AI helps you visualize the flow
You verify against CookbookImpl's annotations
Now you understand how to implement PersonalCollectionImpl

→ Transition: Let's compare good vs bad AI usage...

AI for Diagrams: What Works vs What Doesn't

❌ Vague: "Diagram the repository pattern"

No diagram type (sequence? class?)
No format → ASCII art 🤮
Generic, not your code

✅ Specific: "Create a mermaid flowchart showing data flow through CookbookImpl constructor"

Diagram type ✓ Format ✓ Your class ✓

❌ Wrong tool: "What does addRecipe do?"

AI reads code, explains it
You could just read it yourself
Slower than Ctrl+Click

✅ Right tool: Press F12 on addRecipe

Instant definition
See params and return type
Can step through in debugger

Key lessons:

For AI prompts:

✅ Be specific: mention class names, method names, annotations
✅ Describe what you want to understand (the WHY)
✅ Specify the diagram type (sequence, class, flow)
❌ Don't be generic: "explain this code"
❌ Don't paste huge code blocks without context

Tool selection:

IDE navigation (instant): Go to definition, find usages, type hierarchy
Hand sketching (fast): Quick flow while debugging
AI diagrams (formal): Complex flows, documentation, sharing

The meta-skill:

Knowing WHEN to use each tool is as important as knowing HOW
Use AI to amplify your understanding, not replace it
Always verify AI output against actual code

A3 specific examples:

❌ AI: "What fields does Recipe have?" → Just open Recipe.java
✅ AI: "Diagram how Recipe with nested Ingredients serializes to JSON"
❌ AI: "Where is Cookbook used?" → IDE Find Usages
✅ AI: "Show the call sequence when deserializing polymorphic collections"

→ Transition: Now that we can visualize effectively, let's learn systematic debugging...

Debugging Is Critical Thinking Applied to Code

Brain capacity visualization showing writing code uses 50% capacity while debugging the same code requires 100%+

"Debugging is twice as hard as writing the code in the first place. Therefore, if you write the code as cleverly as possible, you are, by definition, not smart enough to debug it." — Brian Kernighan

The Scientific Method Applied to Debugging

Observe: Notice unexpected behavior or test failure
Hypothesize: Form a theory about the cause
Predict: What evidence would support or refute this?
Test: Gather evidence (debugger, logging, tests)
Analyze: Does the evidence support the hypothesis?
Iterate: Refine hypothesis or implement fix

Quick Background: What Is HTML?

HTML uses tags (text in angle brackets) to format web pages:

HTML (what you write):

<b>Hello</b> world
<p class="intro">Welcome!</p>

Rendered (what users see):

Hello world

Welcome!

The task: Write a function that extracts just the text, removing all tags.

Input	Expected Output
`<b>Hello</b>`	`Hello`
`Click <a href="url">here</a>`	`Click here`

Example: Debugging HTML Markup Removal

/**
 * Removes tags from s. For example, "Hello <b>world</b>" becomes "Hello world".
 */
public static String removeHtmlMarkup(String s) {
    boolean tag = false, quote = false;
    StringBuilder out = new StringBuilder();

    for (int i = 0; i < s.length(); i++) {
        char c = s.charAt(i);
        if (c == '<' && !quote) {
            tag = true;
        } else if (c == '>' && !quote) {
            tag = false;
        } else if (c == '"' || c == '\'' && tag) {  // Handle quotes in tags
            quote = !quote;
        } else if (!tag) {
            out.append(c);
        }
    }
    return out.toString();
}

Adapted from Andreas Zeller's "Introduction to Debugging" (CC-BY-NC-SA).

Step 1: Observe — Build an Observation Table

Input	Expected	Actual	Result
`<b>foo</b>`	`foo`	`foo`	✓
`<b>"foo"</b>`	`"foo"`	`foo`	✗
`"<b>foo</b>"`	`"foo"`	`<b>foo</b>`	✗
`<b id="bar">foo</b>`	`foo`	`foo`	✓

Both failures involve quotes. Pattern emerging!

Step 2: Hypothesize — Form Theories

Based on our observations, we form two hypotheses:

Hypothesis	Based On
H1: Double quotes are stripped from output	`<b>"foo"</b>` → `foo` (quotes missing)
H2: Quotes outside tags break tag parsing	`"<b>foo</b>"` → `<b>foo</b>` (tags kept)

Let's focus on H1 first—it's simpler. We'll refine it:

H1 (refined): Double quotes are stripped from input, even without tags.

Step 3 & 4: Predict and Test

Prediction: If H1 is correct, even "foo" (no tags) should lose its quotes.

@Test
public void testPlainQuotes() {
    assertEquals("\"foo\"", removeHtmlMarkup("\"foo\""));
}
// Result: FAILS! Output is "foo" - quotes stripped

Input	Expected	Actual	Result
`"foo"`	`"foo"`	`foo`	✗

H1 CONFIRMED: Double quotes are stripped even without any HTML tags.

Step 5: Analyze — Narrow Down the Cause

Where is quote-stripping happening? The only quote-handling code is:

else if (c == '"' || c == '\'' && tag) {
    quote = !quote;
}

This should only trigger when tag is true. But for input "foo", there are no tags, so tag should always be false...

New hypothesis H3: The error is due to tag being set incorrectly.

// Let's add an assertion to test H3:
for (int i = 0; i < s.length(); i++) {
    char c = s.charAt(i);
    assert !tag;  // For "foo", tag should NEVER be true
    // ... rest of code
}
// Result: Assertion PASSES! tag is never true.

H3 REFUTED: tag is always false, yet quotes are still stripped.

Step 5 (continued): The Root Cause Revealed

New hypothesis H4: The quote condition evaluates to true even when tag is false.

else if (c == '"' || c == '\'' && tag) {
    assert false;  // This should never execute for "foo"
    quote = !quote;
}
// Result: Assertion FAILS! The condition IS being triggered.

H4 CONFIRMED. But wait—let's test with single quotes:

removeHtmlMarkup("'foo'");  // Returns "'foo'" - quotes preserved!

Double quotes are stripped. Single quotes are preserved. The difference?

Operator precedence! c == '"' || c == ''' && tag is parsed as
(c == '"') || (c == ''' && tag)

Step 6: Fix — Implement and Verify

Before (buggy):

else if (c == '"' || c == '\'' && tag) {
    quote = !quote;
}

Parsed as:

(c == '"') || (c == '\'' && tag)

After (fixed):

else if ((c == '"' || c == '\'') && tag) {
    quote = !quote;
}

Parsed as:

(c == '"' || c == '\'') && tag

✓ All tests now pass. The bug was operator precedence—parentheses fix it.

The Hypothesis Funnel: How We Narrowed Down the Bug

Funnel diagram showing how each hypothesis test narrowed the search space from 'something is wrong' to 'operator precedence bug'

4 targeted tests took us from "something is wrong" to "operator precedence bug."

When to Use Systematic Debugging

Quick Fixes (1-2 tries):

Obvious typos
Simple logic errors
Familiar patterns

Just fix it.

Systematic Approach:

Bug persists after a few attempts
You don't understand the root cause
Complex interactions involved

Open a debugging log.

Rule of thumb: If you can't fix it in two tries, start writing down your hypotheses. The log helps when you think "but I already checked that!"

Why Keep a Debugging Log?

Split image showing a developer's memory fading over time without a log versus maintaining clarity with a written debugging log

The science behind this:

Working memory holds roughly 4 items at once
Each hypothesis, test, and observation is an "item"
A 15-minute debug session easily generates 20+ items
Without external storage, you WILL lose information

The practical reality:

"I feel like I already tried this" → you probably did
"Let me just try one more thing" → you're going in circles
"Wait, what did that test tell me?" → lost insight

The log doesn't have to be fancy:

Plain text file, markdown, even paper
Just: hypothesis, test, result, next step
30 seconds to write, saves 5 minutes of repeated work

When students resist:

"It slows me down" → It speeds you up after minute 2
"I can remember" → You can't. Nobody can. That's the point.
"It's overkill for small bugs" → Then you'll fix it in 2 tries and never need the log

→ Transition: Now let's see how debuggers accelerate hypothesis testing...

Tools for Debugging

1. Write an automated test first

You may need to reproduce the bug 100+ times to pinpoint it and confirm the fix
Manual reproduction = slow feedback loop = frustration
A failing test makes each hypothesis test instant

2. Reason through the code (rubber duck debugging)

Explain the code out loud, line by line
"The variable tag starts false, then when we see <..."
Often reveals assumptions you didn't know you were making

3. Add logging statements (cautiously)

Useful when you can't attach a debugger (production, distributed systems)
But: clutters code, requires guessing what to log, noisy output
Remove or use proper logging levels when done

4. Use a debugger ← next slide!

Why automated tests first:

The #1 time sink in debugging is REPRODUCING the bug
If it takes 30 seconds to reproduce manually, and you test 50 hypotheses, that's 25 minutes just clicking around
A test that runs in 100ms means 50 hypotheses = 5 seconds of reproduction time
This is why TDD practitioners often debug faster—they already have the reproduction step automated

Rubber duck debugging:

Named after a programmer who carried a rubber duck and explained code to it
Works because explaining forces you to be explicit about assumptions
Often you'll say "and then X happens" and realize you never verified X actually happens
Can use AI as a rubber duck too: "Let me explain this code to you..."

Logging vs debugger tradeoffs:

Logging: works in production, persists across runs, good for intermittent bugs
Debugger: immediate visibility, no code changes, but requires setup and can't always attach

The key insight:

These tools serve different purposes
Tests = fast reproduction
Rubber duck = surface assumptions
Logging = production/distributed debugging
Debugger = deep inspection

→ Transition: Let's see the debugger in action...

Debugging Our HTML Example

Screenshot of debugger stepping through the HTML strip tags code, showing breakpoint hit, variables panel with tag and quote values, and call stack

Debugger = Interactive Hypothesis Testing

Instead of adding print statements and re-running, the debugger lets you test hypotheses interactively:

To Test This Hypothesis...	Use This Debugger Feature
"Is `tag` ever true outside a tag?"	Conditional breakpoint: `tag == true`
"What is `quote` when we hit this line?"	Watch the `quote` variable
"Does this branch ever execute?"	Set breakpoint on that line
"What path did execution take?"	Check the call stack

Core Debugger Concepts

Breakpoints:

Line breakpoint: Pause at specific line
Conditional: Pause only when condition is true
Exception: Pause when exception thrown

Stepping Commands:

Step Over (F10): Execute line, go to next
Step Into (F11): Enter method call
Step Out (Shift+F11): Finish current method
Continue (F5): Run to next breakpoint

Inspection:

Variables pane: See all values at current point
Watch expressions: Track specific expressions
Call stack: How did execution get here?
Evaluate expression: Test hypotheses interactively

Effective Debugger Workflow

Reproduce reliably — Write a failing test if possible
Form hypothesis — Use control/data flow analysis to guess location
Set strategic breakpoints — Start broad, narrow down
Step through systematically — Watch for unexpected values/paths
Note when values first become wrong — That's where the bug manifests
Verify understanding before fixing — Can you explain the bug?

Common pitfall: Stepping through every line. Use conditional breakpoints instead!

AI + Scientific Debugging Workflow

From L13, we learned a 6-step workflow for AI collaboration. That same framework applies to debugging—AI can assist at each step of the scientific method, but YOU lead:

Step	You Do	AI Assists
1. Observe	Notice the bug, gather symptoms	"Explain what this error message means"
2. Hypothesize	Form theory about cause	"What could cause this behavior?"
3. Predict	Define what evidence to look for	"Generate test cases to isolate this"
4. Test	Run debugger, check values	"Create a Mermaid diagram of this call flow"
5. Analyze	Interpret results	"Does this match pattern X?"
6. Fix	Implement the solution	"Suggest a fix for this root cause"

Notice: YOU lead every step. AI accelerates, but doesn't replace your thinking.

The integration:

This lecture's scientific method + L13's AI workflow
AI is a tool at each step, not the driver
YOU form hypotheses, AI helps generate them faster
YOU test predictions, AI helps create test cases
YOU analyze results, AI helps explain patterns

Key principle:

"AI assists" ≠ "AI decides"
You must understand each step well enough to EVALUATE AI's help
If you can't evaluate, don't use AI for that step

Practical examples:

Observe: "Explain this NullPointerException stack trace"
Hypothesize: "What are common causes of off-by-one errors in loops?"
Predict: "Generate edge case tests for this function"
Test: "Create a sequence diagram showing how these classes interact"
Analyze: "Does this variable change where I expect?"
Fix: "Given this root cause, suggest a minimal fix"

→ Transition: But remember from L13—some evaluations are harder than others...

Callback: How Hard Is It to Evaluate?

From L13: AI can generate for any task, but evaluation difficulty varies:

Easy to Evaluate	Hard to Evaluate
Does the test pass now? ✓	Did we fix the root cause or just the symptom?
Does the error message disappear? ✓	Will this fix cause bugs elsewhere?
Does output match expected? ✓	Is this the right fix or a workaround?

AI might "fix" your bug in a way that passes the test but breaks something else.

Your job: Understand the fix, not just accept it.

The Right Way to Use AI for Debugging

Split panel contrasting two AI approaches: left shows 'Fix my bug' leading to confusion and no learning; right shows using AI for diagrams and understanding, leading to verified knowledge

Summary: Program Understanding & Debugging

Debugging is integral to implementation and validation — not a separate phase
Control flow + data flow — trace execution paths AND how values change
Diagrams (call graphs, sequence, class) — visualize what code can't show linearly
Scientific method — observe, hypothesize, predict, test, analyze, iterate (same as L13's 6-step workflow!)
Debuggers — accelerate hypothesis testing without modifying code
AI assistants — use the same 6-step workflow; YOU must evaluate the fix
Avoid "vibe debugging" — if you can't explain the fix, you haven't fixed it

From L13: Task familiarity determines AI appropriateness. If you can't evaluate whether a bug fix is correct, you shouldn't accept AI's suggestion blindly.

From the syllabus: "The hardest parts of building software have never been typing code." Debugging is where understanding proves its value.

Tools and Resources

VS Code Extensions for Mermaid:

Markdown Preview Mermaid Support — Renders Mermaid in preview
Mermaid Editor — Live editing with preview pane

Online Tools:

mermaid.live — Interactive editor
Mermaid Documentation — Official syntax reference

Debugging Book:

The Debugging Book by Andreas Zeller — Free online textbook

IDE Debugger Guides:

IntelliJ: Help → Find Action → "Debug"
VS Code: Run → Start Debugging (F5)

Next Steps

Tomorrow: AI Coding Assistants Lab

Practice the prompting techniques from L13 and today

Thursday: HW3 Due

If you haven't started, start NOW
Recipe collections, JSON serialization, polymorphic types
Use the debugging techniques from today when you get stuck!

Wednesday: Testing

Lecture 14: Program Understanding & Debugging​

Poll: Why don't you use Oakland office hours or appointments?​

Learning Objectives​

Coming Soon: Much Bigger Codebases​

Poll: How many hours per assignment do you spend debugging?​

Poll: When You Encounter a Bug, What's Your First Instinct?​

Building Mental Models​

Two Lenses for Understanding Code​

Control Flow: Tracing Execution Paths​

Data Flow: Tracking How Values Change​

Combining Control and Data Flow Analysis​

Interprocedural Analysis: Following Calls Across Methods​

Don't Forget: Dynamic Dispatch Affects Control Flow​

Diagrams Help You See the Big Picture​

Call Graphs: Static View of Method Relationships​

Sequence Diagrams: Dynamic View Over Time​

Class Diagrams: Understanding the Data Model​

Practical Diagram Workflow​

Using AI to Generate Diagrams: Concrete Examples​

When to Use the IDE, Not AI​

When to Use AI for Diagrams​

AI Generated This Diagram. Now What?​

AI for Diagrams: What Works vs What Doesn't​

Debugging Is Critical Thinking Applied to Code​

The Scientific Method Applied to Debugging​

Quick Background: What Is HTML?​

Example: Debugging HTML Markup Removal​

Step 1: Observe — Build an Observation Table​

Step 2: Hypothesize — Form Theories​

Step 3 & 4: Predict and Test​

Step 5: Analyze — Narrow Down the Cause​

Step 5 (continued): The Root Cause Revealed​

Step 6: Fix — Implement and Verify​

The Hypothesis Funnel: How We Narrowed Down the Bug​

When to Use Systematic Debugging​

Why Keep a Debugging Log?​

Tools for Debugging​

Debugging Our HTML Example​

Debugger = Interactive Hypothesis Testing​

Core Debugger Concepts​

Effective Debugger Workflow​

AI + Scientific Debugging Workflow​

Callback: How Hard Is It to Evaluate?​

The Right Way to Use AI for Debugging​

Summary: Program Understanding & Debugging​

Tools and Resources​

Next Steps​

Lecture 14: Program Understanding & Debugging

Poll: Why don't you use Oakland office hours or appointments?

Learning Objectives

Coming Soon: Much Bigger Codebases

Poll: How many hours per assignment do you spend debugging?

Poll: When You Encounter a Bug, What's Your First Instinct?

Building Mental Models

Two Lenses for Understanding Code

Control Flow: Tracing Execution Paths

Data Flow: Tracking How Values Change

Combining Control and Data Flow Analysis

Interprocedural Analysis: Following Calls Across Methods

Don't Forget: Dynamic Dispatch Affects Control Flow

Diagrams Help You See the Big Picture

Call Graphs: Static View of Method Relationships

Sequence Diagrams: Dynamic View Over Time

Class Diagrams: Understanding the Data Model

Practical Diagram Workflow

Using AI to Generate Diagrams: Concrete Examples

When to Use the IDE, Not AI

When to Use AI for Diagrams

AI Generated This Diagram. Now What?

AI for Diagrams: What Works vs What Doesn't

Debugging Is Critical Thinking Applied to Code

The Scientific Method Applied to Debugging

Quick Background: What Is HTML?

Example: Debugging HTML Markup Removal

Step 1: Observe — Build an Observation Table

Step 2: Hypothesize — Form Theories

Step 3 & 4: Predict and Test

Step 5: Analyze — Narrow Down the Cause

Step 5 (continued): The Root Cause Revealed

Step 6: Fix — Implement and Verify

The Hypothesis Funnel: How We Narrowed Down the Bug

When to Use Systematic Debugging

Why Keep a Debugging Log?

Tools for Debugging

Debugging Our HTML Example

Debugger = Interactive Hypothesis Testing

Core Debugger Concepts

Effective Debugger Workflow

AI + Scientific Debugging Workflow

Callback: How Hard Is It to Evaluate?

The Right Way to Use AI for Debugging

Summary: Program Understanding & Debugging

Tools and Resources

Next Steps