Bonus

How We Made This Course

We didn't just write a course about AI. We used AI to research what actually works when teaching people AI — then built the course around those findings.

The experiment

We built an AI tutor that teaches practical AI literacy, then gave a second AI (Claude Code) one job: make the tutor better. It could rewrite the teaching strategy, change the order, add or remove tools — anything. Then it tested the result against 5 simulated learners and kept what worked.

Rewrite tutor→Test on 5 learners→Grade 25 tasks→Keep or discard

Each loop: ~20 minutes, ~$2, ~150 API calls. 13 runs total.

The 5 test learners

Each simulated learner had a real personality, real skepticism, and real resistance. If the tutor couldn't teach all five, the strategy wasn't universal.

Priya, 58Mom — "Google works fine"

Marcus, 34Barber — "No time for this"

Maya, 28Illustrator — "AI is generic"

Jake, 20Student — "I already use it"

Bob, 65Retiree — "Don't want to look foolish"

The 5 skills tested

After each teaching session, every learner was tested on 5 skills. These became the 5 modules in Part 1 and Part 2 of this course.

Prompting→Module 1: Your First Useful Thing

Critical thinking→Module 2: When AI Lies

Debugging→Module 3: Fix It, Don't Restart

Persistence→Module 4: Make It Remember You

Building→Module 5: Build Your First Tool

13 experiments, 3 improvements

Most ideas made things worse. Only 3 out of 11 experiments actually improved the tutor.

base

exp1

exp2

exp3

exp4

exp5

exp6

exp7

exp8

exp9

exp4r

exp11

Full experiment log

#	Score	Pass	Status	What we tried
base	0.757	80%	keep	Baseline — original teaching prompt
exp1	0.817	88%	keep	Explicit tool vs prompt distinction
exp2	0.781	84%	discard	Tighter pacing, combine steps
exp3	0.862	92%	keep	Faster opening + explain AI mechanism during failures
exp4	0.897	96%	keep	Concrete worked example in build step
exp5	0.753	80%	discard	Active recall — learner explains back
exp6	0.471	52%	discard	Prescriptive 4-step framework
exp7	0.836	88%	discard	Faster pacing + combine steps 6&7
exp8	0.710	76%	discard	Vivid autocomplete analogy
exp9	0.671	72%	discard	Remove constraints section
exp4r	0.595	64%	variance	Re-run of exp4 — confirmed ±0.15 variance
exp11	0.713	76%	discard	Debug-via-contrast + payoff-first

What we learned

Concreteness wins

Every kept improvement added specificity. "Build a tool" failed. "Every week you [X], let's build..." worked. Abstract frameworks always regressed.

Constraints are load-bearing

Removing the "under 150 words" and "never skip the why" rules cratered the score to 0.671. Constraints aren't decoration — they shape behavior.

Verbosity always hurts

Every experiment that made the prompt longer scored worse. Analogies, frameworks, active recall — all added words, all regressed.

Don't rush the build step

Three experiments tried compressing the "build a tool" step. All failed. This is why Module 5 has three full examples before asking you to build.

Building is the hardest skill

Learners consistently confused "a good prompt" with "a tool." The input→output framing in Module 5 came directly from the experiment that scored 0.897.

Variance dominates signal

The same prompt scored 0.897 and 0.595 on consecutive runs. Most of our "failures" may have been noise. Reliable testing needs 3+ runs per variant.

How this shaped the course: Every module follows the same structure the winning prompt used — start with a real person's problem, show the solution, explain why it works, show where it breaks, let them practice. The order is non-negotiable: the research showed reordering always made things worse.

Runs

~$27

Total cost

0.897

Peak score

Ready to learn?

This course was built on evidence, not opinion. Start with Module 1.

Start the course →