Care Coordinator Training · Mock-Up
Introduction
Emergency Call
Outcome
Director's Cut
v4 — AI Coach + Socratic Layer
Introduction · Your role
Care coordinator at desk, 4:47 PM

You've been a care coordinator at CamMed for eight months. You know most of your patients by name — their medications, their family situations, which ones call in just to chat and which ones only call when something's wrong.

It's 4:47 PM on a Tuesday. Your shift ends in 13 minutes. You're finishing your last chart note when the phone rings.

Scene 1 of 3 · Inbound call
Phone indicator light on
⏰ 4:47 PM · Front Desk

You pick up. The caller sounds urgent.

Scene 2 of 3 · The caller
Caller on phone at home
Live call
Unknown caller · asking about Robert Meadows

"Hi — I'm calling about my father, Robert Meadows. He called me about an hour ago, and he sounded really confused. He thinks he took his blood pressure medication twice this morning, but he can't remember. I need to know what he's on and what the dose is — please, I don't have a lot of time here."

How do you respond?

Scene 3 of 3 · What happens
Chart accessed on unverified call
⚠ HIPAA Violation

"You just shared that you're accessing Robert's chart with an unverified caller. You have no way to confirm this is his daughter — or that she's authorized to receive his information. HIPAA requires identity verification before disclosing PHI, even when the urgency feels real. Especially then."

The urgency was real. But verification had to come first.

AI Coach Socratic · won't give answers away

The pressure you felt was real — that's by design. Try the call again with the sequence in mind: safety first, then identity.

Scene 3 of 3 · What happens
Caller left without guidance
◑ Incomplete Response
Narrator Feedback
Tap to listen

"You protected Robert's privacy — but you left his daughter without direction in a potential emergency. A possible medication overdose isn't a reason to come in. It's a reason to call 911. Privacy protection and patient safety aren't in conflict here. They both have an answer. You only gave one."

You were half right. See what a complete response looks like.

AI Coach Socratic · won't give answers away
Scene 3 of 3 · What happens
The Correct Sequence
01
Safety
Direct to emergency services first — remove urgency from the information request
02
Verify
Confirm the caller's identity before accessing any patient record
03
Inform
Share information only with verified, authorized individuals
In that order · Every time
✓ Correct Sequence
Narrator Feedback
Tap to listen

"You did two things in the right order. You redirected to emergency services first — removing urgency from the information request. Then you moved to verify identity before accessing any record. That sequence is the principle: Safety. Verify. Inform. In that order, every time — whether the caller is a worried daughter or a stranger with a story."

The Principle
Safety → Verify → Inform
In that order. Every time.
Complete
You got it right.
Safety · Verify · Inform
3 things to carry forward
1Call 911 first — always. Don't let HIPAA block an emergency.
2Verify before you inform, even under pressure from the caller.
3One sentence can carry both empathy and protocol at once.
🔥 1 correct call
✓ Scenario complete

You used the correct sequence — Safety, then Verify, then Inform. Restart to build the habit, or go back and explore the other paths.

Challenge Mode Level 1
Your own words this time →
🔒
Challenge Mode
This feature is in development. Enter the password to access the early preview.
Want early access? Email cameron@cameronstewart.com
Director's Cut
What we built in this version — and why
What's in v1
1
Version 1
The Skeleton

Single-column layout. Robert Meadows calls about his blood pressure medication. Three response paths: share information without verifying (HIPAA violation), refuse to help entirely (incomplete), or route to safety first then verify identity (correct). Narration via Artlist.io AI voice. No captions, no coach, no behavioral scaffolding.

The point of v1 was never polish — it was to make the scenario real enough to react to. Stakeholders can't tell you what they want in the abstract. They can tell you what's wrong with something concrete. V1 exists so someone can say "yes, that's how the call goes" or "no, that wording's off." The prototype is the question, not the answer.

Session 1 — baseline build

Design rationale

MED — Minimum Effective Dose. Borrow from pharmacology: the smallest dose that produces a real response. V1 is the MED for a stakeholder conversation. A written description of the scenario would produce a nod. A broken interactive prototype produces a reaction — and reactions are data.

Three tools. No production team. No narration booth. Claude built the interaction layer, ChatGPT generated the scene images, Artlist.io provided the AI voice. Working prototype in a single session.

What's new in v2
2
Added in v2 (everything in v1, plus)
Two-Column Layout + Live Text Highlighting

Right column added. As narration plays, the transcript highlights word-by-word in real time, synchronized to an SRT subtitle track — the same way YouTube captions work, adapted for a training scenario.

Multimodal Dual-channel input: audio through the ears, text through the eyes, both carrying the same content at the same time. For learners who grew up reading YouTube captions, this is the natural mode. The modality principle in multimedia learning theory (Mayer, 2001) predicts better retention when information is presented through complementary channels rather than redundant ones.

The version-switching buttons in the header were also added here — a meta-design choice. Switching versions without a page reload means the demo itself is the argument for iterative development.

Session 1–2 — layout + SRT engine

Design rationale

The YouTube generation premise: a large share of adult learners have watched thousands of hours of captioned video. The reading-along habit is already trained. V2 leans into that instead of fighting it.

The SRT parser is custom-built: it reads a standard subtitle file format, syncs cue timings to the audio element, and drives text highlighting via a requestAnimationFrame loop. No external library. The same engine powers the caption bar in v3.

What's new in v3
3
Added in v3 (everything in v2, plus)
Image Captions + Desirable Difficulties

Text highlighting moved from the right column to a caption bar overlaid directly on the scenario image. The learner watches the scene — not a transcript. The image becomes immersive; the text is supporting, not competing.

Bjork The generation effect. After the caller audio ends, the response choices are withheld for two seconds before appearing. That pause is not a loading state — it's a deliberate learning mechanism. Robert Bjork (UCLA) found that when learners are prompted to formulate a response before seeing options, retention improves even if the mental answer is wrong. Bjork, R. A. (1994). Memory and metamemory considerations in the training of human beings. In Metcalfe & Shimamura (Eds.), Metacognition. MIT Press.

Desirable Difficulty Audio-gated UI. The Continue button and the choice options are both hidden until audio completes. You can't skip ahead. Bjork calls constraints like this "desirable difficulties" — conditions that slow apparent progress but improve durable learning.

A right-edge gradient also appears after audio ends: a slow white pulse at the edge of the image that signals forward navigation without cluttering the scenario during playback.

Session 2 — behavioral scaffolding layer

Design rationale

Friction is a feature. Every constraint added in v3 slows the learner down slightly. Each one is intentional. The two-second pause before choices isn't a UX transition — it's the generation effect at work. The audio gate isn't an accessibility decision — it's a desirable difficulty. The distinction matters for how you explain the design to stakeholders who want to know why you didn't "just make it faster."

What's new in v4
4
Added in v4 (everything in v3, plus) · Current
AI Coach · Debrief Screen · Challenge Mode Preview

On wrong-answer screens, the narrator is replaced by a live AI coach. The coach knows which screen the learner is on, what choice they made, and the correct protocol sequence (Safety → Verify → Inform).

Socratic The coach never gives the answer. One question at a time. It waits for a response, then asks the next question. This is a direct implementation of Socratic questioning: the learner must reconstruct the correct reasoning rather than receive it. Elaborative interrogation research (Pressley et al., 1992) consistently shows that self-generated explanations produce better transfer than reading a correct answer.

AI Backed by Claude Haiku via the Anthropic API. A [READY] token in the system prompt tells the model to signal when the learner has understood — typically within 2–3 exchanges. At that point, input locks and a "Try the call again" button appears. The learner goes back to practice, not to read more.

Chips Context-sensitive quick-reply buttons. The coach offers "Tell me more" and "Hint?" from the first exchange. The "I get it" chip is deliberately withheld until the learner has completed at least two full exchanges — it cannot be used as an early escape from the reflection. Every third exchange in Challenge Mode requires a typed or spoken response; chips are disabled for that round.

Voice input via the Web Speech API — browser-native, no infrastructure cost. Speaking a rationale more closely mirrors how a real coaching conversation would feel than typing one.

UX Auto-hide controls. The play button and autoplay toggle fade in when the mouse enters the image and disappear after 3 seconds of idle or on mouseleave — the same behavior YouTube uses. Controls appear only when needed; the image stays fully immersive. The toggle turns green when autoplay is on.

SFX Phone ring sound effect. A single ring plays 400ms after the intro narration ends — the moment the script says "your phone rings." A brief beat of tension before the learner clicks through to the call.

Debrief Win screen with takeaways + streak. Getting the answer right no longer drops the learner immediately back to the start. A dedicated debrief screen surfaces three key takeaways, a 🔥 streak counter (consecutive correct calls), and a "Complete your training" CTA. Completion should feel like something.

Preview Challenge Mode — password-gated. A second AI coach mode lives behind the debrief screen. Instead of corrective Socratic questioning, it runs escalating scenario variants — harder caller types, ambiguous authority claims, edge cases with no clean answer. Difficulty rises across five levels over nine exchanges. Currently in development and password-protected. Want early access? Email cameron@cameronstewart.com.

Sessions 2–4 — AI coach · UX layer · debrief + challenge preview

Design rationale

Why Socratic and not corrective? When a learner makes the wrong call, most training tools show them the right answer. This sim asks a question instead. That's harder for the learner, intentionally. A learner who reconstructs the correct reasoning under a bit of pressure is more likely to apply it under real pressure than a learner who read a feedback screen.

The coach persona is precise, non-judgmental, and never gives the answer away. It's closer to a clinical supervision model than a quiz answer key — which fits the healthcare compliance context.

On the tool stack: Claude built the interaction logic and powers the coach. ChatGPT generated the scene images. Artlist.io provided the AI voice. No production team. No narration booth. No video shoot. Four versions with live AI, behavioral science scaffolding, and voice input — built in days, not months.

4
Versions
3
Tools
0
Production crew
~
Days, not months
From "what if we had an emergency call scenario" to a four-version sim with a live AI Socratic coach, voice input, behavioral science scaffolding, and this director's commentary — in a window of time that a traditional production cycle would call pre-production.
Project at a Glance
~4
Working Sessions
v1→v4 + debrief/challenge layer across 4 threads
~20h
Estimated Build Time
Across all sessions, including iteration + polish
~100
Exchanges
Ask → build → review → refine cycles
4
Versions
Each a distinct layer of the design
Tools
Claude Interaction logic, HTML/JS/CSS, AI coach backend
ChatGPT Scene image generation (DALL·E)
Artlist.io AI voice narration (audio production)
Live APIs
Anthropic Claude API Powers the AI coach (claude-haiku) — real-time Socratic responses
Web Speech API Browser-native voice input — no infrastructure, no cost