Real-Time AI Reasoning & Coaching System
Six-component browser pipeline. Sub-second SOAR coaching nudges streamed live during interviews. Zero server-side data.
title: Real-Time AI Reasoning & Coaching System tagline: Six-component browser pipeline. Sub-second SOAR coaching nudges streamed live during interviews. Zero server-side data. year: 2026 status: Live order: 1 stack:
- Next.js 15
- TypeScript
- Deepgram nova-3
- Claude Sonnet 4.6
- Dexie / IndexedDB
- Zustand
- Tailwind highlights:
- ~1s end-to-end latency from question asked to first nudge
- Local-first — all session data stays in IndexedDB
- BYO API key — zero server-side credential storage
- Live-tested in production on a real interview call
What it is
A fully functional, local-first web application that provides real-time coaching during live interviews. The app listens to a live call (Zoom, Teams, or Meet), transcribes both speakers in real time, infers the interview question being asked, and instantly streams structured SOAR-format coaching nudges — all anchored to the specific Job Description, tailored resume, and behavioral story library uploaded at session start.
Designed to help deliver executive-quality, metrics-anchored answers without sounding scripted.
Problem it solves
- Interviewers ask unpredictable behavioral questions; candidates either over-prepare scripts or blank under pressure.
- Existing tools provide generic advice — not real-time, context-aware guidance anchored to the actual role and the candidate's own stories.
- High-stakes roles require metrics, story alignment, and sequencing — hard to recall and deliver live under pressure.
Architecture
A six-component pipeline running entirely in the browser:
- Chrome tab audio capture — Web Audio API + AudioWorklet, 100ms frames
- PCM16 resampling to 16 kHz
- Deepgram nova-3 streaming WebSocket — ~300ms transcription latency, speaker diarization enabled
- Heuristic question detector — three signals: question mark, behavioral opener phrases, 1,500ms pause
- Claude Sonnet 4.6 via Server-Sent Events streaming
- SOAR-format nudge bullets rendered in real time
Key design decisions
- Local-first. All session data (transcript, nudges, knowledge base) stored in IndexedDB via Dexie — nothing persisted server-side.
- Per-session knowledge base. Job Description + tailored resume + story library uploaded per interview (PDF / DOCX / MD parsed client-side); replaceable mid-session without restarting.
- Two-layer state. Zustand for ephemeral live state (transcript, streaming nudge), Dexie for durable session history and export.
- BYO API key architecture. Anthropic and Deepgram keys held in localStorage, passed per-request — zero vendor lock-in, zero server-side credential storage.
The SOAR nudge format
Every nudge is ≤120 words, bullets only, structured as:
- Lead — one-line sequencing cue (e.g., "Open with the $85M outcome, then frame the complexity")
- Story — the specific story from the library being referenced
- S — Situation — 1–2 bullets covering scale and organizational context
- O — Obstacle — 1–2 bullets covering complexity, stakes, or constraints
- A — Actions — 2–3 bullets in active verbs — what was specifically done
- R — Results — 1–2 bullets — always includes a real metric (%, $, headcount, time)
Hard rules: never invent a story, never contradict the resume, align vocabulary to the JD without parroting it.
Status & outcomes
- Status — Live and tested in production on a real call.
- Latency — End-to-end from question asked to first nudge bullet streaming: approximately 1 second.
- Validated — Auto question-detection fires correctly on behavioral openers and pause signals; manual "Nudge me" override available at any moment.
- Roadmap — Web deployment, speaker identity calibration, post-session analytics.
Why this matters
This project demonstrates applied AI product thinking — not a demo, not a tutorial follow-along, but a purpose-built tool solving a real problem with a production-grade architecture. It reflects how I approach transformation work: identify a high-friction moment, design a system that removes it, and build something that actually works under pressure. The same instinct that drives how I modernize PMO operations at scale.