Skip to content
Stanford OnlineStanford Online

Stanford CS230 | Autumn 2025 | Lecture 6: AI Project Strategy

For more information about Stanford’s Artificial Intelligence professional and graduate programs, visit: https://stanford.io/ai October 28, 2025 This lecture provides walkthroughs of examples of AI projects and making day-to-day decisions in building AI systems. To learn more about enrolling in this course, visit: https://online.stanford.edu/courses/cs230-deep-learning To follow along with the course schedule and syllabus, visit: https://cs230.stanford.edu/syllabus/ More lectures will be published regularly. View the playlist: https://www.youtube.com/playlist?list=PLoROMvodv4rNRRGdS0rBbXOUGA0wjdh1X NOTE: There was no class on November 4, 2025 (Lecture 7). The next lecture is Lecture 8. Andrew Ng Founder of DeepLearning.AI Adjunct Professor, Stanford University’s Computer Science Department Kian Katanforoosh CEO and Founder of Workera Adjunct Lecturer, Stanford University’s Computer Science Department

Andrew Nghost
Nov 4, 20251h 15mWatch on YouTube ↗

At a glance

WHAT IT’S REALLY ABOUT

Andrew Ng on speeding AI builds through disciplined error analysis loops

  1. Ng argues that team productivity in AI is dominated by iteration speed and decision quality, often creating 10× differences between teams building similar systems.
  2. Using on-device wake-word detection (“Robert, turn on”), he shows how practical constraints (edge compute, lack of public data) force teams toward targeted models, rapid literature review, and hands-on data collection.
  3. He demonstrates common failure modes like class imbalance and misleading accuracy metrics, then walks through fixes such as reweighting, duplicating positives, and broadening the positive time window to increase signal diversity.
  4. Ng explains why synthetic data is powerful but usually not the first step, highlighting distribution mismatch and limited diversity, while giving effective synthesis techniques like mixing clean speech with varied background noise and non-trigger phrases.
  5. For multi-stage systems (e.g., an LLM-based web “deep researcher”), he emphasizes pipeline-level error analysis to identify the true bottleneck, using manual inspection and spreadsheets to quantify which stage most often causes poor outputs.

IDEAS WORTH REMEMBERING

5 ideas

Speed beats perfect architecture choices early on.

Ng recommends building something in days (even if imperfect) and course-correcting, because fast iteration reveals the real bottlenecks sooner than extended upfront design debates.

Start with a literature + open-source sweep, not a deep read of one paper.

He advises skimming many resources quickly to map the space, then returning to the most promising/seminal ones—this finds strong baselines faster than sequential full-paper reading.

Don’t assume the data you need already exists—plan to collect it.

For custom phrases like “Robert, turn on,” there is no ready-made dataset, so teams must gather recordings (with consent) and design negatives that reflect realistic non-trigger speech.

Accuracy can be meaningless under class imbalance—measure what you actually care about.

The model achieved ~97% accuracy by predicting “no trigger” always; the fix is to rebalance (duplicate/weight positives, penalize false negatives, or reduce negatives) and evaluate detection-oriented metrics.

A small labeling/definition tweak can create more useful positives than naive duplication.

Expanding the “positive” window from an instant to the last 0.5–1s after phrase completion increases positive variety and count, improving learning while matching acceptable product behavior (turning on slightly late is okay).

WORDS WORTH SAVING

5 quotes

But even beyond understanding how the algorithms work, what really drives performance is a team's ability to have an efficient development process.

Andrew Ng

The skill in making those decisions is what often makes a massive literally 10X difference in productivity.

Andrew Ng

I find that, um, of all of these ideas, I think some are better than others, but it doesn't... But, but whether the, the idea is, you know, a bit better or a little bit worse, it is important, but it's actually secondary to how quickly you can just get something built.

Andrew Ng

So this is the kind of stuff that happens in real life, right? And, and by the way, I'm sharing these stories not, you know, just to entertain you, though hopefully you're entertained, but because I think of this by living these experiences that you, you know, go, "Oh, I could see this problem."

Andrew Ng

In contrast, when you're building machine learning system, it's much more like I don't know what's gonna happen next, right? ... And so the workflow of machine learning feels much more like debugging than development.

Andrew Ng

AI iteration speed as competitive advantageWake word/trigger word detection on edge devicesData collection strategies and consent/privacyClass imbalance and misleading accuracy metricsRegularization, overfitting, and “more data” remediesSynthetic data: risks, knobs, and effective mixing methodsPipeline error analysis for LLM agent/research systems

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.