How Metaview built self-improving prompts for application review

At Metaview, we help recruiters sift through thousands of resumes a day. Most evaluation systems set the criteria upfront and rebuild every time preferences change. We built one that learns from every decision recruiters make and evolves with them.

May 22, 202616mWatch on YouTube ↗

CHAPTERS

0:13 – 1:55
Why application review needs to change: AI-driven surge in job applicants
Nick Mayhew sets the stage with the core problem Metaview is solving: application volume has exploded since LLMs became widely available. Lower effort to apply (and to write longer answers) creates overwhelming load for recruiters and hiring teams.
- •LLMs lowered the barrier to applying, increasing submission volume
- •Real example: 2,740 applications for one role in 24 hours
- •Remote and junior roles are especially affected
- •Applicant answers are ~50% longer, largely due to AI assistance
1:55 – 2:56
Stakeholder requirements drift: hiring preferences evolve mid-process
Metaview starts by interviewing hiring managers for evaluation criteria, but quickly runs into shifting requirements. The talk highlights that when human judgment drives decisions, preferences change as managers see more candidates and conduct interviews.
- •Initial requirements (e.g., years of experience) change after reviewing early CVs
- •New constraints appear later (startup experience, “zero-to-one” work)
- •Rewriting prompts/evaluation logic repeatedly becomes costly and slow
- •Core insight: systems must be built to adapt to evolving preferences
2:56 – 3:27
Design principle: prompts must evolve as a first-class system feature
Nick emphasizes that prompt evolution shouldn’t be an afterthought. If users are central to the decision, the product must assume and accommodate continual preference changes.
- •User preferences will evolve; prompts must evolve with them
- •Don’t bolt on adaptation at the end—make it foundational
- •Keep user judgment at the forefront of decision-making
- •Treat the system as responsive to real-world learning cycles
3:27 – 3:57
Metaview’s workflow overview: redact candidates, compare to an ideal candidate profile
The system begins by redacting personally identifiable information and then evaluates candidates against an ideal candidate profile (ICP). The ICP is the self-improving artifact that updates as user decisions accumulate.
- •Redaction removes name/contact info to focus on qualifications
- •Candidates are matched to an ICP to generate evaluations
- •ICP is the main self-improving prompt/document
- •Output supports review rather than making final decisions
3:57 – 4:58
Human-in-the-center: the LLM as apprentice, not decision-maker
Nick distinguishes “human in the loop” from “human in the center.” The model helps summarize and detect relevant evidence, while the user makes the accept/reject decision and remains the authority.
- •System spots signals (companies, technologies, experience) to reduce grunt work
- •User decides progression or rejection; model does not overrule
- •High-risk domain requires deference to human judgment
- •System’s role is assistance and evidence gathering
4:58 – 5:28
Learning from decisions: an ICP agent that observes user patterns
Metaview adds an agent layer that watches user actions (progress/reject/edit feedback) and updates the ICP accordingly. This turns day-to-day recruiter decisions into structured improvements of evaluation criteria.
- •Agent observes rejections, progressions, written feedback, and manual ICP edits
- •Patterns in decisions drive ICP updates over time
- •The ICP manager agent’s job: keep the ICP current
- •Self-improvement is anchored in real user behavior
5:28 – 6:58
Adding context: using candidate history to interpret feedback accurately
User feedback is often relative to the last candidates seen, so Metaview retrieves the relevant resume context to interpret comments like “too junior” or “not enough Python.” A specialized tool helps search unstructured candidate data more effectively than basic grep-style approaches.
- •Feedback is relative; agents need the right surrounding context
- •Tooling retrieves past redacted resumes to ground feedback
- •Specialized “Query Files” tool handles unstructured profiles better than Bash/grep
- •Context helps translate vague feedback into actionable ICP updates
6:58 – 7:29
Why this is a workflow + agent (not one giant agent): cost and token efficiency at scale
Nick explains the architectural choice: processing thousands of applications requires careful token and cost control. The evaluation workflow is optimized, while the ICP-updating agent runs on top to learn from outcomes.
- •At high volume, you can’t “throw everything” into one agent call
- •Business constraints: avoid spending large amounts per role/applicant batch
- •Workflow handles repeated evaluations efficiently; agent handles learning
- •Architecture balances cost, latency, and intelligence
7:29 – 8:29
What an ICP looks like: markdown prose over rules, weights, and keywords
Metaview represents the ideal candidate profile as a natural-language markdown document. Instead of weightings, flowcharts, or keyword-matching, the system relies on LLM strength in prose reasoning and mirrors how recruiters describe roles.
- •ICP is a plain text/markdown document (role summary + criteria)
- •Avoid weightings/if-statements/flowcharts and brittle keyword matching
- •Let users express priorities in natural language
- •Prose-based criteria better reflects real hiring judgment
8:29 – 10:00
Model choice for resume evaluation: critical reasoning and right-sizing intelligence
Nick motivates Metaview’s use of Claude models by focusing on realism: resumes often contain exaggeration, so the model must reason skeptically. They use smaller/faster models for high-volume scoring and more capable models for pattern discovery and ICP updates.
- •CVs contain fluff; model must reason critically, not accept claims at face value
- •Haiku used for high-volume evaluations to control cost/latency
- •Sonnet used for less constrained pattern-finding and ICP improvement
- •Token limits and throughput matter when processing thousands daily
10:00 – 11:00
Live product walkthrough: viewing fit scores and the ICP structure in the UI
Nick demos Metaview’s interface showing candidates, their “ICP fit,” and the ICP document itself. The ICP is organized into role summary, must-haves, nice-to-haves, and red flags to align with recruiter mental models.
- •UI lists candidates with fit assessments against the ICP
- •ICP sections: role summary, must-have, nice-to-have, red flags
- •Structure mirrors how recruiters naturally evaluate
- •Demonstrates transparency into what the system is optimizing for
11:00 – 14:05
Feedback loop in action: progressing a candidate updates the ICP via agent tools
Nick progresses a candidate and submits feedback (e.g., valuing strong engineering-company backgrounds). The agent (running in LangChain/LangGraph) ingests the decision, reasons over context, and proposes an ICP update via an “Upsert ICP” tool call.
- •User action + written feedback becomes a learning signal
- •Agent run shows reasoning and tool selection in logs
- •“Upsert ICP” updates the profile based on observed preference
- •Demonstrates human approval/editing of suggested changes
14:05 – 14:35
Scaling the learning: pattern-based updates and re-evaluations over many decisions
Nick notes ICP shouldn’t change from a single datapoint; it becomes robust after accumulating patterns across many decisions (e.g., 100–200). Once updated, candidates can be re-evaluated efficiently in bulk.
- •Avoid overfitting ICP to one piece of feedback
- •Look for stable patterns across many progress/reject decisions
- •Updated ICP enables improved ranking/fit judgments at scale
- •Bulk reevaluation can run on efficient models without manual effort
14:35 – 16:45
Three closing takeaways: evolving preferences, prose-first prompts, guardrails by design
Nick closes with three principles: assume user preferences evolve, write evaluation criteria in prose rather than rigid rules, and embed guardrails into the architecture from day one. The system must remain an apprentice that supports—but never replaces—human authority.
- •Build for preference evolution as a foundational requirement
- •Use prose/markdown instead of rules and flowcharts
- •Guardrails must be architectural, not bolted on later
- •Keep the user as “master” and the model as “apprentice”

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.

iOS

Android

Claude

Chrome

Why application review needs to change: AI-driven surge in job applicants

Stakeholder requirements drift: hiring preferences evolve mid-process

Design principle: prompts must evolve as a first-class system feature

Metaview’s workflow overview: redact candidates, compare to an ideal candidate profile

Human-in-the-center: the LLM as apprentice, not decision-maker

Learning from decisions: an ICP agent that observes user patterns

Adding context: using candidate history to interpret feedback accurately

Why this is a workflow + agent (not one giant agent): cost and token efficiency at scale

What an ICP looks like: markdown prose over rules, weights, and keywords

Model choice for resume evaluation: critical reasoning and right-sizing intelligence

Live product walkthrough: viewing fit scores and the ICP structure in the UI

Feedback loop in action: progressing a candidate updates the ICP via agent tools

Scaling the learning: pattern-based updates and re-evaluations over many decisions

Three closing takeaways: evolving preferences, prose-first prompts, guardrails by design

Get more out of YouTube videos.