How Metaview built self-improving prompts for application review

At Metaview, we help recruiters sift through thousands of resumes a day. Most evaluation systems set the criteria upfront and rebuild every time preferences change. We built one that learns from every decision recruiters make and evolves with them.

May 21, 202616mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Metaview’s self-improving prompts streamline high-volume recruiting application review workflows

Metaview built application-review prompts that automatically evolve because hiring preferences routinely change as teams see more candidates and conduct interviews.
Their workflow redacts PII, evaluates candidates against an “ideal candidate profile” (ICP), and keeps humans as the final decision-makers while the system acts as an apprentice.
An ICP agent learns from recruiter actions (progress/reject), explicit feedback, and manual ICP edits, using candidate context to interpret relative feedback like “too junior” or “not enough Python.”
They favor prose/Markdown ICPs over rigid rules, weightings, or keyword matching, aiming to mirror how recruiters actually describe needs and how LLMs reason best.
To control cost at scale, they separate high-volume evaluation (Haiku) from pattern-finding/ICP updates (Sonnet) rather than “throw everything into one agent.”

IDEAS WORTH REMEMBERING

5 ideas

Assume hiring criteria will change—design prompts to evolve continuously.

Metaview observed that stakeholders refine requirements after seeing real candidates (e.g., “startup experience,” “zero-to-one”), so static prompts quickly become misaligned with actual decision-making.

Keep humans at the center; the LLM should be an apprentice, not a judge.

The system surfaces fit signals and drafts evaluations, but users make the progress/reject decision; the agent then learns from those decisions rather than overruling them.

Make the ICP the editable, versioned “source of truth” for evaluation.

Instead of hardcoding logic, Metaview updates a natural-language ICP document (role summary, must-haves, nice-to-haves, red flags) that directly steers future evaluations.

User feedback is relative—retrieve the candidate context to interpret it correctly.

Statements like “too junior” or “not enough Python” only make sense when grounded in the specific resume being referenced, so the agent queries/redacted files to anchor updates.

Prefer prose/Markdown over rules, weights, and keyword matching.

Recruiters express preferences in natural language, and LLMs reason best in prose; avoiding flowcharts/weighting schemes reduces brittle behavior and “checkbox” hiring patterns.

WORDS WORTH SAVING

5 quotes

The key point I wanna take away from here is that any user-based decision and anywhere where user judgment is at the forefront, your preferences are going to evolve. And so any system you built, your prompts must evolve with them.

— Nick Mayhew

We're working in high-risk areas here where human judgment is at the forefront and is not just human in the loop, but human in the center.

— Nick Mayhew

Lean into what LLMs are good at, which is natural language. Allow them to reason in prose, not in flowcharts.

— Nick Mayhew

You need a system that's efficient in its token usage, which is why we have this workflow underneath and this agent that sits on top to evaluate the progressions and the rejections.

— Nick Mayhew

Build your system from the start to be the apprentice, to learn from the user, but never overrule the user, right? Make the user the, make the user the master and make the system the apprentice.

— Nick Mayhew

AI-driven surge in job applications and recruiter workloadEvolving hiring preferences and prompt driftHuman-in-the-center decision architectureICP (Ideal Candidate Profile) as a self-improving prompt artifactLearning signals: progress/reject actions, feedback, manual editsContext retrieval over unstructured resumes (Query Files tool)Model selection and token/cost constraints (Haiku vs Sonnet)

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.