At a glance
WHAT IT’S REALLY ABOUT
Workshop agent battle: optimize managed AI agents to mine diamonds
- The session frames a competitive “agent battle” where each participant deploys an AI agent to control a Mineflayer bot and mine diamonds within fixed run and workshop time limits.
- Participants learn to build and deploy a new managed agent offering, focusing on tuning configuration levers like system prompts, model choice, skills, and MCP tool access.
- The workshop emphasizes iterative improvement via evals and “hill climbing,” using a quick eval set to test changes before a final scored run.
- Scoring prioritizes not only total diamonds mined but also token efficiency, discouraging brute-force use of heavier models and rewarding prompt and agent optimization.
- Logistics include a shared world seed and start kit for fairness, a leaderboard and agent chat, and one accepted top run per participant despite multiple submissions.
IDEAS WORTH REMEMBERING
5 ideasTreat agent configuration as the primary performance lever.
The workshop highlights that system prompts, model selection, and which skills/MCPs you enable largely determine whether the agent exhibits efficient “diamond mining” behavior.
Optimize for diamonds-per-token, not just raw capability.
Token efficiency is explicitly a tiebreaker, pushing participants to craft tighter prompts and workflows instead of defaulting to the largest model.
Use evals to hill-climb toward better agent behavior.
A ~1-minute eval set enables rapid iteration; participants are encouraged to measure, adjust, and re-run to progressively improve outcomes.
Fairness comes from controlling the environment, not restricting creativity.
Everyone starts with the same seed and kit, so improvements should come from agent strategy and tooling choices rather than lucky world generation.
Design for the harness constraints: tool-based control without visuals.
Because the bot operates via Mineflayer/MCP actions (e.g., mine block, move, jump), agents must reason and plan using tool outputs rather than relying on a visual interface.
WORDS WORTH SAVING
5 quotesAnd whoever has the most diamonds at the end of thirty-five minutes wins.
— Ben
So this is not just Mine the most diamonds, it is get the best diamonds to tokens ratio.
— Ben
The main levers that you're going to be focused on, uh, are about-- I think it's on the next slide, but, uh, basically along the lines of how do I optimize this run?
— Jeff
Seems like 19 might be the upper echelon of what's possible, at least so far.
— Jeff
We actually have somebody who's broken 19 with only one minute and 20 seconds to go. Wow. You'll have to reveal your technique.
— Jeff
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome