The Twenty Minute VCAidan Gomez: What No One Understands About Foundation Models | E1191
CHAPTERS
- 0:00 – 2:31
From rural Ontario to computer obsession: dial-up, gaming, and early CS curiosity
Aidan describes growing up in a remote, forested part of Ontario with limited access to technology and internet. That scarcity—paired with a love of gaming—pushed him to obsess over making computers and connectivity work better, ultimately leading him toward computer science.
- •Rural upbringing with minimal tech access shaped his motivation
- •Years on dial-up created a fixation on speed and systems
- •Gaming as an early gateway into technology interest
- •Early tinkering led naturally into learning coding and how the web works
- 2:31 – 4:29
What gaming teaches founders: grinding, resilience, and learning through failure
The discussion connects gaming habits to founder traits: willingness to grind, comfort with repetition, and resilience after failure. Aidan highlights the “respawn” mentality as a powerful psychological model for iteration and improvement.
- •Games train persistence through repetitive difficulty
- •Failure is normalized—retrying is built into the experience
- •Iteration leads to measurable progress across attempts
- •Mindset shift: you can recover from mistakes and improve
- 4:29 – 7:11
Scaling laws reality check: more compute works, but it’s the most inefficient path
Harry pushes on whether more compute is still the main driver of model improvements. Aidan agrees scaling reliably boosts performance but argues it’s an inefficient, brute-force approach—especially as smaller models rapidly catch up through better techniques.
- •Scaling compute and parameters remains the most reliable improvement lever
- •It’s low-risk for well-capitalized players but economically wasteful
- •Recent progress: dramatically smaller models can match/beat older huge ones
- •Market forces increasingly reward efficiency over raw scale
- 7:11 – 8:07
Horizontal vs vertical models: prototype big, then distill into focused systems
Aidan predicts a long-term world of both general-purpose foundation models and specialized vertical models. He explains a common pattern: teams prototype with a powerful general model, then distill or fine-tune into cheaper, task-specific models for production.
- •Ecosystem will include both horizontal and verticalized models
- •Big models are best for fast prototyping and proving feasibility
- •Distillation turns expensive prototypes into efficient production models
- •Specialization reduces cost and improves deployment practicality
- 8:07 – 8:54
Who can afford the AI race: beyond hyperscalers via data and method innovation
They discuss whether only Big Tech can remain competitive given massive training costs. Aidan argues that if you only pursue scaling, you need hyperscaler backing—but there’s still room to compete through data innovation and new training methods.
- •Pure scaling favors hyperscalers or their “subsidiaries”
- •Alternative paths include data, algorithms, and method breakthroughs
- •Economic constraints limit how far expensive models can be pushed
- •Pressure on price drives innovation toward smaller, smarter systems
- 8:54 – 13:45
Data and method innovations: better scraping, synthetic data, and reasoning-focused training
Aidan breaks down what “data innovation” means (higher-quality curated datasets and synthetic data) and what “method innovation” could look like (RL, search, and letting models think/try/fail). A key barrier to reasoning is that the internet rarely shows the intermediate steps of thought, so companies must generate that data.
- •Open-source gains largely driven by improved data quality and curation
- •Synthetic data is increasingly central to training and distillation pipelines
- •Method innovation includes RL, search, and iterative problem-solving loops
- •Reasoning data is scarce online because people publish conclusions, not work
- 13:45 – 15:53
Commoditization and price dumping: models trend to low-margin, value shifts to chips and apps
Harry raises concerns about OpenAI price cuts and Meta’s free releases driving a race to the bottom. Aidan agrees model-only businesses face tight margins in the near term, with value accruing at the chip layer (capex) and at the application layer (monetization).
- •Model APIs face pricing pressure from dumping and open-source releases
- •Model-only businesses risk becoming near-zero-margin in the short term
- •Value concentrates at chips/infrastructure and at applications/products
- •Cohere signals it will expand its product suite beyond only model access
- 15:53 – 18:30
Chips, platform optionality, and the next wave of training hardware competition
Aidan describes growing chip spend and why Cohere supports multiple chip providers and clouds: customers demand optionality and avoidance of lock-in. They discuss vertical integration trends and a future where training hardware becomes more competitive beyond Nvidia, including TPUs, AMD, and others.
- •Chip spend has become a dominant cost center
- •Multi-platform support reduces customer lock-in and meets enterprise requirements
- •Verticalization into chips is attractive due to high margins and limited supply
- •Training compute is becoming more heterogeneous (TPUs proven; more entrants coming)
- 18:30 – 20:13
Compute supply chain and infrastructure strategy: partner data centers unless economics flip
They explore whether model progress outpaces data center buildout and whether companies should build their own infrastructure. Aidan says Cohere partners today, but would build data centers if economics or access to a compelling chip demanded it; early compute access was easier because Cohere predates the GPU crunch.
- •Potential misalignment between model iteration speed and compute availability
- •Cohere doesn’t build data centers now, but would if it became cheaper/necessary
- •Infrastructure decisions depend on chip availability and provider procurement
- •Early days benefited from starting before the current compute supply constraints
- 20:13 – 23:48
Transformers to ChatGPT: why adoption suddenly exploded and why chat/voice interfaces matter
Aidan reflects on co-authoring the Transformer paper and not anticipating the architecture’s massive consolidation across AI. He identifies ChatGPT as the key inflection point because it put the capability directly in users’ hands, then explains why chat isn’t universal and why voice is a uniquely powerful interface.
- •Transformer’s downstream impact was not obvious in 2017
- •ChatGPT accelerated adoption by making the tech experiential, not theoretical
- •Chat is useful but not the right interface for everything; GUIs still matter
- •Voice interaction feels emotionally compelling and could drive major consumer shifts
- 23:48 – 27:06
Short-term vs long-term progress: gains get harder, but the “plateau” narrative is wrong
Harry asks whether we’re underestimating near-term AI progress. Aidan argues improvements are getting more expensive because training increasingly requires scarce domain experts, yet major method-based breakthroughs (reasoning, planning, long-horizon tasks) are still coming and will unlock new capability jumps.
- •Incremental improvements require more specialized, expensive human expertise
- •Compute cost trends still fall, but data/teaching bottlenecks rise
- •Perceived progress may feel slower to non-experts despite real capability gains
- •Upcoming method advances: planners/reasoners, try-fail-recover, long-horizon autonomy
- 27:06 – 37:46
Is it too late for startups in model-building? ‘No market for last year’s model’ + consolidation risks
They debate whether falling compute costs enable new startups to enter the model space. Aidan notes that while last-gen models get cheaper fast, demand concentrates on the newest generation—so being behind is fatal; he also predicts consolidation and warns against becoming a cloud provider’s dependent subsidiary.
- •Barrier drops primarily for older generations, not the frontier
- •‘There’s no market for last year’s model’—obsolescence is rapid
- •Progress is still worth funding, but value depends on who pays for it
- •Consolidation is likely; dependence on a cloud investor/provider can be dangerous
- 37:46 – 43:52
Enterprise adoption: trust, private deployments, RAG to reduce hallucinations, and shift from POC to production
Aidan outlines why enterprises hesitate: security, IP risk, and distrust of data usage. He explains Cohere’s approach (private/VPC/on-prem deployments) and how RAG provides citations and reduces hallucinations, while enterprises increasingly move from experimentation to production urgency with workforce augmentation as the top use case.
- •Top blocker: trust/security and fear of training on proprietary enterprise data
- •Private deployment models reduce data exposure and alleviate adoption barriers
- •RAG enables retrieval + citation, cutting hallucinations and enabling customization
- •Enterprise budgets are shifting from POCs to production; employee augmentation leads demand
- 43:52 – 50:09
Agents and copilots: why true workforce augmentation needs tool-agnostic platforms and better reasoning
They discuss agent hype and whether the best agent products will be built by model builders or application-layer companies. Aidan argues agents are the promise of AI, but success depends on reasoning quality and model-level control; he also critiques siloed copilots and emphasizes enterprise tool diversity.
- •Agent hype is justified: long-horizon autonomous work is transformative
- •Agent performance hinges on the underlying model’s reasoning/planning
- •Non-model-builders can be structurally disadvantaged without model-level levers
- •Enterprise assistants must span many tools (Office, Salesforce, SAP, internal apps), not one ecosystem
- 50:09 – 55:55
Human displacement fears, social implications, and where AI could break through next: robotics
Harry worries about AI replacing human interaction and jobs; Aidan argues humans remain central, with localized displacement but net productivity growth. He then points to robotics as a likely major breakthrough area as foundation-model-based planning reduces brittleness and enables more general-purpose machines.
- •AI likely augments rather than replaces humans in most high-accountability contexts
- •Some roles (e.g., customer support) may see meaningful localized displacement
- •Bots may handle emotionally taxing interactions, with humans focusing on higher-value cases
- •Robotics could see big gains as planners/reasoners become more robust and adaptable
- 55:55 – 1:03:26
Quick-fire: underrated importance of data, fundraising realities, Europe vs UK tech culture, and productivity as the north star
In a rapid Q&A, Aidan says he most changed his mind about the importance and sensitivity of data quality. He discusses Cohere’s fundraising scale, remote vs in-person work, his views on UK vs broader European tech attitudes, and closes with a focus on productivity growth as the most important (and under-hyped) AI outcome.
- •Data quality is highly leverageable; small amounts of bad data can matter a lot
- •Raising at the scale of hundreds of millions distorts intuition about money and competition
- •UK shows more tech optimism than much of Europe, where regulation-first attitudes dominate
- •Desired direction: use AI to boost productivity, abundance, and economic growth