Nikhil KamathNikhil Kamath ft. Perplexity CEO, Aravind Srinivas | WTF Online Ep 1.
CHAPTERS
- 0:00 – 0:45
Remote catch-up from SF to Dubai: Chennai roots and travel plans
Nikhil and Aravind open with light context: where Aravind is (San Francisco), when he’s coming to India, and his ties to Chennai. It sets an informal tone before moving into Aravind’s background.
- •Aravind’s base in San Francisco and frequent travel
- •Typical India itinerary: Chennai first, then Mumbai/Delhi/Bangalore
- •Growing up in Chennai and knowing the city well
- 0:45 – 12:14
From cricket stats to code: IIT Madras, Kaggle, and early ML wins
Aravind traces his path from a numbers-first upbringing (cricket statistics) to programming and the IIT track. A Kaggle contest and a fast-paced Bangalore internship push him from curiosity into practical ML.
- •Early intuition for numbers via cricket stats; strong math foundation
- •Programming picked up around 11th standard; JEE/IIT expectations
- •Kaggle contest introduces scikit-learn and applied ML; early success
- •Bangalore internship building recommender systems; self-teaching Andrew Ng/Stanford ML
- •Transition from coursework to research trajectory leading to Berkeley PhD
- 12:14 – 29:06
Berkeley grind and the OpenAI apprenticeship: humility, mentors, and focus
He describes arriving at Berkeley without an advisor, building momentum through intense self-driven work, and earning mentorship. The OpenAI internship becomes a turning point: rigor, blunt feedback, and prioritizing what works over “fancy” ideas.
- •No advisor initially → disciplined routine (Philz Coffee 5:30am–8pm)
- •Learning cloud compute early due to limited local resources
- •Paper leads to advisor Pieter Abbeel; connection to John Schulman/OpenAI
- •OpenAI environment as a humility reset; being “not the best in the room” is okay
- •Core lesson: practical results and simplicity can beat complex academic ideas
- 29:06 – 35:54
Ilya’s ‘two circles’ view: generative AI + RL and the compute bet
Aravind recounts Ilya Sutskever’s framework: generative modeling plus reinforcement learning as the core recipe toward AGI, with scale (compute) as the accelerator. Aravind contrasts this with his own research ambition of models learning their own loss functions.
- •Ilya’s conceptual model: generative AI (big circle) and RL (small circle)
- •“Throw a lot of compute at it” as the long-run driver
- •Aravind’s earlier idea: AI learning/tuning its own loss function iteratively
- •Why the idea was deemed too complicated for real-world progress
- •Takeaway: simplest scalable approaches often win when paired with compute
- 35:54 – 45:25
AI/AGI explained simply: narrow vs general intelligence and economic impact
Nikhil asks for a ‘10-year-old’ explanation of AI, leading to distinctions between narrow task programs (like chess engines) and general systems that handle many tasks. Aravind emphasizes why modern AI matters: it affects paid knowledge work at scale.
- •Definition: programming computers to perform tasks requiring ‘intelligence’
- •Narrow AI examples: chess programs, calculators; limited transferability
- •General intelligence framed as one system doing many tasks without hard-coding
- •Functional lens: compare input/output against human performance on paid work
- •Why the current wave is disruptive: broad usefulness and labor substitution potential
- 45:25 – 53:59
Compute-to-LLM evolution: from circuits and PCs to transformers and chatbots
They build a timeline from calculators (circuits), to personal computing (Moore’s Law, VisiCalc), to internet/mobile/cloud, and finally today’s AI. Aravind highlights what changed since ~2010: neural nets + scale + high-quality data + human feedback.
- •Calculator mechanics: adders/multipliers as circuits; “artifact” that would still work in 1800
- •Personal computer revolution and the spreadsheet as a killer app (VisiCalc)
- •Network effects: internet/web → mobile → cloud as stepping stones to AI
- •Key shift since 2010: neural networks started working reliably at scale
- •Scale recipe: compute + curated data + RLHF + consumer chatbot interface
- 53:59 – 57:03
Neural networks vs machine learning: patterns, loss functions, and irreducible noise
Aravind explains neural networks as layered functions that transform inputs into outputs, trained by minimizing a loss across large datasets. Using stock-market prediction as intuition, he clarifies that models only learn signal; noise can’t be eliminated and won’t generalize.
- •Neural nets as stacked nonlinear functions (matrices + nonlinearities) trained via backprop
- •Training loop: predictions vs targets → compute loss → update weights across many examples
- •Stock-market example: if data lacks signal, predictions overfit and fail to generalize
- •Machine learning broader than neural nets (SVMs, regressions, etc.)
- •Neural nets scale best with more data/compute; other methods can win with small data
- 57:03 – 1:05:13
What an LLM is: pretraining next-token prediction, transformers, and post-training
They define large language models as giant neural nets trained to predict the next token using internet-scale text. Aravind outlines the two-stage pipeline: pretraining (most compute) and post-training/fine-tuning to become a helpful chatbot, plus multimodal capabilities often added later.
- •LLM core task: next-token prediction on massive corpora (trillions of tokens)
- •Tokenization and data pipelines (storage, batching, training at scale)
- •Transformers as the efficient architecture enabling long-context modeling
- •Post-training: instruction tuning/RLHF for conversational usefulness
- •Multimodal (images/video) typically added in later training stages
- 1:05:13 – 1:22:00
Why LLMs may not equal AGI: physical common sense, robotics, and reasoning
Prompted by Yann LeCun’s critiques, Aravind discusses what’s missing: physical reasoning and embodied common sense. He explains why dexterous tasks remain hard and argues that stronger reasoning/planning plus better world models are needed for physical generalization.
- •LeCun’s emphasis: physical common sense as prerequisite for ‘true’ AGI
- •Robotics challenge: data scarcity vs web-scale text; generalization across new objects/materials
- •Training vs inference compute: humans benefit from evolution; robots need heavy training
- •Need for planners/reasoners that can parse scenes and create action plans
- •Video/audio learning plus mental models for unseen scenarios
- 1:22:00 – 1:34:44
Big AI players and the coming differentiation: from chat commodities to agents
Aravind argues most chatbots are converging—benchmarks and similar training lead to similar answers. Differentiation will shift toward richer UI and, more importantly, agentic systems that take actions (book, email, schedule) by integrating tools and personal context.
- •Current reality: limited differentiation between ChatGPT, Gemini, Grok, Anthropic, Meta AI
- •Benchmarks/leaderboards push models toward similar outputs
- •UI upgrades matter (cards, charts, inline shopping/hotels) but aren’t ‘agentic’ by themselves
- •Agentic future: AI that completes tasks end-to-end with integrations (email, calendar, bookings)
- •Product/integration work becomes a major moat vs model quality alone
- 1:34:44 – 1:39:15
How Perplexity works under the hood: multi-model pipelines, speed, and cost dynamics
Nikhil probes Perplexity’s architecture and business tradeoffs. Aravind explains a multi-model workflow per query (rewrite, retrieval/chunking, summarization, suggestions), infrastructure optimizations for tail latency, and why per-query costs and margins are moving targets.
- •Queries use multiple models for distinct subtasks rather than a single monolith
- •Speed focus: tail latency over average latency; streaming as a UX technique
- •Serving efficiency: optimized runtimes, use of different chips (e.g., Cerebras), fast indexing
- •Cost per query declines with model competition/open source, but new features raise costs
- •Subscription economics: deep research is expensive; pricing pressure vs premium tiers
- 1:39:15 – 1:49:42
Platforms, distribution, and ads: Meta vs Google; why ‘the search bar’ still wins
Discussion shifts to strategy and moats: Aravind picks Meta as the best public-market bet due to social network effects and ads in an AI world. They examine Google’s distribution advantages (defaults, Android, Play Store leverage) and why AI search must also enable transactions to truly disrupt Google.
- •Investment pick: Meta due to human-to-human connection moat and strong ad positioning
- •Google’s incentives conflict: AI-native search vs ads business model
- •Distribution lock-in: default search prompts, OEM/telco deals, Android/Play Store leverage
- •AI assistants must close the loop: research → transaction, not just answers
- •Ads differ: personalized ads can increase engagement; search ads can degrade UX
- 1:49:42 – 2:02:47
India’s opportunity stack: voice, models, data centers, and entrepreneurship paths
Aravind argues India should train competitive global models and build the infrastructure (chips/data centers) to avoid dependency, even if outputs converge. For founders without massive resources, he suggests starting with products on open models, then moving into post-training, pretraining, and infrastructure; he flags Indian voice as a near-term wedge.
- •India should build DeepSeek-like model labs competing on global benchmarks
- •Constraint: compute access is harder to democratize than model weights
- •Founder path: product → users → funding → post-training → pretraining → infra
- •Low-hanging wedge: Indian speech recognition/synthesis across accents and dialects
- •Data sovereignty likely drives local inference/data center demand in the long run
- 2:02:47 – 2:05:26
Compute economics and hardware moats: data centers, NVIDIA, and full-stack challengers
Nikhil asks about data center investing, hyperscaler risk, and whether structural shifts could undermine the thesis. Aravind expects commoditization unless paired with strong software layers; he explains NVIDIA’s durability via general-purpose performance, CUDA lock-in, and the difficulty of replicating a full stack—crediting Google as a rare end-to-end alternative.
- •Data centers: valuable, but long-run margins need software (cloud-like tooling, orchestration)
- •Hyperscalers may build in-house unless local constraints create partnership openings
- •NVIDIA moat: flexible general-purpose chips + interconnect/data center expertise + CUDA ecosystem
- •Inference alternatives exist but may be temporary as new GPU generations arrive
- •Google as notable full-stack competitor (TPUs, JAX/XLA, owned data centers)
- 2:05:26 – 2:08:43
Where the next opportunities are: personal software, ‘build your own app’ platforms, and tools to try
Aravind predicts a wave of personalized apps where users generate software tailored to their needs, reducing reliance on one-size-fits-all SaaS. He points Nikhil to tools like Cursor, Replit, and Bolt as early signs, while noting that top-tier engineering still matters—especially infrastructure and reliability.
- •Future tailwind: platforms that let users create/share personal apps securely
- •Enterprise implication: internal tools become easier to generate on demand
- •Open questions: monetization (micropayments vs ads vs subscriptions) and deployment abstractions
- •Tool recommendations: Cursor (AI coding), Replit/Bolt (agentic build + deploy)
- •Coding education shifts: fundamentals (infra, storage, debugging) remain valuable
- 2:08:43 – 2:16:17
Five-year outlook and regulation: assistants everywhere, displacement risk, and app-level guardrails
Aravind forecasts ubiquitous personal assistants—affordable and widely accessible—plus more creative output and easier software creation. He warns about labor displacement and suggests regulation should focus on harmful applications (especially kids/companionship dynamics) rather than trying to regulate model weights directly.
- •Near-term future: widely available personal assistants; creative tools become mainstream
- •Downside: labor displacement and reduced hiring; pressure on outsourcing firms via price/time expectations
- •Open source spreads capability globally, but compute access remains unequal
- •Regulation stance: regulating models is impractical; regulate risky applications instead
- •Concern area: children forming unhealthy relationships with chatbots; keep AI usage productive
- 2:16:17 – 2:16:30
Closing: paywalls, attribution, and Nikhil’s internship offer at Perplexity
They touch on the uncertain future of paywalled data and whether training should require compensation, contrasting human reading with model distillation. The episode ends with Nikhil asking—seriously—to intern at Perplexity to learn firsthand, and Aravind welcoming the idea.
- •Debate: data ownership, paywalls, and whether model training is ‘different’ consumption
- •Perplexity’s approach emphasized: attribution and sourcing vs training on publisher content
- •Nikhil’s desire to learn by immersion rather than commentary from afar
- •Aravind’s view: proximity matters less now; time spent using tools and talking to experts matters more
- •Mutual close: Aravind plans India visit; Nikhil offers to host in India