No Priors Ep. 8 | With Neeva’s Sridhar Ramaswamy

For the first time in decades web search might be at risk for disruption. Bing is allied with OpenAI to integrate LLMs. Google has committed to launching new products. New startups are emerging. Sridhar Ramaswamy co-founded the challenger AI-powered, private search platform Neeva in 2019. He is a former 16-year Google veteran who most recently led the internet’s most profitable business as SVP in charge of Google Ads, Commerce and Privacy. Sridhar, Elad and Sarah talk about the challenge of building search, how LLMs have changed the landscape, and how chatbots and "answer services" will affect web publishers. 00:00 - Introduction 01:32 - Why Sridhar started a private search engine after leaving Google 11:11 - Information Retrieval Problems, Mapping Search Queries and LLMs 15:25 - Google and Bing’s approach to search with LLMs 19:06 - Scale challenges when building a search engine startup 22:26 - Distribution challenges and why they release Neeva Gist 24:11 - Why Neeva is a privacy centric subscription service 28:25 - The relationship between search and publishers/content creators 30:16 - Sridhar’s predictions on how AI will disrupt current ecosystems

Sarah GuohostSridhar RamaswamyguestElad Gilhost

Apr 25, 202335mWatch on YouTube ↗

CHAPTERS

0:00 – 1:56
Rethinking search after Google: the Neeva origin story
Sridhar explains why he and cofounder Vivek chose to start Neeva after leaving Google, returning to “back to basics” search. The core premise: an ads-free product could be designed around user value first, and AI later unlocked a step-function improvement in experience.
- •Motivation to build a new search engine despite Google’s dominance
- •Founding insight: removing ads changes product incentives and UX priorities
- •First years focused on private search; learned hard lessons about consumer adoption
- •LLMs became the later “aha” that enabled a far better end-user experience
- •Belief that one search engine shouldn’t be the single default for the world
1:56 – 3:51
What could be better than Google: privacy, ads, and consumer inertia
They unpack what Neeva initially tried to improve—privacy and an ads-free experience—and why that alone wasn’t enough to force switching. Sridhar describes how difficult it is to get users to take the first step away from entrenched defaults, with notable differences between the US and Europe.
- •Privacy + ads-free as early differentiators
- •User behavior doesn’t match stated preferences; switching is psychologically hard
- •Neeva saw stronger adoption in Europe than the US
- •Personalization experiments (preferences/personal data) had mixed results
- •Once users try an alternative, they notice shortcomings in the default experience
3:51 – 7:02
From featured snippets to LLM answers: why summaries win
Sridhar connects today’s AI summaries to earlier “answer-first” wins like Featured Snippets and integrated vertical results (images, local). The big shift is that LLMs make high-coverage, query-relevant summaries feasible at scale—something older approaches couldn’t reliably deliver.
- •History: Featured Snippets and integrated results (images/local) beat standalone destinations
- •Users prefer minimized effort; answers beat link lists (Occam’s razor framing)
- •Traditional search relies on opaque links and linear scanning
- •Old snippet tech had limited coverage and created publisher tension
- •LLMs enable page understanding and query-conditioned summarization far more broadly
7:02 – 10:21
Building cited summaries at scale: in-house models and RAG foundations
Neeva’s product leap came from generating “cited summaries” that provide a coherent answer while grounding claims in sources. Sridhar details why they built key capabilities in-house rather than relying fully on third-party APIs, and how retrieval-augmented generation underpins the system.
- •Cited summaries: a single fluid answer with citations for trust and navigation
- •Need to summarize the right section of a page depending on the query
- •Decision to avoid being beholden to external APIs for massive-scale summarization
- •Goal: authoritative answers for ~50–70% of information-seeking queries
- •RAG as the early template for combining retrieval with LLM generation
10:21 – 11:50
Beyond summaries: structured extraction, 1-boxes, and tool-using search
The conversation shifts to how LLMs can reduce bespoke engineering for triggers, parsing, and structured extraction. They discuss replacing brittle regex/scraping (“Beautiful Soup nightmares”) with model-driven approaches and broadening search into a tool-using interface (APIs, interpreters).
- •1Box triggers used to require lots of custom rules; LLMs can simplify query interpretation
- •LLMs can extract structure from both queries and web pages more generally
- •Replacing hard-coded vertical logic (e.g., weather phrasing) with model-based extraction
- •Tool use vision: route arithmetic to a Python interpreter, call APIs as needed
- •Search vs chatbot boundaries blur as retrieval + tools become standard
11:50 – 14:03
Tail queries and expectations: what LLMs fix—and what stays hard
Sridhar argues that many tail queries have historically been poorly served by classic search, and chat interfaces raise user expectations. Not every question becomes instantly answerable, but a much larger share can be handled meaningfully when LLMs are integrated as one of several tools.
- •Tail queries have long been weak in traditional search quality
- •Chatbots encourage people to ask more natural, vague questions
- •Not all questions are “usefully answerable,” even with LLMs
- •Hard questions remain hard; progress comes from combining multiple methods
- •User expectations for direct answers will continue to rise
14:03 – 18:44
The economics of LLM search: cost per query, model sizing, and margins
Elad probes whether LLMs are cost-prohibitive for search at scale, especially relative to ad revenue. Sridhar frames the unit economics (cents per call can be enormous at search volumes) but expects rapid cost declines and increasing use of smaller, fine-tuned models across tasks.
- •LLM inference can imply very high effective CPM/RPM costs at scale
- •OpenAI API cost reductions illustrate rapid early cost compression
- •Multiple model calls per query can compound costs
- •Strategy: use smaller (5–10B) fine-tuned models for many tasks; run multiple specialized models
- •Aggressive distribution deals (even 100% rev share) can be rational to gain a beachhead
18:44 – 22:16
Why search startups are brutally hard: crawling, indexing, and team psychology
Sridhar describes search as a scale-dependent product: you need major infrastructure just to be “vaguely competitive.” He highlights the engineering hurdles (crawl, index, iteration speed), the shortcuts enabled by LLMs, and how launching “Answers” energized the team by making improvement obvious to users.
- •Search requires significant scale up front unlike many software products
- •Crawling and indexing are expensive, foundational challenges
- •Engineering innovations: flash-based architecture, rapid index replacement, faster experimentation
- •LLMs substitute for some data/scale advantages (query rewriting, structured extraction)
- •Answers changed team morale: users can immediately see superiority vs link lists
22:16 – 24:07
Distribution strategy and Neeva Gist: experiments, partnerships, and embedded search
They turn to distribution as the primary bottleneck: habits are sticky and incumbents spend heavily on default placement. Sridhar explains Gist as a deliberate UX experiment and discusses embedding Neeva-like conversational search into publisher sites and partnering with privacy-centric products.
- •Distribution is the top worry; superior product helps but isn’t sufficient alone
- •Neeva Gist: “search like Instagram Stories” as a new consumption format
- •Exploring conversational search for publishers (e.g., Reddit-like or local publishers)
- •Partnerships with privacy-oriented products (e.g., Dashlane; discussions with Proton)
- •Uncertainty creates openings for new integrations and go-to-market paths
24:07 – 28:19
Subscription vs ads: incentives, quality, and the risks of ad overload
Sridhar explains why Neeva chose a privacy-centric subscription model to differentiate and align incentives. They discuss when ads work, how ad systems can degrade UX over time, and why “answers-first” products may find ads harder to integrate without undermining trust.
- •Subscription gave Neeva differentiation and clearer incentive alignment
- •Ads can scale massively but create self-destructive pressure to show more ads
- •Google historically protected some quality bars; Amazon cited as an ad-heavy counterexample
- •Ads are harder to integrate cleanly into a pure “answer” interface
- •Hybrid models are possible, though late-stage ad adoption can face scrutiny
28:19 – 30:42
Search and publishers in an answer-first world: traffic, control, and consolidation
Sarah raises the existential question: if summaries reduce clicks, what happens to publishers’ incentives to create content? Sridhar predicts large platforms will demand tighter control over how their content is used in LLM answers, while smaller publishers may struggle—potentially driving consolidation and new content platforms.
- •Big unknown: how answer UIs reshape incentives and traffic flows
- •Large publishers/platforms may allow indexing but restrict LLM training/answers usage
- •Big players may build their own chat experiences over their content
- •Smaller blogs may struggle to monetize if users stop reading full pages
- •Potential outcome: more centralization and new formats for interactive content consumption
30:42 – 33:58
What AI disrupts next: advertising, tool-using agents, and action transformers
Looking beyond search, Sridhar highlights advertising as a major near-term disruption due to optimization loops and personalization, especially multimodal. He’s most excited about agentic systems (“action transformers”) that combine LLMs with tools to perform tasks—though he notes the technology is still nascent.
- •Advertising likely to be heavily disrupted via personalized, multimodal generation
- •Incumbents (e.g., Microsoft) are reacting unusually fast compared to past tech cycles
- •SaaS disruption may be harder because incumbents can incorporate features quickly
- •Agentic/tool-using systems: combine LLMs with search, APIs, programs, and websites
- •Potential applications: AI SRE, code review, RPA-like workflows—still early-stage
33:58 – 35:35
Closing reflections: democratization and the ‘WhatsApp moment’ for LLM apps
In closing, Sridhar frames this as a rare, democratizing platform shift where tiny teams can build world-scale products by combining standard infrastructure with language models. The hosts wrap with final thoughts and thanks.
- •LLMs are powerful and rapidly democratizing
- •Analogy: WhatsApp showed how small teams could reach the world on mobile
- •Expectation: small teams will create surprising new LLM-native applications soon
- •Optimism about transforming search via Neeva’s direction
- •Podcast wrap-up and acknowledgements