No Priors Ep. 52 | With Pinecone CEO Edo Liberty

Accurate, customizable search is one of the most immediate AI use cases for companies and general users. Today on No Priors, Elad and Sarah are joined by Pinecone CEO, Edo Liberty, to talk about how RAG architecture is improving syntax search and making LLMs more available. By using a RAG model Pinecone makes it possible for companies to vectorize their data and query it for the most accurate responses. In this episode, they talk about how Pinecone’s Canopy product is making search more accurate by using larger data sets in a way that is more efficient and cost effective—which was almost impossible before there were serverless options. They also get into how RAG architecture uniformly increases accuracy across the board, how these models can increase “operational sanity” in the dataset for their customers, and hybrid search models that are using keywords and embeds. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @EdoLiberty Show Notes: 0:00 Introduction to Edo and Pinecone 2:01 Use cases for Pinecone and RAG models 6:02 Corporate internal uses for syntax search 10:13 Removing the limits of RAG with Canopy 14:02 Hybrid search 16:51 Why keep Pinecone closed source 22:29 Infinite context 23:11 Embeddings and data leakage 25:35 Fine tuning the data set 27:33 What’s next for Pinecone 28:58 Separating reasoning and knowledge in AI

Sarah GuohostEdo LibertyguestElad Gilhost

Feb 21, 202431mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Pinecone CEO Explains Vector Databases, RAG, And Scalable AI Memory

Pinecone CEO Edo Liberty discusses how vector databases provide long-term, scalable “memory” for AI systems by storing and retrieving embeddings rather than raw text or keywords.
He explains why Retrieval-Augmented Generation (RAG) dramatically reduces hallucinations, equalizes performance across LLMs, and enables secure use of proprietary enterprise data without fine-tuning models.
Liberty details Pinecone’s serverless architecture, the Canopy RAG framework, and why specialized vector databases outperform bolt-on solutions in traditional databases or keyword search engines at production scale.
He also shares views on open source vs. managed services, the limits of large context windows, privacy benefits of RAG, and a future where reasoning and knowledge are separated into more efficient AI subsystems.

IDEAS WORTH REMEMBERING

5 ideas

Use vector databases and RAG to make LLMs more accurate and consistent.

Retrieving relevant embeddings from a large corpus and injecting them into the prompt can cut hallucinations by up to ~50% and makes different LLMs behave more similarly in terms of factual accuracy.

Store proprietary data in a vector database rather than fine-tuning on it.

By keeping models frozen and only using embeddings for retrieval at inference time, companies avoid data leakage into the model weights, stay GDPR-compliant, and can simply delete vectors to ‘forget’ information.

Don’t rely on bolt-on vector features in general-purpose databases for production scale.

Retrofitted solutions like pgVector or legacy search systems can work for small experiments, but typically fail on cost, latency, and scalability when you need hundreds of millions or billions of vectors in production.

Adopt serverless vector infrastructure to remove scaling and cost bottlenecks.

Pinecone’s serverless model is designed for near-unlimited scale, high efficiency, and easy operation, supporting tens of billions of vectors without complex capacity planning or cluster management.

Use hybrid search pragmatically, but design for embeddings-first search.

While Pinecone’s hybrid mode (dense + sparse/keyword vectors) helps boost and control relevance today, Liberty expects keyword-heavy search to fade as embedding quality and retrieval techniques improve.

WORDS WORTH SAVING

5 quotes

Models are mathematical objects… they don’t save the pixels or the words, they save a numeric representation called an embedding or a vector.

— Edo Liberty

If you augment all of them with RAG… you can reduce hallucinations significantly, up to 50% sometimes.

— Edo Liberty

Keyword search is a deeply flawed retrieval system.

— Edo Liberty

One of the main reasons why people attach vector databases to foundational models is it gives you this operational sanity that is almost completely impossible without it.

— Edo Liberty

It hurts the brain to figure out that we take half the internet and cram it into GPU memory. I’m like, ‘Why? This can’t be the right thing to do.’

— Edo Liberty

What vector databases and embeddings are, and how Pinecone worksRetrieval-Augmented Generation (RAG) to reduce hallucinations and improve reliabilityEnterprise use cases: semantic search over proprietary text data (e.g., Notion, Gong)Pinecone serverless architecture and the Canopy open-source RAG frameworkComparison with traditional databases and keyword search (Postgres, Elastic, Algolia)Hybrid search (dense + sparse/keyword vectors) and its future rolePrivacy, data leakage, and the tradeoffs between fine-tuning, prompt engineering, and RAG

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.