Pratyush Kumar, Co-founder, Sarvam AI | "Sarvam means everybody- AI should be for everyone."| Ep. 24

Name: Pratyush Kumar, Co-founder, Sarvam AI | "Sarvam means everybody- AI should be for everyone."| Ep. 24
Uploaded: 2025-05-23T12:00:00Z
Duration: 1 h 4 min 1 s
Description: Sarvam AI grew out of IIT Madras’ AI4Bharat efforts to build high-quality Indian-language AI using data, compute, and community-driven research.

Best Place To BuildMay 23, 20251h 4m

Pratyush Kumar (guest)

AI4Bharat origin and evolutionIndian-language data scarcity and “culture tokens”Code-mixing and Romanized Indian-language inputFour-layer AI stack: inference, models, orchestration, appsSovereign AI and strategic autonomyAI as a public utility akin to UPIGPU/compute economics and the rapid “value loop”Real deployments: Aadhaar voice workflows, insurance outreach, courts, NITI Aayog analyticsEcosystem building: academia–startup–VC integrationHuman–AI dependence and philosophical implications

In this episode of Best Place To Build, featuring Pratyush Kumar, Pratyush Kumar, Co-founder, Sarvam AI | "Sarvam means everybody- AI should be for everyone."| Ep. 24 explores building sovereign AI for India: language, scale, autonomy, utility access Sarvam AI grew out of IIT Madras’ AI4Bharat efforts to build high-quality Indian-language AI using data, compute, and community-driven research.

Building sovereign AI for India: language, scale, autonomy, utility access

Sarvam AI grew out of IIT Madras’ AI4Bharat efforts to build high-quality Indian-language AI using data, compute, and community-driven research.

Kumar frames “sovereign AI” as strategic autonomy: the capability to build and deploy core AI technology domestically without isolation from global collaboration.

He outlines a four-layer full-stack approach—inference, models, orchestration, and applications—arguing that India needs all layers to make AI reliable, low-cost, and scalable.

Sarvam’s work emphasizes India-specific language challenges (low-resource languages, culture tokens, Romanization, and code-mixing) to make AI usable for “everybody.”

He argues AI should become a national-scale utility (like UPI), where per-capita AI usage could become a proxy for national productivity and competitiveness. જોકે

key_topics=[

Key Takeaways

Indian-language AI requires more than translation; it needs cultural and usage realism.

Kumar highlights not just language tokens, but “culture tokens” from undigitized material and evolving forms like Romanized Hindi and code-mixed text, which must be represented for models to work for everyday Indians.

Get the full analysis with uListen AI

Sovereign AI is primarily about capability, not isolation.

He defines sovereignty as the ability to build strategic tech “from scratch” domestically while still collaborating globally, giving India leverage and resilience in critical sectors.

Get the full analysis with uListen AI

Full-stack execution is essential to make AI affordable and reliable at national scale.

Sarvam splits the stack into inference efficiency, model training, orchestration (systems + workflows), and domain applications—because real deployments require low latency, telemetry, reliability, and scalable operations.

Get the full analysis with uListen AI

Deployments create a fast compounding “value loop” that should remain local.

Unlike slower hardware iteration cycles, AI can improve in months based on usage feedback; Kumar argues India must keep this loop in-country so learnings, data, and economic upside reinforce domestic capability.

Get the full analysis with uListen AI

AI in India can follow a UPI-like public–private scaling path.

Instead of only US-style big-tech scale or China-style heavy state control, he suggests India can catalyze compute and standards while enabling private innovation on top—making AI a low-cost utility.

Get the full analysis with uListen AI

Government and high-stakes use cases can pull the ecosystem forward.

Examples include air-gapped, on-prem deployments in UIDAI contexts, court judgment accessibility, and NITI Aayog’s natural-language interface to thousands of government datasets—forcing robustness and trustworthiness.

Get the full analysis with uListen AI

The biggest bottlenecks are compute, data preparation, and talent—not just algorithms.

He notes model training and data processing are GPU-intensive (“GPU factories”), while frontier know-how remains partially tacit; costs are falling but still limit the number of serious foundation-model builders.

Get the full analysis with uListen AI

Notable Quotes

““Sarvam in Sanskrit means everybody, everyone, because the intention is that it should be used by everyone.””
— Pratyush Kumar

““You should have the ability to build it yourself… happy to collaborate with whoever in the world… but you should have the ability to build it yourself.””
— Pratyush Kumar

““I think AI could start looking like [electricity consumption] soon… your per capita consumption of AI is a decent proxy for how advanced or competitive you are as a country.””
— Pratyush Kumar

““We see ourselves as a full stack company… we see it as four layers.””
— Pratyush Kumar

““In the basement of Aadhaar, we have a set of boxes… GPUs… which contain… models and the orchestration layer… to deal with… calls… when biometric fails.””
— Pratyush Kumar

Questions Answered in This Episode

On the ‘four layers’ framing, what metrics do you track to decide where to invest next (inference optimization vs. model capability vs. orchestration reliability vs. app success)?

Sarvam AI grew out of IIT Madras’ AI4Bharat efforts to build high-quality Indian-language AI using data, compute, and community-driven research.

Get the full analysis with uListen AI

You mentioned “culture tokens” and undigitized sources—what’s Sarvam’s concrete plan to acquire or digitize this data ethically and at scale across 22 scheduled languages?

Kumar frames “sovereign AI” as strategic autonomy: the capability to build and deploy core AI technology domestically without isolation from global collaboration.

Get the full analysis with uListen AI

How will Sarvam evaluate and publish quality for code-mixing and Romanized input (e.g., Hinglish, Tanglish) compared to pure-script benchmarks?

He outlines a four-layer full-stack approach—inference, models, orchestration, and applications—arguing that India needs all layers to make AI reliable, low-cost, and scalable.

Get the full analysis with uListen AI

In sensitive deployments like UIDAI’s air-gapped environment, what are the operational and governance requirements (auditability, update cadence, red-teaming) that differ from cloud AI?

Sarvam’s work emphasizes India-specific language challenges (low-resource languages, culture tokens, Romanization, and code-mixing) to make AI usable for “everybody.”

Get the full analysis with uListen AI

What does “open standards” mean in the sovereign LLM context—APIs, evaluation suites, safety protocols, model weights—and which parts should remain proprietary?

He argues AI should become a national-scale utility (like UPI), where per-capita AI usage could become a proxy for national productivity and competitiveness. જોકે

Get the full analysis with uListen AI

Transcript Preview

Pratyush Kumar

we should build, um, AI for Indian languages, right? Because it was an interesting problem. India is so diverse, very rich culture, and it shows in our languages. So many different scripts, so many different unique ways of speaking, and we said it's a good, uh, challenge, uh, to build AI for Indian languages. You should have the ability to build it yourself, because that gives a very different stance to who we are as a country, right? Um, and I think this is very, very important for strategic, uh, technologies. Uh, and I think it is very true for AI. [upbeat music]

Speaker

Hi, my name is Amrit. We've heard that IIT Madras is the best place to build. So we've come down to the Sudha and Shankar Innovation Hub. We want to meet some people. These are builders. We want to talk to them about their work, and also ask them, "What makes IIT Madras the best place to build?" Hi, and welcome to The Best Place To Build Podcast. This is Amritash. Today I'm sitting with Pratyush, the co-founder and CEO of Sarvam AI. Hi, Pratyush.

Pratyush Kumar

Hi, Amritash.

Speaker

Pratyush, you are, uh... you like to stay away from the limelight, for good reason, so if it's okay, I will run an introduction. Pratyush runs Sarvam, one of India's hottest AI startups. They solve for, uh, AI for India, so Indian languages, Indian use cases, uh, speech to text, text to speech translation, and of course, LLM. Uh, company has raised funding from Peak XV, formerly Sequoia, Khosla Ventures, and Lightspeed. You should know that Khosla Ventures was an early investor in OpenAI also. Uh, recently, it was also announced that, uh, by the Honorable Minister Ashwini Vaishnaw, that Sarvam has been selected by the government of India for- to build India's sovereign LLM, uh, under the India AI mission. So we'll talk a little bit about that. Um, the TLDR version is that Sarvam is one of India's hottest AI companies, uh, and is deeply linked to India's AI objectives, and Pratyush leads Sarvam. Uh, hi, Pratyush.

Pratyush Kumar

Hi, Amritash.

Speaker

Um, that's a very heavy introduction, but I want to say that you were into AI long before this LLM craze began, right? So can you give us a sort of an introduction on how your journey has come to be?

Pratyush Kumar

Uh, certainly. Thanks. Thanks for the invite, and it's great to be sitting here in an Bajaj garage in IIT Madras. Uh, it's definitely a place to build, um, so, so glad to be on the podcast. Um, yeah, my background, um, uh, I studied electrical engineering, um, at, at, uh, IIT Bombay. Uh, I sort of, uh, do believe that electrical engineering is a broad topic that teaches you various things, right? From, you know, understanding linear algebra to, to writing code. Uh, and then went on to do a systems engineering PhD at ETH Zurich, and basically learned more about high-performance computing, uh, reliable computing, and so on. Uh, so not really AI at that point, but then I joined IBM Research. Uh, this was the early, uh, time of deep learning. AlexNet had happened, and so people were aware that you could just make these models larger and get more out of them. But interestingly, the area was taking a turn. Uh, it was going from one where, uh, the algorithms mattered a lot, especially algorithm for computer vision, image, uh, for video, for audio separately, uh, versus it becoming, uh, very compute-driven, where you had to build efficient systems on very large amounts of data, uh, and you did much better. So at IBM Research, I got to see a bit of that and started working in this area, and then decided, uh, that it's so- it's an important area, that it's worth doing fundamental research on. So I joined IIT Madras here, uh, as a faculty member, uh, in the computer science department. Uh, and I met a very willing, uh, colleague in Mitesh Khapra, and we decided to say, "Let's focus on, uh, building AI."

Install uListen to search the full transcript and get AI-powered insights

Get Full Transcript

Get more from every podcast

AI summaries, searchable transcripts, and fact-checking. Free forever.

Add to Chrome