No Priors Ep. 64 | With Suno CEO and Co-Founder Mikey Shulman

Mikey Shulman, the CEO and co-founder of Suno, can see a future where the Venn diagram of music creators and consumers becomes one big circle. The AI music generation tool trying to democratize music has been making waves in the AI community ever since they came out of stealth mode last year. Suno users can make a song complete with lyrics, just by entering a text prompt, for example, “koto boom bap lofi intricate beats.” You can hear it in action as Mikey, Sarah, and Elad create a song live in this episode. In this episode, Elad, Sarah, And Mikey talk about how the Suno team took their experience making at transcription tool and applied it to music generation, how the Suno team evaluates aesthetics and taste because there is no standardized test you can give an AI model for music, and why Mikey doesn’t think AI-generated music will affect people’s consumption of human made music. Sign up for new podcasts every week. Email feedback to show@no-priors.com Follow us on Twitter: @NoPriorsPod | @Saranormous | @EladGil | @MikeyShulman Show Notes: 0:00 Mikey’s background 3:48 Bark and music generation 5:33 Architecture for music generation AI 6:57 Assessing music quality 8:20 Mikey’s music background as an asset 10:02 Challenges in generative music AI 11:30 Business model 14:38 Surprising use cases of Suno 18:43 Creating a song on Suno live 21:44 Ratio of creators to consumers 25:00 The digitization of music 27:20 Mikey’s favorite song on Suno 29:35 Suno is hiring

Sarah GuohostMikey ShulmanguestElad Gilhost

May 15, 202430mWatch on YouTube ↗

WHAT IT’S REALLY ABOUT

Suno’s CEO on Democratizing Music Creation With AI-Generated Songs

Suno CEO and co-founder Mikey Shulman discusses how Suno uses transformer-based AI models to generate complete songs—including lyrics, vocals, and instrumentation—from simple text prompts, with the goal of making music creation accessible to everyone. He explains why the team chose to focus on music rather than speech, emphasizing that quality is ultimately judged by human emotion and aesthetics, not standard AI benchmarks. The conversation covers Suno’s technical approach to tokenizing audio, emerging user behaviors around collaborative creation, and the potential impact on how people create, share, and experience music. Shulman predicts that AI tools will expand participation in music, accelerate cultural evolution in sound and song structure, and blur the line between creators and consumers.

IDEAS WORTH REMEMBERING

5 ideas

Aesthetics and emotional impact matter more than traditional AI benchmarks in music.

Unlike text models that optimize for test scores or factual accuracy, Suno evaluates success by how music feels to listeners, relying heavily on human listening, A/B tests, and taste rather than standardized metrics.

Transformers work well for audio, but smart tokenization is the real innovation lever.

Suno uses transformer architectures familiar from text AI, and focuses its R&D on turning continuous, high-sample-rate audio into discrete tokens in ways that preserve nuance and musicality.

Avoiding hard-coded music theory enables more novel and unexpected sounds.

Shulman emphasizes that they deliberately do not encode rules like “12 tones” or fixed instrument sets, instead letting the model learn structures implicitly via next-token prediction, opening space for new timbres and hybrids.

AI music tools can turn music creation into a mainstream, social activity.

Users aren’t just output-focused; they enjoy the creative process itself, co-writing lyrics, trading prompts, and effectively “jamming” with friends and the model, echoing the joy of live jam sessions.

The line between creator and consumer is likely to blur significantly.

Shulman expects future experiences where listening and modifying songs blend together, making “creation vs. consumption” a less meaningful distinction as people interact more actively with music.

WORDS WORTH SAVING

5 quotes

Speech just needs to be right… and the real creativity was happening in a totally different part of audio, which is music.

— Mikey Shulman

Aesthetics matter… you have to use your ears in order to evaluate things.

— Mikey Shulman

The model shouldn’t know about music theory… If I tell my model, ‘There are only 12 tones,’ my model will only know how to output 12 tones.

— Mikey Shulman

Like a video game, music is fun by yourself and maybe more fun in multiplayer mode.

— Mikey Shulman

The machine doesn’t know that there is even a concept of voice… it’s just all sound.

— Mikey Shulman

Mikey Shulman’s background from physics and quantum computing to AI and musicTechnical foundations of Suno’s music models (transformers, audio tokenization, evaluation)Why Suno focuses on music over speech and the role of aestheticsUser behavior, creativity, and collaborative ‘multiplayer’ music-making with AIBusiness model questions and pricing for consumer AI creativity toolsHow AI might reshape the music industry, culture, and creation-consumption dynamicsNew genres, song structures, and the future of AI-driven music innovation

High quality AI-generated summary created from speaker-labeled transcript.

Get more out of YouTube videos.

High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.