Aakash GuptaCollege Dropout Raised $20M Building AI Tools | Cluely, Roy Lee
At a glance
WHAT IT’S REALLY ABOUT
Cluely’s founder on viral growth, AI overlay UX, cheating debate.
- Cluely was built in roughly 10 weeks and uses a “launch broad, go viral, then iterate from usage” approach driven by millions of daily requests and direct customer feedback.
- Roy Lee argues virality on X/LinkedIn is engineered by making content highly digestible and “reactionable,” prioritizing controversy and fast iteration over attribution-heavy funnel optimization.
- The product differentiates through an undetectable-feeling overlay UX plus technical plumbing (custom audio capture, screenshot-at-query, image compression) to deliver real-time assistance from models like GPT‑4.1.
- Cluely is positioning itself as an application-layer winner with traditional SaaS margins, pursuing enterprise contracts (e.g., post-call summaries and coaching) while aiming for a consumer-majority revenue mix.
- Lee frames AI “cheating” as an inevitable shift in what skills matter, and sketches ambitious end-states from AI-native CRM replacement to brain-computer interfaces.
IDEAS WORTH REMEMBERING
5 ideasExtreme distribution can compress the time to credibility—even if the product is brand-new.
Lee claims Cluely feels “established” mainly because it dominates feeds, despite being ~10 weeks old; that attention pulls in users, investors, and hiring inbound faster than traditional GTM.
Treat virality as a product surface with its own iteration loop.
Cluely runs frequent brainstorming sessions to reverse-engineer what’s trending, generates many “100M-view potential” concepts, and accepts that winning formats expire quickly—so speed matters more than polishing one play.
On X/LinkedIn, “digestible + reactionable” beats “smart-sounding.”
Lee argues most tech content fails because it’s jargon-heavy; posts should be understandable to a mainstream audience and designed so quote-tweets/reposts naturally create debate or strong opinions.
A seamless overlay UX is a key moat for AI assistants—not prompts.
He downplays system prompts as fast-changing and eventually commoditized, while emphasizing the integrated translucent overlay as the real interface shift beyond chatbots.
Real-time assistants need pragmatic context handling, not “continuous video to the model.”
Instead of streaming long screen recordings, Cluely captures a screenshot at query-time and compresses it to manage token costs/latency, pairing it with captured system + mic audio.
WORDS WORTH SAVING
5 quotesI think the single most defining character feature of me from, like, literally the second I gained consciousness was, um, provocative.
— Roy Lee
I need to channel this into, like, the biggest, boldest play ever, and that's, like, building a company, bro.
— Roy Lee
Meaning right now is a unique moment in human history where if you make a post or say something that deserves to be seen by millions, you will be seen by millions.
— Roy Lee
The models will get better. Like, literally, the model you use today is the worst of mo- is the worst model you will ever use for the rest of your life.
— Roy Lee
If I expect you to be a part of the cult religion of Cluely, then I should be rewarding you as if, if you are a valid contributing member of the cult.
— Roy Lee
High quality AI-generated summary created from speaker-labeled transcript.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome