YC Root AccessArtie: Real Time Data Streaming For The AI Age
CHAPTERS
What Artie is and the new $12M Series A
Jared opens by introducing Artie and the founders, Robin Tang and Jacqueline Cheong, alongside the announcement of their Series A. Jacqueline explains Artie as a real-time data streaming platform that moves production data changes into warehouses with low latency.
The pain that sparked Artie: “fresh data” and failed in-house builds
Robin describes repeatedly hitting the same bottleneck across roles: wanting fresher data for experimentation and operational use cases, but being blocked by resourcing and complexity. He explains why buying existing tools didn’t fit production DB needs and why building internally often fails or drags on for years.
Where Robin saw the problem firsthand: Zendesk and Opendoor
Robin cites Zendesk and Opendoor as key environments where CDC and warehouse integration challenges were acute. Even with tools like Maxwell, integrating and operating CDC reliably was still too hard, reinforcing that this wasn’t a “one-off” company problem.
Building Artie: timeline, YC-era scrappiness, and becoming truly self-serve
The founders compare the long internal build attempts they’d seen to Artie’s initial build speed, noting it still took months due to complexity. They share how early onboarding was manual behind the scenes during YC, and how it took ~10 months to make the product genuinely self-serve.
Selling mission-critical infra: landing Substack as the first big bet
Jacqueline explains how Substack became Artie’s first major customer with a demanding production need: extract from a massive Postgres DB with minimal load and land in Snowflake with low latency. Substack’s proof-of-concept validated Artie under tight constraints and huge row counts, making “buy vs build” compelling.
Enterprise GTM reality: cold email win, then a long gap to the next ‘Substack’
The founders reveal Substack came from a cold email, with an unusually fast response and immediate push to trial the product. They also discuss why growth is lumpy in mission-critical infra: after the first big logo, it took roughly nine months to land another at similar scale.
Tiny team, big throughput: reaching $1M ARR with four people
Jared highlights how much Artie built and operated with an unusually small team, even while processing billions of rows. The founders attribute this to disciplined hiring, founder-led sales, and tight feedback loops between customer conversations and engineering execution.
Married co-founders: decision process, conflict style, and boundary tradeoffs
Robin and Jacqueline discuss the unusual dynamic of being both spouses and co-founders, including initial skepticism and how they decided it could work. They emphasize faster feedback cycles and low-filter communication, while admitting work/life boundaries are minimal at this stage.
Why real-time CDC is so hard: edge cases, backfills, sharding, and data messiness
The founders go deep on the technical and operational realities: production CDC is a compounding set of edge cases across databases and customer architectures. They describe online backfills while streaming, massive table backfills, performance optimizations, and how each customer’s “messy data” breaks assumptions differently.
Owning the entire reliability stack: the Kafka SDK ordering bug story
Robin shares a prolonged debugging saga where an upstream Kafka client/SDK bug caused out-of-order reads after consumer rebalances under load—corrupting destination data. The key lesson: customers don’t care where the bug lives; Artie must own the end-to-end outcome and build guardrails.
Artie today: 700B+ rows processed and scaling for the AI/agentic era
Jacqueline summarizes current scale and momentum: over 700 billion rows processed in the last 12 months, up ~12x year-over-year. They connect increasing demand for real-time data to emerging AI workloads and outline a focus on scaling reliability from billions to trillions of rows.
Team expansion plan: tripling headcount and the roles they need most
With the Series A, Artie plans to accelerate hiring significantly. Jacqueline lists priority functions across engineering and go-to-market, emphasizing that an in-house recruiter is critical to hitting ambitious hiring goals.
Founder advice and the 2026 roadmap: beyond databases into events and new destinations
Jacqueline advises founders to act and iterate rather than overthink, aligning with YC’s bias toward action. They close by outlining product expansion: starting from database CDC, launching an Events API with ~1–200ms queryability in warehouses, adding more sources/destinations, and exploring real-time search indexing use cases.
Get more out of YouTube videos.
High quality summaries for YouTube videos. Accurate transcripts to search & find moments. Powered by ChatGPT & Claude AI.
Add to Chrome