{"podcast":{"title":"Open Source Startup Podcast","slug":"open-source-startup-podcast","podcast_index_feed_id":3501865,"rss_url":"https://anchor.fm/s/3eab794c/podcast/rss","website_url":"https://oss-startup-podcast.launchnotes.io","image_url":"https://d3t3ozftmdmh3i.cloudfront.net/staging/podcast_uploaded_nologo/10414251/10414251-1718504092058-1eb78ce29b28a.jpg","author":"Robby (MTF); Tim (Essence VC)","episode_count":194,"summary":"The leading podcast on how to build a successful open source company. Learn from the founders of HashiCorp, Chronosphere, Vercel, MongoDB, DBT, mobile.dev and more!","last_synced_at":null,"page_url":"https://stenobird.com/podcast/open-source-startup-podcast"},"episode":{"title":"E186: Unlocking Your Unstructured Data with Typedef","slug":"e186-unlocking-your-unstructured-data-with-typedef","published_at":"2025-11-20T20:05:09+00:00","page_url":"https://stenobird.com/podcast/open-source-startup-podcast/e186-unlocking-your-unstructured-data-with-typedef","show_page_url":"https://stenobird.com/podcast/open-source-startup-podcast","url":"https://podcasters.spotify.com/pod/show/ossstartuppodcast/episodes/E186-Unlocking-Your-Unstructured-Data-with-Typedef-e3b8edk","audio_url":"https://anchor.fm/s/3eab794c/podcast/play/111474548/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2025-10-20%2Fcc7459d9-c988-70ba-3cd1-0a0574f350d5.mp3","summary":"Traditional data pipelines are too brittle for the non-deterministic nature of LLM workloads. Typedef introduces Fenic, an open-source engine designed to handle unstructured data and agentic workflows through semantic operations.","meta_description":"Learn how Typedef is building Fenic, an open-source DataFrame library designed to solve the brittleness of data pipelines in the age of AI and LLMs.","key_points":["Main idea: Traditional engines like Spark are optimized for structured data, whereas modern AI workloads require I/O-heavy processing of unstructured data","Practical takeaway: Use semantic operators and agentic loops to reconcile entity resolution when deterministic rules fail","Failure mode: Relying solely on deterministic pipelines for LLM inference leads to brittle systems that cannot handle noise or evolving data","Strategic insight: Early GTM and continuous customer validation are essential for navigating the rapidly changing AI infrastructure landscape","Future trend: AI agents are becoming the new SaaS, with companies purchasing domain-specific agentic end-to-end solutions"],"chapters":[{"start_ms":60000,"title":"The Evolution of Data Infrastructure","summary":"The founders discuss their backgrounds at Starburst and Tecton and how the shift from Trino/Spark era to AI-native infra is happening."},{"start_ms":420000,"title":"Addressing I/O Bottlenecks in LLM Inference","summary":"A deep dive into why LLM workloads are I/O heavy and why existing technologies struggle with the specific demands of inference."},{"start_ms":615000,"title":"Introducing Fenic and Agentic Workflows","summary":"How Fenic simplifies multi-step inference workflows and reduces the operational complexity for data practitioners."},{"start_ms":800000,"title":"Enabling Agents to Interact with Data","summary":"Exploring the integration of semantic operators and tools that allow LLM agents to interact directly with data pipelines."},{"start_ms":1375000,"title":"The Challenge of Developer Adoption","summary":"Why new frameworks must maintain compatibility with established languages like SQL and JavaScript to avoid high learning curves."},{"start_ms":1755000,"title":"GTM Strategies for AI Infrastructure","summary":"Advice for founders on using inbound and outbound conversations to drive product development and market validation."},{"start_ms":2335000,"title":"The Rise of Agentic SaaS and Benchmarking","summary":"A discussion on the 'spicy take' that AI agents are the new SaaS and the growing obsession with benchmarks in AI marketing."}],"topics":["Data Infrastructure","LLM Inference","Open Source","Unstructured Data","AI Agents","Data Pipelines","Machine Learning Operations","Software Engineering"],"duration_seconds":2525,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/open-source-startup-podcast/episodes/e186-unlocking-your-unstructured-data-with-typedef/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/open-source-startup-podcast/e186-unlocking-your-unstructured-data-with-typedef.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}