# Spark, AI, and the Future of Data Engineering with Daniel Aronovich Page: https://stenobird.com/podcast/data-engineering-central-podcast-7106217/spark-ai-and-the-future-of-data-engineering-with-daniel-aronovich Text version: https://stenobird.com/podcast/data-engineering-central-podcast-7106217/spark-ai-and-the-future-of-data-engineering-with-daniel-aronovich.md Podcast: [Data Engineering Central Podcast](https://stenobird.com/podcast/data-engineering-central-podcast-7106217) Published: 2026-03-24T21:30:13+00:00 Episode link: https://dataengineeringcentral.substack.com/p/spark-ai-and-the-future-of-data-engineering Audio file: https://api.substack.com/feed/podcast/190946157/3623894b48082eae82c0e7edea11f401.mp3 Processing state: not_requested JSON: https://stenobird.com/v1/public/podcasts/data-engineering-central-podcast-7106217/episodes/spark-ai-and-the-future-of-data-engineering-with-daniel-aronovich Duration seconds: 2791 ## Resource In this episode of Data Engineering Central , I sit down with the founder of DataFlint , Daniel Aronovich , to talk about the realities of working with Apache Spark, distributed data systems, and the future of data engineering . We start with his early journey into tech—how he first discovered large-scale data systems and the lessons he learned from working with real-world Spark workloads. * The conversation then turns toward the future of data engineering , particularly the growing role of AI in software development and data infrastructure . We discuss why generic AI coding assistants often struggle with complex distributed systems, whether AI will eventually be able to automatically optimize data pipelines, and how the role of the data engineer may evolve in the coming years. We covered a lot of career advice for new and upcoming data professionals. We also discuss the origin of DataFlint , a tool designed to help engineers better understand and optimize Spark workloads by analyzing execution plans, logs, and runtime context. If you work with Spark, large-scale data pipelines, or modern data platforms , this conversation will give you a deeper look into how the data engineering landscape is evolving. Thanks for reading Data Engineering Central! This post is public so feel free to share it. This is a public episode. If you'd like to discuss this with other subscribers or get access to bonus episodes, visit dataengineeringcentral.substack.com/subscribe ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/data-engineering-central-podcast-7106217/episodes/spark-ai-and-the-future-of-data-engineering-with-daniel-aronovich/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/data-engineering-central-podcast-7106217/spark-ai-and-the-future-of-data-engineering-with-daniel-aronovich.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.