# The Junior Data Engineer is Now an AI Agent Page: https://stenobird.com/podcast/the-data-exchange-with-ben-lorica/the-junior-data-engineer-is-now-an-ai-agent Text version: https://stenobird.com/podcast/the-data-exchange-with-ben-lorica/the-junior-data-engineer-is-now-an-ai-agent.md Podcast: [The Data Exchange with Ben Lorica](https://stenobird.com/podcast/the-data-exchange-with-ben-lorica) Published: 2026-01-08T12:00:00+00:00 Episode link: https://dts.podtrac.com/redirect.mp3/www.buzzsprout.com/682433/episodes/18437537-the-junior-data-engineer-is-now-an-ai-agent.mp3 Audio file: https://dts.podtrac.com/redirect.mp3/www.buzzsprout.com/682433/episodes/18437537-the-junior-data-engineer-is-now-an-ai-agent.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/the-data-exchange-with-ben-lorica/episodes/the-junior-data-engineer-is-now-an-ai-agent Duration seconds: 3273 ## Resource AI agents are moving beyond simple chat interfaces to perform complex, stateful data engineering tasks like building and testing pipelines. Matthew Glickman explains how Genesis Computing uses agentic technology to automate critical workflows and capture institutional knowledge. ## Highlights - Main idea: AI agents are evolving from simple query interfaces into autonomous workers capable of executing multi-step data engineering workflows - Practical takeaway: Use AI to capture 'ambient knowledge' from senior engineers to prevent institutional memory loss during migrations - Failure mode: Automating entry-level tasks risks breaking the talent pipeline, as junior engineers need these foundational tasks to develop into seniors - Strategic lesson: Only build custom data infrastructure if it provides a core competitive advantage; otherwise, leverage specialized third-party experts - Technical insight: Unlike stateless software engineering, data engineering requires agents that can manage side effects across distributed systems like Spark and Kafka ## Topics AI Agents, Data Engineering, Data Pipelines, Enterprise AI, Automation, Genesis Computing, Machine Learning Operations, Distributed Systems ## Chapters - 1:00 — The Genesis of Genesis Computing: Matthew Glickman discusses his transition from Goldman Sachs and Snowflake to addressing the 'wall' enterprises hit when trying to deploy LLMs for data. - 5:20 — The Difficulty of the Last 10%: A look at why moving from flashy AI demos to deterministic, production-ready data pipelines is incredibly challenging. - 9:20 — Targeting the Data Engineer Persona: How Genesis Computing focuses specifically on automating the workflows of the data engineering persona rather than general business users. - 13:20 — Risks of Expanding the Engineering Pool: The implications of using AI to allow non-engineers to perform data engineering tasks and the potential for introducing errors. - 18:10 — Capturing Institutional Knowledge: Using AI to ambiently acquire knowledge from experts so that pipeline specifics aren't lost when people leave the organization. - 22:20 — The Burden of Legacy Migrations: Discussing the massive complexity of migrating legacy systems like SAP HANA and Oracle in large enterprises. - 26:20 — Closing the Loop with Business Users: How agentic workflows allow business users to validate data logic and revenue calculations directly without a human middleman. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/the-data-exchange-with-ben-lorica/episodes/the-junior-data-engineer-is-now-an-ai-agent/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/the-data-exchange-with-ben-lorica/the-junior-data-engineer-is-now-an-ai-agent.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.