Episode

The Junior Data Engineer is Now an AI Agent

Podcast: The Data Exchange with Ben Lorica
Published: Jan 8, 2026
Duration seconds: 3273
Processing state: processed
Canonical source: https://dts.podtrac.com/redirect.mp3/www.buzzsprout.com/682433/episodes/18437537-the-junior-data-engineer-is-now-an-ai-agent.mp3
Audio: https://dts.podtrac.com/redirect.mp3/www.buzzsprout.com/682433/episodes/18437537-the-junior-data-engineer-is-now-an-ai-agent.mp3
JSON: /v1/public/podcasts/the-data-exchange-with-ben-lorica/episodes/the-junior-data-engineer-is-now-an-ai-agent
Markdown: /podcast/the-data-exchange-with-ben-lorica/the-junior-data-engineer-is-now-an-ai-agent.md

Actions

POST https://stenobird.com/v1/public/podcasts/the-data-exchange-with-ben-lorica/episodes/the-junior-data-engineer-is-now-an-ai-agent/transcription-requests
Idempotently request low-priority transcript generation for this episode.
GET https://stenobird.com/podcast/the-data-exchange-with-ben-lorica/the-junior-data-engineer-is-now-an-ai-agent.md
Read the agent-friendly Markdown representation of this episode resource.

Summary

AI agents are moving beyond simple chat interfaces to perform complex, stateful data engineering tasks like building and testing pipelines. Matthew Glickman explains how Genesis Computing uses agentic technology to automate critical workflows and capture institutional knowledge.

Topics

AI Agents
Data Engineering
Data Pipelines
Enterprise AI
Automation
Genesis Computing
Machine Learning Operations
Distributed Systems

Highlights

Main idea: AI agents are evolving from simple query interfaces into autonomous workers capable of executing multi-step data engineering workflows
Practical takeaway: Use AI to capture 'ambient knowledge' from senior engineers to prevent institutional memory loss during migrations
Failure mode: Automating entry-level tasks risks breaking the talent pipeline, as junior engineers need these foundational tasks to develop into seniors
Strategic lesson: Only build custom data infrastructure if it provides a core competitive advantage; otherwise, leverage specialized third-party experts
Technical insight: Unlike stateless software engineering, data engineering requires agents that can manage side effects across distributed systems like Spark and Kafka

Chapters

1:00 The Genesis of Genesis Computing: Matthew Glickman discusses his transition from Goldman Sachs and Snowflake to addressing the 'wall' enterprises hit when trying to deploy LLMs for data.
5:20 The Difficulty of the Last 10%: A look at why moving from flashy AI demos to deterministic, production-ready data pipelines is incredibly challenging.
9:20 Targeting the Data Engineer Persona: How Genesis Computing focuses specifically on automating the workflows of the data engineering persona rather than general business users.
13:20 Risks of Expanding the Engineering Pool: The implications of using AI to allow non-engineers to perform data engineering tasks and the potential for introducing errors.
18:10 Capturing Institutional Knowledge: Using AI to ambiently acquire knowledge from experts so that pipeline specifics aren't lost when people leave the organization.
22:20 The Burden of Legacy Migrations: Discussing the massive complexity of migrating legacy systems like SAP HANA and Oracle in large enterprises.
26:20 Closing the Loop with Business Users: How agentic workflows allow business users to validate data logic and revenue calculations directly without a human middleman.