{"podcast":{"title":"Adventures in DevOps","slug":"adventures-in-devops","podcast_index_feed_id":686419,"rss_url":"https://adventuresindevops.com/episodes/rss.xml","website_url":"https://adventuresindevops.com","image_url":"https://d3wo5wojvuv7l.cloudfront.net/t_rss_itunes_square_1400/images.spreaker.com/original/2f474744f84e93eba827bee58d58c1c9.jpg","author":"Adventures in DevOps","episode_count":274,"summary":"Join us in listening to the experienced experts discuss cutting edge challenges in the world of DevOps. From applying the mindset at your company, to career growth and leadership challenges within engineering teams, and avoiding the common antipatterns. Every episode you'll meet a new industry veteran guest with their own unique story.","last_synced_at":null,"page_url":"https://stenobird.com/podcast/adventures-in-devops"},"episode":{"title":"Solving incidents with one-time ephemeral runbooks","slug":"solving-incidents-with-one-time-ephemeral-runbooks","published_at":"2025-10-20T00:00:00+00:00","page_url":"https://stenobird.com/podcast/adventures-in-devops/solving-incidents-with-one-time-ephemeral-runbooks","show_page_url":"https://stenobird.com/podcast/adventures-in-devops","url":"https://adventuresindevops.com/episodes/2025/10/20/solving-incidents-with-one-time-ephemeral-runbooks","audio_url":"https://dts.podtrac.com/redirect.mp3/api.spreaker.com/download/episode/68206117/download.mp3","summary":"Incident.io's Lawrence Jones explains how to move beyond static documentation by using AI-driven RAG to generate ephemeral, context-aware runbooks during outages. The discussion covers the transition from Heroku to GCP for high-compliance environments and the technical hurdles of automating incident investigation.","meta_description":"Learn how to build ephemeral runbooks using RAG, service catalogs, and GitHub integrations to automate incident response and reduce downtime.","key_points":["Main idea: Effective incident response in regulated industries like FinTech requires high-rigor processes and automated, verifiable documentation","Technical strategy: Use a knowledge graph—combining service catalogs, CRM data, and GitHub webhooks—to power RAG instead of relying solely on unstructured vector embeddings","Practical takeaway: Ephemeral runbooks should be dynamically generated by analyzing recent PR diffs, Slack discussions, and telemetry to surface relevant dashboards instantly","Failure mode: Avoid 'low-confidence' AI hallucinations in incident channels; implement background verification by cloning codebases to validate assumptions before alerting engineers","Lesson learned: The bottleneck in modern DevOps is no longer the LLM's reasoning capability, but the engineering effort required to build structured, modular data pipelines"],"chapters":[{"start_ms":60000,"title":"The High Stakes of FinTech Incidents","summary":"Why regulatory obligations and the cost of downtime in financial services demand much higher incident response discipline than other industries."},{"start_ms":740000,"title":"Scaling Infrastructure for Security","summary":"The transition from Heroku to GCP and Kubernetes to meet the security and compliance needs of enterprise customers."},{"start_ms":1180000,"title":"Automating Investigation with AI","summary":"An introduction to AISRE and how the system uses a service catalog to guide automated incident investigations."},{"start_ms":1410000,"title":"Building RAG with Knowledge Graphs","summary":"How to use GitHub integrations, webhooks, and universal adapters to feed relevant context into a RAG-based runbook generator."},{"start_ms":1870000,"title":"Verifying AI Assumptions","summary":"The importance of using background processes to double-check PR changes and code diffs before presenting findings to responders."},{"start_ms":2100000,"title":"The Future of LLMs in DevOps","summary":"Why the focus is shifting from model intelligence to the engineering of structured, modular systems and objective metrics."}],"topics":["Incident Response","Retrieval-Augmented Generation","Site Reliability Engineering","DevOps Automation","Cloud Infrastructure","Service Catalogs","FinTech Compliance","AI Agents"],"duration_seconds":2999,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/adventures-in-devops/episodes/solving-incidents-with-one-time-ephemeral-runbooks/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/adventures-in-devops/solving-incidents-with-one-time-ephemeral-runbooks.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}