# The Challenges of Data Processing On Kubernetes - A look at Spark, Flink, Dask, and Ray // Holden Karau (DoK Day North America 2022) Page: https://stenobird.com/podcast/data-on-kubernetes-community/the-challenges-of-data-processing-on-kubernetes-a-look-at-spark-flink-dask-and-ray-holden-karau-dok-day-north-america-2022 Text version: https://stenobird.com/podcast/data-on-kubernetes-community/the-challenges-of-data-processing-on-kubernetes-a-look-at-spark-flink-dask-and-ray-holden-karau-dok-day-north-america-2022.md Podcast: [Data on Kubernetes Community](https://stenobird.com/podcast/data-on-kubernetes-community) Published: 2022-10-31T17:00:50+00:00 Episode link: https://podcasters.spotify.com/pod/show/dokcommunity/episodes/The-Challenges-of-Data-Processing-On-Kubernetes---A-look-at-Spark--Flink--Dask--and-Ray--Holden-Karau-DoK-Day-North-America-2022-e1pquki Audio file: https://anchor.fm/s/2d649bc8/podcast/play/59652178/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2022-9-27%2F293395006-44100-2-47d8b4b4b9236.m4a Processing state: failed JSON: https://stenobird.com/v1/public/podcasts/data-on-kubernetes-community/episodes/the-challenges-of-data-processing-on-kubernetes-a-look-at-spark-flink-dask-and-ray-holden-karau-dok-day-north-america-2022 Duration seconds: 1209 ## Resource From the DoK Day North America 2022 (https://youtu.be/YWTa-DiVljY) ABSTRACT This talk will go through both the improvements that have been made in Kubernetes for batch analytic workloads as well as some of the current pain experienced by users and developers moving their workloads to Kube. In this talk you will learn about how we “cheated” back in the YARN and Mesos days to make things go fast, why Kubernetes doesn’t like those cheats, and what some alternatives are. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/data-on-kubernetes-community/episodes/the-challenges-of-data-processing-on-kubernetes-a-look-at-spark-flink-dask-and-ray-holden-karau-dok-day-north-america-2022/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/data-on-kubernetes-community/the-challenges-of-data-processing-on-kubernetes-a-look-at-spark-flink-dask-and-ray-holden-karau-dok-day-north-america-2022.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.