Episode

#534: diskcache: Your secret Python perf weapon

Podcast
Talk Python To Me
Published
Jan 13, 2026
Duration seconds
4440
Processing state
processed
Canonical source
https://talkpython.fm/episodes/show/534/diskcache-your-secret-python-perf-weapon
Audio
https://talkpython.fm/episodes/download/534/diskcache-your-secret-python-perf-weapon.mp3
JSON
/v1/public/podcasts/talk-python-to-me/episodes/534-diskcache-your-secret-python-perf-weapon
Markdown
/podcast/talk-python-to-me/534-diskcache-your-secret-python-perf-weapon.md

Actions

  • POST https://stenobird.com/v1/public/podcasts/talk-python-to-me/episodes/534-diskcache-your-secret-python-perf-weapon/transcription-requests
    Idempotently request low-priority transcript generation for this episode.
  • GET https://stenobird.com/podcast/talk-python-to-me/534-diskcache-your-secret-python-perf-weapon.md
    Read the agent-friendly Markdown representation of this episode resource.

Summary

Stop over-provisioning Redis and start using your existing cloud SSD. Learn how DiskCache leverages SQLite to provide a high-performance, persistent caching layer for Python applications and data science workflows.

Topics

  • Python
  • DiskCache
  • SQLite
  • Caching Strategies
  • Data Science
  • Django
  • Machine Learning
  • Performance Optimization
  • SSD

Highlights

  • Main idea: DiskCache provides a persistent, disk-based caching mechanism that avoids the operational overhead of managing external services like Redis
  • Practical takeaway: Use DiskCache in data science notebooks to memoize expensive computations and avoid redundant processing across sessions
  • Practical takeaway: Integrate DiskCache into Django applications via a dedicated backend to simplify caching configuration
  • Failure mode: Avoid using a cache in scenarios where the write frequency exceeds the read frequency, as this creates unnecessary overhead
  • Main idea: Caching is not just for web traffic; it is a critical tool for managing expensive LLM API calls and heavy machine learning inference

Chapters

  1. 6:25 Caching for Machine Learning: Exploring the necessity of caching expensive computations in machine learning and cloud-based image classification tasks.
  2. 12:15 Web Development and Stale Data: A discussion on the common pitfalls of web development, specifically regarding stale assets and the importance of efficient caching.
  3. 17:55 SQLite and Persistence: Discussing the potential for backing up SQLite and the utility of using it as a foundation for caching layers.
  4. 24:05 The Cost of Write-Heavy Caches: Analyzing why a cache becomes a performance bottleneck if the overhead of writing to the cache outweighs the benefits of reading from it.
  5. 35:00 Advanced Cache Management: Comparing different caching strategies and the ability to selectively clear specific keys, such as YouTube IDs, without wiping the entire cache.
  6. 40:20 Serialization and Python Objects: A look into how DiskCache handles Python object serialization using techniques like pickling.
  7. 57:00 Optimizing Disk Space: Strategies for reducing disk footprint by adjusting numeric precision, such as moving to float16.