{"podcast":{"title":"Data Skeptic","slug":"data-skeptic","podcast_index_feed_id":587881,"rss_url":"https://dataskeptic.libsyn.com/rss","website_url":"https://dataskeptic.com","image_url":"https://static.libsyn.com/p/assets/0/e/4/b/0e4bd71bb64c6e45/DS_-_New_Logo_assets_-_JL_DS_Logo_Stacked_-_Color_2.jpg","author":"Kyle Polich","episode_count":601,"summary":"The Data Skeptic Podcast features interviews and discussion of topics related to data science, statistics, machine learning, artificial intelligence and the like, all from the perspective of applying critical thinking and the scientific method to evaluate the veracity of claims and efficacy of approaches.","last_synced_at":null,"page_url":"https://stenobird.com/podcast/data-skeptic"},"episode":{"title":"Bypassing the Popularity Bias","slug":"bypassing-the-popularity-bias","published_at":"2025-10-15T15:33:00+00:00","page_url":"https://stenobird.com/podcast/data-skeptic/bypassing-the-popularity-bias","show_page_url":"https://stenobird.com/podcast/data-skeptic","url":"https://dataskeptic.com/blog/episodes/2025/bypassing-the-popularity-bias","audio_url":"https://pscrb.fm/rss/p/mgln.ai/e/35/traffic.libsyn.com/secure/dataskeptic/Vaclav_No_Ads_V1.mp3?dest-id=201630","summary":"Popularity bias in recommendation engines creates a feedback loop that favors mainstream content while burying the 'long tail.' This episode explores technical strategies to repurpose models to surface niche, high-quality items.","meta_description":"Learn how to bypass popularity bias in recommender systems using inverse recommendation and content-based embeddings to surface long-tail content.","key_points":["Main idea: Popularity bias occurs when systems prioritize broadly upvoted items, unintentionally suppressing niche but high-quality content","Practical takeaway: Using 'inverse recommendation' as a batch process can help redistribute exposure to the bottom 50% of publishers","Failure mode: Relying solely on interaction-based embeddings can leave niche items with poor representations due to lack of historical data","Technical strategy: Replacing interaction-based embeddings with content-based embeddings can bridge the information gap for new or rare items","Business trade-off: Increasing content diversity may lead to a temporary decrease in click-through rates (CTR) in exchange for better long-term ecosystem health"],"chapters":[{"start_ms":60000,"title":"The Problem of Popularity Bias","summary":"An introduction to how recommendation signals can create a feedback loop that favors generic, high-engagement content over niche quality."},{"start_ms":210000,"title":"Transitioning from Academia to Industry","summary":"A brief look at the evolution of NLP and machine learning tools like BERT in real-world production environments."},{"start_ms":495000,"title":"Bypassing Bias with Inverse Recommendation","summary":"An exploration of the paper 'Bypassing the popularity bias' and the use of bandit algorithms for diverse sampling."},{"start_ms":985000,"title":"Measuring Long-Tail Exposure","summary":"Discussing the 'bottom fifty percent share' metric to track whether niche publishers are gaining visibility."},{"start_ms":1615000,"title":"Retrieval Pipelines and TensorFlow","summary":"How the retrieval stage uses libraries like TensorFlow Recommenders to pre-select candidates for the ranking pipeline."},{"start_ms":1915000,"title":"The Future of Content-Based Embeddings","summary":"Moving beyond user-item interactions toward multi-embedding user profiles to capture diverse, shifting interests."}],"topics":["Recommender Systems","Popularity Bias","Machine Learning","Long Tail Content","Embeddings","Information Retrieval","Content-Based Filtering","Algorithm Diversity"],"duration_seconds":2073,"processing_state":"processed","actions":[{"name":"request_transcript","method":"POST","url":"https://stenobird.com/v1/public/podcasts/data-skeptic/episodes/bypassing-the-popularity-bias/transcription-requests","description":"Idempotently request low-priority transcript generation for this episode."},{"name":"read_markdown","method":"GET","url":"https://stenobird.com/podcast/data-skeptic/bypassing-the-popularity-bias.md","description":"Read the agent-friendly Markdown representation of this episode resource."}]}}