# Fairness in PCA-Based Recommenders Page: https://stenobird.com/podcast/data-skeptic/fairness-in-pca-based-recommenders Text version: https://stenobird.com/podcast/data-skeptic/fairness-in-pca-based-recommenders.md Podcast: [Data Skeptic](https://stenobird.com/podcast/data-skeptic) Published: 2026-01-26T15:56:00+00:00 Episode link: https://dataskeptic.com/blog/episodes/2026/fairness-in-PCA-based-recommenders Audio file: https://pscrb.fm/rss/p/mgln.ai/e/35/traffic.libsyn.com/secure/dataskeptic/David_With_Ads_V1.mp3?dest-id=201630 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/data-skeptic/episodes/fairness-in-pca-based-recommenders Duration seconds: 2999 ## Resource In this episode, we explore the fascinating world of recommender systems and algorithmic fairness with David Liu, Assistant Research Professor at Cornell University's Center for Data Science for Enterprise and Society. David shares insights from his research on how machine learning models can inadvertently create unfairness, particularly for minority and niche user groups, even without any malicious intent. We dive deep into his groundbreaking work on Principal Component Analysis (PCA) and collaborative filtering, examining why these fundamental techniques sometimes fail to serve all users equally. David introduces the concept of "power niche users" - highly active users with specialized interests who generate valuable data that can benefit the entire platform. We discuss his paper "When Collaborative Filtering Is Not Collaborative," which reveals how PCA can over-specialize on popular content while neglecting both niche items and even failing to properly recommend popular artists to new potential fans. David presents solutions through item-weighted PCA and thoughtful data upweighting strategies that can improve both fairness and performance simultaneously, challenging the common assumption that these goals must be in tension. The conversation spans from theoretical insights to practical applications at companies like Meta, offering a comprehensive look at the future of personalized recommendations. ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/data-skeptic/episodes/fairness-in-pca-based-recommenders/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/data-skeptic/fairness-in-pca-based-recommenders.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.