# Adversarial Examples and Data Modelling - Andrew Ilyas (MIT) Page: https://stenobird.com/podcast/machine-learning-street-talk/adversarial-examples-and-data-modelling-andrew-ilyas-mit Text version: https://stenobird.com/podcast/machine-learning-street-talk/adversarial-examples-and-data-modelling-andrew-ilyas-mit.md Podcast: [Machine Learning Street Talk (MLST)](https://stenobird.com/podcast/machine-learning-street-talk) Published: 2024-08-22T07:02:03+00:00 Episode link: https://podcasters.spotify.com/pod/show/machinelearningstreettalk/episodes/Adversarial-Examples-and-Data-Modelling---Andrew-Ilyas-MIT-e2nfov9 Audio file: https://anchor.fm/s/1e4a0eac/podcast/play/90743209/https%3A%2F%2Fd3ctxlq1ktw2nl.cloudfront.net%2Fstaging%2F2024-7-22%2F55ac46ce-094a-59bf-3bca-0792f898e1e2.mp3 Processing state: processed JSON: https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/adversarial-examples-and-data-modelling-andrew-ilyas-mit Duration seconds: 5280 ## Resource Andrew Ilyas, a PhD student at MIT who is about to start as a professor at CMU. We discuss Data modeling and understanding how datasets influence model predictions, Adversarial examples in machine learning and why they occur, Robustness in machine learning models, Black box attacks on machine learning systems, Biases in data collection and dataset creation, particularly in ImageNet and Self-selection bias in data and methods to address it. MLST is sponsored by Brave: The Brave Search API covers over 20 billion webpages, built from scratch without Big Tech biases or the recent extortionate price hikes on search API access. Perfect for AI model training and retrieval augmentated generation. Try it now - get 2,000 free queries monthly at http://brave.com/api Andrew's site: https://andrewilyas.com/ https://x.com/andrew_ilyas TOC: 00:00:00 - Introduction and Andrew's background 00:03:52 - Overview of the machine learning pipeline 00:06:31 - Data modeling paper discussion 00:26:28 - TRAK: Evolution of data modeling work 00:43:58 - Discussion on abstraction, reasoning, and neural networks 00:53:16 - "Adversarial Examples Are Not Bugs, They Are Features" paper 01:03:24 - Types of features learned by neural networks 01:10:51 - Black box attacks paper 01:15:39 - Work on data collection and bias 01:25:48 - Future research plans and closing thoughts References: Adversarial Examples Are Not Bugs, They Are Features https://arxiv.org/pdf/1905.02175 TRAK: Attributing Model Behavior at Scale https://arxiv.org/pdf/2303.14186 Datamodels: Predicting Predictions from Training Data https://arxiv.org/pdf/2202.00622 Adversarial Examples Are Not Bugs, They Are Features https://arxiv.org/pdf/1905.02175 IMAGENET-TRAINED CNNS https://arxiv.org/pdf/1811.12231 ZOO: Zeroth Order Optimizati… ## Actions - request_transcript: `POST https://stenobird.com/v1/public/podcasts/machine-learning-street-talk/episodes/adversarial-examples-and-data-modelling-andrew-ilyas-mit/transcription-requests` — Idempotently request low-priority transcript generation for this episode. - read_markdown: `GET https://stenobird.com/podcast/machine-learning-street-talk/adversarial-examples-and-data-modelling-andrew-ilyas-mit.md` — Read the agent-friendly Markdown representation of this episode resource. A page view does not enqueue transcription. Agents should invoke `request_transcript` explicitly when they need this episode processed. ## Transcript Full transcripts are not published on public pages unless there is a clear rights basis.