Modal came up in “⚡ Inside GitHub’s AI Revolution: Jared Palmer Reveals Agent HQ & The Future of Coding Agents” from Latent Space: The AI Engineer Podcast.
Quote
Right. Uh do you see that as a standard that we should invest in? is like a it's like a thing like'cause it's supported in VS Code. I don't think it's just that popular outside of VS Code. Yeah. Uh it's used internally at GitHub too for like d development at GitHub. Oh yeah. Which is cool. Um yeah, I think they were so far ahead almost. Like it was but now they're Sand the s like sandboxes there's so many these days, right? So I think Cloudflare just launched theirs. Yeah, there's digital. Tona, they're here, Russell, um, mo modal, yeah, uh which I think lovable. I don't know what do you what What do you guys just do? Uh just some Kubernetes pods. Okay, you guys roll it yourself. I think that's Maybe the runtime. should be even internally in Microsoft. I've got a couple of different competing things. So we'll we'll figure it out in the next Isn't a dev container you've already got VS code loaded, you've got a file system, you've got sandbox, you've got the security protocol. Yeah. Yeah. And like ready to be packaged. So there's lots of goodness there. But also codex, also presumably the other guys, is repo setup. Effectively what dev containers and a Docker file does for you is like run this
Modal came up in “Can AI Agents Safely Become DevOps Engineers?” from Agentic DevOps : AI Engineering for Infrastructure.
Quote
for the right model for the right task. Uh Mangol also is built this way. We use the three entropic models today and we're gonna move multi models multiproviders soon too. Because I believe There are models that are really good for certain things. Purpus is really good for complex reasoning, but it's also very slow and sometimes you don't know which if you summarise an email or something or like if you want to rewrite some things that you wrote. So yeah, but I'd say if I have to pick one, yes. That would be my go to, but Yeah, I usually got to be multi modal based on what I'm doing. Nice. Thank you so much, Sam, for being here. We could talk for another hour. People are already gonna go, Brett, like the And it's great to actually talk to not just founders but engineers in the thick of it that are also leading Living and drinking the, you know, the Silicon Valley Kool-Aid a little bit. It's great to see the different levels or I guess maturity levels of everyone that we have on the show and to see where everyone is. one a is at and it's always fun to have people on the show that that are ahead of us and I feel like, you know, touching the AI even closer to what we think is the utopia of this Star Trek future I feel like we're in. So I'm excited that you guys are progressing so quickly on the product and I'm looking forward to using more of it, especially since it scratches an i…
Modal came up in “Why Vision Language Models Ignore What They See with Munawar Hayat - #758” from The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence).
Quote
Charrington. Today I'm joined by Munawar Hayat. Munawar is a research at Qualcomm AI Research. Before we get going, be sure to take a moment to hit that subscribe. subscribe button wherever you're listening to today's show. Monoir, welcome to the podcast. Hey Sam, thanks a lot. Thanks for having me. I'm really looking forward to digging into our conversation we're going to be talking about some of Qualcomm's papers. on multimodal AI and visual AI from the recent NERUPS conference. To get us started, I'd love to have you share a little bit about your background. background. How'd you get into the field? So I did my PhD in In twenty fifteen from Australia. got interested in computer vision and I took some courses in my undergrad. uh on image processing that's what fascinated me and uh um Uh during my PhD I work on uh visual uh analysis. And And uh yeah, and then in twenty uh after my PhD I stayed in academia, I had a faculty position and then I moved to Colcom in
Modal came up in “Building AI Ads with Ray Jang (Atria) | EP115” from AI Agents Podcast.
Quote
uh uh uh improvements in image and video generation. But you know going multiple modal, right? Um to be able to string together image, videos, the full stack. With AI. Okay. Um It's interesting to me. So you I mean that that's a pretty bold thing to do, right? Like just completely pivot um be like all right we got to make this work in three months what was like what was going through your head Wow. I must say that in my adult life it was the hardest period in my life. Um because and it was self imposed. So I think uh to to that level I was able to m go through the pain but I I must say that in I would say I was and to still some degree am an entrepreneur that believes that if you have conviction you're able to go much further um
Modal came up in “Jeppe Reinhold - Storybook Modernization” from devtools.fm: Developer Tools, Open Source, Software Development.
Quote
And interact with the UI. So that's storebook in a nutshell, I guess. Yeah. Yeah. Going Going back to not that, uh in my current workplace we have a mix of like things are in stories and things are not in stories. When things are not in stories, it is like it's pulling teeth. to like even try to understand how do I get this thing to display in the UI that that Alone is like sometimes jumping through hoops. this uh you have this forum that opens in a modal and you're you're doing development on it But like to get to that state in the application you have to click four buttons and whenever you reload, you have to click these four buttons. again to open the modal and that's just super annoying whenever you you you want to iterate on your UI, right? And so if you put it into storybook, you you control the state, you control what it does. Uh so you can immediately see the changes. I think that's the like the the flagship use case for that. So that ESM journey and like the changing of the API really enabled a lot of stuff. Like uh I don't think you could do the same type of testing if you were going in that like the the imperative like side effect effectful way uh that you used to uh do stories. So uh what are the different types of testing that got enabled by moving to CSF? So Again, this ability to being able to to seeing the stories up front was a has a
Modal came up in “E181: Why Multimodal Is the Future of AI Data Workloads” from Open Source Startup Podcast.
Quote
Welcome back to the Open Source Startup Podcast. This is Tim from SNBC and my lovely co-host. host Robbie from Modern Technical Fund. We're super excited to have Chong, the CEO of LandsDB. LandsDB is the AI native multimodal lake house. So welcome Sean. Hey Tim. Hey Robbie. Super excited to We're thrilled to have you on. And on this podcast, we like to go all the way back to the beginning. So why don't you take us back to very, very late 2021, early 2022? to the very beginnings of Lance TV and talk about where the idea came from. Yeah. So my co founder is Leigh and I had been working with him for a long time since our our days. at Cloudera and like 2014 through through twenty seventeen or so. But I would say we credit one man for the idea of Land C B and And that man's name is Henry Chung. He was a diplomat for the Republic of China who after retirement he opened a string of of Hunan restaurants in San Francisco. And Lei and I would always have dinner at Henry's Hunan and it had it was like amazing
Modal came up in “Beyond RPA: How Agentic AI Transforms Invoice Processing” from Agentic AI Podcast.
Quote
They aren't just blindly. Right. And our sources break down that autonomy into three key parts working together. Perception, reasoning, and action. Let's start with perception. What can an agent see that an RPA bot just completely misses? Perception. Shin is about taking in information. The agent uses really advanced tech things like multimedia. Modal large language models, LLMs, combined with sophisticated OCR and natural This means it can ingest and actually understand data from pretty much any Source. Scribbled on the side, an image and an email, complex XML data feeds. It doesn't need a Predefined template, it just figures it out. Okay, so it sees and understands the raw data. Then comes reason. This sounds like the really smart part. It's not just extracting invoice number one, two, the three, right? Exactly. pull the data, it puts it in context. It instantly checks that invoice against say the supplier Even past transactions with that vendor, it's figuring out, is this valid? Is it compliant? Does it match what we expected? And the third piece is action. Once it's reasoned things out. Based on its decision, the agent takes action directly, natively,
Modal came up in “#159 - Inflection-2.5, Devin, OpenAI board update, SIMA, EU AI Act passed” from Last Week in AI.
Quote
Which is exceeding the previous state of art by a lot, before it was just a couple percent uh and they say that it's also been tested on real world tasks such as jobs on upwork and Yeah, they have various demos. They also showed how it can uh write a little uh code to use a computer vision model. using modal and it can do all of this kind of from documentation without being trained specifically to do that. So lots of examples of what this most people in the community have seen the most impressive, uh fully autonomous to see somebody who got access to Devon and then started looking at some of its different capabilities. There's a guy named Andrew Gao, I wanna say, and he's tested it out on A couple of different things and found that I think there was one task I can quite remember it it was related a bit to graphics that it didn't quite do so well on and made this like
Modal came up in “From Tailnet to platform (Interview)” from The Changelog: Software Development, Open Source.
Quote
When I when I first started using Proxmox, like yeah, I you know I would I set it up on Tails. scale, I'd hit the internal IP on it so I could access it over my tailnet wherever I was in the world. But I'd still have to log in with my credentials. Um there is a way to set up authentication with Proxmox that uses TSIDP so that Uh it basically just automatically authenticates you, just by virtue of the fact that you're on the tail net. It knows who you are, right? So I don't have to you know type my username and password into a modal and it's submit. I basically just visit like my Proxmox instance locally and I'm in I'm listening very closely now. Yeah. So uh Alex. Um Yeah, I published a lot of videos on all tail scale tech. He actually has a video on this. I think an update for it too after we brought TSIDP up to the MCP spec last year. He refreshed his video on you know how to set up TSIDP to work with Proxmox. Uh But yeah, it's super convenient like that. At at Tail Scale we use it internally. Um for instance we have to do that. have it set up so that uh our revenue team, if they need to access Salesforce, they don't have to type a username. password, they just they can visit uh they can just visit Tailscale like Salesforce. Salesforce is configured Basically the jump through our local T S I D P instance instead of having to Do an OAuth flow. So this is like clickless login e…
Modal came up in “Private Governance: Creating a Market in AI Regulation, with Dr. Gillian Hadfield & Andrew Freedman” from "The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis.
Quote
If you're ready to transform your customer experience, scale your support, and give your customer service Time to focus on higher level work? Find out how at fin.ai. Slash cognitive. AI researchers and builders who are pushing the frontier know that what's powering today's most advanced models is the highest quality training data, whether it's for agentic tasks, Complex coding and reasoning, or multimodal use cases for audio and video. The data behind the modal Most advanced models is created with a hybrid of software automation, expert human judgment, and reinforcement learning, all working together to shape intelligent systems. And that's exactly where Labelbox comes in. As their CEO Manusharma told me on a We have a very vast network of domain experts and we build Powerful software with operational excellence and experts ranging from STEM PhDs to software. Frontier data for the world's top AI labs, and a partner of choice for companies. seeking to maximize the performance of their task-specific models. As we move closer to supernatural,
Modal came up in “Hermes Agent clearly explained (and how to use it)” from The Startup Ideas Podcast.
Quote
So Hermes has knowledge of where the keys are stored and you're your configuration. And you can say, like, is this a secure setup? Tell me why or why not? And it'll go through and let you know if there are any secret keys exposed on your computer, if they're in plain text Text, if like a firewall is set up poorly, it'll let you know. Um the other thing that's very unique to Hermes is that it's built to also out of the box be able to be ran inside of a Docker container in case you want it on your machine, but isolated from the rest of your files. And then you can also run it on modal as like a serverless service as well. Um so it's really flexible in how you in it. I personally am a little bit risky and I just kind of run it on the bare metal. Uh and I'm just routinely making sure like every day I'm updating. it and I'm also making sure that like I ask it to you know secure my own setup. Cool. Let's let's keep going. All right. So uh The installation if you're on a Mac is pretty straightforward. Um you can just head over to the Ermes Agent uh documentation. It's on the new research website. And if you're on Linux, Linux, Mac OS, or even Windows subsystem for Linux is just this one-line command. If it's your first time installing a tool like this on a Mac, you'll probably need to install the Xcode developer tools. So I covered that in the video that uh Greg found me through, but…
Modal came up in “Why Physical AI Needed a Completely New Data Stack” from Gradient Dissent: Conversations on AI.
Quote
computing and and th that kind of uh that kind of field. So yeah, let's where do we wanna Uh dig in for our space. And how does rerun work there? Got it. Yeah. So maybe just set the scene so rerun uh we have an open source project and that is a very popular Uh and that project is uh an SDK for like logging. Modelling, querying and visualizing, like really multi modal data and particular So I'm picturing like um sense. Yeah, yeah. Multiple cameras moving around. Maybe you have regular RGB. and you have motion sensors and you have whatever other normal signals and tensors and you just If you think it could be so this our actual early beachhead was in spatial computing or augmented reality so like companies building headsets and and and that kind of stuff. And so what are they logging there? You're logging Like sensor, uh like images, you're logging camera calibrations. where a camera is in space, uh as well as just yeah, normal text logs and nearly Logging uh time series of that could mean whatever, just you know, CPU time.