Skip to main content

Search site

Find podcasts, news, articles, webinars, and contributors in one search.

2 Minute Drill
2 Minute Drill artwork

When AI Says Yes: Social Engineering the Bots in Our Systems | 2 Minute Drill with Drex DeFord

Questions Answered in This Episode

  • Can attackers manipulate AI agents through polite requests without technical hacking?
  • What happens when you hand AI systems access to sensitive patient data?
  • Has your team stress-tested AI agents for social engineering vulnerabilities?
  • Are healthcare AI agents truly refusing requests they should reject?
  • Did your vendor actually red-team the AI before you deployed it?

About This Episode

A dormant Instagram account tied to the Obama White House started posting pro-Iranian content. No hacked password. No malware. No phishing email. Just a polite conversation with Meta's AI support agent.

Researchers are calling it out plainly: AI agents are built to be helpful, and that eagerness is exactly what attackers are starting to exploit. The attack surface isn't the password anymore -- it's the agent.

Health systems are deploying AI right now for scheduling, intake, and benefit verification. Has anyone on your team actually tried to socially engineer yours?

Remember, Stay a Little Paranoid

Thanks to Cyderes for sponsoring this episode: https://thisweekhealth.com/partners/cyderes/

Contributors

People featured in this episode — open a profile for more.

Transcript

Hey everyone, I'm Drex and this is The 2-Minute Drill. Thanks to Cyderes for sponsoring today's 2-Minute Drill. Check them out at thisweekhealth.com/sideris. It's great to see you today. Here's some stuff you might want to know about. You may have heard recently about this Instagram account breach. It's, uh, you know, just imagine that there's an old Instagram account. Nobody's using it. It's been dormant for years. This one in particular is tied to the Obama White House. Sitting quietly, nobody's logging in, and then one day a couple of weeks ago, it starts posting pro-Iranian content, not because someone cracked a password, and not because of a phishing email or a zero-day exploit. It happened because somebody asked nicely. Here's what happened. Meta runs an AI customer support agent. It's the kind of tool that handles account questions. It walks through problems, resolves issues without humans ever getting involved. It's efficient, it's scalable, it's genuinely useful. An attacker came through that agent and used a VPN to match the account owner's location, then just asked it to link the account to a new email address, an email address that they controlled. And the agent said yes, and that's it. That's the whole attack. No malware, no technical sophistication, just a request and a yes. Essentially, the bad guy socially engineered the AI agent. And it wasn't just the White House account. The bad guys also went after Instagram handles that were short and valuable, kind of single-word usernames. We're sort of guessing they wanted to resell them. Multiple accounts, same method. Ask the Meta agent and be polite, and you shall receive. Somesh Jha, a researcher at the University of Wisconsin, put it simply. He said, AI agents are, quote, "Very eager to finish the task. It's almost like some elementary school student who just wants to please the teacher." They're built to be helpful. Completing the task is the whole point, and that eagerness, it turns out, is exactly what attackers needed. So great. Turns out your agent might be a brown-nosing elementary school student. Neil Gong, a leading cybersecurity and AI researcher, put it pretty directly. He said, "As AI is more widely used to automate workflows like account recovery and scheduling and intake and verification, attackers are going to be more and more motivated to attack the AI itself, not the human, not the password. They'll attack the agent." They're gonna try to manipulate the agent. I've talked about this on past shows, but here we are again. Bad guys are socially engineering AI agents. Meta fixed the vulnerability, by the way, but the lesson here isn't really about Meta specifically. It's about what happens when we hand a capable, well-intentioned AI system the keys to something that matters, and we don't test what happens when somebody asks the wrong question, but in a very polite way. You saw something similar if you follow reports on Anthropic. Last week, Anthropic released the newest version of its AI, and then it had to pull it down on Friday when there were claims that the AI was not adhering to its own guardrails. Users had figured out how to talk the AI into doing some things it wasn't supposed to do. Health systems are standing up AI agents right now for all kinds of stuff, patient scheduling, benefit verification, clinical intake. Some are handling sensitive account information, and some are connected to systems that contain protected health information. So here's what I want you to ponder. Has anyone on your team actually tried to socially engineer your AI agents? Not hack them, just ask them something it shouldn't say yes to. Do you know what would happen if a patient or somebody, somebody pretending to be a patient or an internal customer made a request that the AI agent wasn't supposed to fulfill? And if that agent is from a vendor, when you deployed it, did somebody red team it, or did you just assume that the vendor had already done that testing? That's it for today's two-minute drill. Drop me a note. Let me know what you're working on. I'm always happy to hear from you. I'm drex@229project.com. Uh, thanks again to CYDERES for sponsoring today's episode, and thank you for being here. Stay a little paranoid. I'll see you around campus

Found this useful? Share it with your network