71. How Startups Can Fight AI Deepfake Fraud (with Ben Colman)

Hello, everyone, and welcome to another episode of the security podcast of Silicon Valley. This is a Y Security production. I'm one of the hosts, John McLaughlin. I'm joined with the other host, Sasha Sienkiewicz.

And today we have an amazing guest, Ben Coleman, the CEO and co founder of reality defender. Welcome to the show, guys. Thank you. Excited to be here.

Ben, you have an amazing, amazing technical background. It looks like you launched your career in Google as an intern. From there, you jumped right into the entrepreneurial thing and co founded a company called Place a Vote. Sounds very conscious, very like socially aware company.

From there, you jumped into Goldman Sachs as vice president. And you were CEO of another startup called CoVortex. From there, you jumped into angel investing. Exgoogler.

co was you're a member past NYC lead there, which helps exgooglers launch successful startups. Awesome. Awesome. You're definitely part of that crowd for sure.

And today, you're leading the charge as co founder and CEO of reality defender, which is actually a YC winter of 2022 backed startup. Incredible. That's right. Yeah, it's been quite quite a journey.

For those listening, you know, being a startup founder has its ups and downs. And I reality defender certainly having a lot of excitement in the news right now. So our first few years, it was much the opposite. We couldn't raise any money.

Nobody wanted to talk to us. Folks wanted us to get into NFT authentication. If you guys remember, NFTs were kind of the thing for a bit. But this is an amazing point, which is time to market always matters.

And there's usually a very short window when the timing is perfect. And if you just couple years too early or a couple of years too late, you never get to experience that hockey stick growth. Where do you see AI defense fits in the market needs today? And what pain point do you guys solve?

So, you know, throughout my career, it's always focused on kind of cybersecurity in a defensive capacity, whether it's protecting companies or consumers or their data or finding the data and removing it. The way I think about our space is just another application of cybersecurity to a new venue, which is around AI. So I think of it very similar to, you know, the antivirus industry and the history of antivirus. We think that our space and detecting AI fraud and deepfix is absolutely following the same growth story in terms of education and explosion and fraud usage.

How do you educate people? Or do you have to educate people before you start a discussion and conversations around AI and the need to defend against deepfix? Do you still see the need to educate or do people reach out directly to you? You know, we started the company five years ago and we're incredibly stubborn.

So we stuck into it, we stuck with it. But what we saw was that once OpenAI launched HatchBT, which went really mass market in winter of 22, suddenly then people started really thinking about what we're doing. And so what it started off was trying to educate people and now they're educating themselves because the problem is so prevalent. So it's a fire hose of inbound interest.

Our challenge is trying to understand which industry leaders, which company leaders understand the existential risk versus just another software they should think about. And so, you know, last year I'd say half our clients said, hey, we don't know if this is an issue, but we believe we should get started doing it. Well, today there is the data, so they don't have to do it anymore. But then half the clients said, not an issue, we'll come back and it's a problem.

Now all those half that said it was an issue are running to us saying, we've had an issue, we need this yesterday. So it's accelerating very quickly, both in terms of industry research groups, understanding the data to be able to present it, but also the clients, the targets of this fraud are seeing the risks firsthand. Yeah. And maybe for our listeners, would you mind giving us an example of what that issue is or an example of an issue?

Yeah. Put bluntly, it's, you know, it's impersonations on the phone, impersonating company leadership, impersonating clients, executing wire transfers, getting access to accounts, and then similarly on video, which is all the same use cases, but also AI generated interviews, you know, whether it's someone interviewing someone else or a hacker or a state level actor trying to get a job at a company to steal information. Yeah, that's a huge problem. I remember in the news, there was a case of someone had wire transferred, someone had gone through like the quote unquote processes to wire transfer 20 million, $20 million out of the, you know, outside of a company.

And they had done it because that they thought that they were actually talking to their colleagues inside the company who are okay in the transfer. Okay. Everything looks good. Okay.

Yup. Everything is signed. All right. Transfer the money.

And then it happened. And then boom, the money was gone. And it turned out that the whole thing was just deep fake AI enhanced fraud. I really believe that's the tip of the iceberg too.

Absolutely. I mean, I think what we're seeing now is that, you know, we're all equally at risk, you know, fraud is equal opportunity, whether it's my seven-year-old son, my seven-year-old mother, uh, world great targets. And in a space when general AI platforms, which create deep fakes requires zero technical ability, you know, those same seven-year-old son with this tablet or my seven-year-old mother with her, you know, her iPhone or her Android, these tools are everywhere. You just go onto the app store and there's face swap tools, there's synthetic voice generation tools, no requirements, no regulations, and either makes things very entertaining.

So I think the challenge is, is that, you know, the majority of use cases per genre of AI are incredible, whether it's entertainment or productivity. But in this edge case, the risks are, are, are infinite and there's no regulation yet. And we're incredibly pro AI and we're pro innovation and we're really working hard in DC on a bipartisan basis to set up some, you know, AI friendly regulations that focus on this incredibly dangerous edge case. And I think AI friendly regulation is the key.

Otherwise, often we fall into this state where we over-regulate and usually what it means, it means that we slow down innovation, but on the attack side, the innovation will never be slowed down. And so there's always this cat and mouse game. How do you guys approach the cat and mouse game? How do you stay, if not ahead of the curve, but at least you follow the same techniques as the attackers do?

Yeah. I mean, we're certainly it's a cat and mouse game. That's also true in all of cybersecurity. And I think that it's much more a feature than a bug.

The good news is that I don't think anytime soon AI is going to suddenly be impossible for us to detect. I think that most standard AI tools are focused on recreating what we observe, like literally how we hear things or see things. They're not trying to recreate how sound waves move. They're not trying to recreate how photons interact in physics.

So, you know, I think we have many years of exciting problems to solve. It's obviously going to get harder, but we're working on both sides of the equation, detecting both fakeness, which is getting better, but also realness, which is not changing. If we can detect whether you're blinking or your heart's beating or you're breathing and John, if you're not, then either you're a werewolf or a vampire or you're, you know, an AI generated bot. Now it gets really exciting with the invention of AI agents, which creates new opportunities for reality defender, given that companies do want to engage your AI agent for information sharing, but maybe not for authorization.

So we're thinking long and hard, not only about detecting AI, which are deepfakes, but whether permission deepfakes suddenly add additional calls to action for enterprise users. So let's, let's frame the problem from all of our listeners out there. In order to pull off a deepfake today with today's technology, do you think like this, this podcast would be enough to, to go off and, and train someone that looks just like Ben Coleman for the purposes of a deepfake, like zoom call, or would it just be a photo and a little bit of your voice or how, how good is that technology? Like you're on the cutting edge of all of this stuff.

So help, help us frame it. If this was six months or a year ago, we would have gone through all the things to look for and, you know, looking for issues around symmetry or hourglassing or pixelation or anti-aliasing or sound versus mouth correlation, or kind of where the head and the neck kind of touch. But you know, my, you know, one of my co-founders who leads R&D is a Harvard PhD and even he can't tell the difference anymore. And so that kind of demonstrates that for average people, you know, my parents, they don't stand a chance.

And very similar to how you don't expect my parents to be able to scan code and detect a virus. We similarly shouldn't expect people to be able to detect a deepfake because it's just too advanced for, for, for everybody, including the experts. And so in a world where the only thing that can detect AI is AI, it's just a matter of us ensuring accessibility, but also educating users on how scary it is. And I, you know, I want to be very careful to say that there is no silver bullet here.

Reality Defender alone is not going to solve these issues, but in a world where all of your personal data is available on the dark web or online and your name and your face and your voice are available on social media or on YouTube, it's paramount that organizations don't just check for your personal information matching. All that means is that I have your information, but it's checking that whoever claims to be you and have your information isn't using AI to do it. Everyone can connect with numbers. What is, what is the cost of deepfaking someone's audio?

And what is the cost of deepfaking someone's video? You know, a year ago it was, you know, hundreds if not thousands of dollars to do a good job. Now it's basically free. The tools are all over.

Just Google it. Unfortunately, I'm not going to, I'll avoid trying to share any of the names of my favorites, but the good news is not only our solution texts them, but more importantly, a lot of the Jernab tools are actually partnering with us to make sure their tools don't fall in, you know, dangerous hands. Or if they do that we're, they could say, Hey, well, reality defender can for sure detect it. So we announced a partnership with 11 labs and we have about a dozen other Jernab platforms that we're going to be announcing over the coming months.

We're partnering on the data side. And so is there some type of watermarking in the partnership that you just mentioned? Is there an easy identifier that something was generated using AI? So we avoid watermarks.

We avoid any provenance-based tools because we assume they're either going to be hacked or just ignored. You know, for the tech folks who are following this podcast, you know, anyone can upload anything to Photoshop and add a watermark. All my parents might say, Oh, it must say it's real. Like this explosion of the Pentagon led to a hundred billion dollar flash crash in the market.

All the watermark means is that at this day and time, a thing was uploaded to a thing and proven that they can't, they confirm that. So one, watermarks can be faked, but also they could be ignored. That same image was shared on Telegram and WhatsApp. They don't share watermarks or on social media.

People don't know how to braid watermarks. And so we are a proponent of these content authenticity platforms and standards like C2PA were a member of it. Our platform of reality fenders focused on supporting when those others fail, whether or not an image or a photo or a video have gone through so many different conversions, whether the watermark's missing, whether we have no idea who it is, our focus is on detecting indicates of AI. You've got to be a little bit crazy to get into security.

It's like a little bit of a cat and mouse game. There's so much at stake, you know, I've always been blessed to know that this stuff gets me irrationally excited. But maybe you have a story. Did something in particular connect back to security really just grabbed your interest and pulled you in?

I'll be honest. I grew up having a lot of fun with technology and I was fortunate enough to have my best friend when I was a child. His father had a computer store. So we were having a lot of fun, you know, working with computers.

We also, we were in fifth grade and I don't know how our parents let us do this, but we flew unaccompanied to Las Vegas to go to the big Comdex tech conference. But we were just exposed to a lot of technology then. And, you know, we tried a lot of tools that were questionable in nature. And I, you know, it certainly is a lot more fun being on the good side than the bad side.

So I'd say, you know, really just being exposed to technology is great, but also if you're going to be involved with things that might be a little, you know, or a lot ethically dubious, might as well be on the good side because it's just as much fun. Yeah, no, that's a great way of seeing it. Like, you know, it's a cat and mouse game and it's your decision whether, you know, you feel like the cat or you feel like the mouse. And it's also an area where the kind of cat and mouse game means it's always exciting and ever changing.

And, you know, every single day brings new excitement. And the challenge for us to understand which one of those assignments are mass marketable, both from the hacker side, but also from the detection side, you know, hackers are the best product managers. You know, if they could use a new tool to do more hacking, they absolutely will. And that's really where the initial genesis of this really interesting research we developed turned into this commercialized solution of reality defenders thinking about how bad actors and hackers and state-level groups will use these widgets to cause more harm at a greater level.

No longer, you know, a Nigerian print scam or trying to attack one person like me attacking you, John. Suddenly I could do it to thousands, tens of thousands, millions of people at the same time. And I only have to be right a few times or even just once for it to work, which is why, you know, I try and force my family members to use password managers. You know, my parents should not use my birthday as their password.

If I get one breach on one site using it everywhere else, they only need to be right once. So anyway, I'm very optimistic on AI. I'm very optimistic on deep detection. But I think we've got to stay vigilant.

Yeah, definitely. And you mentioned, you know, building a company, I think there's a certain type of responsibility that comes with that. It's very different than just getting irrationally excited about security or a topic, though. I think that that's very helpful, because it's a lot of work, build a company, right?

And so congratulations on all of the success with Reality Defender so far. I'm sure that the grand reward is still out there in the future. And I'm super curious, speaking about the future, if you look into the future, I know entrepreneurs and founders in particular, they're always there's always a vision that's driving things forward. Maybe you'd like to share with the audience, what is that vision that you see as success for Reality Defender?

You know, I, based on the way we see the world and how that's evolving, we want to see solutions like Reality Defender everywhere, you know, both companies, but also consumers are engaging communications or media. A lot of that comes down to, you know, compute, which is still quite expensive, but also battery life on phones. You know, I can't get by on my phone, you know, for more than, I don't know, half day without charging it. As compute gets cheaper and moves to the edge and on device for GPUs, which NVIDIA is working on, and also as battery life continues to expand, we want to see Reality Defender scanning everything we're doing all the time.

You know, we're seeing each other on video now, we're speaking real time on audio as well. Similar to, you know, your phone saying, hey, this call might be spam, which maybe you're okay with because it's an AI agent or it's a, you know, United saying that I've been upgraded, which is, you know, it is an AI, it's nice. It's an AI voice saying it. I want that call.

You know, and similar to when you're, when you're on a call, finding out that you have another call, someone also calling you, or maybe letting you know that you got an email or your battery is low, getting alert saying, you know what? At four minutes, 20 seconds for the call, there's an 80% chance that John's voice is AI generated. I think we'll start becoming very common. Not all of them will be fraudulent.

A lot of them will be fantastic permissioned agents of John's voice calling to change a reservation or calling Ben to say, hey, Ben, I'm going to be late to the meeting. That'll all be done in our devices, in our phones, that'll be done securely. And that's incredibly exciting for Ria and their role in a more secure, you know, maybe simpler future where you can trust what you see and what you hear. So in other words, in the near future, there will be a parallel that we have currently in the engineering space, which is service to service or service accounts, communications between systems.

And to paraphrase in the future, there will be agents that will represent an entity or an individual that will actually be positive agents. And the. . .

Both will be positive or negative. Yeah, yeah. I mean, it could be positive John letting me know he's going to be late to dinner. It could also be someone committing fraud just doing it on a massive scale.

So there is a delta there, but it's the same exact technology and framework. But it'll be everywhere. It'll help us understand like, oh, this was actually like human generated. This was machine generated.

Yeah. Or it's machine generated. And I'm saying, hey, John, I want you to send me your credit card number to co-pay this, you know, this concert tickets. And then the AI agent says, hey, John, I'm your AI agent.

I need you to come on the call because we need you to confirm it with Ben. So there'll be a lot of that kind of handoff between AI and non-AI. Ben, you mentioned something very interesting, which is the edge compute, the reduction in resources required for SLMs to process the audio. Are you currently using any SLMs on the local device and SLM stands for small language models?

Are you fine tuning your own models? How do you currently use the technology that is available on the market? If that is something that you can share. Yeah.

So, you know, we have over a dozen patents and white papers, we are selected for peer review at all the big conferences and computer vision and audio, CDPR, NewRIPS, ECCD, AAAI, which really differs us from most companies in our space in that we're, we're majority of our company are PhD researchers and engineers. We see our product deployed everywhere, starting with cloud, moving to edge, moving ultimately on device as well. I can't share too much now, but yes to everything that you're saying. But as far as what we're doing, it's all, you know, we develop our IPR models, our data sets in-house.

It's expensive and, and hard to do. We think that's, that's important. We think it's, versus a lot of other platforms that claim that they are highly effective, but in reality, you're just using open source. We think that's a great way of, you know, getting in trouble as a corporate entity.

But the good news is, is that it's really easy to grade, you know, you, you do a benchmarking data set test and you can grade literally like a multiple choice exam in high school. So for us, it's not only important for us to demonstrate what we're doing, which is, you know, unfortunately considered just marketing, but also have our clients prove it for themselves. So, you know, our, our large clients, you know, two and banks, government organizations, four, two hundreds, they will literally recreate all of our testing themselves and grade the results. You know, they'll upload 10, 000 or a thousand or more real-time or asynchronous audio clips or videos or images, and then grade the results.

We definitely look forward to that announcement. Once it becomes available, we'll celebrate together on the partnership with the manufacturers, but that highlights another important point, which is trust. Modern businesses and just modern business environment in general is built on communications and people have to be able to trust each other in order to make deals, in order to create new businesses, in order to create partnerships and going beyond of the fraud. If someone is not able to trust another party, then essentially you hold in business deals, you prevent in that time to market business deals, which are very important.

And it, it seems, and it sounds based on what you share, the manufacturers themselves see the problem. They see the problem will become even more sore in the near future. And there needs to be a solution. Absolutely.

Did, did I hear correctly? Did you make a prediction that we're going to go from LLMs that are mostly up in the cloud that eventually migrate them to the edge compute, but then ultimately on device? And I assume that's a little bit more, that's a little bit more of the, the small language model, the SLMs that Sasha was mentioning. I think there's other things that can be done for multiple reasons of, you know, for security and scalability.

We've done a lot of that work ourself anyways, in terms of quantizing our models to reduce and use less, less compute or to be able to scale and more secure environments where for a variety of reasons, the client or we want to ensure that the models are more protected or protective. So we have successfully already deployed our solution on commodity laptops, commodity phones. We're not going to market for that ourself. That's like with partners who are the OEMs or for, you know, various enterprise or government use cases that require a different kind of environment.

And so, yes, we've already done what we're discussing and we're excited to really scale out the rollout with, uh, with our partners. What was the pressure? Was the pressure security pressure to keep things off of the cloud? Was the pressure like performance, maybe you get better performance from SLMs or?

The pressure comes from all directions. You know, I'd say it's obviously compute is not infinite on the cloud. It's incredibly expensive, but sometimes it's also not available. Sometimes you just have no connectivity to, you know, to the internet, to access cloud, cloud compute.

Um, but also depending on where we're deploying our models, the models have to be even more secure, um, keeping things, you know, encrypted, being able to not only protect our IP, but also protect what we're scanning on behalf of our, of our clients. And so, you know, without regard to the size of language models or, uh, vision models or audio models and beyond, we've developed some novel approaches to really without changing the model themselves, um, effectively compress them and quantize them without kind of reducing the accuracy of what they're, they're doing. So, you know, it's still a lot of fun work to do there, but, um, certainly we, it's an observation and a prediction mixed into one.

Since we dipped into a little bit of a technical, uh, layer of this discussion, I have one more question for you. It's an interesting space in general where we have a new technology, LLMs and SLMs, but it's still rides on the backbone of the existing good engineering approach and good engineering architecture and good engineering design. And I, I imagine that you don't really have to process the entire audio stream in order to determine that something is deepfake. You can probably chunk it up into smaller chunks, process them in parallel.

And once you determine that something is deepfake, you can make a decision that the entire stream is deepfake, or there is a highly likelihood that there is a deepfake. Hence you reduce the cost that goes into the detection. We could do that. Some of our clients want us to do that, but we prefer to make it, uh, cheap enough and scalable enough that they don't have to make a decision that way.

We're just scanning the whole thing. And so instead of traditional authentication where you're checking, you know, Sasha's data at the beginning, but you're not checking it again at minute two or minute 20, um, we're not scanning just the first minute we're scanning the whole call in real time. And we don't think that folks should compromise that to just scan the first minute. They ask if they can.

And we say, guys, you know, we, we're making it as easy and expensive enough that you don't have to make the compromise yourself. Awesome. And this goes back into the user experience, which is very important. UI and UX in security in general, we've been in this space where we've made things very complex often as a solution, but it sounds like you guys are working from a different point of view, which is whatever we put in our customer's hands, it needs to be seamless.

Customer does not, need to be involved in decision-making process of how to set it up, how to configure it. They just should be able to easily consume it and be protected. Absolutely. We couldn't agree more.

I, I accuse, um, innovation that's not easy to use to actually not be innovation. I will go that far because anything that's like not easy to adopt is not, you, you just can't. It's, it's, it's, it's maybe an interesting side thing, but you know, if it's not adoptable, if it's not easy to understand, if it's not, um, available for consumption, it's not going to make the world a better place because it won't get adopted. Right.

So. Yeah. No, we, we agree more. Yes.

All, all of the gratitude in the world for thinking through these hard problems, being an entrepreneur is really like being on a rollercoaster. You're going to have really high highs and really low lows. And so I'm super curious, what's the best day that you've ever had as co-founder and CEO of reality? Best day.

Yeah. You know, we run up against these, these massive companies that have every single advantage to them. They're legacy providers. They have a ton of money, a ton of a reputation.

And we've been successfully able to not only match them, but more importantly, beat them and, and bake offs with, you know, large, large client opportunities, ones that they're already working with or we're working with. We recently bested one of them with a top five global bank. We, we heard their CEO called up the bank leadership and had some strong words and they said, point blank, we, we chose reality defender because it was better. It's measurable.

On the flip side of that, what's been the most challenging day at reality defender? Hiring is a huge challenge. Not only that there's not a lot of people in our space, but also the fact that anyone touching AI is just, um, has a lot of opportunities. I'd say what's really helpful for us is that we're, we're really helping to, to kind of mold the, uh, public narrative in the space.

You know, we're not only leaders in the market, but we're also kind of starting the market. And so it's a lot of responsibility. And we spent a lot of time with folks in grad school, folks kind of switching or considering switching between jobs. I mean, for me, I connect with as many cold emails as I can.

If you ping me on LinkedIn, I'm always happy to find a few minutes, you know, beyond just deep fakes, but just all of cybersecurity, you know, there's over a million jobs needing to be filled across the U S alone, not all, let alone the world. And so it's a space that needs a lot of, uh, new folks. And it leaves a lot of people, whether they're career switchers or early in their career or before careers, start thinking about it. It's just a lot of fun.

Um, there's, there's never a boring day. I know. I love that. And, you know, sometimes I see there's, as organizations grow and you pull in a lot of research components, a lot of like, um, you're on the very bleeding edge of new technology.

You have to have that research component. And then you, you get a really healthy mix of engineering, you know, where the rubber hits the road, but also the cutting edge in terms of the research itself. And it sounds like you're right there on that edge as well. And you need expertise in both sides of that.

Like, absolutely. Yeah. We're working hard. We're having a lot of fun.

Um, and we welcome anybody who's interested in us to either apply for roles we have on our site or reach out and say, Hey, I don't see a role fits me, but here's what I do. And here's how I think it can be a creative toward your goals. Uh, we have met our best partners on our team, literally from receiving their cold emails. And to reach out to Ben, please check out the description in the podcast.

If you had an opportunity to go back in time and meet your younger self, would you? And what advice would you have for yourself? Absolutely. Oh, that one.

That's quick. You know, I think about this a lot. And, you know, I, I, uh, spent most of my twenties, you know, working at huge companies and sometimes regretting it and saying, what if I did startups early on, but I learned so much to these large companies. I don't think I'd change anything.

I do look at my friends and startups who started them early and, you know, there's certainly a world where you could do things early in a much larger way, but I don't think I would know what I know or question what I question without those experiences and really being myself in the, in the seat of our clients. That's helped us a lot. Awesome. Awesome.

Yeah. Check out the description for the LinkedIn and Ben, you sound super active, super engaged. So that's perfect. Ben, thank you so much for your time.

This has been an amazing show and we look forward to the exciting news that will be coming out of the, uh, the organization that you are leading very successfully. And thank you for being part of great show. Thank you, Sasha. Thank you, John.

Always a pleasure. Ben, it was an absolute pleasure. It's always a real treat to stand on the shoulders of giants. I wouldn't go that far.

To see the world. I would. No, to see the world from, from their point of view and, and get a glimpse into a better future. We're optimistic.

We are very optimistic, but thank you both. And thank you to all of our listeners for tuning into another show of the security podcast of Silicon Valley. This has been a Y security production. I'm John McLaughlin, one of your hosts and was joined by the other host, Sasha Sinkovich, Ben Coleman, everyone, co-founder and CEO of reality defender.

Thanks guys. Thanks. Thanks, supposed to appreciate it. Thanks.

Thanks. Thanks.