WEBVTT

00:00:00.000 --> 00:00:07.040
dita fender sponsored this video so i can show you that i can juggle computers

00:00:07.040 --> 00:00:13.880
and tell you about our new amazing product Linus strength pills watch as i make david watch all of 2019

00:00:15.000 --> 00:00:17.000
not this

00:00:30.000 --> 00:00:36.160
Thanks to Linus Pills.

00:00:36.160 --> 00:00:42.960
I'm unstoppable. Start your subscription today by emailing me at Linus.Linus.Linus.com.

00:00:42.960 --> 00:00:49.080
I don't remember recording any of this. Bitdefender did sponsor a video about protecting yourself from deep phase though.

00:00:49.080 --> 00:00:52.600
Because now, these have gotten so good that you didn't even realize that I'm not the

00:00:52.600 --> 00:00:55.840
real Linus either.

00:00:55.840 --> 00:01:07.520
What are you guys doing?

00:01:07.520 --> 00:01:13.520
Fully AI generated videos, at least for now, still have some easy to spot tells, especially

00:01:13.520 --> 00:01:20.600
if they're long or if they involve a lot of movement. But a simple head replacement deepfake, or a stationary figure at a desk telling you

00:01:20.600 --> 00:01:23.600
all about their latest get rich quick scheme.

00:01:23.600 --> 00:01:27.280
Those have gotten shockingly convincing over the last five years.

00:01:27.280 --> 00:01:32.840
Let's talk about both how we deepfaked and how we fully generated me for this video using

00:01:32.840 --> 00:01:37.320
just commodity hardware. But before we do that, an important message.

00:01:37.320 --> 00:01:41.280
If you've got a loved one who's going to need some help recognizing what's AI and

00:01:41.280 --> 00:01:44.680
what's not, send them to this timestamp.

00:01:44.680 --> 00:01:50.400
We've got you. Starting with our deepfake, the process is still pretty similar to how we did it last

00:01:50.400 --> 00:01:53.600
time, just five years refined.

00:01:53.600 --> 00:01:56.840
First find an actor with a similar body shape to your target.

00:01:56.840 --> 00:02:01.100
You can see why this matters in this test that we did with Ploof, who owns the display.

00:02:01.100 --> 00:02:05.440
When you blow it up on a TV, it's pretty easy to spot anomalies like his hat or his

00:02:05.440 --> 00:02:12.520
beard leaving a soft outline. But on your phone, especially with poor eyesight, mostly just looks like a guy with a slim face

00:02:12.520 --> 00:02:16.840
on a chunky body. Oh my god, am I allowed to say that? It's okay, I wrote the script.

00:02:16.840 --> 00:02:23.400
So we parsed the subreddit then for the latest is this Linus meme and chose chase for our

00:02:23.400 --> 00:02:31.680
real attempt. We used deepface lab to train a model on about 7000 recent images on my face and boom, much

00:02:31.680 --> 00:02:36.880
more convincing. And this is both super cool and super scary.

00:02:36.880 --> 00:02:41.160
It was at least a hundred times easier than when we did it last time.

00:02:41.160 --> 00:02:45.600
Sinking up the lips with the audio is still pretty tough, but if you keep your clips short

00:02:45.720 --> 00:02:50.400
and you edit them together in a punchy manner with alternate angles and close ups, like

00:02:50.400 --> 00:02:55.040
you typically would, you can splice together something pretty convincing and pretty long,

00:02:55.040 --> 00:02:58.120
one 5-7 second shot at a time.

00:02:58.120 --> 00:03:03.040
With that said, I'm sure that a lot of you guys could still tell, because realistically

00:03:03.040 --> 00:03:06.680
you dwell the same corners of the internet that I do, and you've been watching with

00:03:06.680 --> 00:03:13.040
a sense of awe and with dread as deepfakes and video generation have gone from obviously

00:03:13.040 --> 00:03:17.520
fake garbage to oh, this requires a little bit of scrutiny.

00:03:17.520 --> 00:03:21.520
But you savvy viewer are not my concern.

00:03:21.520 --> 00:03:25.520
Many, maybe even most people, can't tell anymore.

00:03:25.520 --> 00:03:31.200
And online scams and fraud are increasing every year to the point where in 2024 losses

00:03:31.200 --> 00:03:34.760
are estimated at over $1 trillion.

00:03:34.760 --> 00:03:38.800
That number came from Bitdefender, who is in the business of knowing these things.

00:03:38.800 --> 00:03:44.640
And the scariest part is that a huge portion of the scam industry is still using old school

00:03:44.640 --> 00:03:51.200
text to voice phone calls. I mean, imagine if you could hit victims with something more like this.

00:03:51.200 --> 00:03:54.800
Hey, I'm in a jam. Can I borrow $5,000?

00:03:54.800 --> 00:03:59.360
And I know some of you guys are probably thinking, tough break boomers.

00:03:59.360 --> 00:04:05.280
You had enough for the wealth anyway. But that's easy to say until it's your family member.

00:04:05.280 --> 00:04:09.920
We have multiple people in this office who have had family members impacted.

00:04:09.920 --> 00:04:14.880
And again, the scariest part is how easy it was to generate that clip.

00:04:14.880 --> 00:04:20.680
All we needed for it and for the ones at the start of this video were start and end keyframes.

00:04:20.680 --> 00:04:26.040
Once those were captured, or in many cases just scraped off of an existing video or Facebook

00:04:26.040 --> 00:04:31.440
photo. My name is Nicholas Pluff and I hate displays.

00:04:31.440 --> 00:04:36.680
We got an advanced subscription to openart.ai and it was off to the races.

00:04:36.680 --> 00:04:39.840
Now to be clear, we're not actually recommending openart.ai.

00:04:39.840 --> 00:04:46.640
They've gotten a lot of deserved hate. It's just that we used it for this project because Sora 2, which was our first choice,

00:04:46.640 --> 00:04:51.320
won't let us generate anything with me in it and openart allows us to quickly experiment

00:04:51.320 --> 00:04:57.680
with many different models. Choosing a model, by the way, is not as simple as just grabbing the latest one.

00:04:57.680 --> 00:05:02.360
While newer models, like Google's VO3, might be generally more convincing, they also tend

00:05:02.360 --> 00:05:10.080
to have stricter content guidelines. So for certain prompts, an older model might give a more satisfactory result.

00:05:10.080 --> 00:05:15.480
Once we settled on mostly VO3, with a sprinkling of WAN 2.5 and Cling 2.1, we ran into our

00:05:15.480 --> 00:05:22.680
second hurdle. A lot of our prompts, completely by accident I assure you, got flagged as not safe for

00:05:22.680 --> 00:05:27.760
work. I think it's pretty obvious that I wanted this.

00:05:27.760 --> 00:05:32.720
But whether it's from past experience or from the training data, our video generator

00:05:32.720 --> 00:05:42.280
thought that I wanted this. Anywho, we got around it by using Cloud AI to help us create more AI-friendly prompts

00:05:42.280 --> 00:05:45.280
to help bypass these guardrails.

00:05:45.280 --> 00:05:49.660
From there, the biggest constraint was just how much money we wanted to burn on tokens.

00:05:49.660 --> 00:05:54.660
We ended up throwing away about five video clips for every one that we were able to use.

00:05:54.660 --> 00:05:58.940
Now I'm sure a few of you are wondering, why not just create this on your own hardware?

00:05:58.940 --> 00:06:05.500
And that's a totally valid question. With CompFee UI and some of the open source models out there, you can create videos.

00:06:05.500 --> 00:06:12.060
But the DIY ones are not as convincing yet, and the performance is pretty rough on consumer

00:06:12.060 --> 00:06:17.100
hardware. With that said, things are moving so fast that by the time you watch this, it'll probably

00:06:17.100 --> 00:06:20.900
have improved. That leads us to hurdle number three.

00:06:20.900 --> 00:06:28.300
See a generated actor, unlike a deepfake actor, doesn't actually speak.

00:06:28.300 --> 00:06:33.220
So you've got to line up your clips with separately generated audio.

00:06:33.220 --> 00:06:36.220
That is enough of a challenge when the subject is stationary.

00:06:36.220 --> 00:06:41.500
You throw in some walk and talk or walk and carry your colleague and things get pretty

00:06:41.500 --> 00:06:46.980
rough. Unlike this WAN desk pad from lttstore.com, we use it all the time and it's still nice

00:06:46.980 --> 00:06:50.460
and soft to the touch. But back on subject, look at this clip.

00:06:50.460 --> 00:06:55.500
Not too bad. Add audio so I can show you that I can juggle computers.

00:06:55.500 --> 00:06:59.020
Ooh, that's the yikes.

00:06:59.020 --> 00:07:04.340
Now we tried lip syncing services, but those completely fell apart when it wasn't a simple

00:07:04.340 --> 00:07:11.500
talking headshot. So our editing supervisor extraordinaire Emily used fish audio to generate a bunch of different

00:07:11.500 --> 00:07:14.620
audio versions, and then we picked the closest matches.

00:07:14.620 --> 00:07:20.620
Perfect. No. But this was a just for fun video to see what we could do with limited time and with the

00:07:20.620 --> 00:07:26.620
tools at hand. Scammers on the other hand, they can spend a lot more time and money on these clips if

00:07:26.620 --> 00:07:31.140
they think they can make money on them. That's where the sponsor of today's video comes in.

00:07:31.140 --> 00:07:37.460
That is one thing fake liners was right. It's bit defender. And their message today is pretty simple, but very important.

00:07:37.460 --> 00:07:42.580
And it's one that I agree with. Common sense is not always enough anymore.

00:07:42.580 --> 00:07:46.540
What we showed you guys today is just the beginning for bad actors out there.

00:07:46.540 --> 00:07:51.820
Scammers are constantly innovating their tactics and rapidly adopting AI, making it more difficult

00:07:51.820 --> 00:07:56.540
than ever to distinguish between what's a real emergency and what's a manufactured

00:07:56.540 --> 00:08:03.020
call that might fool a scared loved one. If you're interested in protecting yourself and your loved ones from all sorts of scams,

00:08:03.020 --> 00:08:07.300
you can get 90 days of bit defenders premium security product absolutely free at the link

00:08:07.300 --> 00:08:12.580
below, which includes scam protection, your AI powered defense against online fraud.

00:08:12.580 --> 00:08:16.900
Now let's talk about how to spot a deep fake or a generated scam video.

00:08:16.900 --> 00:08:21.020
Professor Hany Fareed does this for a living and gave a pretty cool TED talk earlier this

00:08:21.020 --> 00:08:26.180
year here in Vancouver, highlighting cutting edge strategies for detecting AI imagery.

00:08:26.180 --> 00:08:30.780
Some of them, like analyzing image noise, probably not going to help the average person

00:08:30.780 --> 00:08:36.740
that much. But what we can do is look at things like shadows and vanishing points.

00:08:36.740 --> 00:08:41.740
See we live in a 3D space, whoa.

00:08:41.740 --> 00:08:48.620
But AI is creating 2D images in an attempt to simulate a 3D space, and it's doing that

00:08:48.620 --> 00:08:54.180
without really a proper understanding of the laws of physics that govern light.

00:08:54.180 --> 00:09:00.020
Let's say you stick a light in a room. You know intuitively that shadows will be cast away from it.

00:09:00.020 --> 00:09:05.460
AI kind of gets this and tries to get it right, but if you try to align the shadows

00:09:05.460 --> 00:09:10.060
with the light source, odds are that they're not going to convert properly.

00:09:10.060 --> 00:09:14.940
And the same goes for something that human artists have understood for centuries, perspective

00:09:14.940 --> 00:09:22.220
drawing and vanishing points. See in a real picture, every object eventually converges towards what's called the vanishing

00:09:22.220 --> 00:09:27.660
point on the horizon, like this, and this, and this.

00:09:27.660 --> 00:09:33.180
But AI fails to do that and is going to draw lines that don't converge properly.

00:09:33.180 --> 00:09:36.540
Is this always easy or convenient to check?

00:09:36.620 --> 00:09:41.180
So, because unfortunately we are far beyond the days of Will Smith's face melting as

00:09:41.180 --> 00:09:45.460
he eats his non-convergent spaghetti, and depending on the model and how much effort

00:09:45.460 --> 00:09:50.260
some would put in, sure you might spot an extra limb or lack thereof and instantly

00:09:50.260 --> 00:09:55.060
know the image is fake, but it's gotten much harder to tell right away and is advancing

00:09:55.060 --> 00:10:01.260
at a rapid pace. So quickly in fact that these physics quirks today could probably be ironed out in the

00:10:01.260 --> 00:10:05.940
coming months rather than years. So what can you count on?

00:10:06.020 --> 00:10:11.380
For now, your best bet is to be more skeptical about any content that you see, especially

00:10:11.380 --> 00:10:16.060
on social media. Or at least, don't rely on it for trustworthy news.

00:10:16.060 --> 00:10:19.500
And if you see anything suspicious, do some digging.

00:10:19.500 --> 00:10:24.180
Find out if it's real or just more AI propaganda. Use other people to help you.

00:10:24.180 --> 00:10:29.340
Have a look at the discussion. Even just looking a little closer to see if there's kind of a weird shimmer around an

00:10:29.340 --> 00:10:33.860
object or if some of the lighting doesn't look quite right. That can help you spot the slop.

00:10:33.860 --> 00:10:38.620
And whatever you do, please don't answer any urgent requests for your information or

00:10:38.620 --> 00:10:41.740
for your money or click on any sketchy links.

00:10:41.740 --> 00:10:46.020
If someone reaches out to you, the safest thing to do is say, hey, I'm going to call

00:10:46.020 --> 00:10:50.380
you back at the number that I already have for you and confirm if this is you.

00:10:50.380 --> 00:10:54.180
Thanks again to Bitdefender for sponsoring this video. They're a global leader in cybersecurity.

00:10:54.180 --> 00:10:58.740
They've got over 17 years of AI innovation under their belts starting in 2008 when they

00:10:58.740 --> 00:11:02.340
introduced AI and machine learning based threat detection.

00:11:02.340 --> 00:11:07.540
While AI's ability to reproduce and scale tactics is a threat, its replication patterns

00:11:07.540 --> 00:11:13.420
can also present opportunities for detection. We're going to have a link for their services in the video description and I sincerely hope

00:11:13.420 --> 00:11:18.620
this video helps you and your loved ones avoid getting fooled by AI.

00:11:18.620 --> 00:11:23.580
Thanks for watching guys. If you liked this video, maybe check out the last time we tried to deep fake me over five

00:11:23.580 --> 00:11:27.620
years ago. It really has come a long way.

00:11:27.620 --> 00:11:31.620
Not even necessarily in terms of the convincingness, but in terms of the ease.