WEBVTT

00:00:00.080 --> 00:00:05.200
over the last couple of years one of my goals as a business owner has been to

00:00:03.520 --> 00:00:11.519
deepen our pool of talent and intellectual properties so that if Linus

00:00:07.919 --> 00:00:14.559
the guy gets hit by a bus Linus media

00:00:11.519 --> 00:00:17.359
group the company can survive but i know

00:00:14.559 --> 00:00:20.640
that for many of you lioness tech tips just wouldn't be the same without the

00:00:19.600 --> 00:00:26.480
Linus which is why we formulated a plan

00:00:23.840 --> 00:00:31.760
we posted this video on our sister channel ShortCircuit where i appeared

00:00:29.119 --> 00:00:36.480
to be the host but in actual fact i had no involvement whatsoever

00:00:34.239 --> 00:00:42.079
here's how my team made it and how monday.com who generously sponsored this

00:00:39.120 --> 00:00:47.520
time-consuming but super fun and very educational journey made the process a

00:00:45.040 --> 00:00:51.600
whole lot easier if you haven't heard of monday.com it's a productivity tool to

00:00:49.600 --> 00:00:55.160
help you manage you and your team's work from start to finish

00:01:02.079 --> 00:01:09.119
most deep fakes you see online are made by taking an existing video clip and

00:01:06.000 --> 00:01:11.119
then using software to replace the face

00:01:09.119 --> 00:01:15.119
and only the face unfortunately our team's task was a

00:01:13.360 --> 00:01:20.720
little bit more complicated because we wanted our final video to not only look

00:01:17.280 --> 00:01:22.400
like me but also sound like me Linus's

00:01:20.720 --> 00:01:26.880
voice is proof that his balls are the only two things he hasn't dropped yet

00:01:25.040 --> 00:01:30.400
this goal meant that we would have to write a script

00:01:28.159 --> 00:01:36.000
hire a Linus impersonator to read the script on camera record and edit that

00:01:33.280 --> 00:01:40.159
video and then deep fake it we used monday.com's recruitment management

00:01:38.000 --> 00:01:45.680
template with a few tweaks to keep track of everything one small problem there

00:01:43.200 --> 00:01:49.360
are no Linus impersonators well none that are usable anyway that we could

00:01:47.280 --> 00:01:52.399
find and even if we did find someone who could do Linus's voice what are the odds

00:01:51.280 --> 00:01:56.799
that that guy or gal also looks like Linus and i'm not

00:01:55.439 --> 00:02:00.880
even talking about facial features although that helps no matter how

00:01:58.640 --> 00:02:04.320
convincing the face swap the final video simply wouldn't work if the impersonator

00:02:03.280 --> 00:02:08.800
was say built like a brick or had curly hair

00:02:06.719 --> 00:02:13.680
okay then so we go the other way then get someone who looks like me and then

00:02:11.120 --> 00:02:16.879
deep fake the face and voice right not quite

00:02:14.480 --> 00:02:21.520
you see deep fake for voice or high fidelity voice conversion mostly exists

00:02:19.599 --> 00:02:26.879
in research papers right now so we couldn't record like a look-alikes take

00:02:23.840 --> 00:02:28.480
and then morph that audio clip to sound

00:02:26.879 --> 00:02:33.680
like Linus instead we sent high-quality audio and

00:02:31.440 --> 00:02:39.760
script reference pairs to a company called resemble ai who use them to train

00:02:37.040 --> 00:02:42.879
a neural network to generate the voice from scratch

00:02:41.040 --> 00:02:46.800
i give consent to have my voice cloned by resemble

00:02:44.480 --> 00:02:51.680
meanwhile james wrote a fresh script for our eventual fake Linus actor to read a

00:02:49.599 --> 00:02:55.280
keyboard review for the cleave from truly ergonomic the machine's first pass

00:02:54.319 --> 00:03:02.080
was a little rough just when you think you've reviewed every ergonomic keyboard

00:02:59.680 --> 00:03:06.080
on the planet a new company with a name like truly ergonomic launches something

00:03:03.760 --> 00:03:10.480
like this the cleave keyboard apparently the world's most comfortable keyboard

00:03:07.920 --> 00:03:15.360
but with some patience and a ton of help from resemble by the way shout out to

00:03:12.720 --> 00:03:19.120
sakib muhammad you are awesome we were able to get several options for each

00:03:17.200 --> 00:03:24.239
sentence arrange our selections in premiere along with some room tone et

00:03:21.120 --> 00:03:26.640
voila an imperfect but definitely

00:03:24.239 --> 00:03:30.080
recognizable voice clone just when you think you've reviewed every ergonomic

00:03:28.239 --> 00:03:33.280
keyboard on the planet a new company with a name like truly ergonomic

00:03:31.680 --> 00:03:37.760
launches something like this the cleave keyboard apparently the world's most

00:03:35.040 --> 00:03:42.480
comfortable keyboard that is a pretty sexy sounding robot if i do say so

00:03:40.400 --> 00:03:46.239
myself now for the visuals we made a casting

00:03:44.319 --> 00:03:52.720
call on vancouver actor's guide looking for someone with my hair type face shape

00:03:49.120 --> 00:03:54.799
skin tone and well stature and boy did

00:03:52.720 --> 00:03:59.480
some people ever not read the ad but some people did and we ended up hiring a

00:03:56.799 --> 00:04:04.480
lovely gentleman named dylan tebow lttstore.com i surrendered my favorite

00:04:02.000 --> 00:04:08.400
clothes to help him get into character and well the kovit 19 pandemic had made

00:04:07.200 --> 00:04:12.879
his hair a little longer than the picture but we decided to run with it

00:04:10.720 --> 00:04:17.680
it's up here now that's way too far how often i use it now instead i have two

00:04:15.599 --> 00:04:22.639
control keys right next to each other how is that helpful hey dylan can you

00:04:20.320 --> 00:04:26.800
act more unreasonably upset about a small detail okay

00:04:25.360 --> 00:04:31.280
losing so many keys that your productivity suffers and it's priced at

00:04:28.800 --> 00:04:34.720
300 us dollars which cut dylan i don't want you to break any

00:04:33.120 --> 00:04:38.479
company property but do you think you could drop it

00:04:36.479 --> 00:04:42.720
oh my god armed with the footage from dylan's shoot we opted to use deep face

00:04:40.639 --> 00:04:46.880
lab a free tool that is surprisingly easy to use you just supply a video of

00:04:44.720 --> 00:04:51.680
your source so that's me and your destination so that's dylan and then use

00:04:49.440 --> 00:04:57.120
the various batch files to split the videos into frames extract faces from

00:04:54.400 --> 00:05:00.960
the frames train the ai model and then convert the final video

00:04:59.040 --> 00:05:05.919
let's look at each of those steps in a bit more detail most deep fix are made

00:05:03.840 --> 00:05:10.080
by taking a famous video that already exists and then swapping a new face onto

00:05:07.919 --> 00:05:14.720
it but we're actually making a brand new video so that gives us a huge advantage

00:05:12.880 --> 00:05:19.039
not only are we able to use a set that Linus has shot on before

00:05:16.560 --> 00:05:22.560
but this one has a giant softbox that's always there that makes matching the

00:05:20.960 --> 00:05:27.440
lighting between our new video and our existing Linus footage really easy but

00:05:25.199 --> 00:05:30.960
just in case we also took brightness sweeps of dylan from multiple angles

00:05:29.280 --> 00:05:34.479
which helped the ai model become more generalizable

00:05:32.800 --> 00:05:38.080
we then grabbed the footage from Linus's ShortCircuit videos plus a couple LTT's

00:05:36.479 --> 00:05:42.160
with different lighting and face angles and put them all on a new timeline in

00:05:40.080 --> 00:05:47.759
premiere it's a good idea to remove any extra b-roll sponsor spots or non-Linus

00:05:45.280 --> 00:05:50.800
faces at this point to save time later next we exported the video and dropped

00:05:49.360 --> 00:05:55.039
it into our deep fake folder with the correct name so the batch files can

00:05:52.639 --> 00:05:58.240
point to it the first batch file splits the video into frames which you could

00:05:57.039 --> 00:06:04.479
have actually done in your editing software which also means less compression while the next one extracts

00:06:02.240 --> 00:06:09.680
faces from those frames now for the all-important step of massaging our data

00:06:06.720 --> 00:06:13.600
set for both size and quality monday.com's shareable boards helped us

00:06:11.759 --> 00:06:17.520
work through all the things we needed you need at least a couple thousand

00:06:15.680 --> 00:06:22.400
images for your source and your destination but the thing is you don't

00:06:19.680 --> 00:06:26.400
want too much so somewhere between two and ten thousand pictures of each face

00:06:24.479 --> 00:06:30.880
will do nicely depending on how much time you have and for quality you want

00:06:28.479 --> 00:06:35.680
to remove any faces that are blurry are of other people or are not actually

00:06:33.280 --> 00:06:40.960
faces at all it can be a bit tedious to go through so many images even if they

00:06:38.160 --> 00:06:44.960
are someone so handsome but fortunately deepface Labs has some

00:06:43.199 --> 00:06:49.520
built-in sorting tools to help it go faster with that done it's finally time

00:06:47.680 --> 00:06:53.120
to train the model there are a number of knobs to adjust in the training step to

00:06:51.600 --> 00:06:56.960
ensure that our hardware is being used to its maximum potential and a few that

00:06:55.280 --> 00:07:01.280
have to be tweaked over the course of training so that our faces have detail

00:06:58.720 --> 00:07:05.840
in the eyes teeth and skin tone for example it's basically a matter of

00:07:03.680 --> 00:07:09.280
turning up the key parameters until you get an out of memory error and then

00:07:08.160 --> 00:07:13.520
backing off to the point where you're confident it won't crash to get the best

00:07:11.680 --> 00:07:17.280
possible results we needed to build a pretty badass deep fake rig we started

00:07:15.759 --> 00:07:21.919
with the unlimited budget pc from a previous video it's equipped with a 9900

00:07:19.360 --> 00:07:25.680
ks at 5.1 gigahertz along with 64 gigabytes of RAM but since video memory

00:07:24.080 --> 00:07:31.360
is everything when it comes to defects we upgraded the gaming video cards to an

00:07:27.840 --> 00:07:33.039
rtx quadro 8000 with a tier inducing 48

00:07:31.360 --> 00:07:36.080
gigabytes of video memory and while we do have two of these cards we were

00:07:34.400 --> 00:07:39.440
actually worried that sli would hurt our performance rather than help because

00:07:37.520 --> 00:07:44.400
deep face lab really isn't optimized for it so i guess rip sli

00:07:42.000 --> 00:07:49.199
fast forward a couple of days or weeks rather actually and our model previews

00:07:46.639 --> 00:07:54.400
were starting to look pretty detailed that means it's time to merge

00:07:51.759 --> 00:07:58.160
this can be done automatically or you can use the interactive converter to go

00:07:56.479 --> 00:08:03.440
through your video frame by frame and tweak settings like super resolution the

00:08:00.639 --> 00:08:07.360
blur of the mask edge or the face scale after that our settings were used to

00:08:05.120 --> 00:08:09.919
output a final video and

00:08:08.639 --> 00:08:13.599
just when you think you've reviewed every ergonomic keyboard on the planet a

00:08:12.000 --> 00:08:16.720
new company with the name like truly ergonomic launches something like this

00:08:15.199 --> 00:08:20.319
the cleave keyboard apparently the world's most comfortable keyboard

00:08:20.560 --> 00:08:26.319
well the face swap is good but the lip sync

00:08:23.919 --> 00:08:30.960
pretty far off the body language doesn't look very lineasy my hair is curly and

00:08:29.360 --> 00:08:35.760
the voice sounds frankly just kind of tacked on at this

00:08:33.519 --> 00:08:39.680
point i thought we were pretty screwed but fortunately while this video was

00:08:37.680 --> 00:08:44.320
being made a new version of deep face lab came out that supports whole head

00:08:42.159 --> 00:08:48.720
swaps meaning we no longer needed an actor with similar hair so i took it

00:08:46.480 --> 00:08:53.600
upon myself to don the costco jeans slather deputy do in my hair to make it

00:08:50.480 --> 00:08:56.360
as small and shiny as possible and talk

00:08:53.600 --> 00:09:01.200
with exaggerated facial expressions ltdstore.com and we also use this

00:08:58.880 --> 00:09:05.519
opportunity to collect some foley this is the ambient noises you'd expect to

00:09:03.120 --> 00:09:11.200
hear when the host handles packaging for example the new model showed promise but

00:09:08.160 --> 00:09:13.040
it had its own new issues number one see

00:09:11.200 --> 00:09:17.279
how the hair derps out when he turns his head that's because our source data set

00:09:15.040 --> 00:09:23.040
contained images from multiple LTT videos meaning multiple slightly

00:09:20.240 --> 00:09:27.920
different gelled hairstyles that limits the machine to low detail approximations

00:09:25.760 --> 00:09:31.680
of my hair to make it crispier we removed most of the images from our data

00:09:29.680 --> 00:09:36.480
set keeping only stills from the original ShortCircuit video but that

00:09:34.240 --> 00:09:40.800
didn't end up solving it the hair looks great when i'm facing forward but since

00:09:38.560 --> 00:09:45.760
i actually never turned my head in that video we had no data for my right side

00:09:44.240 --> 00:09:50.640
so i did some detective work and i learned that Linus shot that short

00:09:47.440 --> 00:09:52.080
circuit video on november 29 2019 so i

00:09:50.640 --> 00:09:56.320
looked through our internal email and chat history to try to find out if he

00:09:53.920 --> 00:10:01.760
shot anything else that same day as luck would have it he did including multiple

00:09:59.360 --> 00:10:06.959
takes where he turns the correct way over and over again issue number two you

00:10:05.040 --> 00:10:09.200
can still see some remnants of james in there

00:10:08.080 --> 00:10:14.240
handsome but not necessary to fix that we ran the

00:10:12.000 --> 00:10:18.640
merge three times to create three different masks then brought those into

00:10:16.399 --> 00:10:22.560
premiere and created a composite that completely removed james while

00:10:20.480 --> 00:10:26.880
preserving the background and the new generated Linus

00:10:24.800 --> 00:10:30.480
not new and improved just new and generated i see you there

00:10:29.200 --> 00:10:35.360
and i see a result that i'm honestly

00:10:32.800 --> 00:10:40.079
really proud of i can't believe what the team accomplished here like it's not

00:10:38.240 --> 00:10:43.680
perfect james could have spent a little bit more time mastering my cat-like

00:10:42.480 --> 00:10:49.519
grace but given that this was our first attempt

00:10:46.959 --> 00:10:54.720
and we learned so much doing it not to mention had a ton of fun i am glad that

00:10:52.480 --> 00:10:58.240
we completely blew our time budget for this project

00:10:55.920 --> 00:11:02.720
because hey at least we had monday.com to help us keep that under control

00:11:00.320 --> 00:11:07.279
monday.com made it way easier to keep track of tasks recruit actors and

00:11:05.120 --> 00:11:11.279
delegate assignments across departments monday.com is built for the way you want

00:11:09.200 --> 00:11:15.360
to use it with tons of customization you can import any data you need track

00:11:13.279 --> 00:11:19.120
progress among team members and automate tasks so you can focus on getting work

00:11:17.200 --> 00:11:24.079
done don't take my word for it though give it a try with a 30-day free trial

00:11:21.680 --> 00:11:28.160
at monday.com when you sign up today at the link below which is monday.com it's

00:11:26.640 --> 00:11:32.959
really easy when your service is the same as your url hyundai.com right it's

00:11:30.800 --> 00:11:37.279
easy to find guys if you haven't seen it already the finished fully deep faked

00:11:35.360 --> 00:11:42.800
video is up on our ShortCircuit channel so go have a look thanks for watching

00:11:40.240 --> 00:11:45.839
is this really me you'll never know