WEBVTT

00:00:00.320 --> 00:00:08.200
okay do you remember that project I was working on where for the better part of

00:00:03.639 --> 00:00:10.440
6 months I built up this badass 36 core

00:00:08.200 --> 00:00:15.400
dual xon server machine to handle our video encoding and transcoding tasks

00:00:12.639 --> 00:00:21.560
over the network here well fast forward almost a year and many many hours spent

00:00:19.359 --> 00:00:26.039
on diagnosis not to mention a kick in the right direction from this post over

00:00:23.199 --> 00:00:32.759
on Puget systems I think I finally figured out why we never got quite the

00:00:29.279 --> 00:00:35.040
performance that I expected is it

00:00:32.759 --> 00:00:41.760
possible then that a $4,000 22 core CPU could be outperformed

00:00:39.480 --> 00:00:47.160
by one that costs only a few hundred bucks for video encoding is it possible

00:00:45.039 --> 00:00:52.640
that I made a mistake nothing to hold on

00:00:50.000 --> 00:00:56.480
to fails a lot if they reading the sign I'm definitely getting their attention

00:00:54.199 --> 00:01:00.680
so does one of the recurring themes of these laptop or bus videos

00:00:59.239 --> 00:01:06.119
become line failure

00:01:02.399 --> 00:01:09.410
montages I I mean aside from those

00:01:06.119 --> 00:01:12.470
ones let's find

00:01:16.759 --> 00:01:24.920
out freshbooks is the super simple invoicing solution that lets you get

00:01:21.320 --> 00:01:26.640
organized save time and get paid faster

00:01:24.920 --> 00:01:30.960
click now at the link in the video description to try it for

00:01:28.520 --> 00:01:36.040
free Okay so to open this video up we need to take a closer than usual look at

00:01:33.560 --> 00:01:41.640
my test bench I wanted to eliminate bottlenecks wherever possible so that

00:01:38.280 --> 00:01:44.040
the CPU is the only factor in my

00:01:41.640 --> 00:01:49.479
performance evaluation so for that reason most of the performance testing

00:01:46.439 --> 00:01:54.479
was done on an Intel 750 series 1.2 TB

00:01:49.479 --> 00:01:57.320
ndme SSD a GTX Titan x 128 gigs of ddr4

00:01:54.479 --> 00:02:03.039
quad Channel memory on an x99 Deluxe 2 motherboard and the CPUs tested are as

00:01:59.920 --> 00:02:07.560
follows Intel's top of the server line

00:02:03.039 --> 00:02:10.640
2699 V4 22 coron they top of the

00:02:07.560 --> 00:02:11.840
high-end desktop line 10 core core i7

00:02:10.640 --> 00:02:19.519
extreme 6950x the 8 core and 6 core 6900k and

00:02:16.680 --> 00:02:25.120
6800k and finally I decided to throw in their Flagship mainstream 6700k quad

00:02:22.920 --> 00:02:31.080
core to give us the most complete picture possible at the end of the day

00:02:28.239 --> 00:02:35.000
as for the video tests I apologize in advance if the codec or encoder

00:02:33.400 --> 00:02:38.840
application that you personally prefer wasn't covered but this was done as much

00:02:37.200 --> 00:02:43.640
to optimize the Linus Media Group workflow as it was for the purposes of

00:02:40.840 --> 00:02:49.400
creating a video so I'm looking at four different scenarios that we encounter

00:02:45.920 --> 00:02:53.680
pretty much daily one transcoding a 4K

00:02:49.400 --> 00:02:55.800
mxf off of our Sony fs5 to 1080p copor

00:02:53.680 --> 00:03:01.400
our mezzanine codec of choice for editing two exporting a finished project

00:02:59.319 --> 00:03:07.680
in this case a green screened episode as fast as possible directly to h264 for

00:03:04.560 --> 00:03:10.440
publication to YouTube Three a quick

00:03:07.680 --> 00:03:14.920
export in cfor how we normally export so that a network media encoder machine

00:03:12.480 --> 00:03:21.040
with a watch folder can transcode it to h264 and automatically upload it to the

00:03:17.360 --> 00:03:24.879
channel and four finally the performance

00:03:21.040 --> 00:03:27.560
of that copor to h264 conversion with

00:03:24.879 --> 00:03:32.799
the 1080p to 4K upsampling that we perform for the reasons we covered more

00:03:29.360 --> 00:03:35.519
thoroughly in this video here so I ran

00:03:32.799 --> 00:03:39.920
every test with and without Cuda acceleration enabled in Adobe Media

00:03:37.360 --> 00:03:45.560
encoder and used a second machine to capture the screen output with CPU and

00:03:42.640 --> 00:03:52.560
GPU usage displayed so I could review it later let's begin then with scenario one

00:03:49.439 --> 00:03:55.560
this is what most people probably expect

00:03:52.560 --> 00:03:57.840
from a multicore CPU in a video encoding

00:03:55.560 --> 00:04:02.920
Benchmark traditionally this is one of the easiest workloads to scale AC

00:03:59.760 --> 00:04:04.879
crossmore course and our CPU usage graph

00:04:02.920 --> 00:04:09.840
indicates that all is working beautifully throwing a GPU into the mix

00:04:07.879 --> 00:04:14.680
levels the playing field somewhat but this won't surprise anyone who knows how

00:04:11.799 --> 00:04:20.919
GPU dependent a video Codec cineform is and how that bastard law of diminishing

00:04:17.519 --> 00:04:24.080
returns Works moving on to exporting a

00:04:20.919 --> 00:04:27.280
project directly from our cfor timeline

00:04:24.080 --> 00:04:30.680
in CPU only mode we see nice scaling

00:04:27.280 --> 00:04:32.800
with more cores but maybe not quite the

00:04:30.680 --> 00:04:37.240
dominance we'd expect from a chip with and yes I know it doesn't quite work

00:04:34.160 --> 00:04:40.440
this way like 60 GHz of theoretical

00:04:37.240 --> 00:04:44.919
total performance this is a hint of

00:04:40.440 --> 00:04:47.840
things to come and Bam throwing a GPU

00:04:44.919 --> 00:04:54.320
into the mix paints a much more extreme picture here the Cuda accelerated code

00:04:50.880 --> 00:04:57.759
path not only reaps very little benefit

00:04:54.320 --> 00:05:00.280
from more than six cores it punishes

00:04:57.759 --> 00:05:07.680
CPUs with lower clock speed in a way that I really didn't expect observed GPU

00:05:04.199 --> 00:05:10.720
usage is much lower than any other

00:05:07.680 --> 00:05:14.240
processor in this test for our $4,000

00:05:10.720 --> 00:05:17.840
chip and the CPU usage we see of about

00:05:14.240 --> 00:05:20.240
25% tells us this is not a heavily

00:05:17.840 --> 00:05:25.440
threaded workload oops all right so let's break that down

00:05:23.000 --> 00:05:31.880
then into the individual steps and find out where our heavy multi thousand

00:05:28.400 --> 00:05:34.319
investment in an Uber Zeon falls apart

00:05:31.880 --> 00:05:38.880
exporting the project from a cfor 1080p timeline to a cineform 1080p file

00:05:37.039 --> 00:05:43.280
theoretically Elsewhere on the network but I'm using my NVMe drive as a target

00:05:41.400 --> 00:05:48.880
for these benchmarks for consistency sake is pretty flat across the board and

00:05:46.120 --> 00:05:53.880
curiously this is true with or without Cuda acceleration enabled in media

00:05:50.600 --> 00:05:56.360
encoder GPU usage is 85% regardless of

00:05:53.880 --> 00:06:02.759
which drop down so this is clearly nearly 100% GPU dependent which leads us

00:06:00.160 --> 00:06:10.319
then to the second step in the process converting from cineform 1080 to h264 4K

00:06:07.039 --> 00:06:14.080
in CP mode only we see a similar Trend

00:06:10.319 --> 00:06:16.599
to our initial injest test more horses

00:06:14.080 --> 00:06:22.880
is better but only to a point then in GPU assisted mode there it is we are

00:06:20.120 --> 00:06:28.280
almost entirely Bound by per core performance with a lowly quad core

00:06:25.479 --> 00:06:33.680
costing one/ tenth as much handily beating our Zeon be

00:06:31.039 --> 00:06:38.000
so then did I horribly misconfigure our video encoding injest stations and

00:06:35.520 --> 00:06:42.520
output server our Zeon basically pointless NVIDIA

00:06:39.800 --> 00:06:48.280
work well if you're looking simply at the graphs I just showed you along with

00:06:44.759 --> 00:06:50.599
these charts of approximate CPU and GPU

00:06:48.280 --> 00:06:56.560
usage in all the different scenarios I tested then it's pretty clear that these

00:06:53.199 --> 00:06:58.879
lower clocked many core chips are being

00:06:56.560 --> 00:07:04.160
underutilized and the money though I fortunately didn't pay for them would be

00:07:00.759 --> 00:07:06.280
better invested almost anywhere else but

00:07:04.160 --> 00:07:10.400
as always the real world isn't really that simple and it's going to come down

00:07:07.879 --> 00:07:15.560
to the needs and workflow of each individual or organization

00:07:12.960 --> 00:07:20.560
virtualization can be used to get damn near 100% scaling out of as many cores

00:07:17.919 --> 00:07:25.199
as you please encoding software like Sor and squeeze can process many files at a

00:07:23.120 --> 00:07:31.440
time and on the subject of different software testing any given codec in any

00:07:28.960 --> 00:07:36.000
given soft software could yield very different results from what you're

00:07:32.599 --> 00:07:38.879
looking at here so there's no way around

00:07:36.000 --> 00:07:43.759
testing just make sure that when you do so for yourself you go in without any

00:07:41.960 --> 00:07:49.039
assumptions about what the right tool for the job will end up being so you can

00:07:46.479 --> 00:07:55.159
avoid pulling a Linus speaking of tools for the job it's

00:07:52.319 --> 00:07:58.840
summer apparently something something boarding Planes Trains driving a car

00:07:57.360 --> 00:08:05.039
leave your worries behind okay I don't know what any of that stuff in my notes is but today's sponsor is tunnel bear

00:08:03.080 --> 00:08:11.080
and if today's lack of online privacy brings out your inner grizzly

00:08:07.720 --> 00:08:14.639
bear ra then you can try tunnel bear

00:08:11.080 --> 00:08:16.720
it's simple and it is free to try at the

00:08:14.639 --> 00:08:21.560
link in the video description it's the easy to use VPN that makes it so you can

00:08:19.120 --> 00:08:26.199
browse privately and enjoy a more open internet without all that hassle

00:08:23.800 --> 00:08:31.319
associated with more complex VPN Solutions any you know port forwarding

00:08:28.560 --> 00:08:37.120
or DNS or any nonsense like that you just click the button and boom you can

00:08:34.279 --> 00:08:41.200
tunnel into up to 20 different countries and it will appear to the websites and

00:08:39.159 --> 00:08:44.640
services that you are using as though you are coming from that country and

00:08:43.039 --> 00:08:50.959
tunnel bear has a top rated privacy policy and does not log your activity so

00:08:48.240 --> 00:08:54.959
try it free with 500 megabytes and no credit card required and if you decide

00:08:53.320 --> 00:08:59.519
you like it and you want to get a year of unlimited data you can save 10% by

00:08:57.680 --> 00:09:03.920
going to tunnel bear.com T Linked In the video description so

00:09:02.600 --> 00:09:07.760
thanks for watching guys if this video sucked you know what to do but if it was awesome get subscribed hit that like

00:09:06.440 --> 00:09:12.720
button or even check out the link to where to buy the stuff we featured at

00:09:10.200 --> 00:09:16.040
Amazon in the video description also linked in the description is our merch

00:09:14.120 --> 00:09:19.399
store which has cool t-shirts just like this one and our community Forum which

00:09:17.839 --> 00:09:22.880
you should totally join now that you're done doing all that stuff you're probably wondering what to watch next so

00:09:21.560 --> 00:09:28.640
check out that little button in the top right to check out our latest video over

00:09:24.880 --> 00:09:28.640
on Channel Super Fun
