Our 36 Core Video Rendering Server – Finally Explained
Linus Tech Tips
·Linus Tech Tips
·2016-05-06
·
1,978 words · ~9 min read
0:00
this journey begins over six months ago
0:03
when i reached out to Intel about supporting us with some chips a
0:07
low-power xeon to build the high-speed storage server for our new office that i
0:11
first showed off here then a pair of their top of the line e5
0:17
2699 v3 18 core xeon processors to build
0:21
a network video rendering server also
0:24
for the new office well as it turns out they couldn't send us the low power chip
0:28
for the storage server so we bought our own but whatever the reason was they
0:32
were able to honor our request for the pair of 4 500
0:37
processors so it is with much thanks to Intel along
0:41
with supermicro who provided a dual socket motherboard kingston who provided
0:45
128 gigs of ddr4 ecc RAM and norco
0:49
Noctua and fsp who provided our case cooling and redundant power that we are
0:54
able to bring you these findings because you see the deal was this
0:58
send us the chips and we'll make a video about how we're using them which sort of
1:03
puts a lot of pressure on us to figure out not only if the concept of network
1:07
rendering also known as a render farm works i mean that's been pretty standard
1:11
stuff for years especially in animation but also to find a way to efficiently
1:17
use those resources in our workflow
1:21
so without further ado thanks to weeks of work by edsel and much patience from
1:26
the rest of the team i am pleased to present our new editing workflow it's
1:32
fast it has built-in redundancy for our files and to quote dimitri from hardware
1:36
canucks who has already switched to it it brought the joy back to 4k video
1:41
editing for me so here we go
1:53
the logitech g303 features a lightweight design and advanced optical sensor with
1:57
delta zero technology for precise tracking and RGB lighting to match your
2:01
setup check out the link in the video description to learn more so the most
2:05
obvious bottleneck in a video editor's daily life is waiting around for
2:10
encoding tasks to complete outputting a finished ready to upload
2:15
h.264 video file can take anywhere from 15 to 30 minutes for us with one pass
2:20
vbr or even over an hour with two passes
2:24
so that was the first thing we tried to tackle with the 36 core server machine
2:29
for software telestream episode and sorenson squeeze desktop were the front
2:34
runners initially telestream was intriguing thanks to its unique ability
2:38
to split an encoding project into pieces
2:42
processed them across many cores and then stitched them back together at the
2:46
end regardless of the codec and sorenson
2:50
due to its excellent handling of multiple concurrent projects also a time
2:54
saver if you have many processing cores and its ability to utilize all cores for
2:59
a single project with supported codecs
3:03
so episode is a great concept but we abandoned it quickly due to stability
3:08
issues sorensen on the other hand impressed the snot out of us the
3:12
software worked their support staff was professional and even as a trial
3:16
customer we were escalated to engineering whenever we encountered more
3:19
complex issues outstanding so next we
3:22
tested a variety of different output formats and found that thanks to
3:26
optimizations within premiere pro our projects could be exported very quickly
3:31
by our editing workstations in dnxhd to
3:35
our server where sorenson would utilize all CPU cores to output an h.264 master
3:40
copy that was suitable for upload to youtube and other video sharing sites in
3:44
a fraction of the time that adobe media encoder could do it and all of this
3:49
while leaving the video editors computers free to work on other things
3:53
instead of just sitting there barely usable while they encoded video so
3:58
mission accomplished then right well
4:01
you know how the rabbit hole is the discoveries we made about how
4:06
dramatically a program's optimizations around a given codec could affect
4:10
performance raised more questions than they answered
4:13
and while premiere pro's claim to fame is that
4:16
unlike competitors like avid and final cut it allows any video file you want to
4:22
simply be plunked onto the timeline and edit it in real time it made us consider
4:27
the way that 4k footage off our panasonic gh4 camera just
4:32
seemed to chug as you scrub through it on the timeline even on six CPU cores
4:38
and a 10 gigabit network connection maybe there's some merit then too going
4:43
back to the old way so we devised a
4:46
workflow that would utilize our copious amounts of CPU horsepower to transcode
4:52
footage from whatever format our various
4:55
cameras captured in natively to an intermediary or mezzanine codec that was
5:01
compatible with all the programs in our workflow so for a number of reasons
5:05
avid's dnxhd was chosen and would you
5:08
look at that comparing pre-fetch latencies with
5:13
native gh4 footage the delay when moving the playhead in premiere was reduced by
5:18
nearly 25 times at 4k depending which program
5:23
exactly was used for the trans code so it was at that point that the goal
5:28
actually changed obviously we could just have the
5:32
individual video editors convert all the footage off the cameras to our mezzanine
5:37
codec when they're working but then we'd be right back where we damn well left
5:41
off with highly skilled video editors staring at their barely functional
5:44
computers waiting for a big queue of videos to transcode so no
5:49
we needed a way to avoid that by using
5:52
our overpowered hardware and the answer of course is to do the transcode at the
5:57
time of ingest or when the footage is initially removed from the camera
6:02
and here's some bad and some good news while squeeze desktop sorenson's low end
6:08
offering can perform a task like this across many
6:11
CPU cores because we dump so many video
6:15
clips off our sd cards at a time it just
6:18
wasn't stable enough with our workload so we turned to their server offering
6:24
which operated much more smoothly to automatically monitor our video file
6:28
dumping folders and transcode everything we dropped in them so the benchmark was
6:33
a folder of 41 video files totaling 16.7
6:37
gigs and by prioritizing multiple tasks
6:40
this could be processed in about 14 minutes a small price to pay even on a
6:45
video that needed to be edited immediately for the improved timeline
6:49
performance but unfortunately time wasn't the only price the server version
6:56
requires a Windows server operating system to run on top of and costs 5 000
7:02
plus yearly maintenance fees and furthermore despite the assurance we
7:07
received from sorenson's engineers that there shouldn't be any gamma or color
7:11
shifts using quicktime as a wrapper between squeezes dnx
7:15
export and premier's import it was there
7:19
and very difficult to compensate for so
7:22
it was back to the drawing board somewhat which led us to a conversation
7:27
with blackmagic design where they said that cineform could also be a great
7:32
mezzanine codec an option that had been dismissed early on due to its limited
7:37
compatibility with most software including sorensen squeeze although they
7:41
had said they could add compatibility with the next yearly release
7:46
so could we quickly transcode our footage to cineform it turns out that
7:51
yes even with only 30 CPU utilization
7:55
effectively 10 and a half of our 36 course adobe media encoder yes back to
8:00
that again managed to kick sorensen's ass converting to cineform versus
8:05
sorenson converting to dnxhd and all of
8:08
this without a significant loss in
8:11
quality regardless of whether we're working with native 4k footage for
8:16
better green screen and punch in performance or settling for up sampling
8:20
1080p footage for our finished project by the way please see this video for
8:25
more details about the benefits and the drawbacks of 4k
8:29
so that's all fine and good Linus but does cineform deliver the answer again
8:35
yes while file sizes are significantly
8:39
larger especially at 4k than even the
8:42
source files timeline performance is
8:45
better than even dnxhd thanks to an extraordinarily poorly documented
8:51
feature of cineform it's GPU accelerated
8:55
so even though dnxhd also performs like
8:59
a champ it can eat 50 to 60
9:02
of a 12 core xeon while scrubbing through footage while cineform is using
9:07
the fancy titan x graphics cards that NVIDIA sent us for our workstations to
9:12
keep CPU usage much lower
9:15
so then here is the process that we finally settled on we're using adobe
9:21
prelude 2015 to ingest our footage
9:24
automatically dumping the raw files off of the camera to a local storage array
9:29
on the machine in case of an emergency and then queuing up transcode jobs for
9:34
each of those clips in media encoder 2015 to send to our network share we
9:40
then use media encoder 2014 which is
9:43
included with your creative cloud license by the way to monitor the watch
9:47
folders that we export our finished jobs into and turn those into h.264 files
9:54
ready for publishing on websites like youtube vessel yooku billy billy and
9:59
facebook and while hitting both instances of media
10:03
encoder we've seen CPU usage as high as
10:06
90 percent but that doesn't mean that
10:10
you need a multi-thousand dollar network
10:13
render machine to utilize this workflow
10:16
all we've demonstrated here is that it's scalable to that kind of hardware for a
10:22
small team you could easily take advantage of this on a smaller scale
10:27
with a low power networked machine if you just wanted to improve your timeline
10:31
performance and not sit around waiting for exports
10:34
on your main station while something else works on that in the background
10:39
speaking of things that run in the background tunnelbear is an easy to use
10:44
privacy app for mobile desktop and
10:47
browser so they got support for iOS Android mac pc and chrome it allows you
10:51
to tunnel to 14 different countries
10:54
allowing you to browse the internet as if you're in that country it works for
11:00
accessing things like geo-blocked websites
11:03
the apps are super easy to use so you just like pick your country and turn
11:07
tunnel bear on and your internet connection gets fully encrypted and you
11:11
don't have to be technical to use or install tunnelbear and if you get stuck
11:15
you can contact their friendly support bears that are standing by 24 hours a
11:21
day they've got a plain english privacy policy and they've got 5 million users
11:26
that trust them already so you can try it out for free tunnelbear actually
11:31
gives you 500 megs of data for free every month and an extra gig if you
11:35
tweet at them but if you need more prices for unlimited plans start at 699
11:39
a month so head over to tunnelbear.com LTT linked in the video description to
11:44
try it out today thanks for watching guys if this video sucked
11:48
come on this was a lot of work
11:52
but if it was awesome please hit that like button
11:57
get subscribed or even consider supporting us directly by using our
12:00
affiliate code to shop at amazon buying a cool t-shirt like this one or with a
12:04
direct monthly contribution through our community forum which you should definitely join links up there and now
12:09
that you're done doing all that stuff you're probably wondering what to watch next so click that little button up in
12:14
the top right to check out luke's video where he goes through the ins and outs
12:19
of password protection that is
12:22
protecting your passwords making it so other people don't have them
12:27
see you next time