WEBVTT

00:00:00.060 --> 00:00:06.359
this has happened more often than I care to think about and this time of year is

00:00:03.959 --> 00:00:11.760
particularly bad I mean obviously we want to cover all the new phones and

00:00:08.820 --> 00:00:16.139
CPUs and gpus and I mean we've got this incredible video coming where we bought

00:00:13.440 --> 00:00:20.220
20 used mining gpus so we could find out once and for all if they're safe to buy

00:00:17.820 --> 00:00:23.340
but we've also got a corporate mandate to maintain healthy work-life balance

00:00:21.840 --> 00:00:29.039
for our team so who's going to actually do all of this testing meet the subject

00:00:26.279 --> 00:00:33.600
of today's video mark bench or well rather workbench Mark here is an

00:00:31.980 --> 00:00:39.360
automated benchmarking tool that our Labs team has been cooking up for the last six months it's still early days

00:00:36.780 --> 00:00:44.160
but even now Mark is able to improve our test efficiency by some percent and in

00:00:42.180 --> 00:00:47.579
time we intend to make it freely available for personal use to our

00:00:45.780 --> 00:00:51.360
community so let's take a deeper dive together and maybe you guys can tell us

00:00:49.020 --> 00:00:55.079
what you think is good and also give us your feedback about what you'd like to

00:00:53.160 --> 00:01:00.180
see us work on as we continue Mark's development and continue to tell you

00:00:57.180 --> 00:01:02.100
about our sponsor simplemdm they provide

00:01:00.180 --> 00:01:05.760
ridiculously simple Apple device management for it enrolling your

00:01:04.199 --> 00:01:09.720
company's Apple devices and keeping them up to date doesn't have to be

00:01:07.619 --> 00:01:15.180
frustrating try it for free for 30 days on unlimited devices at simplemdm.com

00:01:12.439 --> 00:01:19.500
Linus let's begin with some napkin math to explain why we decided that it was

00:01:17.460 --> 00:01:23.640
finally time to bite the bullet and build markbench we'll use that upcoming

00:01:21.360 --> 00:01:28.020
mining GPU video as our example each of those cards will be subjected to 12

00:01:25.680 --> 00:01:32.159
different benchmarks to ensure that it is free of strange performance anomalies

00:01:29.640 --> 00:01:36.479
let's say optimistically that the 12 benchmarks take three minutes each

00:01:33.840 --> 00:01:40.320
that's 36 minutes factor in that each test runs five times now we're up to

00:01:38.280 --> 00:01:43.439
three hours plus half an hour of thermal stress testing that lands you at about

00:01:41.759 --> 00:01:49.259
three and a half hours per card multiplied by 24 cards 20 from eBay and

00:01:47.159 --> 00:01:54.360
four lightly used control cards and that is 84 hours of testing and that doesn't

00:01:52.380 --> 00:01:59.579
even account for reinstalling drivers swapping cards or taking bio breaks so

00:01:57.360 --> 00:02:03.960
it's pretty clear that even if we could just pound back Red Bulls and power

00:02:01.740 --> 00:02:08.520
through it that is not the kind of thing that we'd want to do regularly and do it

00:02:06.540 --> 00:02:14.459
regularly is basically in our job description oh yeah not like that I mean

00:02:12.180 --> 00:02:17.700
to say that it is techtober and our corporate Overlord seem to get their

00:02:16.260 --> 00:02:21.599
jollies out of scheduling product releases to cause us much inconvenience

00:02:19.739 --> 00:02:25.980
as possible I'm pretty sure the target is actually each other but retailers

00:02:23.700 --> 00:02:29.280
media and consumers definitely end up getting caught in the crossfire to

00:02:27.300 --> 00:02:33.660
varying degrees so if we want to keep up the ants answer is Automation and while

00:02:32.220 --> 00:02:38.459
Mark doesn't look like much at the moment everything has to start somewhere

00:02:35.400 --> 00:02:40.319
so for now mark bench is a golang GUI

00:02:38.459 --> 00:02:44.160
with a python framework that collects all the sensor and Frame data that is

00:02:42.420 --> 00:02:49.080
output from our system during each test he collects this data using present Mon

00:02:46.440 --> 00:02:52.980
and Libra hardware monitor presentmon is a tool for collecting Frame data and is

00:02:51.180 --> 00:02:57.480
actually the basis of NVIDIA's frameview software and as for Libra hardware

00:02:54.840 --> 00:03:01.140
monitor it's an open source Fork of open hardware monitor which gives us access

00:02:59.040 --> 00:03:06.900
to all of the sensors in our system you know fan RPMs CPU GPU temperature power

00:03:04.680 --> 00:03:11.099
consumption stuff like that after a test is finished our python framework outputs

00:03:09.239 --> 00:03:15.900
the data in the form of csvs then converts those into protobufs a smaller

00:03:13.800 --> 00:03:20.099
binary format the data gets uploaded to a local ingest server before being sent

00:03:17.760 --> 00:03:24.360
to our Cloud hosted postgres database in layman's terms the Labs team builds

00:03:22.080 --> 00:03:29.760
what's called a harness for every game we want to test then using scripts Mark

00:03:27.900 --> 00:03:34.620
adjusts the settings of the game launches the game loads up a benchmark

00:03:32.280 --> 00:03:39.300
and then records all of the relevant data while The Benchmark is running and

00:03:36.900 --> 00:03:44.159
stores it in a database rinse and repeat until all the benchmarks are done and we

00:03:41.819 --> 00:03:47.700
can swap off the card and put on the next one now obviously we could get a

00:03:46.260 --> 00:03:52.319
similar level of automation using commercial software like 3dmark bit

00:03:49.980 --> 00:03:56.819
conveniently already exists but scripting automation into real games has

00:03:54.900 --> 00:04:03.659
a few major benefits for you the consumer first up while a single bigger

00:04:00.299 --> 00:04:05.819
is better number is convenient it really

00:04:03.659 --> 00:04:11.519
doesn't tell the full story take Intel's Arc a750 for example it might actually

00:04:08.640 --> 00:04:16.799
perform well on average compared to say NVIDIA's RTX 3060 but if your main game

00:04:14.099 --> 00:04:21.419
is CS go you are not going to be happy with that purchase which leads perfectly

00:04:18.600 --> 00:04:25.440
to reason number two a collection of individual game benchmarks allows you to

00:04:23.580 --> 00:04:29.100
focus on what matters most to you geekbench for example contains

00:04:27.240 --> 00:04:33.360
cryptography tests that heavily influence the final score but yet have

00:04:31.500 --> 00:04:36.960
very little bearing on how most people will actually use the products being

00:04:34.919 --> 00:04:41.940
tested it's so bad that it's often dismissed as kinda Irrelevant in media

00:04:39.419 --> 00:04:46.139
circles even though it does also contain tests that are perfectly valid finally

00:04:44.100 --> 00:04:50.759
markbench is a great way to keep manufacturers honest everyone from

00:04:48.840 --> 00:04:55.800
Samsung to Volkswagen has been caught cheating on standardized synthetic tests

00:04:54.120 --> 00:05:00.120
to make their products look better than they are so by giving ourselves the

00:04:58.259 --> 00:05:04.320
option to run any number of different real games all of which will

00:05:02.580 --> 00:05:08.880
automatically be updated with new patches that would be hard to optimize

00:05:06.300 --> 00:05:13.259
for we are making it extremely impractical to try to game the system

00:05:10.800 --> 00:05:16.680
and artificially Elevate test scores I mean unless they just want to optimize

00:05:15.060 --> 00:05:20.400
their product for real games in which case well that's not really cheating and

00:05:18.600 --> 00:05:24.360
we all win them right once we've got our juicy data we use grafana to transform

00:05:22.620 --> 00:05:29.220
it into nice pretty graphs for your viewing pleasure or well at least that's

00:05:27.000 --> 00:05:33.539
the plan we still have a lot of work to do as some of you have helpfully pointed

00:05:31.139 --> 00:05:38.280
out on automating our data visualization because depending on what we're trying

00:05:35.759 --> 00:05:42.840
to convey it can be really challenging to quickly and effectively present this

00:05:40.979 --> 00:05:46.560
much data it's still a Big Time Saver already though let's compare it to our

00:05:44.520 --> 00:05:50.699
current process first we choose from our suite of benchmarks

00:05:48.000 --> 00:05:54.720
like say these ones which is going to come down to what we're trying to learn

00:05:52.620 --> 00:05:59.160
about the product does it perform well in lighter titles what about the latest

00:05:56.460 --> 00:06:02.580
AAA games what about older DirectX 9 games that sort of thing then we get

00:06:01.199 --> 00:06:08.280
everything installed and patched and adjust the in-game settings to our liking oh and don't forget to reboot the

00:06:06.479 --> 00:06:11.880
game if you happen to adjust that setting then we fire up frame view set

00:06:10.440 --> 00:06:15.539
it for the length of the Benchmark go into the game Run The Benchmark wait for

00:06:13.800 --> 00:06:20.699
the game to load then press the record button right at the exact right time as

00:06:17.699 --> 00:06:23.940
we load in and then we play the waiting

00:06:20.699 --> 00:06:26.699
game it's a very manual and tedious

00:06:23.940 --> 00:06:30.900
process that requires just enough of our attention that it's pretty hard to get

00:06:28.620 --> 00:06:35.400
any other real work done at the same time but because markbench has all of

00:06:33.600 --> 00:06:39.660
that built in once it's up and running you're free to do whatever you want

00:06:37.139 --> 00:06:45.300
until each card has completed the entire test Suite want to test 20 games easy

00:06:42.539 --> 00:06:49.020
want to repeat every test five times so that we can throw out the early cold

00:06:46.860 --> 00:06:53.039
runs and then add average the last three results no problem

00:06:50.940 --> 00:06:57.419
the other big difference maker is that mark bench all but eliminates human

00:06:55.259 --> 00:07:05.280
error and trust me once you've been at it for two four eight hours it is really

00:07:02.160 --> 00:07:06.960
easy to forget a small step like opening

00:07:05.280 --> 00:07:11.880
up your background data logging software or to accidentally leave dlss enabled or

00:07:10.319 --> 00:07:16.500
something like that and if you don't notice until you've already moved on to

00:07:13.919 --> 00:07:20.759
another card then those kinds of mistakes can cost a lot of time given

00:07:18.780 --> 00:07:25.979
that in order to do things properly you need to not only swap the cards but also

00:07:22.979 --> 00:07:27.660
remove and reinstall your GPU drivers in

00:07:25.979 --> 00:07:31.680
order to redo the run and you might think this kind of thing affects me and

00:07:29.819 --> 00:07:36.720
not you guys but here's the thing whenever we post a review We invariably

00:07:34.440 --> 00:07:42.000
see questions like why didn't you test this or how come nobody ever talks about

00:07:38.759 --> 00:07:45.120
that and we feel the same way we want to

00:07:42.000 --> 00:07:47.759
know these answers but in many cases our

00:07:45.120 --> 00:07:52.199
hands are tied companies like AMD and Intel send out review samples for their

00:07:49.560 --> 00:07:58.160
products with only seven to ten days until the Embargo lifts or four Intel

00:07:55.919 --> 00:08:02.819
that means our testing needs to be done extremely quickly so that we can analyze

00:08:00.900 --> 00:08:07.380
the data write a script fill the video edit the video and finally upload and

00:08:05.160 --> 00:08:13.080
release all of that takes a lot of time that we don't really have meaning that

00:08:09.840 --> 00:08:14.639
we can narrow our scope which sucks ask

00:08:13.080 --> 00:08:19.680
our employees to give up their precious time off which really sucks or miss the

00:08:17.340 --> 00:08:22.979
Embargo which as a business where views and clicks Drive income is frankly

00:08:21.479 --> 00:08:26.160
unsustainable let's look at some numbers to demonstrate that apple is a great

00:08:24.660 --> 00:08:30.599
example since they don't send us stuff ahead of release at all

00:08:30.599 --> 00:08:36.959
this means that unless we can get an early hookup from somewhere we can't

00:08:34.680 --> 00:08:39.659
perform any meaningful tests on their products if we want to get our videos

00:08:38.039 --> 00:08:44.700
out in a timely manner and to give you some idea why that is so important look

00:08:42.360 --> 00:08:48.959
at this we rushed out a video on the M1 MacBook Pro on our ShortCircuit sister

00:08:46.620 --> 00:08:52.980
channel right near the release day it was super shallow because we had no time

00:08:50.820 --> 00:08:56.580
to prepare anything but it got tripled the usual views that we see on that

00:08:54.600 --> 00:09:01.380
channel then when we covered that same product on our main Channel this one a

00:08:59.160 --> 00:09:05.580
few weeks later in way more depth we ended up with

00:09:02.700 --> 00:09:09.600
whoops below average viewership for our trouble and again this isn't just our

00:09:07.740 --> 00:09:14.160
problem it's a problem for consumers because favoring friendly media is one

00:09:12.420 --> 00:09:18.060
of the best ways for companies to control the narrative around their

00:09:15.540 --> 00:09:23.040
products that initial boost by being one of the first to cover a new device often

00:09:20.519 --> 00:09:27.180
creates a positive feedback loop that continues to drive increased viewership

00:09:24.779 --> 00:09:32.880
over the entire sales cycle so if you do a search for say M1 MacBook Pro review

00:09:29.940 --> 00:09:36.899
you are much more likely to end up with an apple approved media outlet and the

00:09:34.980 --> 00:09:40.320
most Insidious part of this is that the companies that play this game well are

00:09:39.000 --> 00:09:45.120
smart enough to keep the Rules of Engagement so vague and nebulous that

00:09:43.019 --> 00:09:49.740
they create this environment where every media Outlet even once they've never

00:09:46.980 --> 00:09:54.540
spoken to will carefully control their criticism to avoid stepping over some

00:09:51.779 --> 00:09:58.500
invisible line and this kind of horse is why we push back so hard when NVIDIA

00:09:56.760 --> 00:10:02.880
threatened to stop sending pre-launch gpus to Tim and Steve from Hardware

00:10:00.120 --> 00:10:07.620
unboxed of course as NVIDIA pointed out they are well within their rights to

00:10:05.040 --> 00:10:11.399
send gpus or not send gpus to whomever they please and besides they're more

00:10:09.720 --> 00:10:16.019
than welcome to cover their gpus later except that for the reasons I just

00:10:13.260 --> 00:10:20.459
outlined this was a clear attempt to suppress hardware and box influence and

00:10:18.600 --> 00:10:24.779
their growth by killing their launch day viewership to NVIDIA's credit unlike

00:10:22.740 --> 00:10:29.100
Apple they actually cared about the outrage from The Gaming Community who to

00:10:27.060 --> 00:10:33.000
their credit recognized this for what it was and to my knowledge Hardware unboxed

00:10:31.260 --> 00:10:37.800
is reinstated in the reviewer program but there are many other companies who

00:10:35.399 --> 00:10:42.120
like apple maintain much more strict control over who is allowed to review

00:10:40.440 --> 00:10:46.440
their products which is why it's time to break that cycle and mark bench is the

00:10:44.519 --> 00:10:50.399
key by automating this testing we're going to be able to piss off whoever we

00:10:47.940 --> 00:10:55.079
want and still deliver near launch day data to our viewers and over time we

00:10:52.260 --> 00:10:59.399
plan to publish not only videos but also written articles which you can expect to

00:10:56.880 --> 00:11:03.839
find on the lab's website along with the mother of all testing databases

00:11:00.959 --> 00:11:07.140
obviously none of that is ready but in the meantime we're going to have much

00:11:05.579 --> 00:11:11.279
more in-depth testing in our regular videos and we're hoping to publish some

00:11:09.360 --> 00:11:15.000
extra data or content on our forums or on Floatplane.com we're not 100 sure

00:11:13.440 --> 00:11:18.600
what this is going to look like yet but you should sign up for both our forum is

00:11:16.860 --> 00:11:21.839
free and the link below will be a thread where you can submit your suggestions

00:11:19.980 --> 00:11:25.500
for markbench features and as for Floatplane it's got great extras and

00:11:23.339 --> 00:11:28.680
exclusives right now like Dennis's epic martial arts training sessions leading

00:11:27.360 --> 00:11:34.019
up to our fight the only thing I need to do now is uh use it to get all that

00:11:31.980 --> 00:11:38.220
testing finished and compiled for the 20 mining GPU video and oh I guess I also

00:11:36.360 --> 00:11:42.779
need to find a way to segue to our sponsor Squarespace if you're building

00:11:40.200 --> 00:11:46.440
your brand online in 2022 you should absolutely have a website and if you

00:11:45.000 --> 00:11:50.760
need a tool to help build that brand look no further than Squarespace

00:11:48.720 --> 00:11:54.600
Squarespace is the all-in-one platform to help expand your brand online make a

00:11:53.040 --> 00:11:58.260
beautiful website engage with your audience and sell anything and

00:11:56.100 --> 00:12:02.160
everything from products to content we love Squarespace so much we use it here

00:11:59.940 --> 00:12:05.760
at LMG it's custom templates make it easy to stand out with a beautiful

00:12:03.779 --> 00:12:09.899
website that fits your needs you can maximize your visibility thanks to a

00:12:07.500 --> 00:12:13.500
suite of integrated SEO features and their analytic insights help you

00:12:11.579 --> 00:12:17.160
optimize for performance so you can see what's going well and What needs a

00:12:15.480 --> 00:12:22.320
little work so get started today and head to squarespace.com forward slash

00:12:19.200 --> 00:12:23.700
LTT to get 10 off your first purchase if

00:12:22.320 --> 00:12:27.240
you guys enjoyed this video why don't you check out our Labs video about our

00:12:25.500 --> 00:12:31.320
headphone testing device that believe it or not we are still waiting to get

00:12:29.220 --> 00:12:36.260
delivered that was a rental unit we paid for it months ago

00:12:33.839 --> 00:12:36.260
cool
