1
00:00:00,060 --> 00:00:06,359
this has happened more often than I care to think about and this time of year is

2
00:00:03,959 --> 00:00:11,760
particularly bad I mean obviously we want to cover all the new phones and

3
00:00:08,820 --> 00:00:16,139
CPUs and gpus and I mean we've got this incredible video coming where we bought

4
00:00:13,440 --> 00:00:20,220
20 used mining gpus so we could find out once and for all if they're safe to buy

5
00:00:17,820 --> 00:00:23,340
but we've also got a corporate mandate to maintain healthy work-life balance

6
00:00:21,840 --> 00:00:29,039
for our team so who's going to actually do all of this testing meet the subject

7
00:00:26,279 --> 00:00:33,600
of today's video mark bench or well rather workbench Mark here is an

8
00:00:31,980 --> 00:00:39,360
automated benchmarking tool that our Labs team has been cooking up for the last six months it's still early days

9
00:00:36,780 --> 00:00:44,160
but even now Mark is able to improve our test efficiency by some percent and in

10
00:00:42,180 --> 00:00:47,579
time we intend to make it freely available for personal use to our

11
00:00:45,780 --> 00:00:51,360
community so let's take a deeper dive together and maybe you guys can tell us

12
00:00:49,020 --> 00:00:55,079
what you think is good and also give us your feedback about what you'd like to

13
00:00:53,160 --> 00:01:00,180
see us work on as we continue Mark's development and continue to tell you

14
00:00:57,180 --> 00:01:02,100
about our sponsor simplemdm they provide

15
00:01:00,180 --> 00:01:05,760
ridiculously simple Apple device management for it enrolling your

16
00:01:04,199 --> 00:01:09,720
company's Apple devices and keeping them up to date doesn't have to be

17
00:01:07,619 --> 00:01:15,180
frustrating try it for free for 30 days on unlimited devices at simplemdm.com

18
00:01:12,439 --> 00:01:19,500
Linus let's begin with some napkin math to explain why we decided that it was

19
00:01:17,460 --> 00:01:23,640
finally time to bite the bullet and build markbench we'll use that upcoming

20
00:01:21,360 --> 00:01:28,020
mining GPU video as our example each of those cards will be subjected to 12

21
00:01:25,680 --> 00:01:32,159
different benchmarks to ensure that it is free of strange performance anomalies

22
00:01:29,640 --> 00:01:36,479
let's say optimistically that the 12 benchmarks take three minutes each

23
00:01:33,840 --> 00:01:40,320
that's 36 minutes factor in that each test runs five times now we're up to

24
00:01:38,280 --> 00:01:43,439
three hours plus half an hour of thermal stress testing that lands you at about

25
00:01:41,759 --> 00:01:49,259
three and a half hours per card multiplied by 24 cards 20 from eBay and

26
00:01:47,159 --> 00:01:54,360
four lightly used control cards and that is 84 hours of testing and that doesn't

27
00:01:52,380 --> 00:01:59,579
even account for reinstalling drivers swapping cards or taking bio breaks so

28
00:01:57,360 --> 00:02:03,960
it's pretty clear that even if we could just pound back Red Bulls and power

29
00:02:01,740 --> 00:02:08,520
through it that is not the kind of thing that we'd want to do regularly and do it

30
00:02:06,540 --> 00:02:14,459
regularly is basically in our job description oh yeah not like that I mean

31
00:02:12,180 --> 00:02:17,700
to say that it is techtober and our corporate Overlord seem to get their

32
00:02:16,260 --> 00:02:21,599
jollies out of scheduling product releases to cause us much inconvenience

33
00:02:19,739 --> 00:02:25,980
as possible I'm pretty sure the target is actually each other but retailers

34
00:02:23,700 --> 00:02:29,280
media and consumers definitely end up getting caught in the crossfire to

35
00:02:27,300 --> 00:02:33,660
varying degrees so if we want to keep up the ants answer is Automation and while

36
00:02:32,220 --> 00:02:38,459
Mark doesn't look like much at the moment everything has to start somewhere

37
00:02:35,400 --> 00:02:40,319
so for now mark bench is a golang GUI

38
00:02:38,459 --> 00:02:44,160
with a python framework that collects all the sensor and Frame data that is

39
00:02:42,420 --> 00:02:49,080
output from our system during each test he collects this data using present Mon

40
00:02:46,440 --> 00:02:52,980
and Libra hardware monitor presentmon is a tool for collecting Frame data and is

41
00:02:51,180 --> 00:02:57,480
actually the basis of NVIDIA's frameview software and as for Libra hardware

42
00:02:54,840 --> 00:03:01,140
monitor it's an open source Fork of open hardware monitor which gives us access

43
00:02:59,040 --> 00:03:06,900
to all of the sensors in our system you know fan RPMs CPU GPU temperature power

44
00:03:04,680 --> 00:03:11,099
consumption stuff like that after a test is finished our python framework outputs

45
00:03:09,239 --> 00:03:15,900
the data in the form of csvs then converts those into protobufs a smaller

46
00:03:13,800 --> 00:03:20,099
binary format the data gets uploaded to a local ingest server before being sent

47
00:03:17,760 --> 00:03:24,360
to our Cloud hosted postgres database in layman's terms the Labs team builds

48
00:03:22,080 --> 00:03:29,760
what's called a harness for every game we want to test then using scripts Mark

49
00:03:27,900 --> 00:03:34,620
adjusts the settings of the game launches the game loads up a benchmark

50
00:03:32,280 --> 00:03:39,300
and then records all of the relevant data while The Benchmark is running and

51
00:03:36,900 --> 00:03:44,159
stores it in a database rinse and repeat until all the benchmarks are done and we

52
00:03:41,819 --> 00:03:47,700
can swap off the card and put on the next one now obviously we could get a

53
00:03:46,260 --> 00:03:52,319
similar level of automation using commercial software like 3dmark bit

54
00:03:49,980 --> 00:03:56,819
conveniently already exists but scripting automation into real games has

55
00:03:54,900 --> 00:04:03,659
a few major benefits for you the consumer first up while a single bigger

56
00:04:00,299 --> 00:04:05,819
is better number is convenient it really

57
00:04:03,659 --> 00:04:11,519
doesn't tell the full story take Intel's Arc a750 for example it might actually

58
00:04:08,640 --> 00:04:16,799
perform well on average compared to say NVIDIA's RTX 3060 but if your main game

59
00:04:14,099 --> 00:04:21,419
is CS go you are not going to be happy with that purchase which leads perfectly

60
00:04:18,600 --> 00:04:25,440
to reason number two a collection of individual game benchmarks allows you to

61
00:04:23,580 --> 00:04:29,100
focus on what matters most to you geekbench for example contains

62
00:04:27,240 --> 00:04:33,360
cryptography tests that heavily influence the final score but yet have

63
00:04:31,500 --> 00:04:36,960
very little bearing on how most people will actually use the products being

64
00:04:34,919 --> 00:04:41,940
tested it's so bad that it's often dismissed as kinda Irrelevant in media

65
00:04:39,419 --> 00:04:46,139
circles even though it does also contain tests that are perfectly valid finally

66
00:04:44,100 --> 00:04:50,759
markbench is a great way to keep manufacturers honest everyone from

67
00:04:48,840 --> 00:04:55,800
Samsung to Volkswagen has been caught cheating on standardized synthetic tests

68
00:04:54,120 --> 00:05:00,120
to make their products look better than they are so by giving ourselves the

69
00:04:58,259 --> 00:05:04,320
option to run any number of different real games all of which will

70
00:05:02,580 --> 00:05:08,880
automatically be updated with new patches that would be hard to optimize

71
00:05:06,300 --> 00:05:13,259
for we are making it extremely impractical to try to game the system

72
00:05:10,800 --> 00:05:16,680
and artificially Elevate test scores I mean unless they just want to optimize

73
00:05:15,060 --> 00:05:20,400
their product for real games in which case well that's not really cheating and

74
00:05:18,600 --> 00:05:24,360
we all win them right once we've got our juicy data we use grafana to transform

75
00:05:22,620 --> 00:05:29,220
it into nice pretty graphs for your viewing pleasure or well at least that's

76
00:05:27,000 --> 00:05:33,539
the plan we still have a lot of work to do as some of you have helpfully pointed

77
00:05:31,139 --> 00:05:38,280
out on automating our data visualization because depending on what we're trying

78
00:05:35,759 --> 00:05:42,840
to convey it can be really challenging to quickly and effectively present this

79
00:05:40,979 --> 00:05:46,560
much data it's still a Big Time Saver already though let's compare it to our

80
00:05:44,520 --> 00:05:50,699
current process first we choose from our suite of benchmarks

81
00:05:48,000 --> 00:05:54,720
like say these ones which is going to come down to what we're trying to learn

82
00:05:52,620 --> 00:05:59,160
about the product does it perform well in lighter titles what about the latest

83
00:05:56,460 --> 00:06:02,580
AAA games what about older DirectX 9 games that sort of thing then we get

84
00:06:01,199 --> 00:06:08,280
everything installed and patched and adjust the in-game settings to our liking oh and don't forget to reboot the

85
00:06:06,479 --> 00:06:11,880
game if you happen to adjust that setting then we fire up frame view set

86
00:06:10,440 --> 00:06:15,539
it for the length of the Benchmark go into the game Run The Benchmark wait for

87
00:06:13,800 --> 00:06:20,699
the game to load then press the record button right at the exact right time as

88
00:06:17,699 --> 00:06:23,940
we load in and then we play the waiting

89
00:06:20,699 --> 00:06:26,699
game it's a very manual and tedious

90
00:06:23,940 --> 00:06:30,900
process that requires just enough of our attention that it's pretty hard to get

91
00:06:28,620 --> 00:06:35,400
any other real work done at the same time but because markbench has all of

92
00:06:33,600 --> 00:06:39,660
that built in once it's up and running you're free to do whatever you want

93
00:06:37,139 --> 00:06:45,300
until each card has completed the entire test Suite want to test 20 games easy

94
00:06:42,539 --> 00:06:49,020
want to repeat every test five times so that we can throw out the early cold

95
00:06:46,860 --> 00:06:53,039
runs and then add average the last three results no problem

96
00:06:50,940 --> 00:06:57,419
the other big difference maker is that mark bench all but eliminates human

97
00:06:55,259 --> 00:07:05,280
error and trust me once you've been at it for two four eight hours it is really

98
00:07:02,160 --> 00:07:06,960
easy to forget a small step like opening

99
00:07:05,280 --> 00:07:11,880
up your background data logging software or to accidentally leave dlss enabled or

100
00:07:10,319 --> 00:07:16,500
something like that and if you don't notice until you've already moved on to

101
00:07:13,919 --> 00:07:20,759
another card then those kinds of mistakes can cost a lot of time given

102
00:07:18,780 --> 00:07:25,979
that in order to do things properly you need to not only swap the cards but also

103
00:07:22,979 --> 00:07:27,660
remove and reinstall your GPU drivers in

104
00:07:25,979 --> 00:07:31,680
order to redo the run and you might think this kind of thing affects me and

105
00:07:29,819 --> 00:07:36,720
not you guys but here's the thing whenever we post a review We invariably

106
00:07:34,440 --> 00:07:42,000
see questions like why didn't you test this or how come nobody ever talks about

107
00:07:38,759 --> 00:07:45,120
that and we feel the same way we want to

108
00:07:42,000 --> 00:07:47,759
know these answers but in many cases our

109
00:07:45,120 --> 00:07:52,199
hands are tied companies like AMD and Intel send out review samples for their

110
00:07:49,560 --> 00:07:58,160
products with only seven to ten days until the Embargo lifts or four Intel

111
00:07:55,919 --> 00:08:02,819
that means our testing needs to be done extremely quickly so that we can analyze

112
00:08:00,900 --> 00:08:07,380
the data write a script fill the video edit the video and finally upload and

113
00:08:05,160 --> 00:08:13,080
release all of that takes a lot of time that we don't really have meaning that

114
00:08:09,840 --> 00:08:14,639
we can narrow our scope which sucks ask

115
00:08:13,080 --> 00:08:19,680
our employees to give up their precious time off which really sucks or miss the

116
00:08:17,340 --> 00:08:22,979
Embargo which as a business where views and clicks Drive income is frankly

117
00:08:21,479 --> 00:08:26,160
unsustainable let's look at some numbers to demonstrate that apple is a great

118
00:08:24,660 --> 00:08:30,599
example since they don't send us stuff ahead of release at all

119
00:08:30,599 --> 00:08:36,959
this means that unless we can get an early hookup from somewhere we can't

120
00:08:34,680 --> 00:08:39,659
perform any meaningful tests on their products if we want to get our videos

121
00:08:38,039 --> 00:08:44,700
out in a timely manner and to give you some idea why that is so important look

122
00:08:42,360 --> 00:08:48,959
at this we rushed out a video on the M1 MacBook Pro on our ShortCircuit sister

123
00:08:46,620 --> 00:08:52,980
channel right near the release day it was super shallow because we had no time

124
00:08:50,820 --> 00:08:56,580
to prepare anything but it got tripled the usual views that we see on that

125
00:08:54,600 --> 00:09:01,380
channel then when we covered that same product on our main Channel this one a

126
00:08:59,160 --> 00:09:05,580
few weeks later in way more depth we ended up with

127
00:09:02,700 --> 00:09:09,600
whoops below average viewership for our trouble and again this isn't just our

128
00:09:07,740 --> 00:09:14,160
problem it's a problem for consumers because favoring friendly media is one

129
00:09:12,420 --> 00:09:18,060
of the best ways for companies to control the narrative around their

130
00:09:15,540 --> 00:09:23,040
products that initial boost by being one of the first to cover a new device often

131
00:09:20,519 --> 00:09:27,180
creates a positive feedback loop that continues to drive increased viewership

132
00:09:24,779 --> 00:09:32,880
over the entire sales cycle so if you do a search for say M1 MacBook Pro review

133
00:09:29,940 --> 00:09:36,899
you are much more likely to end up with an apple approved media outlet and the

134
00:09:34,980 --> 00:09:40,320
most Insidious part of this is that the companies that play this game well are

135
00:09:39,000 --> 00:09:45,120
smart enough to keep the Rules of Engagement so vague and nebulous that

136
00:09:43,019 --> 00:09:49,740
they create this environment where every media Outlet even once they've never

137
00:09:46,980 --> 00:09:54,540
spoken to will carefully control their criticism to avoid stepping over some

138
00:09:51,779 --> 00:09:58,500
invisible line and this kind of horse is why we push back so hard when NVIDIA

139
00:09:56,760 --> 00:10:02,880
threatened to stop sending pre-launch gpus to Tim and Steve from Hardware

140
00:10:00,120 --> 00:10:07,620
unboxed of course as NVIDIA pointed out they are well within their rights to

141
00:10:05,040 --> 00:10:11,399
send gpus or not send gpus to whomever they please and besides they're more

142
00:10:09,720 --> 00:10:16,019
than welcome to cover their gpus later except that for the reasons I just

143
00:10:13,260 --> 00:10:20,459
outlined this was a clear attempt to suppress hardware and box influence and

144
00:10:18,600 --> 00:10:24,779
their growth by killing their launch day viewership to NVIDIA's credit unlike

145
00:10:22,740 --> 00:10:29,100
Apple they actually cared about the outrage from The Gaming Community who to

146
00:10:27,060 --> 00:10:33,000
their credit recognized this for what it was and to my knowledge Hardware unboxed

147
00:10:31,260 --> 00:10:37,800
is reinstated in the reviewer program but there are many other companies who

148
00:10:35,399 --> 00:10:42,120
like apple maintain much more strict control over who is allowed to review

149
00:10:40,440 --> 00:10:46,440
their products which is why it's time to break that cycle and mark bench is the

150
00:10:44,519 --> 00:10:50,399
key by automating this testing we're going to be able to piss off whoever we

151
00:10:47,940 --> 00:10:55,079
want and still deliver near launch day data to our viewers and over time we

152
00:10:52,260 --> 00:10:59,399
plan to publish not only videos but also written articles which you can expect to

153
00:10:56,880 --> 00:11:03,839
find on the lab's website along with the mother of all testing databases

154
00:11:00,959 --> 00:11:07,140
obviously none of that is ready but in the meantime we're going to have much

155
00:11:05,579 --> 00:11:11,279
more in-depth testing in our regular videos and we're hoping to publish some

156
00:11:09,360 --> 00:11:15,000
extra data or content on our forums or on Floatplane.com we're not 100 sure

157
00:11:13,440 --> 00:11:18,600
what this is going to look like yet but you should sign up for both our forum is

158
00:11:16,860 --> 00:11:21,839
free and the link below will be a thread where you can submit your suggestions

159
00:11:19,980 --> 00:11:25,500
for markbench features and as for Floatplane it's got great extras and

160
00:11:23,339 --> 00:11:28,680
exclusives right now like Dennis's epic martial arts training sessions leading

161
00:11:27,360 --> 00:11:34,019
up to our fight the only thing I need to do now is uh use it to get all that

162
00:11:31,980 --> 00:11:38,220
testing finished and compiled for the 20 mining GPU video and oh I guess I also

163
00:11:36,360 --> 00:11:42,779
need to find a way to segue to our sponsor Squarespace if you're building

164
00:11:40,200 --> 00:11:46,440
your brand online in 2022 you should absolutely have a website and if you

165
00:11:45,000 --> 00:11:50,760
need a tool to help build that brand look no further than Squarespace

166
00:11:48,720 --> 00:11:54,600
Squarespace is the all-in-one platform to help expand your brand online make a

167
00:11:53,040 --> 00:11:58,260
beautiful website engage with your audience and sell anything and

168
00:11:56,100 --> 00:12:02,160
everything from products to content we love Squarespace so much we use it here

169
00:11:59,940 --> 00:12:05,760
at LMG it's custom templates make it easy to stand out with a beautiful

170
00:12:03,779 --> 00:12:09,899
website that fits your needs you can maximize your visibility thanks to a

171
00:12:07,500 --> 00:12:13,500
suite of integrated SEO features and their analytic insights help you

172
00:12:11,579 --> 00:12:17,160
optimize for performance so you can see what's going well and What needs a

173
00:12:15,480 --> 00:12:22,320
little work so get started today and head to squarespace.com forward slash

174
00:12:19,200 --> 00:12:23,700
LTT to get 10 off your first purchase if

175
00:12:22,320 --> 00:12:27,240
you guys enjoyed this video why don't you check out our Labs video about our

176
00:12:25,500 --> 00:12:31,320
headphone testing device that believe it or not we are still waiting to get

177
00:12:29,220 --> 00:12:36,260
delivered that was a rental unit we paid for it months ago

178
00:12:33,839 --> 00:12:36,260
cool
