1
00:00:00,000 --> 00:00:04,320
Why do search engines suck now? Wait, before we get ahead of ourselves,

2
00:00:04,320 --> 00:00:07,720
do search engines suck now? Are they actually getting worse,

3
00:00:07,720 --> 00:00:13,600
or have they just changed in a way that I personally hate? Just by glancing at the first page of a Google or Bing search,

4
00:00:13,600 --> 00:00:16,760
it's easy to find a long list of potential complaints.

5
00:00:16,760 --> 00:00:21,320
Paid sponsors crowd out the first few search results. There's annoying little widgets everywhere,

6
00:00:21,320 --> 00:00:25,360
and it keeps giving you barely related suggestions for what to search next,

7
00:00:25,360 --> 00:00:29,960
like a lame choose your own adventure book. But why does this search page have so many ads?

8
00:00:30,080 --> 00:00:33,680
Isn't an interesting question. It's because money.

9
00:00:33,680 --> 00:00:38,880
The more interesting question is why it feels like it's gotten harder to find the information you want

10
00:00:38,880 --> 00:00:42,400
despite all those supposedly helpful widgets.

11
00:00:42,400 --> 00:00:47,560
The answer is also because money, but it's worth unpacking why.

12
00:00:47,560 --> 00:00:52,520
It's difficult to get a lot of rigorous data on this subject, but at least a few researchers have tried to answer

13
00:00:52,520 --> 00:00:57,120
whether or not search engines are, in fact, getting worse. One recent German study did a survey

14
00:00:57,120 --> 00:01:02,480
of over 7,000 product review searches on Bing, Google, and DuckDuckGo over the course of a year,

15
00:01:02,480 --> 00:01:05,880
and concluded that you could still find useful information,

16
00:01:05,880 --> 00:01:12,720
but it was being drowned out by a torrent of low quality content, especially SEO spam.

17
00:01:12,720 --> 00:01:15,960
Top ranked pages typically were heavily optimized

18
00:01:15,960 --> 00:01:22,320
and littered with affiliate marketing links, and they also showed clear markers of lower content quality.

19
00:01:22,320 --> 00:01:26,520
There's not a ton of academic papers on this issue, but there's plenty of data on the web

20
00:01:26,520 --> 00:01:32,520
showing how users have changed their behavior to sidestep low quality, highly optimized results.

21
00:01:32,520 --> 00:01:39,040
One possible indicator that search engines suck now is the growing use of Reddit as a de facto search engine.

22
00:01:39,040 --> 00:01:45,280
Sadly, Reddit's own internal search function has long been considered what the experts call absolute trash,

23
00:01:45,280 --> 00:01:48,920
but that hasn't stopped savvy users from just sticking the word Reddit

24
00:01:48,920 --> 00:01:54,080
on otherwise unrelated Google search queries. It's a well-known tact for cutting out SEO-ified garbage

25
00:01:54,080 --> 00:02:00,760
and vapid listicles because it effectively bypasses the weaknesses of both search engines and Reddit itself.

26
00:02:00,760 --> 00:02:06,320
Reddit isn't perfect, just ask any Redditor, but on the modern internet, it does a rare and special thing.

27
00:02:06,320 --> 00:02:11,920
It allows users to direct their question to a bunch of big old nerds who care more about being right

28
00:02:11,920 --> 00:02:18,560
than they care about making money off the interaction. If you search site colon reddit.com search engine bad,

29
00:02:18,560 --> 00:02:22,480
you'll find plenty of posts complaining about the decaying state of modern search engines

30
00:02:22,480 --> 00:02:26,240
going back over a decade. There has, however, been a pretty major uptick

31
00:02:26,240 --> 00:02:29,360
in such posts over the last two years.

32
00:02:29,360 --> 00:02:34,320
Separately, Google Trends data shows that Reddit has been steadily gaining popularity

33
00:02:34,320 --> 00:02:38,880
as a search term since 2010, when news aggregator Dig shot themselves in the foot

34
00:02:38,880 --> 00:02:42,400
with a controversial redesign and started bleeding users.

35
00:02:42,400 --> 00:02:45,800
That growth remained steady until December, 2021,

36
00:02:45,800 --> 00:02:50,280
when users started appending Reddit to their searches at an increased rate,

37
00:02:50,280 --> 00:02:53,600
over 40% of Reddit's growth as a Google search term

38
00:02:53,600 --> 00:02:57,680
since 2010 is from the end of 2021 onward,

39
00:02:57,680 --> 00:03:02,040
a bit over two years. Now, there are a lot of potential confounding factors here,

40
00:03:02,040 --> 00:03:07,080
but this could be, at least in part, a consequence of widespread dissatisfaction

41
00:03:07,080 --> 00:03:10,800
with search engines. An interesting contrast to Reddit's upward trend

42
00:03:10,800 --> 00:03:15,000
as a search term is Wikipedia, which long predated Reddit as the kind of word

43
00:03:15,000 --> 00:03:18,000
you add to the end of a search query in order to cut through the noise.

44
00:03:18,000 --> 00:03:21,040
Wikipedia is a far more popular website,

45
00:03:21,040 --> 00:03:24,640
currently ranked seventh for global traffic to Reddit's 16th,

46
00:03:24,640 --> 00:03:28,040
but it's been on the decline as a search term since 2010,

47
00:03:28,040 --> 00:03:31,640
in part because Google heavily prioritizes Wikipedia already,

48
00:03:31,640 --> 00:03:36,200
both in search results and as part of its knowledge panel widget.

49
00:03:36,200 --> 00:03:40,240
But this might also indicate that the decline in quality for search engine results

50
00:03:40,240 --> 00:03:43,320
isn't hitting every search subject equally.

51
00:03:43,320 --> 00:03:48,200
There's not a ton of money riding on a search query like, when was the war of 1812?

52
00:03:48,200 --> 00:03:52,320
So the top results are mostly authoritative for reliable history sources.

53
00:03:52,320 --> 00:03:56,840
But if the most important goal of search engines is to find useful results,

54
00:03:56,840 --> 00:04:01,080
why haven't they fixed the problem? It's not that search engine companies don't care

55
00:04:01,080 --> 00:04:04,280
that spam is cluttering up the first two pages of results.

56
00:04:04,280 --> 00:04:07,360
They've been in an arms race with spam since the very beginning.

57
00:04:07,360 --> 00:04:10,920
It's just that the spam is now apparently winning.

58
00:04:10,920 --> 00:04:16,040
According to the authors of the German study that we mentioned earlier, search engine companies banning spam sites

59
00:04:16,040 --> 00:04:21,000
and readjusting their parameters had a positive, but ultimately temporary impact.

60
00:04:21,000 --> 00:04:24,600
There was still a general downward trend in terms of text quality and relevance

61
00:04:24,600 --> 00:04:29,240
for all three search engines studied, which could imply that this isn't a problem

62
00:04:29,240 --> 00:04:33,560
with search engines, but a problem with the internet itself.

63
00:04:33,560 --> 00:04:37,280
Appearing on the first page of Google can be life or death for a company.

64
00:04:37,280 --> 00:04:40,580
So there's massive financial incentive to game that system.

65
00:04:40,580 --> 00:04:45,040
The same as how there's a massive financial incentive to game ratings on sites like Amazon

66
00:04:45,040 --> 00:04:49,480
where fake reviews are notoriously rampant. Companies both big and small have realized

67
00:04:49,480 --> 00:04:54,000
that word of mouth personal recommendations from a financially disinterested human being

68
00:04:54,000 --> 00:04:59,200
are far more effective than traditional advertising, which means that the shadier among them

69
00:04:59,200 --> 00:05:03,600
put a lot of effort and resources into infiltrating so-called organic,

70
00:05:03,600 --> 00:05:09,040
user-generated systems of validation, drowning out authentic user reviews.

71
00:05:09,040 --> 00:05:12,040
Not to get too metaphorical, but the only way to be heard in a room

72
00:05:12,040 --> 00:05:15,560
where everybody's already yelling is to scream even louder.

73
00:05:15,560 --> 00:05:18,880
Everyone's incentive is to make more and better garbage.

74
00:05:19,560 --> 00:05:24,440
Large, vertically integrated companies like Google don't really help this hyper-competitive,

75
00:05:24,440 --> 00:05:27,680
low-effort environment when they leverage their platform

76
00:05:27,680 --> 00:05:31,520
to prioritize their own products over competitors.

77
00:05:31,520 --> 00:05:34,800
Google has had long-standing beef with Yelp

78
00:05:34,800 --> 00:05:38,560
since at least 2011 when the FTC investigated allegations

79
00:05:38,560 --> 00:05:43,920
that Google's search algorithm consistently favored Google Places over Yelp.

80
00:05:43,920 --> 00:05:48,080
That allegation was serious enough that Google actually agreed to allow online services

81
00:05:48,080 --> 00:05:53,760
to opt out of data scraping. Yelp further contributed data to a 2015 academic paper

82
00:05:53,760 --> 00:05:57,360
claiming that Google manipulates search results to favor itself.

83
00:05:57,360 --> 00:06:02,880
Small companies perceive often accurately that the platform they essentially have to use

84
00:06:02,880 --> 00:06:06,560
is a potential competitor that can and will replace them

85
00:06:06,560 --> 00:06:12,120
with a store-brand version at any time. Those fancy widgets and rich snippets exist

86
00:06:12,120 --> 00:06:15,840
so that the engine that can take you anywhere you wanna go

87
00:06:15,840 --> 00:06:19,960
is now a place that you never have to leave. So what are you gonna do?

88
00:06:19,960 --> 00:06:23,720
Build a better product? Pay for your position at the top

89
00:06:23,720 --> 00:06:28,080
or find a louder way to scream. That's why even though we said earlier

90
00:06:28,080 --> 00:06:31,200
that it's not necessarily search engine companies fault

91
00:06:31,200 --> 00:06:34,360
that they've gotten worse, it also kinda is.

92
00:06:34,360 --> 00:06:39,000
You know how captures have gotten harder and harder over time? Well, that's in part because they were used

93
00:06:39,000 --> 00:06:43,080
to train machine learning, which then led to the bots becoming more sophisticated,

94
00:06:43,080 --> 00:06:46,800
which then led to the need for stronger and stronger captures and so on.

95
00:06:46,800 --> 00:06:51,840
What we're seeing here is likely something similar, only internet-wide, where search engines are struggling

96
00:06:51,840 --> 00:06:56,080
to distinguish between quality content and spam from AI systems trained

97
00:06:56,080 --> 00:06:59,120
on traditionally trusted sources like Wikipedia.

98
00:06:59,120 --> 00:07:02,640
The increasing cheapness of relatively sophisticated spam tools

99
00:07:02,640 --> 00:07:08,600
has resulted in numerous odd trends. Some funny, like when oodles of products on Amazon

100
00:07:08,600 --> 00:07:12,320
wound up being named, I'm sorry, I cannot fulfill that request.

101
00:07:12,320 --> 00:07:16,840
It goes against open AI use policy. And others, disturbing.

102
00:07:16,840 --> 00:07:20,360
Like the rise of procedurally generated clickbait obituaries,

103
00:07:20,360 --> 00:07:23,400
often for private citizens, many of whom aren't even dead.

104
00:07:23,400 --> 00:07:27,240
You used to have to be at least a D-list celebrity in order to get that kind of treatment.

105
00:07:27,240 --> 00:07:30,240
Search engines have not lost the fight against spam,

106
00:07:30,240 --> 00:07:33,920
at least not yet. But as machine generation continues to progress

107
00:07:33,920 --> 00:07:37,680
and proliferate, your search experience probably isn't going to get any better.

108
00:07:37,680 --> 00:07:44,760
Thanks for watching guys, if you liked this video, maybe you'd like another video we have about how streaming is basically becoming cable.

109
00:07:44,760 --> 00:07:46,960
You can click on it somewhere.
