WEBVTT

00:00:00.080 --> 00:00:04.960
Google Translate is an amazing tool but

00:00:03.120 --> 00:00:09.719
it also kind of sucks it's a service that's come a long way over the last 18

00:00:07.759 --> 00:00:13.240
years but it's still nowhere near the perfect Universal translator we're used

00:00:11.719 --> 00:00:18.000
to seeing in science fiction like Star Trek and hitchhiker Guide to the Galaxy

00:00:16.000 --> 00:00:23.080
but how far away from that kind of seamless translation technology are we

00:00:20.359 --> 00:00:27.400
really Google Translate is an example of machine translation a subfield of

00:00:25.519 --> 00:00:32.480
computational linguistics that has been under development since the 1950s early

00:00:30.199 --> 00:00:36.480
machine translators used a rule-based approach which required programmers to

00:00:34.280 --> 00:00:41.440
explicitly describe every rule governing a language which was both labor

00:00:38.360 --> 00:00:43.200
intensive and flawed in the 1980s

00:00:41.440 --> 00:00:47.480
statistical translation models became popular these used probability and

00:00:45.480 --> 00:00:51.719
heuristics to determine the best translation for a given phrase which was

00:00:49.840 --> 00:00:55.559
more efficient but they struggled to translate between languages with

00:00:53.399 --> 00:01:00.519
significant differences in terms of grammar and word order JM is a pretty

00:00:58.800 --> 00:01:06.720
normal French phrase but its word order isn't I love you it's

00:01:03.480 --> 00:01:09.119
I you love this is one of the reasons

00:01:06.720 --> 00:01:14.080
why even relatively fast real-time interpretation programs always come with

00:01:11.360 --> 00:01:17.960
a significant delay the software needs to hear the full sentence before it

00:01:15.720 --> 00:01:22.280
knows how it's supposed to order the words in its translation when it

00:01:20.000 --> 00:01:25.880
launched in 2006 Google translate originally used a statistical

00:01:24.000 --> 00:01:29.920
translation model and typically translated Languages by converting them

00:01:27.720 --> 00:01:34.680
to English then converting them into into the desired target language in late

00:01:32.200 --> 00:01:39.479
2016 Google transitioned its translation service to neural machine translation

00:01:36.920 --> 00:01:44.399
which uses a neural network to translate phrases and passages holistically

00:01:41.920 --> 00:01:49.119
meaning that it can use the larger text as context when translating specific

00:01:46.880 --> 00:01:53.960
phrases Google Translate in particular excels because it's so widely used and

00:01:51.719 --> 00:02:00.119
has a massive flow of user feedback to help refine its output that being said

00:01:57.479 --> 00:02:05.039
experts do not consider any translation programs today to be capable of fully

00:02:02.280 --> 00:02:09.879
automated high quality translation as the technology is still too unreliable

00:02:07.560 --> 00:02:14.920
for formal purposes like translating works for publication or interpreting a

00:02:12.360 --> 00:02:19.480
witness statement during a court hearing however professional translators do use

00:02:17.280 --> 00:02:24.680
a fair number of automated Tools in what is called computer assisted translation

00:02:22.080 --> 00:02:29.959
their cyborgs to be fair accurate enough for a court hearing is a pretty high bar

00:02:27.319 --> 00:02:34.599
so how accurate is Google Translate really well it can vary a lot depending

00:02:32.280 --> 00:02:38.800
on the language pairing if you're a unilingual English speaker Google

00:02:36.879 --> 00:02:43.159
translate might actually seem to work really really well that's because

00:02:40.599 --> 00:02:47.879
English has the most speakers out of any human language with almost 1.5 billion

00:02:46.000 --> 00:02:51.480
and half of websites are written in English Google translate has tons of

00:02:49.840 --> 00:02:56.360
English language reference material to draw from but the majority of the other

00:02:54.040 --> 00:03:01.360
132 languages supported by Google translate are only represented by a

00:02:58.640 --> 00:03:07.440
small fraction of users and websites according to a 2019 UCLA study English

00:03:04.519 --> 00:03:12.000
to Spanish was around 94% accurate but English to Armenian was only 55%

00:03:12.519 --> 00:03:19.879
accurate but isn't this accuracy Gap just going to go away with

00:03:17.000 --> 00:03:23.920
time well yes and no this technology will inevitably become more accurate as

00:03:22.200 --> 00:03:28.319
time goes on but there are several deeper problems with machine translation

00:03:25.920 --> 00:03:32.480
that we have yet to really solve we'll tell you what they are after this

00:03:29.799 --> 00:03:36.200
message from our sponsor iix it if your laptop is broken or acting funky forget

00:03:34.720 --> 00:03:41.360
going out and buying a whole new one repair it with the help of iix it their

00:03:38.840 --> 00:03:44.439
exhaustive selection of Parts like ssds and batteries along with their

00:03:42.760 --> 00:03:48.519
comprehensive repair guides means opening up your laptop and fixing it is

00:03:46.360 --> 00:03:52.920
easier than ever check out iFix it using the link in the description and give

00:03:50.439 --> 00:03:57.560
your busted laptop a new lease on life it's tempting to think that the route to

00:03:54.439 --> 00:03:59.439
a universal translator is just more data

00:03:57.560 --> 00:04:03.680
and that would definitely help in arm media case but even with our extremely

00:04:02.000 --> 00:04:08.239
sophisticated neural network and plenty of training data machine translators

00:04:06.000 --> 00:04:13.879
tend to struggle in a few key areas of human communication most notably slang

00:04:11.319 --> 00:04:17.359
jokes and figurative language a lot of human communication relies on

00:04:15.439 --> 00:04:22.919
abstraction and double meanings especially metaphors and idioms the

00:04:19.440 --> 00:04:24.720
French phrase okam twam I'm sorry is a

00:04:22.919 --> 00:04:29.240
Whimsical idiom typically used to describe the height of young children

00:04:27.040 --> 00:04:33.960
the rough English equivalent is knee high to a grasshopper but if you

00:04:31.280 --> 00:04:39.479
translate it literally it means three apples High Google translate interprets

00:04:36.960 --> 00:04:45.120
this French phrase as little person which is sort of right but also very

00:04:41.840 --> 00:04:47.800
very wrong similarly the humor of it

00:04:45.120 --> 00:04:51.680
ain't rocket surgery just isn't going to translate well without a human being

00:04:49.880 --> 00:04:56.039
willing to put in the groundwork to find a suitable cultural equivalent because

00:04:53.919 --> 00:05:00.400
French people don't use rocket science or brain surgery as benchmarks for

00:04:58.120 --> 00:05:04.000
difficulty they use sourc now this isn't an impossible hurdle for

00:05:02.120 --> 00:05:08.000
a machine translator to handle because idioms tend to be said the exact same

00:05:05.880 --> 00:05:12.240
way every time you could teach the software how to translate a long list of

00:05:10.000 --> 00:05:16.720
specific idioms so long as you are willing to put enough resources into it

00:05:14.680 --> 00:05:21.919
teaching it to consistently translate word play and jokes however might very

00:05:19.759 --> 00:05:25.319
well be impossible sing is always a real headache for translators both machine

00:05:23.440 --> 00:05:29.919
and human because it's typically used by relatively small Niche subcultures and

00:05:27.960 --> 00:05:35.479
it's unlikely to wind up in Main language repositories like dictionaries

00:05:32.199 --> 00:05:37.759
and thorth slang also tends to change

00:05:35.479 --> 00:05:41.800
quickly and rely on community specific cultural references if you were to try

00:05:39.919 --> 00:05:45.919
and translate a slang heavy Kendrick Lamar song by pushing it through a

00:05:43.560 --> 00:05:51.000
machine translator the result would not be just aesthetically questionable but

00:05:48.600 --> 00:05:55.120
also basically incoherent to a person unfamiliar with American rap culture

00:05:53.440 --> 00:05:58.880
helping a machine translator to understand slang is again possible it

00:05:57.400 --> 00:06:03.280
would just be a bit expensive and require near constant updating where

00:06:01.520 --> 00:06:07.800
things get tricky is that there's a slight difference between a translation

00:06:05.039 --> 00:06:12.720
being accurate and a translation being good especially when it comes to Art

00:06:10.599 --> 00:06:16.960
machine translation can often be technically correct but still fail to

00:06:14.680 --> 00:06:22.560
communicate the tone and cultural connotations of the original on a purely

00:06:19.960 --> 00:06:27.319
technical level how are you what's up and how's it hanging all mean basically

00:06:24.919 --> 00:06:31.000
the same thing but one is a double on Tandra and a weird thing to say to your

00:06:28.880 --> 00:06:34.919
grandma we might be happy enough with a machine translation of a menu if it just

00:06:33.280 --> 00:06:39.680
accurately communicates what food is available but a translated novel or poem

00:06:38.000 --> 00:06:43.880
needs to find a good balance between accuracy and Aesthetics in order to

00:06:41.840 --> 00:06:48.479
create a similar experience as the original this means that a literary

00:06:45.800 --> 00:06:52.919
translator needs to be creative and make astute artistic judgments in the same

00:06:50.919 --> 00:06:56.840
way that an author does the basic problem with translation software is

00:06:54.560 --> 00:07:01.120
that there's no mind behind it that truly understands the purpose or intent

00:06:58.919 --> 00:07:04.720
of the words its processing an optimistic goal of machine translation

00:07:02.879 --> 00:07:08.160
as a discipline is to one day have translation software that is so

00:07:06.080 --> 00:07:12.800
sophisticated and nuanced that it only needs a human Editor to check its work

00:07:10.000 --> 00:07:17.160
and fix whatever errors they find more pessimistically highquality fully

00:07:14.800 --> 00:07:22.000
Automated machine translation might require the development of something

00:07:18.879 --> 00:07:23.560
like AGI artificial general intelligence

00:07:22.000 --> 00:07:28.800
a machine capable of human-like intelligence and making Nuance judgment

00:07:25.479 --> 00:07:31.800
calls so Google translate might need to

00:07:28.800 --> 00:07:33.960
be capable of forming opinions before

00:07:31.800 --> 00:07:37.960
it's ready to translate poetry thanks for watching guys if you like this video

00:07:35.680 --> 00:07:43.240
why don't you check out our one on X's or Twitter's Community notes program

00:07:40.520 --> 00:07:46.360
it's actually pretty cool and maybe the future of the internet
