How Does Optical Character Recognition (OCR) Work?
Techquickie
·Techquickie
·2017-05-06
·
962 words · ~4 min read
0:00
you know it's pretty easy to take words on your computer screen and put them on
0:04
a physical sheet of paper just click print and unless you've forgotten to
0:08
fork out an extortion level amount of money for a new cartridge you'll have
0:13
fresh warm satisfying documents just a
0:16
few moments later but going in the opposite direction scanning dead tree
0:20
information into your pc is actually quite a bit trickier i mean sure flatbed
0:25
scanners aren't all that difficult to operate per se
0:28
but many of them are basically just taking a picture of the document and
0:32
saving it onto your pc meaning not only will it probably not look very crisp due
0:37
to file compression and little bits of dust in your scanner but you can't edit
0:42
a clean copy of your document in your favorite word processor because the
0:46
scanner won't recognize each individual character
0:49
fortunately there are a number of devices out there that enable optical
0:52
character recognition or ocr where each character on a page is scanned
0:57
individually so your papers are uploaded as actual text documents instead of
1:02
messy jpegs but how exactly does that
1:05
work and is one kind of optical scanner better than another well because the
1:10
whole concept of translating text into electronic signal is pretty broad there
1:15
have been lots of different implementations of ocr over the years in
1:19
fact one of the earliest electric ocr devices the optophone was invented all
1:24
the way back in 1914 this bizarre looking contraption relied
1:29
on the special behavior of selenium
1:33
which conducts electricity differently in light and darkness
1:37
as it scanned the words on a page the optophone distinguished between the dark
1:42
ink of text and lighter blank spaces
1:45
generating tones that corresponded to different letters making it possible for
1:49
blind people to read with some practice
1:53
later in 1931 a machine was developed that could convert printed text to
1:58
telegraph code one of the first technologies to translate printed
2:01
characters to electrical impulses rather than sounds but it wasn't until the
2:06
1960s and 70s that ocr began to take a
2:09
more familiar modern form with postal services using ocr to read addresses and
2:15
software that could recognize many different fonts
2:18
so back to present day when you scan a document how exactly does the software
2:23
know what it's looking at well the first step is to cut out artifacts so your ocr
2:28
program can concentrate on the text and nothing else so it attempts to remove
2:32
dust and other various graphics align the text properly and convert any colors
2:38
or shades of grey in the image to black and white only making the words
2:42
themselves easier to recognize the next step is to figure out which characters
2:47
are on the page simpler forms of ocr compare each scanned letter pixel by
2:52
pixel to a known database of fonts and decide on the closest match smarter ocr
2:58
however takes this step farther by breaking down each character down to
3:02
constituent elements like curves and corners and looking for matching
3:07
physical features and actual letters you can think of the differences between
3:11
these two approaches similarly to the difference between raster and vector
3:15
images which you can learn more about up here ocr software can also make use of a
3:20
dictionary so it won't accidentally spit out nonsense words due to inaccurate
3:24
scanning for example if your scanner sees this but it can't quite tell
3:28
whether the middle letter is an o or an a it can check its own dictionary to
3:33
decide that the word is actually dog and not dag giving ocr software situational
3:40
information can further cut down on errors such as telling it to only try to
3:44
match numbers if it's reading zip codes on an envelope
3:49
even with these tricks however ocr obviously is not perfect which you've
3:54
probably seen for yourself if you've ever used it but with greater
3:58
processing power and machine learning techniques that allow software to
4:02
recognize more subtle patterns over time ocr has become versatile enough to
4:07
recognize harder to read typefaces inconsistently printed material and even
4:12
handwriting and free ocr cloud processing services
4:15
like google drive which has a lot more machine learning capability than your
4:19
home pc for which i hope are fairly obvious reasons have made ocr more
4:24
accessible than ever no word yet though on whether google will take it a step
4:29
further and launch google interpretive dance translator
4:32
i don't know what i'm doing ignore me
4:36
are you racing against the clock as a freelancer trying to start your
4:40
challenging but rewarding interpretive dance company with the growth of the
4:43
internet there's never been more opportunities for these self-employed to
4:47
meet this need check out freshbooks cloud accounting software designed for
4:50
the way that you work it's the simplest and easiest way
4:54
to be more productive organized and more importantly get paid quickly
4:59
you can create and send professional looking invoices in less than 30 seconds
5:03
which is super important and you can set up online payments with
5:06
just a couple clicks and get paid up to four days faster you can even see when a
5:11
client has seen your invoice so there's no more guessing games freshbooks is
5:15
offering a 30-day unrestricted free trial to our viewers to claim it go to
5:20
freshbooks.comtechwiki and enter techquickie in the how did you hear
5:23
about us section thanks for watching this video don't forget to like it or
5:27
dislike it uh get subscribed check out our other channels
5:31
and don't forget that i'm the worst dancer that has
5:38
wow dennis just roasted the crap out of
5:41
me