How Does Optical Character Recognition (OCR) Work?

Techquickie ·Techquickie ·2017-05-06 · 962 words · ~4 min read
Floatplane YouTube

Transcript

JSON SRT VTT 80
0:00 you know it's pretty easy to take words on your computer screen and put them on
0:04 a physical sheet of paper just click print and unless you've forgotten to
0:08 fork out an extortion level amount of money for a new cartridge you'll have
0:13 fresh warm satisfying documents just a
0:16 few moments later but going in the opposite direction scanning dead tree
0:20 information into your pc is actually quite a bit trickier i mean sure flatbed
0:25 scanners aren't all that difficult to operate per se
0:28 but many of them are basically just taking a picture of the document and
0:32 saving it onto your pc meaning not only will it probably not look very crisp due
0:37 to file compression and little bits of dust in your scanner but you can't edit
0:42 a clean copy of your document in your favorite word processor because the
0:46 scanner won't recognize each individual character
0:49 fortunately there are a number of devices out there that enable optical
0:52 character recognition or ocr where each character on a page is scanned
0:57 individually so your papers are uploaded as actual text documents instead of
1:02 messy jpegs but how exactly does that
1:05 work and is one kind of optical scanner better than another well because the
1:10 whole concept of translating text into electronic signal is pretty broad there
1:15 have been lots of different implementations of ocr over the years in
1:19 fact one of the earliest electric ocr devices the optophone was invented all
1:24 the way back in 1914 this bizarre looking contraption relied
1:29 on the special behavior of selenium
1:33 which conducts electricity differently in light and darkness
1:37 as it scanned the words on a page the optophone distinguished between the dark
1:42 ink of text and lighter blank spaces
1:45 generating tones that corresponded to different letters making it possible for
1:49 blind people to read with some practice
1:53 later in 1931 a machine was developed that could convert printed text to
1:58 telegraph code one of the first technologies to translate printed
2:01 characters to electrical impulses rather than sounds but it wasn't until the
2:06 1960s and 70s that ocr began to take a
2:09 more familiar modern form with postal services using ocr to read addresses and
2:15 software that could recognize many different fonts
2:18 so back to present day when you scan a document how exactly does the software
2:23 know what it's looking at well the first step is to cut out artifacts so your ocr
2:28 program can concentrate on the text and nothing else so it attempts to remove
2:32 dust and other various graphics align the text properly and convert any colors
2:38 or shades of grey in the image to black and white only making the words
2:42 themselves easier to recognize the next step is to figure out which characters
2:47 are on the page simpler forms of ocr compare each scanned letter pixel by
2:52 pixel to a known database of fonts and decide on the closest match smarter ocr
2:58 however takes this step farther by breaking down each character down to
3:02 constituent elements like curves and corners and looking for matching
3:07 physical features and actual letters you can think of the differences between
3:11 these two approaches similarly to the difference between raster and vector
3:15 images which you can learn more about up here ocr software can also make use of a
3:20 dictionary so it won't accidentally spit out nonsense words due to inaccurate
3:24 scanning for example if your scanner sees this but it can't quite tell
3:28 whether the middle letter is an o or an a it can check its own dictionary to
3:33 decide that the word is actually dog and not dag giving ocr software situational
3:40 information can further cut down on errors such as telling it to only try to
3:44 match numbers if it's reading zip codes on an envelope
3:49 even with these tricks however ocr obviously is not perfect which you've
3:54 probably seen for yourself if you've ever used it but with greater
3:58 processing power and machine learning techniques that allow software to
4:02 recognize more subtle patterns over time ocr has become versatile enough to
4:07 recognize harder to read typefaces inconsistently printed material and even
4:12 handwriting and free ocr cloud processing services
4:15 like google drive which has a lot more machine learning capability than your
4:19 home pc for which i hope are fairly obvious reasons have made ocr more
4:24 accessible than ever no word yet though on whether google will take it a step
4:29 further and launch google interpretive dance translator
4:32 i don't know what i'm doing ignore me
4:36 are you racing against the clock as a freelancer trying to start your
4:40 challenging but rewarding interpretive dance company with the growth of the
4:43 internet there's never been more opportunities for these self-employed to
4:47 meet this need check out freshbooks cloud accounting software designed for
4:50 the way that you work it's the simplest and easiest way
4:54 to be more productive organized and more importantly get paid quickly
4:59 you can create and send professional looking invoices in less than 30 seconds
5:03 which is super important and you can set up online payments with
5:06 just a couple clicks and get paid up to four days faster you can even see when a
5:11 client has seen your invoice so there's no more guessing games freshbooks is
5:15 offering a 30-day unrestricted free trial to our viewers to claim it go to
5:20 freshbooks.comtechwiki and enter techquickie in the how did you hear
5:23 about us section thanks for watching this video don't forget to like it or
5:27 dislike it uh get subscribed check out our other channels
5:31 and don't forget that i'm the worst dancer that has
5:38 wow dennis just roasted the crap out of
5:41 me