{"video_id":"fp_G6AU5EwzxR","title":"What is an NPU?","channel":"Techquickie","show":"Techquickie","published_at":"2024-04-16T20:06:00.020Z","duration_s":299,"segments":[{"start_s":0.0,"end_s":3.76,"text":"AI chips have suddenly become a big selling point for phones,","speaker":null,"is_sponsor":0},{"start_s":3.76,"end_s":8.8,"text":"but that might seem a little surprising that your little smartphone, which already","speaker":null,"is_sponsor":0},{"start_s":8.8,"end_s":15.84,"text":"has serious limitations on power consumption and heat generation, can run something as seemingly complicated as AI.","speaker":null,"is_sponsor":0},{"start_s":15.84,"end_s":22.24,"text":"So how exactly do they pull this off? Well, these neural processing units, or NPUs,","speaker":null,"is_sponsor":0},{"start_s":22.24,"end_s":28.0,"text":"are quite a bit different than your phone's main CPU cores, features like Apple's neural engine","speaker":null,"is_sponsor":0},{"start_s":28.0,"end_s":31.08,"text":"or the machine learning engine on a Google Tensor chip","speaker":null,"is_sponsor":0},{"start_s":31.08,"end_s":37.36,"text":"are highly optimized for AI tasks, but probably suck at pretty much anything else.","speaker":null,"is_sponsor":0},{"start_s":37.36,"end_s":42.44,"text":"It's kind of like how a GPU works. Although they are much better for rendering graphics","speaker":null,"is_sponsor":0},{"start_s":42.44,"end_s":46.36,"text":"than a more general-purpose CPU, you're not going to run your operating system off","speaker":null,"is_sponsor":0},{"start_s":46.36,"end_s":50.28,"text":"of your graphics card. They are embarrassingly parallel.","speaker":null,"is_sponsor":0},{"start_s":50.28,"end_s":56.38,"text":"A relatively small amount of dye area then that is dedicated to AI can effectively run machine learning","speaker":null,"is_sponsor":0},{"start_s":56.38,"end_s":59.78,"text":"based tasks without sucking down too much power.","speaker":null,"is_sponsor":0},{"start_s":59.78,"end_s":64.14,"text":"But that doesn't answer the question of why there's such a push to put these chips in our phones","speaker":null,"is_sponsor":0},{"start_s":64.14,"end_s":68.1,"text":"in the first place. I mean, we hear so much about cloud AI,","speaker":null,"is_sponsor":0},{"start_s":68.1,"end_s":74.42,"text":"where neural networks run on powerful servers. So can't we just offload tasks like image optimization","speaker":null,"is_sponsor":0},{"start_s":74.42,"end_s":80.62,"text":"and voice recognition to the cloud? Well, the answer lies in how large and complex the AI models","speaker":null,"is_sponsor":0},{"start_s":80.62,"end_s":85.14,"text":"are that your device needs to use. Models for common smartphone AI features,","speaker":null,"is_sponsor":0},{"start_s":85.18,"end_s":90.14,"text":"such as voice recognition, facial recognition, and some kinds of image correction","speaker":null,"is_sponsor":0},{"start_s":90.14,"end_s":93.9,"text":"are often relatively small, meaning that they can be run on device","speaker":null,"is_sponsor":0},{"start_s":93.9,"end_s":98.42,"text":"on a limited amount of silicon. And if these functions can be run locally","speaker":null,"is_sponsor":0},{"start_s":98.42,"end_s":102.18,"text":"instead of in the cloud, it's generally better to do so.","speaker":null,"is_sponsor":0},{"start_s":102.18,"end_s":105.98,"text":"For example, if you use an Android phones speech recognition","speaker":null,"is_sponsor":0},{"start_s":105.98,"end_s":112.02,"text":"button, you will wait around for your phone to send your speech over to a server over the internet,","speaker":null,"is_sponsor":0},{"start_s":112.02,"end_s":117.26,"text":"wait for that server to figure out what you're trying to say, and then wait to get the results back to your phone.","speaker":null,"is_sponsor":0},{"start_s":117.26,"end_s":122.1,"text":"If you could get results right now, that would be a big selling point for a modern phone.","speaker":null,"is_sponsor":0},{"start_s":122.1,"end_s":125.18,"text":"So even though cloud hardware might be more powerful,","speaker":null,"is_sponsor":0},{"start_s":125.18,"end_s":128.18,"text":"the latency advantage of having a chip on your device","speaker":null,"is_sponsor":0},{"start_s":128.18,"end_s":132.22,"text":"makes this trade-off worth it. Not to mention that it helps protect your privacy","speaker":null,"is_sponsor":0},{"start_s":132.22,"end_s":135.78,"text":"by keeping as much of your data on your phone as possible.","speaker":null,"is_sponsor":0},{"start_s":135.78,"end_s":139.5,"text":"But when may it not make sense to rely on a phone's NPU?","speaker":null,"is_sponsor":0},{"start_s":139.5,"end_s":143.58,"text":"More advanced forms of generative AI aren't quite at the point","speaker":null,"is_sponsor":0},{"start_s":143.62,"end_s":149.26,"text":"where you can run them on a phone efficiently. And by generative AI, I mean artificial intelligence","speaker":null,"is_sponsor":0},{"start_s":149.26,"end_s":154.78,"text":"that can create new media. Think about the stories that get generated by chat GPT","speaker":null,"is_sponsor":0},{"start_s":154.78,"end_s":157.98,"text":"or the AI art from services like mid-journey.","speaker":null,"is_sponsor":0},{"start_s":157.98,"end_s":163.34,"text":"Now, you probably don't expect to run an entire advanced image generation model on a phone,","speaker":null,"is_sponsor":0},{"start_s":163.34,"end_s":168.26,"text":"at least with NPUs the size they are now. But what about commonly touted features","speaker":null,"is_sponsor":0},{"start_s":168.26,"end_s":174.38,"text":"like Google's Magic Editor on its Pixel lineup? Well, Magic Editor appears to need an internet connection","speaker":null,"is_sponsor":0},{"start_s":174.38,"end_s":179.98,"text":"since the feature uses enough generative AI to the point where the phone has to rely on cloud servers","speaker":null,"is_sponsor":0},{"start_s":179.98,"end_s":183.42,"text":"in order to give you the image you want in a reasonable amount of time.","speaker":null,"is_sponsor":0},{"start_s":183.42,"end_s":189.58,"text":"However, less demanding features, such as live translate, can run on device.","speaker":null,"is_sponsor":0},{"start_s":189.58,"end_s":194.86,"text":"Since the idea of AI-specific hardware on consumer devices is still relatively new,","speaker":null,"is_sponsor":0},{"start_s":194.86,"end_s":199.3,"text":"tech companies are still trying to figure out exactly where the sweet spot is","speaker":null,"is_sponsor":0},{"start_s":199.3,"end_s":203.34,"text":"in terms of which tasks can and should be done on device","speaker":null,"is_sponsor":0},{"start_s":203.34,"end_s":209.3,"text":"versus which ones should be offloaded to the cloud. In fact, lots of AI as a service type products","speaker":null,"is_sponsor":0},{"start_s":209.3,"end_s":212.62,"text":"don't yet have a clear pathway to monetization.","speaker":null,"is_sponsor":0},{"start_s":212.62,"end_s":216.46,"text":"Instead, it's more common for tech firms to roll the features out now,","speaker":null,"is_sponsor":0},{"start_s":216.46,"end_s":220.58,"text":"figure out how they work, and then jam them into their business model","speaker":null,"is_sponsor":0},{"start_s":220.58,"end_s":226.02,"text":"at some point down the line. This is actually part of the reason that the dye areas of NPUs and phones","speaker":null,"is_sponsor":0},{"start_s":226.02,"end_s":230.22,"text":"are still relatively small. Hardware manufacturers would rather have","speaker":null,"is_sponsor":0},{"start_s":230.22,"end_s":233.62,"text":"enough inside the phone to enable AI features,","speaker":null,"is_sponsor":0},{"start_s":233.62,"end_s":237.14,"text":"but then figure out exactly what the use cases are","speaker":null,"is_sponsor":0},{"start_s":237.14,"end_s":240.7,"text":"before they dedicate more hardware to AI.","speaker":null,"is_sponsor":0},{"start_s":240.7,"end_s":243.82,"text":"You're also seeing this on the desktop and laptop side of things,","speaker":null,"is_sponsor":0},{"start_s":243.82,"end_s":249.06,"text":"with both AMD and Intel coming out with consumer processors that include NPUs.","speaker":null,"is_sponsor":0},{"start_s":249.06,"end_s":252.26,"text":"And the ideas that features like Windows Studio Effects","speaker":null,"is_sponsor":0},{"start_s":252.26,"end_s":255.42,"text":"will run on device so your video calls look a little bit nicer.","speaker":null,"is_sponsor":0},{"start_s":255.42,"end_s":258.7,"text":"But as time goes on, both PC and phone manufacturers","speaker":null,"is_sponsor":0},{"start_s":258.7,"end_s":262.22,"text":"are aiming to get more and more AI functions running locally.","speaker":null,"is_sponsor":0},{"start_s":262.22,"end_s":266.1,"text":"You're already seeing the push for this with how both Team Red and Team Blue","speaker":null,"is_sponsor":0},{"start_s":266.1,"end_s":271.78,"text":"have partnered with a number of outside software developers to make applications that can take advantage of their NPUs.","speaker":null,"is_sponsor":0},{"start_s":271.78,"end_s":275.22,"text":"While it remains to be seen what AI features will become mainstays,","speaker":null,"is_sponsor":0},{"start_s":275.22,"end_s":280.38,"text":"it's clear that your gadgets are going to have significantly more brain power going forward.","speaker":null,"is_sponsor":0},{"start_s":280.38,"end_s":282.66,"text":"For better or for worse.","speaker":null,"is_sponsor":0},{"start_s":285.7,"end_s":289.98,"text":"If you guys enjoyed this video, leave a like or dislike depending on how you feel.","speaker":null,"is_sponsor":0},{"start_s":289.98,"end_s":293.1,"text":"Check out our video on the hardware that runs ChatGPT","speaker":null,"is_sponsor":0},{"start_s":293.1,"end_s":296.1,"text":"if you're looking for something else to watch and leave a comment if you have a suggestion","speaker":null,"is_sponsor":0},{"start_s":296.1,"end_s":299.58,"text":"for a future video. And of course, don't forget to subscribe.","speaker":null,"is_sponsor":0}],"full_text":"AI chips have suddenly become a big selling point for phones, but that might seem a little surprising that your little smartphone, which already has serious limitations on power consumption and heat generation, can run something as seemingly complicated as AI. So how exactly do they pull this off? Well, these neural processing units, or NPUs, are quite a bit different than your phone's main CPU cores, features like Apple's neural engine or the machine learning engine on a Google Tensor chip are highly optimized for AI tasks, but probably suck at pretty much anything else. It's kind of like how a GPU works. Although they are much better for rendering graphics than a more general-purpose CPU, you're not going to run your operating system off of your graphics card. They are embarrassingly parallel. A relatively small amount of dye area then that is dedicated to AI can effectively run machine learning based tasks without sucking down too much power. But that doesn't answer the question of why there's such a push to put these chips in our phones in the first place. I mean, we hear so much about cloud AI, where neural networks run on powerful servers. So can't we just offload tasks like image optimization and voice recognition to the cloud? Well, the answer lies in how large and complex the AI models are that your device needs to use. Models for common smartphone AI features, such as voice recognition, facial recognition, and some kinds of image correction are often relatively small, meaning that they can be run on device on a limited amount of silicon. And if these functions can be run locally instead of in the cloud, it's generally better to do so. For example, if you use an Android phones speech recognition button, you will wait around for your phone to send your speech over to a server over the internet, wait for that server to figure out what you're trying to say, and then wait to get the results back to your phone. If you could get results right now, that would be a big selling point for a modern phone. So even though cloud hardware might be more powerful, the latency advantage of having a chip on your device makes this trade-off worth it. Not to mention that it helps protect your privacy by keeping as much of your data on your phone as possible. But when may it not make sense to rely on a phone's NPU? More advanced forms of generative AI aren't quite at the point where you can run them on a phone efficiently. And by generative AI, I mean artificial intelligence that can create new media. Think about the stories that get generated by chat GPT or the AI art from services like mid-journey. Now, you probably don't expect to run an entire advanced image generation model on a phone, at least with NPUs the size they are now. But what about commonly touted features like Google's Magic Editor on its Pixel lineup? Well, Magic Editor appears to need an internet connection since the feature uses enough generative AI to the point where the phone has to rely on cloud servers in order to give you the image you want in a reasonable amount of time. However, less demanding features, such as live translate, can run on device. Since the idea of AI-specific hardware on consumer devices is still relatively new, tech companies are still trying to figure out exactly where the sweet spot is in terms of which tasks can and should be done on device versus which ones should be offloaded to the cloud. In fact, lots of AI as a service type products don't yet have a clear pathway to monetization. Instead, it's more common for tech firms to roll the features out now, figure out how they work, and then jam them into their business model at some point down the line. This is actually part of the reason that the dye areas of NPUs and phones are still relatively small. Hardware manufacturers would rather have enough inside the phone to enable AI features, but then figure out exactly what the use cases are before they dedicate more hardware to AI. You're also seeing this on the desktop and laptop side of things, with both AMD and Intel coming out with consumer processors that include NPUs. And the ideas that features like Windows Studio Effects will run on device so your video calls look a little bit nicer. But as time goes on, both PC and phone manufacturers are aiming to get more and more AI functions running locally. You're already seeing the push for this with how both Team Red and Team Blue have partnered with a number of outside software developers to make applications that can take advantage of their NPUs. While it remains to be seen what AI features will become mainstays, it's clear that your gadgets are going to have significantly more brain power going forward. For better or for worse. If you guys enjoyed this video, leave a like or dislike depending on how you feel. Check out our video on the hardware that runs ChatGPT if you're looking for something else to watch and leave a comment if you have a suggestion for a future video. And of course, don't forget to subscribe."}