{"video_id":"fp_MsdI6X6AAD","title":"YouTube transcripts scraped, Record companies suing, Meta does a nice thing, and  more!","channel":"TechLinked","show":"TechLinked","published_at":"2024-07-18T02:00:00.054Z","duration_s":378,"segments":[{"start_s":0.0,"end_s":4.8,"text":"Welcome to another episode of Tech News. Siri, we know you're watching and you're welcome.","speaker":null,"is_sponsor":0},{"start_s":4.8,"end_s":8.16,"text":"Because according to new research from Proof News, tech giants including Apple,","speaker":null,"is_sponsor":0},{"start_s":8.16,"end_s":13.52,"text":"Anthropic, and NVIDIA have been training their AI models using transcripts from over 170,000","speaker":null,"is_sponsor":0},{"start_s":13.52,"end_s":18.4,"text":"YouTube videos. This trove included work of big internet stars like MrBeast, PewDiePie,","speaker":null,"is_sponsor":0},{"start_s":18.4,"end_s":23.2,"text":"Jacksepticeye, and Marques Brownlee, but also clips from major television shows like Jimmy","speaker":null,"is_sponsor":0},{"start_s":23.2,"end_s":27.44,"text":"Kimmel Live, Last Week's Night, and The Late Show with Stephen Colbert. These transcripts","speaker":null,"is_sponsor":0},{"start_s":27.44,"end_s":33.52,"text":"weren't taken directly by these companies, but were collected by a third-party non-profit called","speaker":null,"is_sponsor":0},{"start_s":33.52,"end_s":39.76,"text":"Eleuther AI. That's a weird combination of letters. As part of The Pile, an open-source","speaker":null,"is_sponsor":0},{"start_s":39.76,"end_s":44.96,"text":"dataset intended to help academics and small developers to train their own AI models and","speaker":null,"is_sponsor":0},{"start_s":44.96,"end_s":50.08,"text":"not, as it might sound, a giant haemorrhoid. The material appears to have been scraped entirely","speaker":null,"is_sponsor":0},{"start_s":50.08,"end_s":54.56,"text":"without the original creator's permission. Obviously, these transcripts are generated from","speaker":null,"is_sponsor":0},{"start_s":54.64,"end_s":60.24,"text":"copyrighted material made by human artists with bills and mortgages, but as pointed out by Marques","speaker":null,"is_sponsor":0},{"start_s":60.24,"end_s":65.12,"text":"Brownlee, he uses a paid transcription service rather than generated subtitles, meaning those","speaker":null,"is_sponsor":0},{"start_s":65.12,"end_s":70.32,"text":"transcripts are kind of copyrighted squared. In addition to educational videos from Khan Academy,","speaker":null,"is_sponsor":0},{"start_s":70.32,"end_s":74.96,"text":"MIT, and Harvard, the database also contains materials from conspiracy theorists and flat","speaker":null,"is_sponsor":0},{"start_s":74.96,"end_s":79.6,"text":"earth videos, so don't be shocked when you ask directions and Siri gives you a lecture about","speaker":null,"is_sponsor":0},{"start_s":79.6,"end_s":85.84,"text":"the danger of chemtrails. Through the world's largest record companies, Universal Warner and","speaker":null,"is_sponsor":0},{"start_s":85.84,"end_s":92.0,"text":"Sony Music have launched a $2.6 billion lawsuit against Verizon, alleging that the telecom company","speaker":null,"is_sponsor":0},{"start_s":92.0,"end_s":97.04,"text":"has profited from ignoring widespread piracy. According to the suit, Verizon should be held liable","speaker":null,"is_sponsor":0},{"start_s":97.04,"end_s":101.6,"text":"for its decision to continue to provide internet services to thousands of flagrant pirates. It's","speaker":null,"is_sponsor":0},{"start_s":101.6,"end_s":106.64,"text":"not clear how likely this lawsuit is to succeed, given that it closely resembles a previous lawsuit","speaker":null,"is_sponsor":0},{"start_s":106.64,"end_s":111.92,"text":"by the same plaintiffs against Cox Communications. That suit found that Cox was guilty of failing to","speaker":null,"is_sponsor":0},{"start_s":111.92,"end_s":116.96,"text":"stop piracy, but wound up overturned on appeal because Cox didn't actually make any money from","speaker":null,"is_sponsor":0},{"start_s":116.96,"end_s":121.92,"text":"the infringement. See, while it's obviously true that Verizon makes more money if it doesn't force","speaker":null,"is_sponsor":0},{"start_s":121.92,"end_s":125.52,"text":"its paying customers to walk the plank the moment they download BitTorrent, it actually makes the","speaker":null,"is_sponsor":0},{"start_s":125.52,"end_s":129.92,"text":"same amount of money whether or not you steal birdemic shock and terror, or you pay for it like","speaker":null,"is_sponsor":0},{"start_s":129.92,"end_s":134.48,"text":"a good boy. Meta has launched a small pilot program that will give researchers access to","speaker":null,"is_sponsor":0},{"start_s":134.48,"end_s":139.28,"text":"Instagram's own data so they can study the app's effect on teenagers and young adults.","speaker":null,"is_sponsor":0},{"start_s":139.28,"end_s":143.2,"text":"Meta is now accepting proposals for studies and will accept up to seven submissions,","speaker":null,"is_sponsor":0},{"start_s":143.2,"end_s":148.48,"text":"one for each deadly sin. However, researchers still need to obtain consent from study participants","speaker":null,"is_sponsor":0},{"start_s":148.48,"end_s":152.88,"text":"and their parents. It's still unclear how extensive this information access will be,","speaker":null,"is_sponsor":0},{"start_s":152.88,"end_s":157.76,"text":"but it's a notable step given Meta's past hostility towards researchers. Prior research has","speaker":null,"is_sponsor":0},{"start_s":157.76,"end_s":162.96,"text":"shown a distinct correlation between heavy social media use and mental health issues like anxiety","speaker":null,"is_sponsor":0},{"start_s":162.96,"end_s":167.44,"text":"and depression, especially in teenagers. What's less clear however is whether the kids are sad","speaker":null,"is_sponsor":0},{"start_s":167.44,"end_s":171.36,"text":"because they're on Instagram all the time or if they're on Instagram all the time because they're","speaker":null,"is_sponsor":0},{"start_s":171.36,"end_s":175.84,"text":"sad. Regardless, parents and legislators are concerned and it's driving more and more scrutiny","speaker":null,"is_sponsor":0},{"start_s":175.84,"end_s":180.8,"text":"from regulators. TikTok lost its recent legal challenge to the EU's decision to classify it","speaker":null,"is_sponsor":0},{"start_s":180.8,"end_s":185.52,"text":"as a very large online platform on account of the size of its parent company and because","speaker":null,"is_sponsor":0},{"start_s":185.52,"end_s":189.68,"text":"it's the social media equivalent of free-basing. If you made it through our main stories,","speaker":null,"is_sponsor":0},{"start_s":189.68,"end_s":194.0,"text":"but wish they could have been slightly quicker? Congratulations, it's the quick bits, they're","speaker":null,"is_sponsor":0},{"start_s":194.0,"end_s":201.44,"text":"here. Noctua's new NHD 15 G2 cooler is leaving some users rattle, specifically their coolers.","speaker":null,"is_sponsor":0},{"start_s":201.44,"end_s":205.6,"text":"They're rattling. After a report from hardware busters and customer complaints, Noctua says","speaker":null,"is_sponsor":0},{"start_s":205.6,"end_s":209.92,"text":"they're investigating the issue. Noctua believes the problem could be temporarily fixed by using","speaker":null,"is_sponsor":0},{"start_s":209.92,"end_s":214.0,"text":"tape or foam until they find a permanent solution, but they're also offering full refunds.","speaker":null,"is_sponsor":0},{"start_s":214.72,"end_s":220.16,"text":"As much as we love to hock Tua for Noctua, it seems that even the beige and brown giant","speaker":null,"is_sponsor":0},{"start_s":220.16,"end_s":225.2,"text":"can avoid QA issues. Thankfully, Noctua has a good reputation for customer support. So as","speaker":null,"is_sponsor":0},{"start_s":225.2,"end_s":229.44,"text":"they say in the company's native Austria, you got a problem with Noctua, you talk to her.","speaker":null,"is_sponsor":0},{"start_s":231.92,"end_s":236.56,"text":"Google has announced Project Oscar, an open source platform allowing developers to create AI","speaker":null,"is_sponsor":0},{"start_s":236.56,"end_s":241.52,"text":"agents to help manage their open source projects. For example, the agent can summarize and highlight","speaker":null,"is_sponsor":0},{"start_s":241.52,"end_s":247.6,"text":"relevant information in issue reports from users. Rather than have the AI code, which is an awful","speaker":null,"is_sponsor":0},{"start_s":247.6,"end_s":252.32,"text":"idea that has only ever gone wrong, the hope seems to be that the agents can help with the","speaker":null,"is_sponsor":0},{"start_s":252.32,"end_s":257.28,"text":"disruptions in toil. I guess I don't understand why someone would want to use this. I'm pretty sure","speaker":null,"is_sponsor":0},{"start_s":257.28,"end_s":261.68,"text":"the only reason to maintain an open source project is to suffer thanklessly while making no money.","speaker":null,"is_sponsor":0},{"start_s":263.2,"end_s":267.84,"text":"Apple is making a Squidwardian attempt to prevent a future where everything is chrome","speaker":null,"is_sponsor":0},{"start_s":267.84,"end_s":273.04,"text":"with a new Hitchcockian ad for Safari. Apple continues to beat the dead horse saying they","speaker":null,"is_sponsor":0},{"start_s":273.04,"end_s":277.6,"text":"care about privacy and will keep pummeling the deceased stallion saying they don't care about","speaker":null,"is_sponsor":0},{"start_s":277.6,"end_s":282.72,"text":"your privacy, they just want your data with them in their sandbox. Also, it's not like they've","speaker":null,"is_sponsor":0},{"start_s":282.72,"end_s":286.96,"text":"stopped accepting Google's multi-billion dollar bribes to make Google the default search engine","speaker":null,"is_sponsor":0},{"start_s":286.96,"end_s":291.04,"text":"in Safari. At least not until regulators force them to stop. Also, Apple, I don't know whose","speaker":null,"is_sponsor":0},{"start_s":291.04,"end_s":296.08,"text":"idea this was, but maybe you should stop feeding your AI. Videos about how birds aren't real.","speaker":null,"is_sponsor":0},{"start_s":296.24,"end_s":306.16,"text":"Research and recovery company RMS Titanic, Inc. is prepared to send remote submersibles down 12,000","speaker":null,"is_sponsor":0},{"start_s":306.16,"end_s":311.12,"text":"feet to take 3D scans of the wreck of the Titanic. This is the first mission to the wreck since the","speaker":null,"is_sponsor":0},{"start_s":311.12,"end_s":315.84,"text":"implosion of the Titan submersible last year, but this expedition is intended to preserve","speaker":null,"is_sponsor":0},{"start_s":315.84,"end_s":323.12,"text":"knowledge for study rather than turn thinking-breathing humans into soup. In similar news,","speaker":null,"is_sponsor":0},{"start_s":323.12,"end_s":328.48,"text":"drone manufacturer DJI has released high-quality footage of Mount Everest taking with the DJI","speaker":null,"is_sponsor":0},{"start_s":328.48,"end_s":332.8,"text":"Mavic 3 Pro. Now you can travel from base camp to summit and you don't even have to step over","speaker":null,"is_sponsor":0},{"start_s":332.8,"end_s":338.48,"text":"the frozen bodies of dead billionaires to get there. The FTC has said he health can no longer scan","speaker":null,"is_sponsor":0},{"start_s":338.48,"end_s":343.92,"text":"your little man with their app, Calmera. It aims to be an AI pecker protector that could detect","speaker":null,"is_sponsor":0},{"start_s":343.92,"end_s":349.92,"text":"10 STIs that scourge your courgette with 94% accuracy. That's a 1 in 20 chance they can screw","speaker":null,"is_sponsor":0},{"start_s":349.92,"end_s":355.92,"text":"up identifying a bug in your love slug and that's just the STIs you can see. Turns out the app can","speaker":null,"is_sponsor":0},{"start_s":355.92,"end_s":361.04,"text":"only accurately detect four diseases between your knees and could be fooled by inanimate","speaker":null,"is_sponsor":0},{"start_s":361.04,"end_s":366.24,"text":"nobjects like phallic vases and pastries. I mean that's what I think every time I look at a croissant.","speaker":null,"is_sponsor":0},{"start_s":366.24,"end_s":371.04,"text":"Looks like you should block the AI schlock. Get a doctor gock at your...penis.","speaker":null,"is_sponsor":0},{"start_s":372.08,"end_s":375.52,"text":"And doctors recommend you get a tri-weekly injection of techno, so make sure you come back","speaker":null,"is_sponsor":0},{"start_s":375.52,"end_s":378.72,"text":"on Friday. Personally, I've got something I need you to take a look at.","speaker":null,"is_sponsor":0}],"full_text":"Welcome to another episode of Tech News. Siri, we know you're watching and you're welcome. Because according to new research from Proof News, tech giants including Apple, Anthropic, and NVIDIA have been training their AI models using transcripts from over 170,000 YouTube videos. This trove included work of big internet stars like MrBeast, PewDiePie, Jacksepticeye, and Marques Brownlee, but also clips from major television shows like Jimmy Kimmel Live, Last Week's Night, and The Late Show with Stephen Colbert. These transcripts weren't taken directly by these companies, but were collected by a third-party non-profit called Eleuther AI. That's a weird combination of letters. As part of The Pile, an open-source dataset intended to help academics and small developers to train their own AI models and not, as it might sound, a giant haemorrhoid. The material appears to have been scraped entirely without the original creator's permission. Obviously, these transcripts are generated from copyrighted material made by human artists with bills and mortgages, but as pointed out by Marques Brownlee, he uses a paid transcription service rather than generated subtitles, meaning those transcripts are kind of copyrighted squared. In addition to educational videos from Khan Academy, MIT, and Harvard, the database also contains materials from conspiracy theorists and flat earth videos, so don't be shocked when you ask directions and Siri gives you a lecture about the danger of chemtrails. Through the world's largest record companies, Universal Warner and Sony Music have launched a $2.6 billion lawsuit against Verizon, alleging that the telecom company has profited from ignoring widespread piracy. According to the suit, Verizon should be held liable for its decision to continue to provide internet services to thousands of flagrant pirates. It's not clear how likely this lawsuit is to succeed, given that it closely resembles a previous lawsuit by the same plaintiffs against Cox Communications. That suit found that Cox was guilty of failing to stop piracy, but wound up overturned on appeal because Cox didn't actually make any money from the infringement. See, while it's obviously true that Verizon makes more money if it doesn't force its paying customers to walk the plank the moment they download BitTorrent, it actually makes the same amount of money whether or not you steal birdemic shock and terror, or you pay for it like a good boy. Meta has launched a small pilot program that will give researchers access to Instagram's own data so they can study the app's effect on teenagers and young adults. Meta is now accepting proposals for studies and will accept up to seven submissions, one for each deadly sin. However, researchers still need to obtain consent from study participants and their parents. It's still unclear how extensive this information access will be, but it's a notable step given Meta's past hostility towards researchers. Prior research has shown a distinct correlation between heavy social media use and mental health issues like anxiety and depression, especially in teenagers. What's less clear however is whether the kids are sad because they're on Instagram all the time or if they're on Instagram all the time because they're sad. Regardless, parents and legislators are concerned and it's driving more and more scrutiny from regulators. TikTok lost its recent legal challenge to the EU's decision to classify it as a very large online platform on account of the size of its parent company and because it's the social media equivalent of free-basing. If you made it through our main stories, but wish they could have been slightly quicker? Congratulations, it's the quick bits, they're here. Noctua's new NHD 15 G2 cooler is leaving some users rattle, specifically their coolers. They're rattling. After a report from hardware busters and customer complaints, Noctua says they're investigating the issue. Noctua believes the problem could be temporarily fixed by using tape or foam until they find a permanent solution, but they're also offering full refunds. As much as we love to hock Tua for Noctua, it seems that even the beige and brown giant can avoid QA issues. Thankfully, Noctua has a good reputation for customer support. So as they say in the company's native Austria, you got a problem with Noctua, you talk to her. Google has announced Project Oscar, an open source platform allowing developers to create AI agents to help manage their open source projects. For example, the agent can summarize and highlight relevant information in issue reports from users. Rather than have the AI code, which is an awful idea that has only ever gone wrong, the hope seems to be that the agents can help with the disruptions in toil. I guess I don't understand why someone would want to use this. I'm pretty sure the only reason to maintain an open source project is to suffer thanklessly while making no money. Apple is making a Squidwardian attempt to prevent a future where everything is chrome with a new Hitchcockian ad for Safari. Apple continues to beat the dead horse saying they care about privacy and will keep pummeling the deceased stallion saying they don't care about your privacy, they just want your data with them in their sandbox. Also, it's not like they've stopped accepting Google's multi-billion dollar bribes to make Google the default search engine in Safari. At least not until regulators force them to stop. Also, Apple, I don't know whose idea this was, but maybe you should stop feeding your AI. Videos about how birds aren't real. Research and recovery company RMS Titanic, Inc. is prepared to send remote submersibles down 12,000 feet to take 3D scans of the wreck of the Titanic. This is the first mission to the wreck since the implosion of the Titan submersible last year, but this expedition is intended to preserve knowledge for study rather than turn thinking-breathing humans into soup. In similar news, drone manufacturer DJI has released high-quality footage of Mount Everest taking with the DJI Mavic 3 Pro. Now you can travel from base camp to summit and you don't even have to step over the frozen bodies of dead billionaires to get there. The FTC has said he health can no longer scan your little man with their app, Calmera. It aims to be an AI pecker protector that could detect 10 STIs that scourge your courgette with 94% accuracy. That's a 1 in 20 chance they can screw up identifying a bug in your love slug and that's just the STIs you can see. Turns out the app can only accurately detect four diseases between your knees and could be fooled by inanimate nobjects like phallic vases and pastries. I mean that's what I think every time I look at a croissant. Looks like you should block the AI schlock. Get a doctor gock at your...penis. And doctors recommend you get a tri-weekly injection of techno, so make sure you come back on Friday. Personally, I've got something I need you to take a look at."}