WEBVTT

00:00:00.160 --> 00:00:05.640
there it is I have a problem I just

00:00:03.439 --> 00:00:11.080
updated the BIOS on my workstation and ever since I did it's been randomly

00:00:08.280 --> 00:00:15.000
shutting off just while I'm not doing anything and while normally that's the

00:00:13.360 --> 00:00:19.840
sort of thing that I would troubleshoot myself I have another problem and that

00:00:17.920 --> 00:00:26.199
is that if I don't leave to start my vacation in the next 3 minutes my wife

00:00:22.880 --> 00:00:28.519
is going to absolutely murder me

00:00:26.199 --> 00:00:34.040
so we're going to call in the Cavalry here dialing a for

00:00:32.120 --> 00:00:41.000
Anthony uh Hey Anthony uh

00:00:36.800 --> 00:00:43.399
hello can you fix something for me maybe

00:00:41.000 --> 00:00:50.120
okay cool my computer's turning off and I got to

00:00:45.280 --> 00:00:53.320
go man I just got back from vacation and

00:00:50.120 --> 00:00:55.000
I'm already doing tech support well hey

00:00:53.320 --> 00:00:59.160
why not come along for the ride and I'll show you my method for troubleshooting

00:00:57.280 --> 00:01:02.760
barely any information to go off of right after this information for our

00:01:00.399 --> 00:01:07.439
sponsor thanks to hulman for sponsoring this video Hulk man's Alpha 85s helps

00:01:05.560 --> 00:01:16.439
you instantly jump start your dead car battery with no Aid required even in up

00:01:10.280 --> 00:01:16.439
to minus 40° F learn more at lm. g/

00:01:22.880 --> 00:01:28.560
hulman minus tells me that he updated the BIOS in order to upgrade to Windows

00:01:26.880 --> 00:01:33.560
11 and the computer's been shutting off randomly ever since doesn't matter what

00:01:30.799 --> 00:01:37.880
he's doing even absolutely nothing let's see if we can track down the issue now

00:01:35.680 --> 00:01:42.119
the first rule of thumb when doing any kind of troubleshooting always check the

00:01:40.159 --> 00:01:49.479
most basic Solutions first so we'll start with the wall power and um H

00:01:46.560 --> 00:01:52.880
actually he's got a UPS down here and I don't think anything else has been

00:01:50.640 --> 00:01:57.680
turning off randomly like the TV still got power from what he told me so that's

00:01:55.680 --> 00:02:02.240
probably not going to be our problem that leaves us to the

00:02:00.520 --> 00:02:06.640
cables which well I'm going to have a hard time

00:02:04.479 --> 00:02:12.840
getting in under there and you're not going to be able to see that either uh

00:02:09.440 --> 00:02:15.280
because not only is it kind of dark but

00:02:12.840 --> 00:02:19.840
the TV is also covering it all up but it looks

00:02:17.680 --> 00:02:23.280
like the 8 pin 12volt connector looks like it's okay actually there's multiple

00:02:21.959 --> 00:02:28.319
there and they both look like they're down all the way and the 24 pin ATX

00:02:26.800 --> 00:02:34.280
connector is also pretty solid there so I don't think

00:02:31.360 --> 00:02:38.480
that's our problem either now the system has a Threadripper CPU and sometimes

00:02:36.640 --> 00:02:42.840
over toring the retention mechanism can flatten out the socket's pins so that

00:02:40.080 --> 00:02:47.640
the CPU is not making contact which can make for a no power on but that doesn't

00:02:46.120 --> 00:02:51.800
match the description of the issue where it's happening even when the system is

00:02:49.519 --> 00:02:57.040
idle and besides I don't think I'm going to be able to open this thing up without

00:02:54.360 --> 00:03:00.920
some help and I'd really rather not do that for now we'll fire it up and move

00:02:59.120 --> 00:03:04.080
on maybe we'll get lucky first we'll open up a system monitoring tool like HW

00:03:02.879 --> 00:03:10.159
info and see what our temperatures and voltages look like and we'll also load up task manager to see if any processes

00:03:08.280 --> 00:03:13.040
are running wild and check to make sure that there's nothing weird showing up in

00:03:11.920 --> 00:03:19.000
the startup items CPU usage looks okay to me right

00:03:16.319 --> 00:03:24.040
now we're at 3% with uh only a few things actually being used right now

00:03:21.640 --> 00:03:28.720
you've got OBS running and uh I guess lonus is doing some idle mining which is

00:03:26.640 --> 00:03:33.680
fine as long as you are not specifically using your GPU for that as for the

00:03:31.640 --> 00:03:40.439
temperatures they look okay they're a little high but at the same time we've

00:03:35.799 --> 00:03:42.319
got a GPU doing idle mining on a closed

00:03:40.439 --> 00:03:46.159
loop like this or it's not really a closed loop but like it's sharing the

00:03:44.280 --> 00:03:51.599
cooling between the CPU and GPU and the CPU is currently idling it around uh in

00:03:49.360 --> 00:03:56.360
the 40s 50s or 60s depending on what's Happening normally depending on the CPU

00:03:54.120 --> 00:04:01.560
the enclosure and the cooling you have on your system it'll usually hover

00:03:58.760 --> 00:04:05.120
around 30 or 40 somewhere in there as it is right now I'm not concerned about The

00:04:03.480 --> 00:04:10.239
Thermals at all it seems perfectly fine to me yeah that's not even all that hot

00:04:07.280 --> 00:04:13.560
the hot spot temperatures in the 60s so this cooling setup is definitely enough

00:04:12.239 --> 00:04:18.120
we'll have to find something else that might be the cause of the problem we'll keep these tools open for now so we can

00:04:16.799 --> 00:04:22.880
keep monitoring it as we use the computer that way if anything does

00:04:21.239 --> 00:04:28.479
happen we can kind of be tipped off about it before it inevitably shuts down

00:04:26.240 --> 00:04:33.000
before we move on we should look at the voltages now something to be and bind

00:04:31.080 --> 00:04:36.639
about software reported CPU voltages is that they're not always correct they can

00:04:35.000 --> 00:04:40.960
change more quickly than they update on screen so take them with some salt it's

00:04:39.240 --> 00:04:44.120
also worth remembering that CPU core voltage will go higher or lower

00:04:42.720 --> 00:04:48.000
depending on the core clocks and what the CPU is doing thanks to load likee

00:04:46.000 --> 00:04:51.800
calibration a feature that tries to keep the CPU from under or overshooting its

00:04:49.960 --> 00:04:56.280
voltage range and crashing or potentially taking damage over time this

00:04:54.560 --> 00:05:00.280
throws a lot of people for a loop because even if you set the voltage

00:04:57.759 --> 00:05:03.800
manually in the BIOS it's not guaranteed you're going to get that voltage under

00:05:02.000 --> 00:05:09.520
any circumstances now this is an overclocked Threadripper 3000 so I

00:05:07.039 --> 00:05:14.800
can't really expect any specific voltage range but I think usually if it's in a

00:05:12.880 --> 00:05:19.840
low power State you can go from 0.6 all the way up to around 1.4 or more

00:05:17.520 --> 00:05:24.160
volts depending on the load and while nothing stands out to me right now this

00:05:22.440 --> 00:05:27.919
is one of the potential culprits for a random blue screen or power off so it's

00:05:25.960 --> 00:05:31.680
worth keeping an eye on next it's time to see what Windows recorded last time

00:05:29.840 --> 00:05:35.080
the PC unexpectedly powered off using Event Viewer this is built into Windows

00:05:33.840 --> 00:05:39.120
just right click on the startup button and go up to Event Viewer here or you

00:05:37.600 --> 00:05:43.400
can access it by typing this on the command prompt if you're ever really

00:05:40.720 --> 00:05:47.240
stuck this particular version of Event Viewer has been around since Windows

00:05:45.000 --> 00:05:51.639
2000 and is basically unchanged even Windows 11 we'll head over to Windows

00:05:49.039 --> 00:05:55.600
logs and then system now by default this is sorted by date and time so we'll

00:05:54.120 --> 00:05:59.199
scroll down to the last time we know what happened and see what Windows sees

00:05:57.560 --> 00:06:05.440
now I actually already see some things here uh so that's interesting but you can

00:06:03.039 --> 00:06:09.560
actually tell when Windows was shut down big gap in time followed by three event

00:06:07.319 --> 00:06:13.120
log items when the service is restarted and if there was a blue screen you'll

00:06:11.160 --> 00:06:16.880
see some bug check items labeled as critical as well you can use these items

00:06:15.319 --> 00:06:20.759
to see the blue screens error code which you can then use to narrow things down

00:06:18.319 --> 00:06:24.560
via Google this PC isn't blue screening so we don't have any bug checks well let

00:06:23.280 --> 00:06:30.080
us see what else is happening at the time went down and it looks like we have

00:06:26.800 --> 00:06:32.880
a bunch of acpi events here

00:06:30.080 --> 00:06:36.319
these are related to power management and that's interesting remember what I

00:06:34.280 --> 00:06:40.759
said about voltages earlier we might have just found some evidence for it and

00:06:38.880 --> 00:06:45.680
by the way you'll notice that there are a lot of items in the event log just in

00:06:42.919 --> 00:06:49.639
general so how do I know that what I'm looking at is what I'm supposed to look

00:06:47.400 --> 00:06:55.080
at well informational stuff like these things here are usually not that useful

00:06:53.800 --> 00:07:00.280
unless you're trying to track down something very specific and if you've been fixing Windows PCS for long enough

00:06:58.759 --> 00:07:06.599
you'll know that some these uh distributed Comm things are usually just

00:07:04.400 --> 00:07:10.440
there on a fresh install and they're entirely benign if you do want to know

00:07:09.000 --> 00:07:15.400
what exactly these are referring to though you can copy the string of letters and numbers and search it online

00:07:13.919 --> 00:07:22.840
or if you're feeling adventurous in the registry this particular one is for

00:07:18.840 --> 00:07:24.800
immersive shell not exactly exciting or

00:07:22.840 --> 00:07:28.280
relevant to what we're doing but knowing how to do this can be useful for

00:07:26.639 --> 00:07:32.360
tracking down misbehaving drivers and services and just knowing about it will

00:07:30.960 --> 00:07:37.599
help you if a scammer ever tries to convince you Windows is broken by showing you these errors it's a pretty

00:07:35.400 --> 00:07:41.680
common trap coming back to our main issue we've got a candidate for the

00:07:39.560 --> 00:07:45.639
problem but we should also check the application log to see if anything wears

00:07:43.479 --> 00:07:50.080
going on there this is where things like app crashes go and while these don't

00:07:47.720 --> 00:07:53.599
usually coincide with Windows issues if something specific keeps happening at

00:07:51.800 --> 00:07:59.319
the same time as a recurring problem it could be related now it doesn't look

00:07:57.120 --> 00:08:04.199
like there's any patterns showing up here here so we can rule this out for

00:08:01.280 --> 00:08:08.080
now the security logs aren't usually relevant on single user PCS but they can

00:08:06.400 --> 00:08:12.440
show you when you've logged in when you've used stored credentials and so on

00:08:10.680 --> 00:08:17.520
useful if you suspect you've been hacked or if there's a problem with Windows log

00:08:14.039 --> 00:08:19.440
on but it's not what we're after setup

00:08:17.520 --> 00:08:22.840
on the other hand are related to Windows updates and a failed or broken Windows

00:08:21.560 --> 00:08:27.520
update can sometimes do what we're seeing Al be usually with a reboot

00:08:24.919 --> 00:08:30.800
instead of a shutdown as expected most of this is unremarkable information

00:08:29.080 --> 00:08:35.440
about which updates have been installed when but a broken update or Windows

00:08:33.279 --> 00:08:39.159
update service will show errors here for now our best lead is the power

00:08:36.919 --> 00:08:43.640
management situation by default modern CPUs will enter a low power State when

00:08:41.240 --> 00:08:47.440
they're not in use called c-states not to be confused with s- States like

00:08:45.480 --> 00:08:50.680
sleeper hibernation these are like the performance oriented P States you might

00:08:49.200 --> 00:08:55.200
already be familiar with in the form of turbo boost where the CPU will wrap up

00:08:52.800 --> 00:08:59.600
its performance in response to load C states are just in the opposite

00:08:56.519 --> 00:09:01.480
direction now depending on the state the

00:08:59.600 --> 00:09:06.640
system might turn off some parts of the CPU or Park inactive cores to

00:09:04.000 --> 00:09:12.000
significantly reduce power consumption and the voltage supplied to the cores

00:09:09.480 --> 00:09:16.720
and therefore the CPU you can see where I'm going with this right but this is a

00:09:14.279 --> 00:09:21.920
desktop PC so idle power consumption isn't a major concern therefore let's

00:09:19.399 --> 00:09:26.519
try turning off C states to do that we'll restart into the BIOS and to do

00:09:24.480 --> 00:09:32.200
that first I need to stop all of these logging programs and restart the system

00:09:30.760 --> 00:09:35.040
oh I probably shouldn't have done that while lonus was still like doing stuff

00:09:33.959 --> 00:09:40.120
but oh well this is how you get into the BIOS

00:09:38.000 --> 00:09:43.600
usually sometimes you can get into it with F2 and you can get into it through

00:09:41.959 --> 00:09:48.240
Windows as well if you hold shift while clicking on restart then using the

00:09:45.360 --> 00:09:55.120
advanced options uh function to select UEFI firmware settings but uh we didn't

00:09:51.480 --> 00:09:55.120
do that and so I have to mash

00:09:55.720 --> 00:10:02.399
delete from here we're going to look for C States now some bioses like ASUS's

00:10:00.800 --> 00:10:07.000
lets you use the search function which we will use but sometimes you'll find it

00:10:04.720 --> 00:10:12.560
in uh like CPU configuration under Advanced or uh sometimes you'll find it

00:10:09.399 --> 00:10:14.560
in like power management configuration

00:10:12.560 --> 00:10:19.800
uh for now we'll just hit F9 to do search and do

00:10:17.720 --> 00:10:24.120
cstate and it is enabled sometimes you'll see two of them one for Global

00:10:21.760 --> 00:10:28.440
cstate control and one for c1e or enhanced halt uh if we had that here we

00:10:26.560 --> 00:10:34.440
would disable it as well for now let's go ahead and disable C States hit uh F10

00:10:32.000 --> 00:10:40.760
doesn't work from here okay so F10 to save and reset that's pretty normal

00:10:37.800 --> 00:10:45.720
and okay if F10 doesn't work on your BIOS for Save and reset you can just go

00:10:42.399 --> 00:10:48.120
to the um exit screen and it'll ask you

00:10:45.720 --> 00:10:53.680
if you want to save usually now that we've

00:10:49.040 --> 00:10:53.680
rebooted how can we know that this

00:10:54.720 --> 00:11:01.760
worked we can't we can't really test for a random power off event so we'll just

00:10:59.519 --> 00:11:06.399
need to wait I'll let it go for a few days and then we'll come back to it now

00:11:04.079 --> 00:11:10.800
thankfully I've got my water bottle from LTD store.com to keep my drink cold till

00:11:08.760 --> 00:11:14.320
then I'll just add a little bit of ice every now and then all right let's take

00:11:12.720 --> 00:11:21.120
a look it's still powered on that's a good sign let's see the event

00:11:18.399 --> 00:11:26.480
log pixel refresher I do not care about you I want to see Windows I don't see

00:11:23.639 --> 00:11:33.440
any of those acpi events that's a good sign that's

00:11:30.440 --> 00:11:35.120
really good I think we're good this was

00:11:33.440 --> 00:11:38.639
happening multiple times per day before and it's been a few days now and it's

00:11:36.680 --> 00:11:43.639
still chugging along I think it's safe to call this one fixed as for why the

00:11:41.880 --> 00:11:48.360
problem started happening in the first place well power management has always

00:11:46.360 --> 00:11:52.839
been and always will be a little bit of black magic a little tweak to how power

00:11:51.000 --> 00:11:56.320
management works in a BIOS update silicon degradation or even just the

00:11:54.440 --> 00:11:59.959
Silicon Lottery can affect how stable power management features will be for a

00:11:57.880 --> 00:12:03.600
given system speaking of which get subscribed because we finally have our

00:12:02.240 --> 00:12:08.279
last guide you'll ever need for how to build a PC coming up and if you liked

00:12:06.200 --> 00:12:11.639
this video you'll love that one it's the it's the longest video we've ever done

00:12:10.519 --> 00:12:16.639
except for the uog back to power management though it's

00:12:14.680 --> 00:12:20.920
usually fine for laptops because of how tightly integrated the CPU and the logic

00:12:18.320 --> 00:12:24.720
board are but not so much for desktops with modular components and longer

00:12:22.519 --> 00:12:28.120
traces many power management features are in fact disabled by default on a

00:12:26.560 --> 00:12:32.560
desktop particularly anything to do with PCI Express in my experience it's almost

00:12:30.920 --> 00:12:38.120
never a good idea to enable these on a desktop unless you're doing it specifically to see if it does work for

00:12:35.120 --> 00:12:40.160
your Hardware C states are usually okay

00:12:38.120 --> 00:12:45.399
but there's a reason you'll see many threads about them on forums going back

00:12:42.399 --> 00:12:48.320
many many years in this

00:12:45.399 --> 00:12:52.880
case whatever it was it seems like it was just done luck but it's fixed now

00:12:50.959 --> 00:12:57.040
and hopefully you'll come away from this video not only knowing that c States can

00:12:54.519 --> 00:13:01.760
be flaky but also a little more about how to dig in and find out what a

00:12:58.920 --> 00:13:06.760
problem might actually be and lonus will come back from vacation and I'll still

00:13:03.959 --> 00:13:09.880
have a job and hey you'll also know a little bit more about our sponsor

00:13:08.000 --> 00:13:13.600
pulseway thanks to pulseway for sponsoring today's video pulseway lets

00:13:11.880 --> 00:13:17.399
you centrally manage all your desktops servers network devices and Cloud

00:13:15.240 --> 00:13:20.760
infrastructure in one place you'll be the first person to know when a user has

00:13:18.959 --> 00:13:24.680
an issue or when there's a problem with your it environment you'll have out of

00:13:22.839 --> 00:13:28.760
the box commands to take action such as killing processes resetting user

00:13:26.639 --> 00:13:33.040
passwords running Powershell commands backing up files and even remote control

00:13:31.320 --> 00:13:37.160
with powerful Auto remediation tools pulseway can automatically resolve

00:13:34.680 --> 00:13:40.720
critical I failures like low dis space High CPU and even restart your services

00:13:39.560 --> 00:13:45.800
the patching engine will prevent vulnerabilities by checking for updates for both your operating system and

00:13:43.800 --> 00:13:49.839
thirdparty applications and run those updates for you on a schedule that you

00:13:47.519 --> 00:13:53.519
define and the best part is that you can do all of that from the mobile app or

00:13:51.560 --> 00:13:57.240
from the desktop try it for free today at Pulse way.com or through our link

00:13:55.399 --> 00:14:01.560
below thanks for watching guys this one's a bit different so why not check

00:13:59.199 --> 00:14:04.959
out Linus's new rack it's also a bit different and a whole lot of fun
