I have recently built a new PC, to be used as a server. For months now, I have been getting unexplained crashes, sometimes after a few minutes, sometimes after a few days, where the PC just reboots without any trace in the logs. Just normal occasional status logs, and then, a few seconds later, the log of a normal boot process.
This is slowly driving me crazy because I just can’t make out the issue. I have tried multiple different Linux installs, swapped out the ssd and PSU and ran a ram test but this behaviour stills persists.
Today something was different. Instead of rebooting, it showed me this blue screen, this time finally with a log. But I still can’t seem to make out the issues. Some quick internet searches show some very vague answers; everything from software to hardware, and psu to CPU.
Can any Linux wizard help me fix my problem? Link to the log
Update: I have now faced an even weirder issue. I booted up, installed cpupower like a comment suggested, installed man to look up its documentation and then the screen froze, and I was forced to reboot the PC by pressing the power button for 3s. Then when I booted back up, my bash history was reset to a state a from a few days back (~.bash_history mod time from 2 days ago) even though I rebooted several times since then, and have not had any persistency errors like this. man was also not installed anymore. Even weirder is that cpupower was still installed. So it seems like some data was saved, while other files were discarded. I will now use a second ssd and try to replicate this. I now suspect some kind of Storage issue, even though the two ssd drives in question have never caused issues in my laptop. This seems scary, I have never witnessed a so weirdly corrupted Linux install, ever.
Yes, I went to fast because I have been sitting on this for months now. Normally I would only change one thing at a time, but with this situation it can take everywhere from 5 minutes to multiple days to test one single thing. If it doesn’t crash for 48 hours, it might be because I fixed the issues, or it might just be a coincidence and it will crash in hour 49 ¯_(ツ)_/¯.
But your right, I will attempt it the right way when I find the time, even though it will probably take weeks 😮💨.
I know it sucks but I’m glad you seen to have corrected the problem. As someone who does more harm than good with Linux systems, myself, to fix a Linux issue without completely reinstalling the OS, is impressive and you should be proud to have accomplished such a feat!
Well I’ve not fixed anything yet😅. It was sadly just a hypothetical. Sorry if that wasn’t clear from the comment.
Well I’m still rooting for your success!