How I almost bricked my machine.
Aug 30, 2024
At the time of writing this, I have fixed ALMOST everything, but not everything works the way it once did.
The story begins with the simple mistake of enabling systemd-boot with grub already installed as my bootloader. This happened because I was following a How-To guide on enabling secure boot. I was trying to configure secure boot for grub's bootloader by signing the correct EFI files using sbctl, and I risked it all by accidentally entering the command: bootctl enable
At this point I forgot that I didn't need to run this command because I didn't use systemd-boot. This led to roughly an hour of headaches, google searches, and a total reorganization of my desk area.
When I first rebooted my machine, I noticed that the boot menu I was so familiar with was COMPLETELY missing. I was left with this strange boot menu that I hadn't recognized. At first, I noticed that all of my kernel images were missing, and that I had only one option: "Boot to firmware settings". Now, I am no stranger to bootloader issues. I have had my fair share of time messing around with the bootloader on the spare servers that I had laying around; but I never, EVER tried to mess with the bootloader on my main workstation— precisely because of the situation I am currently writing about. At this point, I internally said to myself "oh sh*t, I really just ruined my whole night", and I did. The good thing was, I knew how to fix this.
"Now where did that USB go?"
For about 20 minutes, I had been "rearranging" my entire desk space looking for an old USB drive that had a live installation of Arch Linux that I had used just in-case I needed to chroot into one of my old servers to fix these bootloader issues. Now my main workstation uses Manjaro, which is based on Arch Linux— but not quite Arch Linux. Nonetheless, chroot would be able to do the job just fine.
Now, if you don't know what 'chroot' is, it is a command you should definitely learn if you plan on becoming a system administrator, or messing around with Linux machines in general. The great thing about 'chroot' is that it allows you to live boot into the host, and the command switches the root directory to whatever mount point you set it to. In my case, the commands looked a little something like this:
Now this might look a little confusing if you never messed with partitions on Linux before, but I will try to explain this as simply as possible for those uninitiated with this form of technomancy.
The listed drive above (nvme0n1) is my NVMe drive that contains all of my precious files that are currently locked behind my currently unbootable machine. Now, the drive has 2 partitions 'nvme0n1p1' and 'nvme0n1p2'. The partition responsible for holding my root directory is 'nvme0n1p2' as it holds roughly 931 GiB of data. The partition responsible for holding my EFI bootloader is the other partition by process of elimination. You can also tell that it is the EFI partition, because it holds 300MiB of data,which is the default EFI size on Manjaro Linux. So what you have to do in this instance is mount the root directory first into the live boot's /mnt directory, and then mount the EFI directory into the /boot directory of our target root directory— in this case /mnt/boot. After all is said and done, all I have to do is do the following.
and voila, I am now able to fix my oopsies.
Finding the culprit
Now here's where all the time, and headaches came in. I had spent at least an hour trying to reinstall the grub bootloader to no avail. If you were sitting behind me, watching me fix this you would've saw these commands over and over again on the blank tty:
Reboot after reboot, after reboot— and nothing. Still that same damn screen. I had no clue that I accidentally enabled systemd-boot until I looked at my command history and saw the culprit command: bootctl enable
which is when I literally just put my head down on my desk, and nearly smacked myself silly.
Oh my god, is it finally over? I rebooted and was met with an entire drive missing inside of my firmware settings.
I was freaking out. "Where the hell did my drive go? I literally just reinstalled grub like 40 million times". Then it hit me, the reason why there was no drive was because there was no bootloader. So I quickly did a reinstall of grub and now everything is ok, sort of.
Aftermath
So far, the important things work. I can boot into my Linux partition with no issue, but now I can't access my other kernel images, nor access the menu for grub— but that is an issue for me to solve tomorrow. At the time of writing this, it is almost 4:40am and I have been working on this issue since 12:30am. The worst part of it is that I didn't even get secure boot working yet, so that has to be fixed WITH the menu problem tomorrow. (Yes, I tried editing /etc/default/grub, and remaking grub's config file, that isn't working for some weird reason.)
EDIT:
I got the grub menu working again. For some reason, I have a new /boot/etc/grub/grub.cfg file that has to be overwritten each time that I have a new grub configuration. I might symlink the original file /boot/grub/grub.cfg to this file, but I don't want any more headaches. If it works, I'm leaving it as is.