Gyroplast @ Gyroplast @pawb.social

Posts

0
Comments

39
Joined

1 yr. ago

16h ago

Microwave Intensifies

Jump

I still have a soft spot for troll physics. Needs more magnets, though.

4d ago

Any more?

Jump

This reminds me of the tale of the coder tasked to write an input validator for IPv4 addresses. Poor bastard.

Another fun one: 0177.042.017.066

PSA: Don't zero-pad your IPv4 octets. Decimal is for simpletons.

4d ago

Any more?

Jump

Yes. 127.0.0.0/8 is reserved IPv4 address space for Loopback. It is perfectly valid, and occasionally useful, to use other loopback addresses that are functionally identical, like 127.0.1.1 or 127.0.0.53, which carry semantic information for the initiated, like "53? Must be DNS-related, obviously!"

4d ago

Any more?

Jump

2134206969

5d ago

USA 🇺🇸 USA 🇺🇸 USA

Jump

That's just science as applied by engineers.

6d ago

New Motherboard with pre-existing Linux Install can't be found?

Jump

You can fix this, and this certainly isn't "basic".

I figure you didn't see your Linux option, because your EFI boot variables (boot order and boot loader locations) are stored "on the mainboard", and you didn't manage to reset those on the new board according to your current layout. Windows still boots, as it installs its bootloader in a generic location intended for removable devices, instead of properly registering itself with EFI boot variables. Because of course they do.

To fix this, I'd recommend a deep breath first, and then set BIOS to UEFI boot only, no CSM/legacy at all for now, no even as fallback. If you're lucky, you can boot into your Linux system from the BIOS boot menu right now, and skip the archiso boot and chroot shenanigans. Have a look. Otherwise boot into the archiso as you did before.

Identify your EFI system partition(s) (ESP) Run sfdisk -l /dev/nvme0n1 and sfdisk -l /dev/nvme1n1, note the "EFI System" type partitions. Ideally, there's only one at nvme1n1p3. Multiple ESPs would be trickier, but let's assume your singular ESP is nvme1n1p3. Have a look at the ESP, to understand its layout and confirm this is really what you're looking for: mkdir /esp, mount /dev/nvme1n1p3 /esp, find /esp. You should find the Windows bootloader at EFI/Boot/bootx64.efi and EFI/Microsoft/Boot/bootmgfw.efi.

You might also find your grub bootloader in a subdirectory like EFI/arch/grubx64.efi. Find all of your instances with find /esp -iname grubx64.efi, and note the paths. If you find multiple grubx64.efi, I'd recommend to pick only the newest file, unless you know for a fact which one is supposed to be the one you want to use. A ls -l <file> gives you the date of the file to check. If you don't have any grub bootloader installed, yet, that's fine, too.

Create a boot entry with grub in a chrooted system If you can arch-chroot into your Linux system, make sure your ESP is mounted in your chroot as well, let's say at /esp again, and your /boot directory must be mounted, too, otherwise grub-install will fail. Then grub-install --efi-directory=/esp should do its magic just fine. Use efibootmgr to display/edit the EFI boot variables, and check if the entries in there look correct. The ESP will be referenced by UUID, and the list will look pretty busy, but you should recognize the EFI paths, and your arch entry should be the BootCurrent value. Make sure you've got a /boot/grub/grub.cfg in place, otherwise grub won't do you much good! Create one with grub-mkconfig -o /boot/grub/grub.cfg from within your chroot, after editing /etc/default/grub if you need to add any kernel arguments for your system. Usually you do not, so fire away.

If chrooting doesn't work for you, btrfs can be a little tricky, you should be able to install grub with the ESP and boot partition mounted alone in the archiso. nvme0n1p1 looks like your boot partition, so it'd go down like this:

    
mkdir /esp /linuxboot
mount /dev/nvme1n1p3 /esp
mount /dev/nvme0n1p1 /linuxboot
grub-install --efi-directory=/esp --boot-directory=/linuxboot

You can pacman -S grub if grub-install isn't available on archiso, yet. Make sure you've got a /linuxboot/grub/grub.cfg in place here as well. Unfortunately you cannot use grub-mkconfig effectively without the chroot, but if you are at this point, and you're dropped into the grub rescue shell, you can try a minimal, lovingly handcrafted grub.cfg: You need to obtain the UUIDs of your ESP, boot and root partition for the menuentries, and replace the placeholders with your values. You can get those values with lsblk -oNAME,UUID /dev/nvme1n1p3 /dev/nvme0n1p1 /dev/nvme0n1p2, in this order (ESP, BOOTPART, ROOTPART UUID). I assume your root subvolume is named root, and your kernel is named vmlinuz-linux on the boot partition, with a vmlinuz-linux.img initfs. You should adapt these filenames in the grub.cfg if they are different, of course, but I think this is a pretty good guess. :)

    
insmod part_gpt
insmod part_msdos
set default="0"
if [ x"${feature_menuentry_id}" = xy ]; then
  menuentry_id_option="--id"
else
  menuentry_id_option=""
fi
export menuentry_id_option
function load_video {
  if [ x$feature_all_video_module = xy ]; then
    insmod all_video
  else
    insmod efi_gop
    insmod efi_uga
    insmod ieee1275_fb
    insmod vbe
    insmod vga
    insmod video_bochs
    insmod video_cirrus
  fi
}
terminal_input console
terminal_output console
set timeout=5
menuentry 'Arch Linux' --class arch --class gnu-linux --class gnu --class os $menuentry_id_option 'gnulinux-simple' {
        load_video
        set gfxpayload=keep
        insmod gzio
        insmod part_gpt
        insmod ext2
        search --no-floppy --fs-uuid --set=root <BOOTPART UUID>
        echo    'Loading Linux linux ...'
        linux   /vmlinuz-linux root=UUID=<ROOTPART UUID> rw  rootflags=subvol=root
        echo    'Loading initial ramdisk ...'
        initrd  /initramfs-linux.img
}
if [ "$grub_platform" = "efi" ]; then
  insmod bli
fi
if [ "$grub_platform" = "efi" ]; then
menuentry 'Windows Boot Manager --class windows --class os $menuentry_id_option 'osprober-efi' {
        insmod part_gpt
        insmod fat
        search --no-floppy --fs-uuid --set=root <ESP UUID>
        chainloader /EFI/Microsoft/Boot/bootmgfw.efi
}
fi

Let's see how this goes.

2w ago

My version of a for loop

Jump

Wow.. I.. I did not expect such density of triggers in a single panel. Trolling truly is a art. The more you look, the worse it gets. I love it.

2mo ago

You can't "skill issue" yourself out from every situation

Jump

They were holding it wrong, obviously.

2mo ago

One-liner

Jump

What AI-generated, non-working, obviously incorrect garbage is this? Also, you want to define this as an alias to type the command 33% faster, too!

    
alias fc='ffmpeg -c copy -map 0:0 -f data - 2>/dev/null -i '

Amateurs.

2mo ago

Realistic

Jump

In some retirement homes, we hear feeble cries for justice, lamenting "source tarballs are even cross-platform, just build yourself already as intended", but nobody received that suggestion from their AI assistant, just a list of packaging services you should subscribe to instead.

2mo ago

Jimmy Neutron was a sciencememes poster

Jump

NaCl (Technical Grade)

2mo ago

Instructions on how to draw a cat

Jump

s/s/iss/g

This is the mostestest blursed things I am going to see today. True art.

2mo ago

The Magic Cookie

Jump

"I'm not alone here. Gotta setup a k-line to this surfer's cyberjack and ice his codes before he cracks my firewalls!"

2mo ago

Unexplained reboots and kernel panic

Jump

I'm in. $10 on "this reported kernel panic is not resolved by any change to which nvidia kernel driver is loaded, patched or not, or how anything pertaining nvidia is configured".

nvidia is at fault for many issues, agreed, but not this one.

2mo ago

[SOLVED] System freezes at boot and I'm not sure if it's a software or hardware problem

Jump

You could increase verbosity, and try working up your way from booting a bare minimum, to see when the system hangs, and if it persistently hangs at the same time, in the same way.

My usual go-to is to add debug apic=debug init=/bin/sh vga=0 nomodeset acpi=off to kernel boot arguments and see if I consistently drop into the bare initramfs shell that way, without switching to any framebuffer graphics mode, while also avoiding potential ACPI breakage that may manifest as early boot freezes. Yes, vga=0 is legacy BIOS only, feel free to skip that one if you're booting UEFI. This is not likely to avoid your problem, anyway.

If that works, remove the arguments, from the right, one after another, to re-enable ACPI, then KMS, then automatic framebuffer console setting. If you're still going, change init=/bin/sh to emergency, then to rescue, then remove to boot normally, always with excessive debug output. At that point, boot should freeze again, as you've only increased verbosity. The messages leading up to the freeze should always give a hint as to what subsystem might be worth looking into further - be it a specific module that freezes, which can subsequently be blacklisted by kernel parameter, for example. Let the system tell you its woes before stabbing at its parts randomly.

This does not assume you having a software fault. This procedure uses the kernel init and following boot process as diagnostics, in a way. Unfortunately, it is pretty easy to miss output that is "out of the ordinary" if you're not used to how a correct boot is supposed to look like, but the info you need is typically there. I typically try this before unplugging all optional hardware, but both approaches go hand in hand, really. I've found in modern, highly integrated systems, there's just not that much available to unplug anymore that would make a difference at boot time, but the idea is still sound.

If this becomes involved, you might want to look into using netconsole to send the kernel messages somewhere else to grab with netcat, and store them in a plain text file to post here for further assistance. You might just get a good hint when reading the debug kernel messages yourself already, though!

EDIT: If those two colorful, pixely dotted lines in the lower half of your literal screen shot happen to flicker into view during boot somewhat consistently right before freezing, my gut feeling says it's likely a graphics-related issue. You might want to short-circuit your tests by trying only debug nomodeset, a more brutal debug nomodeset module_blacklist=amdgpu,radeon, or replacing your GPU with a known good model, as suggested.

2mo ago

Unexplained reboots and kernel panic

Jump

Do I need to run the machine for longer or should it have crashed right away according to your hypothesis?

Sorry for mudding the waters with my verbosity. It should not crash anymore. I believe your kernel panic was caused when an idle CPU 6 was sent to sleep. Disabling C-states, or limiting them to C0 or C1, prevents your CPUs from going into (deep) sleep. Thusly, by disabling or limiting c-states, a kernel panic should not happen anymore.

I haven't found a way to explicitly put a core into a specific c-state of your choosing, so best I can recommend now is to keep your c-states disabled or limited to C1, and just normally use your computer. If this kernel panic shows up again, and you're sure your c-state setting was effective, then I would consider my c-state hypothesis as falsified.

If, however, your system runs normally for a few days, or "long enough for you to feel good about it" with disabled c-states, that would be a strong indication for having some kind of issue when entering deeper sleep modes. You may then try increasing the c-state limit again until your system becomes unstable. Then you know at least a workaround at the cost of some loss of power savings, and you can try to find specific issues with your CPU or mainboard concerning the faulty sleep mode on Linux.

Best of luck!

2mo ago

Unexplained reboots and kernel panic

Jump

No, it is not. There is an issue with the installed GPU not being supported by the initializing driver, but this is entirely irrelevant for the reported fault and panic happening more than 1600 seconds later.

Or would you argue the NIC is 100% the issue, because r8169 0000:04:00.0 enp4s0: Link is Down is literally right in the logs?

2mo ago

Unexplained reboots and kernel panic

Jump

Screen freezes should also leave traces in your syslog, if they're caused by any panic or GPU driver issue. You might want to check if your system is still accessible via SSH, if only the screen froze, and try killing X from there, if switching to text VTs doesn't work. SysRq might become helpful, too.

2mo ago

Unexplained reboots and kernel panic

Jump

screen froze, and I was forced to reboot the PC by pressing the power button for 3s

seems like some data was saved, while other files were discarded

I would not worry too much about a somehow "forgetful" file system immediately after a hard power cycle. This is exactly what happens if data could not be flushed to disk. Thanks to journaling, your FS does not get corrupted, but data lingering in caches is still lost and discarded on fsck, to retain a consistent fs. I would recommend to repeat the installations you did before the crash, and maybe shove a manual sync behind it, to make sure you don't encounter totally weird "bugs" with man later, when you don't remember this as a cause anymore. Your bash history is saved to file on clean shell exit only, and is generally a bit non-intuitive, especially with multiple interactive shells in parallel, so I would personally disregard the old .bash_history file as "not a fault, only confusing" and let that rest, too.

Starting a long SMART self-test and a keen eye on the drive's error logs (smartctl -l error <drive>), or better yet, all available SMART info (-x), to see if anything seems fishy with your drive is a good idea, anyway. Keep in mind that your mainboard / drive controller or its connection may just as well be (intermittently) faulty. In ye olden times, a defective disk cable or socket was messing up my system once or twice. You will see particular faults in your syslog, though - this is not invisible. You don't only get a kernel panic without some sprinkling of I/O errors as well. If your drive is SMART-OK, but you clearly get disk I/O errors, time to inspect and clean the SSD socket and contacts and re-seat once more. If you never saw any disk I/O errors, and your disk's logs are clean, I'd consider the SSD as not an issue.

If you encouter random kernel panics, random as in "in different and unrelated call stacks that do not make sense in any other way", I agree that RAM is a likely culprit, or an electrical fault somewhere on the mainboard. It's rare, but it happens. If you can, replace (only) the mainboard, or better yet, take a working PC with compatible parts, and replace the working MBO with your suspected broken one to see if the previously working machine now faults. "Carrying the fault with you" is easier/quicker than proving an intermittent fault gone.

Unless you get different kernel panics, my money's still on your c-states handling. I'd prefer the lowest level you can find to inhibit your CPUs from going to sleep, i. e. BIOS > kernel boot args > sysctl > cpupower, to keep the stack thin. If that is finnicky somehow, you could alternatively boot with a single CPU and leave the rest disabled (bootarg nosmp). The point is just to find out where to focus your attention, not to keep this as a long-term workaround.

To keep N CPUs running, I usually just background N infinite loops in bash:

    
$ cpus=4; for i in $(seq 1 $cpus); do { while true; do true; done; } & done 
[1] 7185
[2] 7186
[3] 7187
[4] 7188

In your case you might change that to:

    
cpus=4; for i in $(seq 0 $((cpus - 1))); do { taskset -c $i bash -c 'while true; do sleep 1; done'; } & done

To just kick each CPU every second, it does not have to be stressed. The taskset will bind each loop to one CPU, to prevent the system from cleverly distributing the tiny load. This could also become a terrible, terrible workaround to keep running if all else fails. :)

3mo ago

Unexplained reboots and kernel panic

Jump

Looking at the call trace:

    
[ 1641.073507] RIP: 0010:rb_erase+0x199/0x3b0
...
[ 1641.073601] Call Trace:
[ 1641.073608]  <TASK>
[ 1641.073615]  timerqueue_del+0x2e/0x50
[ 1641.073632]  tmigr_update_events+0x1b5/0x340
[ 1641.073650]  tmigr_inactive_up+0x84/0x120
[ 1641.073663]  tmigr_cpu_deactivate+0xc2/0x190
[ 1641.073680]  __get_next_timer_interrupt+0x1c2/0x2e0
[ 1641.073698]  tick_nohz_stop_tick+0x5f/0x230
[ 1641.073714]  tick_nohz_idle_stop_tick+0x70/0xd0
[ 1641.073728]  do_idle+0x19f/0x210
[ 1641.073745]  cpu_startup_entry+0x29/0x30
[ 1641.073757]  start_secondary+0x11e/0x140
[ 1641.073768]  common_startup_64+0x13e/0x141
[ 1641.073794]  </TASK>

What's happening here leading up to the panic is start_secondary followed by cpu_startup_entry, eventually ending up in CPU idle time management (tmigr), giving a context of "waking up/sleeping an idle CPU". I've had a few systems in my life where somewhat aggressive power-saving settings in the BIOS were not cleanly communicated to Linux, so to say, causing such issues.

This area is notorious for being subtly borked, but you can test this hypothesis easily by either disabling a setting akin to "Global C States" in your BIOS, which effectively disables power-saving for your CPUs, or try an equivalent setting of the kernel arguments processor.max_cstate=1 intel_idle.max_cstate=0, or even a cpuidle.off=1.

This is obviously losing your power-saving capability of the CPUs, but if your system runs stable that way, you're likely in the right ballpark and find a specific solution for that issue, possibly in a BIOS/Fimware update. Here's a not too shabby gist roughly explaining what c-states are. Don't read too many of the comments, they're more confusing than enlightening.

The kernel docs I linked to above are comprehensive, and utterly indecipherable for a layperson. Instead of fumbling about in sysfs, try the cpupower tool/package to visualize the CPU idle settings, and try increasing enabled idle states until your system crashes again, to find out if a specific (deep) sleep state triggers your issue, and disable exactly that if you cannot find a bugfix/BIOS update.

If this is your problem, to reproduce the panic, try leaving your system as idle as possible after bootup. If a panic happens regularly that way, try starting processes exercising all your CPUs - if the hypothesis holds, this should not panic at any time, as no CPU is ever idle.

Gyroplast @ Gyroplast @pawb.social Posts 0Comments 39Joined 1 yr. ago

Gyroplast @ Gyroplast @pawb.social

Posts

0
Comments

39
Joined

1 yr. ago