Skip Navigation

Morphit @ Morphit @feddit.uk

Posts

4
Comments

288
Joined

2 yr. ago

12mo ago

Today I'm grateful I'm using Linux - Global IT issues caused by Crowdstrike update causes BSOD on Windows

It's not that clear cut a problem. There seems to be two elements; the kernel driver had a memory safety bug; and a definitions file was deployed incorrectly, triggering the bug. The kernel driver definitely deserves a lot of scrutiny and static analysis should have told them this bug existed. The live updates are a bit different since this is a real-time response system. If malware starts actively exploiting a software vulnerability, they can't wait for distribution maintainers to package their mitigation - they have to be deployed ASAP. They certainly should roll-out definitions progressively and monitor for anything anomalous but it has to be quick or the malware could beat them to it.

This is more a code safety issue than CI/CD strategy. The bug was in the driver all along, but it had never been triggered before so it passed the tests and got rolled out to everyone. Critical code like this ought to be written in memory safe languages like Rust.

12mo ago

The Deep

I can't find the Fanta can but they have a Budwiser and a Spam can. They both feature in a video clip about trash in the deep ocean.

12mo ago

Today I'm grateful I'm using Linux - Global IT issues caused by Crowdstrike update causes BSOD on Windows

I'd unsubscribe from !linux@lemmy.ml for a start.

I'm pretty sure this update didn't get pushed to linux endpoints, but sure, linux machines running the CrowdStrike driver are probably vulnerable to panicking on malformed config files. There are a lot of weirdos claiming this is a uniquely Windows issue.

12mo ago

CrowdStrike downtime apparently caused by update that replaced a file with 42kb of zeroes

IFERROR(;0)

Maybe they should use a more appropriate development tool for their critical security platform than Excel.

12mo ago

CrowdStrike downtime apparently caused by update that replaced a file with 42kb of zeroes

This error isn't intentionally crashing because of a security risk, though that could happen. It's a null pointer exception, so there are no static or runtime checks that could have prevented or handled this more gracefully. This was presumably a bug in the driver for a long time, then a faulty config file came and triggered the crashes. Better static analysis and testing of the kernel driver is one aspect, how these live config updates are deployed and monitored is another.

12mo ago

CrowdStrike downtime apparently caused by update that replaced a file with 42kb of zeroes

You can still catch the error at runtime and do something appropriate. That might be to say this update might have been tampered with and refuse to boot, but more likely it'd be to just send an error report back to the developers that an unexpected condition is being hit and just continuing without loading that one faulty definition file.

12mo ago

CrowdStrike downtime apparently caused by update that replaced a file with 42kb of zeroes

A page fault can be what triggers a catch, but you can't unwind what a loaded module (the Crowdstrike driver) did before it crashed. It could have messed with Windows kernel internals and left them in a state that is not safe to continue. Rather than potentially damage the system, Windows stops with a BSOD. The only solution would be to not allow code to be loaded into the kernel at all, but that would make hardware drivers basically impossible.

12mo ago

CrowdStrike downtime apparently caused by update that replaced a file with 42kb of zeroes

The driver is in kernel mode. If it crashes, the kernel has no idea if any internal structures have been left in an inconsistent state. If it doesn't halt then it has the potential to cause all sorts of damage.

12mo ago

CrowdStrike downtime apparently caused by update that replaced a file with 42kb of zeroes

I don't think the kernel could continue like that. The driver runs in kernel mode and took a null pointer exception. The kernel can't know how badly it's been screwed by that, the only feasible option is to BSOD.

The driver itself is where the error handling should take place. First off it ought to have static checks to prove it can't have trivial memory errors like this. Secondly, if a configuration file fails to load, it should make a determination about whether it's safe to continue or halt the system to prevent a potential exploit. You know, instead of shitting its pants and letting Windows handle it.

12mo ago

Happy International Blue Screen Day

This doesn't really answer my question but Crowdstrike do explain a bit here: https://www.crowdstrike.com/blog/technical-details-on-todays-outage/

These channel files are configuration for the driver and are pushed several times a day. It seems the driver can take a page fault if certain conditions are met. A mistake in a config file triggered this condition and put a lot of machines into a BSOD bootloop.

I think it makes sense that this was a preexisting bug in the driver which was triggered by an erroneous config. What I still don't know is if these channel updates have a staged deployment (presumably driver updates do), and what fraction of machines that got the bad update actually had a BSOD.

Anyway, they should rewrite it in Rust.

12mo ago

Happy International Blue Screen Day

Does anyone know how these Cloudstrike updates are actually deployed? Presumably the software has its own update mechanism to react to emergent threats without waiting for patch tuesday. Can users control the update policy for these 'channel files' themselves?

12mo ago

OneDrive deleted my files!

The switches do suck but they can usually be revived with contact cleaner. If you open the mouse you can spray around the switch plunger or better yet, pop off the top half of the switch case and spray the contact directly. That completely cleared up the double click on my G402 and even revived an old MX510 that was missing clicks.

1y ago

MSI warranty claim database was publicly accessible via Google

Or if the government sends you the social security numbers of every teacher in the state. Then you're a hacker for responsibly disclosing the issue:
Missouri gov. calls journalist who found security flaw a “hacker,” threatens to sue

1y ago

Las Vegas' dystopia-sphere, powered by 150 Nvidia GPUs and drawing up to 28,000,000 watts, is both a testament to the hubris of humanity and an admittedly impressive technical feat | PC Gamer

So build concentrated solar power and store the heat for after the sun sets. Bonus - thermal power plant turbines give inertia to the grid, which photo-voltaics don't.

1y ago

Home routing and encryption technologies are making lawful interception harder, Europol warns

Privacy Enhancing Technologies. Some obvious things giving anonymity and plausible deniability but also zero-knowledge proofs and such.

1y ago

This layout was made for speed

Because then it would be 'a;imodo not qazimodo.

1y ago

Chinese space firm unintentionally launches its new rocket

I don't think that was a malfunction...
That was 'working as intended'.

1y ago

NASA and Boeing say Starliner astronauts ‘are not stranded,’ but will be on the ISS for a few more weeks

Almost happened to Frank Rubio when the radiator blew on his ride.

1y ago

NASA and Boeing say Starliner astronauts ‘are not stranded,’ but will be on the ISS for a few more weeks

You could make a religion out of this.

1y ago

Permanently Deleted

wouldn’t changing it just end up performative

Exactly. Sidereal time does get rid of time zones and leap years, but it's still referenced to a single physical object and relies on a arbitrary choice of start point. So it doesn't create some perfect cosmic time standard.

The international date line doesn't help since that's just 180° offset from Greenwich itself.

The point of standards is that they can be followed by everyone. The AD/BC epoch is fine. The Greenwich meridian is fine. UTC is fine. Changing them would cause so much disruption that it cannot be worth it.

Daylight savings can go die in a ditch though.