Thoughts??
enumerator4829 @ enumerator4829 @sh.itjust.works Posts 0Comments 101Joined 6 mo. ago
R, the language where dependency resolution is built upon thoughts and prayers.
Say what you want about Excel, but compatibility is kinda decent (ignoring locales and DNA sequences). Meanwhile, good luck replicating your R installation on another machine.
the H200 has a very impressive bandwith of 4.89 TB/s, but for the same price you can get 37 TB/s spread across 58 RX 9070s, but if this actually works in practice i don't know.
Your math checks out, but only for some workloads. Other workloads scale out like shit, and then you want all your bandwidth concentrated. At some point you’ll also want to consider power draw:
- One H200 is like 1500W when including support infrastructure like networking, motherboard, CPUs, storage, etc.
- 58 consumer cards will be like 8 servers loaded with GPUs, at like 5kW each, so say 40kW in total.
Now include power and cooling over a few years and do the same calculations.
As for apples and oranges, this is why you can’t look at the marketing numbers, you need to benchmark your workload yourself.
Well, a few issues:
- For hosting or training large models you want high bandwidth between GPUs. PCIe is too slow, NVLink has literally a magnitude more bandwidth. See what Nvidia is doing with NVLink and AMD is doing with InfinityFabric. Only available if you pay the premium, and if you need the bandwidth, you are most likely happy to pay.
- Same thing as above, but with memory bandwidth. The HBM-chips in a H200 will run in circles around the GDDR-garbage they hand out to the poor people with filthy consumer cards. By the way, your inference and training is most likely bottlenecked by memory bandwidth, not available compute.
- Commercially supported cooling of gaming GPUs in rack servers? Lol. Good luck getting any reputable hardware vendor to sell you that, and definitely not at the power densities you want in a data center.
- TFLOP16 isn’t enough. Look at 4 and 8 bit tensor numbers, that’s where the expensive silicon is used.
- Nvidias licensing agreements basically prohibit gaming cards in servers. No one will sell it to you at any scale.
For fun, home use, research or small time hacking? Sure, buy all the gaming cards you can. If you actually need support and have a commercial use case? Pony up. Either way, benchmark your workload, don’t look at marketing numbers.
Is it a scam? Of course, but you can’t avoid it.
Your numbers are old. If you are building today with anyone ad much as mentioning AI, you might as well consider 100kW/rack as ”normal”. An off-the-shelf CPU today runs at 500W, and you usually have two of them per server, along with memory, storage and networking. With old school 1U pizza boxes, that’s basically 100kW/rack. If you start adding GPUs, just double or quadruple power density right off the bat. Of course, assume everything is direct liquid cooled.
Very easy to solve - just make the entire IPv6 address space have low reputation. (/s)
Disempower users until they stop leaking leaking data.
Infantilise users until they stop clicking random links in shitty phishing emails.
Disempower power users until they can’t create security incidents by running shittily patched shadow IT on random open ports.
If you don’t like it, don’t operate in organisations beholden to
- GDPR
- ISO 27001
- PCI-compliance
- NIS2
- IP range reputation
- Public reputation
At least for organisations. As a private individual, I want my wide open ports on a public static IP at home.
I kinda get why organisations don’t migrate.
IPv6 just hands you a bag of footguns. Yes, I want all my machines to have random unpredictable IPs. Having some extra additional link local garbage can’t hurt either, can it? Oh, and you can’t run exhaustive scans over your IP ranges to map out your infra.
I’m not saying people shouldn’t migrate, but large orgs like universities have challenges to solve, without any obvious upside to the cost. All of the above can be solved, but at a cost.
A few years ago my old university finally went with NAT instead of handing out public IPs to all servers, workstations and random wifi clients. (Yes, you got a public IP on the wifi. Behind a firewall, but still public.) I think they have a /16 and a few extra /24s in total.
I’ll just go ahead and start the flame war.
I totally agree with the functionality of systemd. We need that. But the implementation… Why the fuck do we need to cram everything into pid 1? At least delegate the parsing into another process, god damn. And could we all just agree that ’systemd-{networkd,resolved,homed}’ don’t really have a reason to exist, and definitely not that coupled to a fucking init system. Systemd-timers are wonderful, but why are we running cron-but-better in pid 1?
We have an init-system where the developers are afraid of using things like processes and separation of privileges. I’m just tired of patching fleets of servers in panic every time Pöttering’s bad design decisions hit the fan with their CVEs and consequences.
Exactly. The malware can do whatever, but as long as the TPM measurements don’t add up the drive will remain encrypted. Given stringent enough TPM measurements and config you can probably boot signed malware without yielding access to the encrypted data.
In my view, SecureBoot is just icing on the cake that is measured boot via TPM. Nice icing though.
True. Personally, I’m hoping for easier use of SecureBoot, TPM and encryption on Linux overall. People are complaining about BitLocker, but try doing the same on Linux. All the bits and pieces are there, but integrating everything and having it keep working through kernel upgrades isn’t fun at all.
For you? No. For most people? Nope, not even close.
However, it mitigates certain threat vectors both on Windows and Linux, especially when paired with a TPM and disk encryption. Basically, you can no longer (terms and conditions apply) physically unscrew the storage and inject malware and then pop it back in. Nor can you just read data off the drive.
The threat vector is basically ”our employees keep leaving their laptops unattended in public”.
(Does LUKS with a password mitigate most of this? Yes. But normal people can’t be trusted with passwords and need the TPM to do it for them. And that basically requires SecureBoot to do properly.)
VSCode is just Emacs with a weirder Lisp. (/s)
(You can tear my Emacs from my cold dead hands)
I’m quite fucking good at Linux. I’m fine with embracing open source, and I think Proton is the best thing ever.
I drew the line at audio, video and graphics on Linux, especially anything realtime.
I bought a MacBook for that. I feel dirty, but all my ”work” is done on remote Linux systems anyway, so my Mac just needs to provide an editor and a terminal emulator, and I can even make do with my editor over SSH given reasonable latencies. On the other hand, all my audio/video/graphics work flawlessly on MacOS, and that’s what I need locally.
Permanently Deleted
The thing is - wayland does kind of prevent it by forcing the GPU into the rendering pipeline far harder than Xorg. The GPU-assumptions throughout the code base(s) makes latency shoot through the roof when running software rendered. If you want decent latency, you need a GPU, and if you want to run multiuser you are going to pay Nvidia a shitton of money.
I can also imagine it’s hard (impossible?) to do performant damage tracking in a VNC server without implementing at least parts of the VNC server inside the compositor. This means that the compositor and VNC server gets tightly coupled by necessity. Choice will be limited. Would you like the bad DE with the good VNC server or the good DE with the bad VNC server? Bad damage tracking means shit latency and high bandwidth usage, or other tradeoffs. So even if someone managed to implement what I want on Wayland, it would most likely be limited to a single compositor and not a general solution allowing a free choice of compositor.
Best software suite I know of for it is Cendio Thinlinc, on top of TigerVNC. Free for up to 5 users. There are some others in the same niche. My recommendation would be to try Thinlinc on Rocky 9 or Ubuntu 24, and configure it to use XFCE. Mate, KDE, or Cinnamon, all work fine. Turn off compositing! Over a good WAN-link it feels mostly local unless playing fullscreen videos. On a LAN-link, the only thing giving it away is extra tearing and compression artifacts when playing youtube-videos fullscreen. Compared to many others solutions I have tried, the latency and ”immersion” is incredible.
As for me, I’ll try to never manage linux desktop fleets or remote desktops again.
Permanently Deleted
What I’ve seen of rustdesk so far is that it’s absolutely not even close to the options available for X. It replaces TeamViewer, not thin clients.
You would need the following to get viability in my eyes:
- Multiple users per server (~50 users)
- Enterprise SSO authentication, working kerberos on desktop
- Good and easily deployable native clients for Windows, Linux and Mac, plus html5 client
- Performant headless software rendered desktops
- GPU acceleration possible but not required
- Clustering, HA control plane, load balancing
- Configuration management available
This isn’t even an edge case. Current and upcoming regulations on information security drags the entire industry this way. Medical, research, defence, banking, basically every regulated landscape gets easier to work in when going down this route. Close to zero worries about endpoint security. Microsoft is working hard on this. It’s easy to do with X. And the best thing on Wayland is RustDesk? As stated earlier, these issues were brought up and discarded as FUD in 2008, and here we are.
Wayland isn’t a better replacement, after 15 years it’s still not a replacement. The Wayland implementations certainly haven’t been rushed, but the architecture was. At this point, fucking Arcan will be viable before Wayland.
Permanently Deleted
Exactly my point. The issues people consider ”solved” with wayland today will be solved in production in 3-5 years.
People are still running RHEL 7, and Wayland in RHEL 9 isn’t that polished. In 4-5 years when RHEL 10 lands, it might start to be usable. Oh right, then we need another few years for vendors to port garbage software that’s absolutely mission critical and barely works on Xorg, sure as fuck won’t work in xwayland. I’m betting several large RHEL-clients will either remain on RHEL8 far past EOL or just switch to alternative distros.
Basically, Xorg might be dead, but in some (paying commercial) contexts, Wayland won’t be a viable option within the next 5-10 years.
Permanently Deleted
Yeah, the few thousand users I managed desktops for will remain on X for the next few years last I heard from my old colleagues.
Because of my points above
But good that your laptop works now and that I can help my grandma over teamviewer again.
I have fucked around enough with R’s package management. Makes Python look like a god damn dream. Containers around it is just polishing a turd. Still have nightmares from building containers with R in automated pipelines, ending up at like 8 GB per container.
Also, good luck getting reproducible container builds.
Regarding locales - yes, I mentioned that. Thats’s a shitty design decision if I ever saw one. But within a locale, most Excel documents from last century and onwards should work reasonably well. (Well, normal Excel files. Macros and VB really shouldn’t work…). And it works on normal office machines, and you can email the files, and you can give it to your boss. And your boss can actually do something with it.
I also think Excel should be replaced by something. But not R.