I use the *arrs to make a well named hard link to the file in my media library right after the download completes. Then they can be removed from the torrent client after appropriate seeding time/ratio.
Sora can sometimes do 1 minute clips that mostly look ok as long as you don't pay too close attention. We are incredibly far away from coherent, feature-length narratives and even those aren't likely to be thematically interesting or engaging.
I don't see it as hypocritical at all. Public comments are, for me at least, put out for the public good. The same reason someone might license open source code with the MIT license. My issue with Reddit is that they restricted who can obtain the data and then privately sold them to only the highest bidder. They should be freely available to all who want to view them without restrictions on money or power.
But also Tim Cook's total compensation for 2022 was $99 million and Satya Nadella's 2023 was $48 million. Paying him more than CEOs of actually profitable companies and what amounts to nearly 1/4 of revenue is a pretty big outlier.
I wonder what the risks are to including deleted and pre-edited content in training data. Most of the edits are going to be typos and formatting, do you want 2-3 copies of the same message with typos in them for training data? Similarly, deleted comments are mostly nonsense, unhelpful, duplicate, or highly controversial things.
If someone wants to dig through and find individual users to restore that's one thing, but I don't think I'd immediately choose to train off of that other data unless I had to.
That's what finally did in my 10 year old Corsair. I was technically within specs on wattage with my new 4070 but certain loads would cause it to trip the over current protection anyway.
We made a tag that can't be reliably and deterministically scanned so we also included a machine learning model that takes a good guess at it.
I just don't see how you could possibly rely on a black box model for anything important. You have no way to mathematically prove if there are collisions in the model output or not, and newer versions of the model can't be made backwards compatible. So if you have a database of thousands of these tags scanned, then they discover a critical vulnerability and provide a new model, you're SOL and everything you have is worthless.
Even better, a community ground source heat pump. It can be impractical for a single household to invest in drilling deep wells or digging up a whole yard, but a community together could do so and get serious efficiency improvements from it.
I'm not a huge python fan but I'm pretty confused about point 3. What alternative languages have decent linear algebra support built in?