I can’t claim to know what the designers intended, but having users spread across a large numbers of servers is terribly inefficient for how Lemmy works: each server maintains a copy of each community that it’s users are subscribed to, and changes to those communities need to be communicated across each of those instances.
Given this architecture, it is much more efficient and robust to have users concentrate on what are effectively high performance cacheing servers, and communities spread out on smaller, interest focused instances.
That’s pretty much my thinking, though there is an advantage that having a large number of users on an instance amplifies it’s caching effect, though as you say - if their interests are too far spread, that effect is diminished.
Exactly. The mechanisms needed to implement it are there, but I don’t think the devs are interested in much more than making it more stable and robust right now.
Not harmful, but I would agree that the network seems optimized for a small number of user-focused servers, and a large number of community-focused servers.
Why do people insist that there needs to be (for example) /c/politics on every instance? Really, there are only 3 or 4 with any substantial traffic, and there are good reasons to pick one over the others, and they are the same good reasons for them to be separate.
There is a cross post feature, and the resuting post appears to be aware it was cross posted - it would be nice if Lemmy would consolidate those to one post that appears in multiple communities, or at least show you only one of them.
My recollection is that access to Usenet requires a paid feed. Anyone can spin up a Lemmy server if they are willing to deal with the administrative hassle.
AI content isn’t watermarked, or detection would be trivial. What he’s talking about is that certain words have a certain probability of appearing after certain other words in a certain context. While there is some randomness to the output, certain words or phrases are unlikely to appear because the data the model was based on didn’t use them.
All I’m saying is that the more a writer’s writing style and word choice are similar to the data set, the more likely their original content would be flagged as AI generated.
Here's the thing though - the probabilities for word choice come from the data the model was trained on. While someone that uses a substantially different writing style / word choice than the LLM could easily be identified as being not from the LLM, someone with a similar writing style might be indistinguishable from the LLM.
Or, to oversimplify: given that Reddit was a large portion of the input data for ChatGPT, all you need to do is write like a Redditor to sound like ChatGPT.
I think this is essentially the answer to OP - Cats understand the concept of feeding their family, and eventually figure out that the person is a pretty effective provider.
If it could, it couldn’t claim that the content out produced was original. If AI generated content were detectable, that would be a tacit admission that it is entirely plagiarized.
Get rid of the senate. It is the US aristocracy, anti democratic, and serves no useful purpose.
Require the house to have more votes (or a supermajority, whichever is less) to repeal a law than were needed to pass it. Edit: this reduces the effect of instability that removing the Senate would produce, while allowing the House to respond quickly to injustice.
Require the House to pass a budget once per term. If they (and the president) can’t pass a budget, the session ends, and they all (including the president) go up for re-election.
I’d say congress should pick the president, but that would tip my hand that I think Parliament is a better system of government.
The base assumption of those with that argument is that an AI is incapable of being original, so it is "stealing" anything it is trained on. The problem with that logic is that's exactly how humans work - everything they say or do is derivative from their experiences. We combine pieces of information from different sources, and connect them in a way that is original - at least from our perspective. And not surprisingly, that's what we've programmed AI to do.
Yes, AI can produce copyright violations. They should be programmed not to. They should cite their sources when appropriate. AI needs to "learn" the same lessons we learned about not copy-pasting Wikipedia into a term paper.
I can’t claim to know what the designers intended, but having users spread across a large numbers of servers is terribly inefficient for how Lemmy works: each server maintains a copy of each community that it’s users are subscribed to, and changes to those communities need to be communicated across each of those instances.
Given this architecture, it is much more efficient and robust to have users concentrate on what are effectively high performance cacheing servers, and communities spread out on smaller, interest focused instances.