Skip Navigation

Posts
7
Comments
148
Joined
1 yr. ago

  • If you're running a small server it's pretty straightforward and hands-off. It's only when you get to the scale of these larger instances that you run into issues. I just rerun ansible every time there's an update and it otherwise just manages itself.

  • My understanding is that the copyright applies to reproductions of the work, which this is not. If I provide a summary of a copyrighted summary of a copyrighted work, am I in violation of either copyright because I created a new derivative summary?

  • Just because it's not using your personal preference of containerization doesn't qualify it as being "hacked together". Docker is a perfectly acceptable solution for what Lemmy is.

  • Quoting this comment from the HN thread:

    On information and belief, the reason ChatGPT can accurately summarize a certain copyrighted book is because that book was copied by OpenAI and ingested by the underlying OpenAI Language Model (either GPT-3.5 or GPT-4) as part of its training data.

    While it strikes me as perfectly plausible that the Books2 dataset contains Silverman's book, this quote from the complaint seems obviously false.

    First, even if the model never saw a single word of the book's text during training, it could still learn to summarize it from reading other summaries which are publicly available. Such as the book's Wikipedia page.

    Second, it's not even clear to me that a model which only saw the text of a book, but not any descriptions or summaries of it, during training would even be particular good at producing a summary.

    We can test this by asking for a summary of a book which is available through Project Gutenberg (which the complaint asserts is Books1 and therefore part of ChatGPT's training data) but for which there is little discussion online. If the source of the ability to summarize is having the book itself during training, the model should be equally able to summarize the rare book as it is Silverman's book.

    I chose "The Ruby of Kishmoor" at random. It was added to PG in 2003. ChatGPT with GPT-3.5 hallucinates a summary that doesn't even identify the correct main characters. The GPT-4 model refuses to even try, saying it doesn't know anything about the story and it isn't part of its training data.

    If ChatGPT's ability to summarize Silverman's book comes from the book itself being part of the training data, why can it not do the same for other books?

    As the commentor points out, I could recreate this result using a smaller offline model and an excerpt from the Wikipedia page for the book.

  • I don't know that 1password should be on that list. The first two are free and open source. The last one is paid and proprietary.

    Don't put your credentials in the hand of a company that requires you to trust them to not fuck up. Everyone thought LastPass was great until they weren't

  • If you're using a password manager you'd be doing this for every site and without even having to think about it. Bitwarden is a great choice.

  • To me there’s a bunch of red flags, but I can’t put my finger on what I reckon they’re flagging.

    Let's start with the fact that the only way to participate currently is to make a "donation" for which you then receive a passphrase which will allegedly give you access once this thing releases. That release presumably depends on them receiving enough of these "donations". They then instruct you to go hype this to your social network, no doubt with the goal of convincing them to donate.

    Then once you move past that there's the fact that they're claiming this platform operates on a "custom blockchain". If I'm to take that at face value it means they're spinning up their own chain for which there will be an incredibly limited number of nodes. Even if you have users running their own nodes this is going to result in centralization out of the gate and would only be anywhere near decentralized if enough active users of the network (which isn't out yet) decide to turn into network operators. In other blockchains this is done using economic incentives because running these types of networks is neither financially or technically trivial. This no doubt means there would eventually be a Panquake token. 🥞🚀🌙!!

    We've seen this model before and if the network gains any traction it results in a handful of supernodes controlled either by the central entity or a cabal of entities associated with them.

    They could have saved themselves a ton of time and engineering hours by docking themselves to Ethereum either as a rollup or even by using existing Layer 2 networks. Then they would have inherited the decentralization and security guarantees that network provides and additionally opened the opportunity to market to the participants of that network.

    Then setting all of that aside there is the more obvious question of why a social network needs to be built on a blockchain. What part of an append-only ledger of immutable records aligns with the operational requirements of a social network? The only overlap between the needs of the two is decentralization, but ActivityPub already exists.

    Every single part of this looks like people trying to create monetization from a solution that doesn't solve the problem it claims to address, while going about all of it in the worse way possible. And all of that assumes this will ever see the light of day rather than just running off into the sunset with those "donations", which conveniently creates a form of legal protection since you gave that money under the pretense of supporting an effort without a clearly defined expectation of a deliverable.

    TLDR: Avoid