Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)NO
Posts
12
Comments
265
Joined
2 yr. ago

  • Maybe I misread it but this was the source of the 5T remark…

    https://news.ycombinator.com/item?id=38077521#38080442

    What we make available is:

    (A) the dataset after pre-processing the raw CommonCrawl data (e.g., text extraction and language identification) and some minimal filtering; and (B) for each document in (A), we also pre-computed 40+ of "features" (we call the "quality annotations") you can use to further filter it or deduplicate it. For example, one such feature is "how similar this document is to Wikipedia".

    (A) is around 30T tokens, but you might want to use features in (B) to further filter/dedup it down, e.g., to 5T. For example, if in your application documents similar to Wikipedia are the most helpful documents, you can take the top documents with the highest score for the feature "how similar this document is to Wikipedia". Of course, the really interesting case happens when you consider a larger subset of these features (or maybe even automatically learn what the best way of filtering it is).

    Our goal is to make this as flexible as possible such that you can fit this into your own application. What we have released is both (A) and (B)

    If you have any questions, please let us know! Thanks for your interests, have fun with the data!

  • Yes that’s the major selling point in the Rust language in my opinion. Memory safety. Most of the security issues you hear about are because of mismanaged memory, specifically buffer overflows. My understanding is that Rust reduces risk of those by catching them at compile time.

  • If they have the only copy and their datacenter goes belly up, lot of good it did to have the only remaining copy because now it’s lost to existence. Offsite backups and ideally by many different organizations is the only sure-fire way to preserve this stuff. I donate to archive.org because I believe on what they’re trying to do and I hope they can continue on as long as needed.

  • Things nearest the center would move towards the center at an accelerated rate. So observation from the perspective of an object falling in the black hole could be everything is expanding? Since everything is getting compressed as it goes toward the center. I’m not an expert on anything but it seems like an intriguing concept.

  • No need to apologize, but thank you all the same. I remember back when compiz first came out and had a rotating cube desktop (virtual desktops). I kind of wish that stuff would make a comeback.

  • As far as I’ve been able to tell, OP is interested in an OS for their desktop or laptop, not steam deck, so desktop is likely a valid concern of theirs.

    I agree the Steam Deck’s proprietary interface is decently polished.

  • The desktop environment is clunky at best and the default GUI is 100% proprietary. I own a Steam Deck and use it near daily in both modes. If that’s the most polished I shudder to think what you normally use. The other guy linked Nobara I think? From the home page it looks like a pretty decent option.