Skip Navigation

User banner
Posts
15
Comments
661
Joined
2 yr. ago

  • I use firefox, I mostly like it, but it still doesn't support chromium style tab groups (no, that one extension is not similar), and its webgpu implementation also doesn't work on most websites more than a year after Google made their version available by default

  • Read the response from the thing that read the Arch wiki

  • yea, thinkpads aren't the only laptops that can be bought used

  • I've been looking at the paper, some things about it:

    • the paper and article are from 2021
    • the model needs to be able to use optional data from age, family history, etc, but not be reliant on it
    • it needs to combine information from multiple views
    • it predicts risk for each year in the next 5 years
    • it has to produce consistent results with different sensors and diverse patients
    • its not the first model to do this, and it is more accurate than previous methods
  • I actually thought that subplot was one of the more interesting ones

  • I think they would be good books if he took the whole plot and compressed it into 3, maybe 5 books. It’s just too long, too many pointless tangents, too many random characters to remember who may or may not reappear at some point in the next 10 books… as soon as you get to an interesting part it switches perspectives to the most boring events imaginable.

  • Is there even a protagonist? Yeah, I agree though.

  • It’s “set realistic playercount expectations”, not “we should shut our games’ servers down earlier”

  • you don't need shift right click to do either of those things on youtube, you can always right click on a thumbnail and get the normal menu, and if you right click twice on a video you get the normal menu

  • Yes, but 200 gb is probably already with 4 bit quantization, the weights in fp16 would be more like 800 gb IDK if its even possible to quantize more, if it is, you're probably better of going with a smaller model anyways

  • Also worth noting that the 200 gb is for fp4, fp16 would be more like 800 gb

  • PCIe will probably be the bottleneck way before the number of GPUs is, if you're planning on storing the model in ram. Probably better to get a high end server CPU.

  • I don't have access to llama 3.1 405b but I can see that llama 3 70b takes up ~145 gb, so 405b would probably take 840 gigabytes, just to download the uncompressed fp16 (16 bits / weight) model. With 8 bit quantization it would probably take closer to 420 gb, and with 4 bit it would probably take closer to 210 gb. 4 bit quantization is really going to start harming the model outputs, and its still probably not going to fit in your RAM, let alone VRAM.

    So yes, it is a crazy model. You'd probably need at least 3 or 4 a100s to have a good experience with it.

  • for that sort of thing you can look it up in an private window, where at least google will have to pretend to not be tracking you

  • If everybody else is doing the same thing, yeah.

  • next up: microsoft announces development of Bethesda's next game will be largely outsourced

  • there's someone who uploaded this same meme before the reddit thing but instead of 2000 it said 20

    edit: I was linked to the post a few months ago maybe but I can't find it now

  • Ok, i guess its just kinda similar to dynamic overclocking/underclocking with a dedicated npu. I don't really see why a tiny 2$ microcontroller or just the cpu can't accomplish the same task though.

  • Ram is slower than GPU VRAM, but that extreme slowdown is due to the bottleneck of the pcie bus that the data has to go through to get to the GPU.

  • there are some local genai music models, although I don't know how good they are yet as I haven't tried any myself (stable audio is one, but I'm sure there are others)

    also minor linguistic nitpick but LLM stands for 'language model' (you could maybe get away with it for pixart and sd3 as they use t5 for prompt encoding, which is an llm, i'm sure some audio models with lyrics use them too), the term you're looking for is probably 'generative'