AdrianTheFrog

12mo ago

Firefox + Ublock = 👑

I use firefox, I mostly like it, but it still doesn't support chromium style tab groups (no, that one extension is not similar), and its webgpu implementation also doesn't work on most websites more than a year after Google made their version available by default

12mo ago

I don't think I'll continue using Arch, btw

Jump

Read the response from the thing that read the Arch wiki

12mo ago

Laptop recommendations

Jump

yea, thinkpads aren't the only laptops that can be bought used

12mo ago

Breast Cancer

Jump

I've been looking at the paper, some things about it:

the paper and article are from 2021
the model needs to be able to use optional data from age, family history, etc, but not be reliant on it
it needs to combine information from multiple views
it predicts risk for each year in the next 5 years
it has to produce consistent results with different sensors and diverse patients
its not the first model to do this, and it is more accurate than previous methods

12mo ago

What is the worst experience you've had reading a book?

Jump

I actually thought that subplot was one of the more interesting ones

12mo ago

What is the worst experience you've had reading a book?

Jump

I think they would be good books if he took the whole plot and compressed it into 3, maybe 5 books. It’s just too long, too many pointless tangents, too many random characters to remember who may or may not reappear at some point in the next 10 books… as soon as you get to an interesting part it switches perspectives to the most boring events imaginable.

12mo ago

What is the worst experience you've had reading a book?

Jump

Is there even a protagonist? Yeah, I agree though.

12mo ago

Palworld's community manager says our 'dead game' fixation is ruining gaming: 'I don't think it really serves anyone to push gamers to play the same game, day in and day out'

Jump

It’s “set realistic playercount expectations”, not “we should shut our games’ servers down earlier”

1y ago

Ruleclick

Jump

you don't need shift right click to do either of those things on youtube, you can always right click on a thumbnail and get the normal menu, and if you right click twice on a video you get the normal menu

1y ago

The Rule

Jump

Yes, but 200 gb is probably already with 4 bit quantization, the weights in fp16 would be more like 800 gb IDK if its even possible to quantize more, if it is, you're probably better of going with a smaller model anyways

1y ago

The Rule

Jump

Also worth noting that the 200 gb is for fp4, fp16 would be more like 800 gb

1y ago

The Rule

Jump

PCIe will probably be the bottleneck way before the number of GPUs is, if you're planning on storing the model in ram. Probably better to get a high end server CPU.

1y ago

The Rule

Jump

I don't have access to llama 3.1 405b but I can see that llama 3 70b takes up ~145 gb, so 405b would probably take 840 gigabytes, just to download the uncompressed fp16 (16 bits / weight) model. With 8 bit quantization it would probably take closer to 420 gb, and with 4 bit it would probably take closer to 210 gb. 4 bit quantization is really going to start harming the model outputs, and its still probably not going to fit in your RAM, let alone VRAM.

So yes, it is a crazy model. You'd probably need at least 3 or 4 a100s to have a good experience with it.

1y ago

Last Rule Tonight

Jump

for that sort of thing you can look it up in an private window, where at least google will have to pretend to not be tracking you

1y ago

Single point of failrule

Jump

If everybody else is doing the same thing, yeah.

1y ago

Bethesda Game Studios workers have unionized

Jump

next up: microsoft announces development of Bethesda's next game will be largely outsourced

1y ago

The likes the upvotes

Jump

there's someone who uploaded this same meme before the reddit thing but instead of 2000 it said 20

edit: I was linked to the post a few months ago maybe but I can't find it now

1y ago

Permanently Deleted

Jump

Ok, i guess its just kinda similar to dynamic overclocking/underclocking with a dedicated npu. I don't really see why a tiny 2$ microcontroller or just the cpu can't accomplish the same task though.

1y ago

Permanently Deleted

Jump

Ram is slower than GPU VRAM, but that extreme slowdown is due to the bottleneck of the pcie bus that the data has to go through to get to the GPU.

1y ago

Permanently Deleted

Jump

there are some local genai music models, although I don't know how good they are yet as I haven't tried any myself (stable audio is one, but I'm sure there are others)

also minor linguistic nitpick but LLM stands for 'language model' (you could maybe get away with it for pixart and sd3 as they use t5 for prompt encoding, which is an llm, i'm sure some audio models with lyrics use them too), the term you're looking for is probably 'generative'

AdrianTheFrog @ AdrianTheFrog @lemmy.world e Read more Posts 15Comments 661Joined 2 yr. ago

AdrianTheFrog @ AdrianTheFrog @lemmy.world

Posts

15
Comments

661
Joined

2 yr. ago