In the space of 1 week, a second open-source Chinese AI model equals the best investors are pouring tens of billions of dollars into.
BetaDoggo_ @ BetaDoggo_ @lemmy.world Posts 1Comments 283Joined 2 yr. ago
BetaDoggo_ @ BetaDoggo_ @lemmy.world
Posts
1
Comments
283
Joined
2 yr. ago
For a 16k context window using q4_k_s quants with llamacpp it requires around 32GB. You can get away with less using smaller context windows and lower accuracy quants but quality will degrade and each chain of thought requires a few thousand tokens so you will lose previous messages quickly.