Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)CY
Posts
6
Comments
341
Joined
2 yr. ago

  • What are you talking about? RAG is a method you use. It only has limitations you design. Your datastore can be whatever you want it to be. The llm performs a tool use YOU define. RAG isn't one thing. You can build a rag system out of flat files or a huge vector datastore. You determine how much data is returned to the context window. Python and chromadb easily scales to gigabytes, on consumer hardware, completely suitable for local rag.

  • Nailed it. I've tried taking notification contexts and generally seeing how hard it is. Their foundational model, I think is 4bit quantized, 3billion parameter model.

    So I loaded up llama, phi, and picollm to run some unscientific tests. Honestly they had way better results than I expected. Phi and llama handled notification summaries (I modeled the context window, nothing official) and both performed great. I have no idea wtf AFM is doing, but it's awful.