Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)NO
Posts
89
Comments
149
Joined
2 yr. ago

  • it's a standardizing of a universal GGML format which would mean going forward no more breaking changes when new formats are worked on, and also includes the same functionality of llama.cpp for all GGML types (falcon, mpt, starcoder etc)

  • When you make a docker image and push it to dockerhub all of the instructions it took appear there so it's very transparent, also super easy for any person to build it themselves unlike executables, just download the Dockerfile and run a single command

  • Yeah it's a step in the right direction at least, though now that you mention it doesn't lmsys or someone do the same with human eval and side by side comparisons?

    It's such a tricky line to walk between deterministic questions (repeatable but cheatable) and user questions (real world but potentially unfair)

  • I have the snap installed, for what it's worth it's pretty painless AS LONG AS YOU DON'T WANT TO DO ANYTHING SILLY

    I've found it nearly impossible to alter the base behaviour and have it not entirely break, so if nextcloud out of the box does exactly what you want, go ahead and install it via snap...

    I predict that on docker you're going to have a bad time if you can't give it host network mode and try to just forward ports

    That said, docker >>>> VM in my books

  • I've managed to get it running in koboldcpp, had to add --forceversion 405 because it wasn't being detected properly, even with q5.1 I was getting an impressive 15 T/s and the code actually seemed decent, this might be a really good candidate for fine-tuning on large datasets and passing massive contexts of basically entire small repos or at least several full files

    Odd they chose neox as their model, I think only ctranslate2 can offload those? I had trouble getting the GPTQ running in autogptq.. maybe the huggingface TGI would work better

  • I've been impressed with Vicuna 1.5, seems quite competent and enjoyable. Unfortunately I'm only able to do 13B at any reasonable speed so that's where I tend to stay, though funny enough I haven't tried any 70Bs since llama.cpp added support, I'll have to start some downloads...

  • The same thing is happening here that happened to smartphones, we started out saying they were the be-all end-all, but largely because they were all so goddam different that it was impossible to compare them 1:1 in any meaningful way without some kind of automation like benchmarks

    Then some people started cheating them, and we noticed that really the benchmarks, while nice for generating big pretty numbers, don't actually have much correlation to real world performance, and more often than not would miss-represent what the product was capable of

    Eventually we get to a point where we can harmonize between benchmarks providing useful metrics and frames of reference for showing that there's something wrong, and having real reviews that dive into how the actual model works in the real world

  • I would love to see more of this and maybe making it its own post for more traction and discussion, do you have a link to those pictures elsewhere? can't seem to get a large version loaded on desktop haha.

  • I've been beating my head against a docker issue for a couple days so didn't get to try this one yet, but I did notice vicuna was VERY ready to output code and pretty coherently i might add, should check that one out if you're interested in coding ability

  • Yeah no for sure didn't sound too negative, your concerns are definitely valid

    As for whether people on Reddit know about this one, it's likely several don't cut also not sure how well the mods there would take to advertising an alternative, probably best bet is to have some high value posts here that get posted on Reddit for awareness and maybe some people will feel like joining :D

  • As weird as this is, I'm looking forward to seeing what their watch would look like, I'm highly impressed by the care and attention to detail they gave to minimal but cohesive reskinning of android

  • My biggest idea is just to get people to post 😅 just want to make sure this place doesn't stagnate and I want to know what the community thinks will help with that

    I obviously can post 24/7 but also don't want to spam this place, so finding a middle ground is important

    I think discussion threads might help, because as it stands it's hard to know where to just you know.. post tech issues, or cool prompts you made, or tools you're using

  • I don't think there's harm in a sticky that contains the latest model releases, I agree that it shouldn't devolve into spamming automated posts, maybe weekly is more appropriate or every X days

    I agree overall with your list, the question is just about how do we foster that, and I'm wondering if discussion threads will be a good middle ground until we have a reasonable flow of new posts

    Either way I want to keep an eye on it and find a way to help the community grow organically

  • LocalLLaMA @sh.itjust.works

    Microsoft makes new 1.3B coding LLM that outperforms all models on MBPP except GPT-4, reaches third place on HumanEval above GPT-3.5, and shows emergent properties

    Android @lemmy.ml

    Samsung Electronics Expands the Self-Repair Program to Europe

    Android @lemmy.ml

    Samsung Galaxy Watch6 prices in France

    Android @lemmy.ml

    Quick pixel tablet unboxing (JUST UNBOXING, no info)

    LocalLLaMA @sh.itjust.works

    Update to my text-generation-webui docker image - now with ExLlama support!

    Selfhosted @lemmy.world

    GitHub - jesseduffield/horcrux: Split your file into encrypted fragments so that you don't need to remember a passcode

    LocalLLaMA @sh.itjust.works

    More docker images! Now for oobabooga's text-generation-webui, CPU and GPU versions

    LocalLLaMA @sh.itjust.works

    New Wizard coder model posted and quantized by TheBloke!

    LocalLLaMA @sh.itjust.works

    GitHub - UnderstandGPT/UnderstandGPT: A source of knowledge for all things LLM.