Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)NO
Posts
89
Comments
149
Joined
2 yr. ago

LocalLLaMA @sh.itjust.works

My personal collection of interesting models I've quantized from the past week (yes, just week)

  • Colour me intrigued. I want more manufactures that go against the norm. If they put out a generic slab with normal specs at an expected price, I won't be very interested, but if they do something cool I'm all for it

    Except I just noticed the part where it's developed by Meizu so nevermind probably will be a generic Chinese phone

  • Stop making me want to buy more graphics cards...

    Seriously though this is an impressive result, "beating" gpt3.5 is a huge milestone and I love that we're continuing the trend. Will need to try out a quant of this to see how it does in real world usage. Hope it gets added to the lmsys arena!

  • LocalLLaMA @sh.itjust.works

    itsme2417/PolyMind: A multimodal, function calling powered LLM webui.

    LocalLLaMA @sh.itjust.works

    Introducing Nomic Embed: A Truly Open Embedding Model

  • Btw I know this is old and you may have already figured out your hardware and setup, but p40s and p100s go for super cheap on eBay.

    P40 is an amazing $/GB deal, only issue is the fp16 performance is abysmal so you'll want to run either full fp32 models or use llama.cpp which is able to cast up to that size

    The p100 has less VRAM but really good fp16 performance which makes it ideal for exllamav2 usage. I picked up one of each recently, p40 was failed to deliver and p100 was delivered while I'm away, but once I have both on hand I'll probably post a comparison to my 3090 for interests sake

    Also I run all my stuff on Linux (Ubuntu 22.04) with no issues

  • Yeah q2 logic is definitely a sore point, I'd highly recommend going with Mistral dolphin 2.6 DPO instead, the answers have been very high quality for a 7b model

    But good info for anyone wanting to keep up to date on very low bit rate quants!

  • I don't have a lot of experience with either at this time, I've used them here and there for programming questions but usually I stick to 7b models because I use them for code completion and I only find that useful if it completes the code before I do lol

    That said, I've had overall good answers from either whenever I've decided to pull them out, it feels like wizard coder should be better since it's so much newer but overall it hasn't been that different. Wish phind would release an update :(

  • LocalLLaMA @sh.itjust.works

    InternLM2 models llama-fied

    LocalLLaMA @sh.itjust.works

    WizardLM/WizardCoder-33B-V1.1 released!

  • I live in Ontario where we go down to -30C in the harshest conditions.

    We have a heat pump and a furnace and they alternate based on efficiency

    Somewhere around -5 to +5 C it switches from the heat pump to the furnace

    I think you could get by a bit colder but it really loses out on efficiency vs burning gas unless you invest in a geothermal heat pump

  • LocalLLaMA @sh.itjust.works

    Microsoft announces WaveCoder

  • It's definitely a little odd.. I'm glad they did any kind of official release for 0.2, but yeah information is sorely lacking and would be nice to have more, especially with how revolutionary the previous one was.. is this incremental? Is it a huge change? Is it just more fine tuning? Did they start from scratch? We'll never know 🤷‍♂️

  • LocalLLaMA @sh.itjust.works

    Mixture of Experts Explained (Huggingface blog)

    LocalLLaMA @sh.itjust.works

    Mistral releases version 0.2 of their 7B model

  • The only concern I had was my god is it a lot of faith to put in this random twitter, hope they never get hacked lol, but otherwise yes it's a wonderful idea, would be a good feature for huggingface to speed up downloads/uploads

  • LocalLLaMA @sh.itjust.works

    Mistral drops a new magnet download

  • Better finetuning is such an important factor, i feel like the future is all of us having our own personal tunes for models that work well with our lives, and iterating for learning more basically every day is also really helpful, so the more barriers we can take down the better!

  • LocalLLaMA @sh.itjust.works

    Orca 2: Teaching Small Language Models How to Reason

    LocalLLaMA @sh.itjust.works

    Hundreds of OpenAI employees threaten to resign and join Microsoft

    LocalLLaMA @sh.itjust.works

    Catch me if you can! How to beat GPT-4 with a 13B model | LMSYS Org

    LocalLLaMA @sh.itjust.works

    TensorRT-LLM evaluation of the new H200 GPU achieves 11,819 tokens/s on Llama2-13B

    LocalLLaMA @sh.itjust.works

    ExUI - a lightweight web UI for ExLlamaV2 by turboderp

    LocalLLaMA @sh.itjust.works

    Phind V7 subjectively performing at GPT4 levels for coding

    LocalLLaMA @sh.itjust.works

    Min P sampler (an alternative to Top K/Top P) has been merged into llama.cpp

    LocalLLaMA @sh.itjust.works

    HUGE dataset released for open source use

    LocalLLaMA @sh.itjust.works

    I've started uploading quants of exllama v2 models, taking requests

    LocalLLaMA @sh.itjust.works

    Text Generation Web-UI has been updated to CUDA 12.1, and with it new docker images are needed

    LocalLLaMA @sh.itjust.works

    Single Digit tokenization improves LLM math abilities by up to 70x