Probably better to ask on !localllama@sh.itjust.works. Ollama should be able to give you a decent LLM, and RAG (Retrieval Augmented Generation) will let it reference your dataset.
The only issue is that you asked for a smart model, which usually means a larger one, plus the RAG portion consumes even more memory, which may be more than a typical laptop can handle. Smaller models have a higher tendency to hallucinate - produce incorrect answers.
Short answer - yes, you can do it. It's just a matter of how much RAM you have available and how long you're willing to wait for an answer.
Also, to @projectmoon@lemm.ee, you might want to wait and see what gets announced at Computex next month. Hopefully they announce some new stuff and the current gen prices drop.
One thing to keep in mind about adding RAM your speed could drop depending on how many slots you populate. For me, I have a 5700G and with 2x16Gb, it runs at 3200Mhz, but with 4x16Gb(same exact product), it only runs at 1800Mhz. In my case, RAM speed has a huge effect on tokens/sec, if I have a model that has to use some RAM.
You can check AMD's spec page for your processor, but they don't really document a lot of this stuff.
I guess "It’s not for everyone" is the real takeaway here. I'm not a phone guy in general, but I've been using cards since BK was still selling 99¢ Whoppers. I'm guessing both of us are ready to pay before the cashier has our order rung up.
To each their own. (I'm finally admitting that I'm fighting a losing battle on writing checks though.)
Have you tried the guide on AMD's site? It looks like it's for Windows, and I don't know what you're running. Plus, I use Ollama, so I probably can't be of much help.
For programing, my favorite is Dolphin-Mixtral, but I've had good results with Dolphin-Mistral and Llama2.
Same here. I guess I should have pointed out that I'm not really much of a phone guy to begin with. I don't install many apps, and I stay logged out of Google. To me, losing a phone really just means losing my pictures and videos. The most expensive phone I've ever had was $200.
Using a phone sounds inconvenient to me. I usually just pull my card out of my wallet, wave it over the terminal until I hear a beep and that's it. Worst case scenario, I have to insert it into the chip reader or God-forbid swipe it through the slot like some kind of Neanderthal.
I'm kidding, but seriously, that's easier than screwing around with a phone, to me.
I'm slightly pissed about the shrinkage, but really pissed that they don't come in packages anymore (at least not where I live). It's bad enough I have to scan my own groceries, but now I have to scan 8-12 individual eggs in a row. What's next?
I pirated a certain 'crash cars and shoot'em up' game because, even though I own it on Steam, the gameplay (especially the launcher) absolutely sucks.
No more automatically downloading online content when I don't even play online and no more updates breaking my mods. It's worked out so well that I'm looking at pirating other games I already own.
I really hope those patches make their way into the other distros. I've got a few Linux machines and the Steam Deck is the only one that wakes from sleep without locking up. It's also the only one that allocates VRAM for the iGPU automatically when a game needs more.
Probably better to ask on !localllama@sh.itjust.works. Ollama should be able to give you a decent LLM, and RAG (Retrieval Augmented Generation) will let it reference your dataset.
The only issue is that you asked for a smart model, which usually means a larger one, plus the RAG portion consumes even more memory, which may be more than a typical laptop can handle. Smaller models have a higher tendency to hallucinate - produce incorrect answers.
Short answer - yes, you can do it. It's just a matter of how much RAM you have available and how long you're willing to wait for an answer.