Can you fine-tune on localized steering of an LLM?
lynx @ lynx @sh.itjust.works Posts 1Comments 20Joined 2 yr. ago
lynx @ lynx @sh.itjust.works
Posts
1
Comments
20
Joined
2 yr. ago
Deleted
Permanently Deleted
Deleted
Permanently Deleted
I dont know what you mean with steering?
First of all, have you tried giving the model multiple examples of input output pairs in the context, this already helps the model a lot to output the correct format.
Second you can force a specific output structure by using a regex or grammar: https://python.langchain.com/docs/integrations/chat/outlines/#constrained-generation https://github.com/ggerganov/llama.cpp/blob/master/grammars/README.md
And third, in case you want to train a model to respond differently and the previous steps were not good enough, you can fine-tune. I can recommend this project to you, as it teaches how to fine-tune a model: https://github.com/huggingface/smol-course
Depending on the size of the model, that you want to fine-tune and the amount of compute that you have available you can either train by updating all parameters like ORPO or you can train via PEFT (LoRA)