Post-training LLMs

大语言模型后训练/微调(Post-training Large Language Models) #

SFT,PEFT,LORA,RLHF、PPO、DPO #