SLIME is a novel reference-free alignment objective for large language models that decouples preference learning from generation quality through a three-pronged approach combining likelihood maximization, probability stabilization, and dual-margin constraints.
Selective knowledge distillation in autoregressive language models using student-entropy-guided position selection improves accuracy and efficiency while reducing memory and storage requirements.