共计 29 篇文章
2025
Distilling the Essence: Efficient Reasoning Distillation via Sequence Truncation
Instruction tuning with loss over instructions
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
RETAINING BY DOING: THE ROLE OF ON-POLICY DATA IN MITIGATING FORGETTING
Different Designs For LLM KD Loss(II)
Importance-Aware Data Selection for Efficient LLM Instruction Tuning
Training-Inference Mismatch In LLM KD(II)
FROM CORRECTION TO MASTERY: REINFORCED DISTILLATION OF LARGE LANGUAGE MODEL AGENTS
Merge-of-Thought Distillation
Delta Knowledge Distillation for Large Language Models