学习笔记
30
Distilling the Essence: Efficient Reasoning Distillation via Sequence Truncation
Instruction tuning with loss over instructions
Selective Reflection-Tuning: Student-Selected Data Recycling for LLM Instruction-Tuning
RETAINING BY DOING: THE ROLE OF ON-POLICY DATA IN MITIGATING FORGETTING
Proximal Gradient and Subgradients
Importance-Aware Data Selection for Efficient LLM Instruction Tuning
Scaling Up Dataset Distillation to ImageNet-1K with Constant Memory
Squeeze, Recover and Relabel: Dataset Condensation at ImageNet Scale From A New Perspective
FROM CORRECTION TO MASTERY: REINFORCED DISTILLATION OF LARGE LANGUAGE MODEL AGENTS
Merge-of-Thought Distillation
More...