共计 6 篇文章
2025
SVD-LLM: TRUNCATION-AWARE SINGULAR VALUE DECOMPOSITION FOR LARGE LANGUAGE MODEL COMPRESSION
Training-Inference Mismatch In LLM KD
Dual-Space Knowledge Distillation for Large Language Models
Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation
NOT ALL LLM-GENERATED DATA ARE EQUAL: RETHINKING DATA WEIGHTING IN TEXT CLASSIFICATION
Different Designs For LLM KD Loss