共计 16 篇文章
2025
ASVD: ACTIVATION-AWARE SINGULAR VALUE DECOMPOSITION FOR COMPRESSING LARGE LANGUAGE MODELS
Dual-Space Knowledge Distillation for Large Language Models
Why Exposure Bias Matters: An Imitation Learning Perspective of Error Accumulation in Language Generation
NOT ALL LLM-GENERATED DATA ARE EQUAL: RETHINKING DATA WEIGHTING IN TEXT CLASSIFICATION
关于浮点数存储精度
2024
LightGCN:简化和增强图卷积网络的推荐