Structure-Aware Low-Rank Adaptation for Parameter-Efficient Fine-Tuning

被引:3
作者
Hu, Yahao [1 ]
Xie, Yifei [1 ]
Wang, Tianfeng [1 ]
Chen, Man [1 ]
Pan, Zhisong [1 ]
机构
[1] Army Engn Univ PLA, Command & Control Engn Coll, Nanjing 210007, Peoples R China
基金
中国国家自然科学基金;
关键词
pre-trained language models; parameter-efficient fine-tuning; low-rank adaptation; intrinsic rank; training efficiency;
D O I
10.3390/math11204317
中图分类号
O1 [数学];
学科分类号
0701 ; 070101 ;
摘要
With the growing scale of pre-trained language models (PLMs), full parameter fine-tuning becomes prohibitively expensive and practically infeasible. Therefore, parameter-efficient adaptation techniques for PLMs have been proposed to learn through incremental updates of pre-trained weights, such as in low-rank adaptation (LoRA). However, LoRA relies on heuristics to select the modules and layers to which it is applied, and assigns them the same rank. As a consequence, any fine-tuning that ignores the structural information between modules and layers is suboptimal. In this work, we propose structure-aware low-rank adaptation (SaLoRA), which adaptively learns the intrinsic rank of each incremental matrix by removing rank-0 components during training. We conduct comprehensive experiments using pre-trained models of different scales in both task-oriented (GLUE) and task-agnostic (Yelp and GYAFC) settings. The experimental results show that SaLoRA effectively captures the structure-aware intrinsic rank. Moreover, our method consistently outperforms LoRA without significantly compromising training efficiency.
引用
收藏
页数:16
相关论文
共 41 条
  • [1] Achiam OJ, 2023, Arxiv, DOI [arXiv:2303.08774, DOI 10.48550/ARXIV.2303.08774]
  • [2] Aghajanyan A., 2021, Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, P7319
  • [3] Ben-Zaken E, 2022, PROCEEDINGS OF THE 60TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS (ACL 2022): (SHORT PAPERS), VOL 2, P1
  • [4] Brock Andrew, 2016, ICLR
  • [5] Brown TB, 2020, ADV NEUR IN, V33
  • [6] Devlin J, 2019, 2019 CONFERENCE OF THE NORTH AMERICAN CHAPTER OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS: HUMAN LANGUAGE TECHNOLOGIES (NAACL HLT 2019), VOL. 1, P4171
  • [7] Parameter-efficient fine-tuning of large-scale pre-trained language models
    Ding, Ning
    Qin, Yujia
    Yang, Guang
    Wei, Fuchao
    Yang, Zonghan
    Su, Yusheng
    Hu, Shengding
    Chen, Yulin
    Chan, Chi-Min
    Chen, Weize
    Yi, Jing
    Zhao, Weilin
    Wang, Xiaozhi
    Liu, Zhiyuan
    Zheng, Hai-Tao
    Chen, Jianfei
    Liu, Yang
    Tang, Jie
    Li, Juanzi
    Sun, Maosong
    [J]. NATURE MACHINE INTELLIGENCE, 2023, 5 (03) : 220 - +
  • [8] Dong QX, 2024, Arxiv, DOI [arXiv:2301.00234, DOI 10.48550/ARXIV.2301.00234]
  • [9] Gallego-Posada Jose, 2022, Advances in Neural Information Processing Systems, V35, P1253
  • [10] Guo DM, 2021, 59TH ANNUAL MEETING OF THE ASSOCIATION FOR COMPUTATIONAL LINGUISTICS AND THE 11TH INTERNATIONAL JOINT CONFERENCE ON NATURAL LANGUAGE PROCESSING (ACL-IJCNLP 2021), VOL 1, P4884