Skip to main navigation Skip to search Skip to main content

Task-Aware Parameter-Efficient Fine-Tuning of Large Pre-Trained Models at the Edge

Research output: Book Chapters | Papers in Conference ProceedingsConference paper (refereed)Referred Conference Paperpeer-review

Abstract

Large language models (LLMs) have achieved remarkable success in various tasks, such as decision-making, reasoning, and question answering. They have been widely used in edge devices. However, fine-tuning LLMs to specific tasks at the edge is challenging due to the high computational cost and the limited storage and energy resources at the edge. To address this issue, we propose TaskEdge, a task-aware parameter-efficient fine-tuning framework at the edge, which allocates the most effective parameters to the target task and only updates the task-specific parameters. Specifically, we first design a parameter importance calculation criterion that incorporates both weights and input activations into the computation of weight importance. Then, we propose a model-agnostic task-specific parameter allocation algorithm to ensure that task-specific parameters are distributed evenly across the model, rather than being concentrated in specific regions. In doing so, TaskEdge can significantly reduce the computational cost and memory usage while maintaining performance on the target downstream tasks by updating less than 0.1% of the parameters. In addition, TaskEdge can be easily integrated with structured sparsity to enable acceleration by NVIDIA’s specialized sparse tensor cores, and it can be seamlessly integrated with LoRA to enable efficient sparse low-rank adaptation. Extensive experiments on various tasks demonstrate the effectiveness of TaskEdge.
Original languageEnglish
Title of host publication2025 IEEE Global Communications Conference, GLOBECOM 2025 : Proceedings
PublisherIEEE
Pages1035-1040
Number of pages6
ISBN (Electronic)9798331577810
ISBN (Print)9798331577827
DOIs
Publication statusPublished - Dec 2025
EventThe 2025 IEEE Global Communications Conference - Taipei, Taiwan, China
Duration: 8 Dec 202512 Dec 2025

Publication series

NameIEEE Conference on Global Communications
PublisherIEEE
ISSN (Print)1930-529X
ISSN (Electronic)2576-6813

Conference

ConferenceThe 2025 IEEE Global Communications Conference
Abbreviated titleGLOBECOM 2025
Country/TerritoryTaiwan, China
CityTaipei
Period8/12/2512/12/25

Funding

The research work described in this paper was conducted in the JC STEM Lab of Smart City funded by The Hong Kong Jockey Club Charities Trust under Contract 2023-0108. The work was supported in part by the Hong Kong SAR Government under the Global STEM Professorship and Research Talent Hub. The work of S. Hu was supported in part by the Hong Kong Innovation and Technology Commission under InnoHK Project CIMDA. The work of Y. Deng was supported in part by the National Natural Science Foundation of China under Grant No. 62301300

Keywords

  • Large Language Models
  • Large Pre-Trained Models
  • Parameter-Efficient Fine-Tuning
  • Edge Computing

Fingerprint

Dive into the research topics of 'Task-Aware Parameter-Efficient Fine-Tuning of Large Pre-Trained Models at the Edge'. Together they form a unique fingerprint.

Cite this