Abstract
Large language models (LLMs) have achieved remarkable success in various tasks, such as decision-making, reasoning, and question answering. They have been widely used in edge devices. However, fine-tuning LLMs to specific tasks at the edge is challenging due to the high computational cost and the limited storage and energy resources at the edge. To address this issue, we propose TaskEdge, a task-aware parameter-efficient fine-tuning framework at the edge, which allocates the most effective parameters to the target task and only updates the task-specific parameters. Specifically, we first design a parameter importance calculation criterion that incorporates both weights and input activations into the computation of weight importance. Then, we propose a model-agnostic task-specific parameter allocation algorithm to ensure that task-specific parameters are distributed evenly across the model, rather than being concentrated in specific regions. In doing so, TaskEdge can significantly reduce the computational cost and memory usage while maintaining performance on the target downstream tasks by updating less than 0.1% of the parameters. In addition, TaskEdge can be easily integrated with structured sparsity to enable acceleration by NVIDIA’s specialized sparse tensor cores, and it can be seamlessly integrated with LoRA to enable efficient sparse low-rank adaptation. Extensive experiments on various tasks demonstrate the effectiveness of TaskEdge.
| Original language | English |
|---|---|
| Title of host publication | 2025 IEEE Global Communications Conference, GLOBECOM 2025 : Proceedings |
| Publisher | IEEE |
| Pages | 1035-1040 |
| Number of pages | 6 |
| ISBN (Electronic) | 9798331577810 |
| ISBN (Print) | 9798331577827 |
| DOIs | |
| Publication status | Published - Dec 2025 |
| Event | The 2025 IEEE Global Communications Conference - Taipei, Taiwan, China Duration: 8 Dec 2025 → 12 Dec 2025 |
Publication series
| Name | IEEE Conference on Global Communications |
|---|---|
| Publisher | IEEE |
| ISSN (Print) | 1930-529X |
| ISSN (Electronic) | 2576-6813 |
Conference
| Conference | The 2025 IEEE Global Communications Conference |
|---|---|
| Abbreviated title | GLOBECOM 2025 |
| Country/Territory | Taiwan, China |
| City | Taipei |
| Period | 8/12/25 → 12/12/25 |
Funding
The research work described in this paper was conducted in the JC STEM Lab of Smart City funded by The Hong Kong Jockey Club Charities Trust under Contract 2023-0108. The work was supported in part by the Hong Kong SAR Government under the Global STEM Professorship and Research Talent Hub. The work of S. Hu was supported in part by the Hong Kong Innovation and Technology Commission under InnoHK Project CIMDA. The work of Y. Deng was supported in part by the National Natural Science Foundation of China under Grant No. 62301300
Keywords
- Large Language Models
- Large Pre-Trained Models
- Parameter-Efficient Fine-Tuning
- Edge Computing
Fingerprint
Dive into the research topics of 'Task-Aware Parameter-Efficient Fine-Tuning of Large Pre-Trained Models at the Edge'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver