Abstract
Medical image segmentation is fundamental to computer-assisted diagnosis but faces challenges across diverse imaging modalities. Linear attention mechanisms succeed in natural images but are limited in medical segmentation due to insufficient spatial dependency and tissue heterogeneity modeling. Research indicates that successful dense prediction requires balanced permutation variance, strong inductive capabilities, and precise absolute position information. Current linear attention approaches satisfy the first two requirements but critically lack the third, significantly impacting medical segmentation where spatial localization is essential. To address these limitations, we propose U-MLLA, which integrates U-Net with mamba-like linear attention (MLLA) for multiscale feature and context capture. We further introduce complementary conditional and absolute positional encoding (APE) to compensate for position information deficits in linear attention. Experiments show U-MLLA provides robust features, and the complementary strategies significantly improve multi-organ and tumor segmentation. APE particularly excels with complex structures requiring precise boundary delineation. This cognitively inspired architecture adapts 93% of ImageNet-1k weights and increases effectiveness. Comprehensive evaluations across six challenging datasets (e.g., FLARE22, AMOS22CT/MR, ACDC) and 24 tasks, U-MLLA achieves state-of-the-art performance with an average DSC of 88.32%, outperforming nnUNetV2-2D and SwinUNetR by 4.37% and 1.98%. These results highlight U-MLLA’s potential for clinical applications that require precise anatomical delineation, where APE is essential for maintaining spatial context and differentiating similar structures. The code is available at https://github.com/csyfjiang/U-MLLA.
| Original language | English |
|---|---|
| Title of host publication | Pattern Recognition and Computer Vision - 8th Chinese Conference, PRCV 2025, Proceedings |
| Editors | Josef KITTLER, Hongkai XIONG, Weiyao LIN, Jian YANG, Xilin CHEN, Jiwen LU, Jingyi YU, Weishi ZHENG |
| Publisher | Springer Science and Business Media Deutschland GmbH |
| Pages | 76-90 |
| Number of pages | 15 |
| ISBN (Electronic) | 9789819556342 |
| ISBN (Print) | 9789819556335 |
| DOIs | |
| Publication status | Published - 21 Jan 2026 |
| Event | 8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025 - Shanghai, China Duration: 15 Oct 2025 → 18 Oct 2025 |
Publication series
| Name | Lecture Notes in Computer Science |
|---|---|
| Volume | 16284 LNCS |
| ISSN (Print) | 0302-9743 |
| ISSN (Electronic) | 1611-3349 |
Conference
| Conference | 8th Chinese Conference on Pattern Recognition and Computer Vision, PRCV 2025 |
|---|---|
| Country/Territory | China |
| City | Shanghai |
| Period | 15/10/25 → 18/10/25 |
Bibliographical note
Publisher Copyright:© The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd. 2026.
Funding
This work was supported by the General Research Fund (GRF 15104323) and the RGC Theme-based Research Scheme (project no. T45-401/22-N). Additional support was provided by Lingnan University through the Faculty Research Grant (No. SDS24A2, SDS24A8, SDS24A12, and SDS24A19), Direct Grant (DR25E8), and Lam Woo Research Fund (LWP20040), as well as by the Hong Kong Research Grants Council through the Faculty Development Scheme (Project No. UGC/FDS16/E10/23).
Keywords
- Linear Attention in Vision
- Medical Image Segmentation
- Position Encoding
- Semantic Segmentation
- UNet
Fingerprint
Dive into the research topics of 'U-MLLA: A Cognitive-Inspired Enhancement of Linear Attention for Medical Image Segmentation'. Together they form a unique fingerprint.-
Hierarchical Evaluation Framework for AI Scientists: Beyond Surface Metrics to Deep Scientific Understanding
LI, Z. (PI)
1/08/25 → 31/07/26
Project: Grant Research
-
An Integrated Fake Financial News Detection Framework: Knowledge Graph, Large Language Models, Uncertainty Modeling, and Contrastive Learning
XIE, H. (PI)
1/07/25 → 30/06/27
Project: Grant Research
-
Scalable Sentence Representation with Mixture-of-Experts and Dy-namic Routing
LI, Z. (PI), CHEN, X. (CoI) & WANG, W. (CoI)
1/07/25 → 30/06/28
Project: Grant Research
Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver