Skip to main navigation Skip to search Skip to main content

Multi-Scale Feature Compression via Multi-Receptive-Field Convolutional Neural Network for Machine Vision

  • Zhaoqing PAN*
  • , Haihang WANG
  • , Tiesong ZHAO
  • , Haoran XIE
  • , Sam KWONG
  • *Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

Abstract

Multi-scale feature compression is essential in machine vision tasks for reducing storage and transmission costs while maintaining task performance. However, existing multi-scale feature compression methods fail to effectively extract and aggregate the local and global correlations of multi-scale features, resulting in incomplete elimination of feature redundancies. Moreover, these multi-scale feature compression methods mainly rely on mean square error-based loss functions to optimize signal fidelity, but they fail to adequately preserve semantic information critical to machine vision tasks. To address these issues, a Multi-receptive-field Convolutional Neural Network (MCNN)-based multi-scale feature compression method is proposed in this paper, which not only achieves compact fusion of multi-scale features but also enhances the semantic fidelity of reconstructed features. To effectively eliminate feature redundancies, a multi-receptive-field-based feature fusion module is designed for capturing both local and global correlations in multi-scale features. To enhance the quality of the reconstructed features, a cosine similarity-based multi-fidelity loss function is developed by considering both signal and semantic fidelity. Extensive experiments on the object detection and instance segmentation tasks show that the proposed MCNN outperforms the state-of-the-art multi-scale feature compression methods in terms of compression efficiency.

Original languageEnglish
Pages (from-to)1-12
Number of pages12
JournalIEEE Transactions on Multimedia
DOIs
Publication statusE-pub ahead of print - 27 Feb 2026

Bibliographical note

Publisher Copyright:
© 1999-2012 IEEE.

Keywords

  • cosine similarity
  • Machine vision
  • multi-receptive-field convolutional neural network
  • multi-scale feature compression

Fingerprint

Dive into the research topics of 'Multi-Scale Feature Compression via Multi-Receptive-Field Convolutional Neural Network for Machine Vision'. Together they form a unique fingerprint.

Cite this