Skip to main navigation Skip to search Skip to main content

Assessing AI-Generated Image Quality Using a Cross-Modal Hierarchical Perception Network

  • Zhaoqing PAN*
  • , Yi YANG
  • , Feng YUAN
  • , Haoran XIE
  • , Fu Lee WANG
  • , Sam KWONG
  • *Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

Abstract

AI-Generated Images (AGIs) are increasingly used in various multimedia applications, making it essential to accurately assess the quality of AGIs to enhance user experience and optimize generative models. However, existing AI-Generated Image Quality Assessment (AGIQA) methods struggle to align fine-grained cross-modal semantics or capture diverse quality factors across multiple perceptual levels, limiting their effectiveness. To address these limitations, a Cross-modal Hierarchical Perception Network (CHPNet) is proposed for AGIQA, which simulates the hierarchical visual perception and adaptive decision-making mechanisms of the human brain. The proposed CHPNet comprises two key components: a Multi-level Cross-modal Interaction Network (MCINet) and an Adaptive Hierarchical Scoring Network (AHSNet). The MCINet is designed to generate multi-level quality-aware features by aligning and fusing visual and textual features at multiple semantic levels. To enhance semantic alignment, a Cross-modal Bidirectional Semantic Alignment Module (CBSAM) is built to improve the quality-aware feature extraction of MCINet by mitigating the semantic gap between cross-modal features. The AHSNet is developed to adaptively evaluate the importance of each perceptual level and assign importance-based weights to compute the final quality score. Extensive experiments on three AGIQA databases have demonstrated the effectiveness of the proposed CHPNet. The code of the proposed CHPNet is released at https://github.com/NUIST-Videocoding/CHPNet.git

Original languageEnglish
Pages (from-to)291-302
Number of pages12
JournalIEEE Transactions on Broadcasting
Volume72
Issue number1
Early online date28 Oct 2025
DOIs
Publication statusPublished - Mar 2026

Bibliographical note

Publisher Copyright:
© 1963-12012 IEEE.

Funding

This work was supported by the National Natural Science Foundation of China under Grant 62322116.

Keywords

  • AI-generated images
  • quality assessment
  • cross-modal hierarchical perception network
  • multi-level cross-modal interaction network
  • adaptive hierarchical scoring network

Fingerprint

Dive into the research topics of 'Assessing AI-Generated Image Quality Using a Cross-Modal Hierarchical Perception Network'. Together they form a unique fingerprint.

Cite this