Two-Dimensional Data Partitioning for Non-negative Matrix Tri-Factorization

Jiaxing YAN, Hai LIU, Zhiqi LEI, Yanghui RAO, Guan LIU*, Haoran XIE, Xiaohui TAO, Fu lee WANG

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)peer-review

Abstract

As a two-sided clustering and dimensionality reduction paradigm, Non-negative Matrix Tri-Factorization (NMTF) has attracted much attention in machine learning and data mining researchers due to its excellent performance and reliable theoretical support. Unlike Non-negative Matrix Factorization (NMF) methods applicable to one-sided clustering only, NMTF introduces an additional factor matrix and uses the inherent duality of data to realize the mutual promotion of sample clustering and feature clustering, thus showing great advantages in many scenarios (e.g., text co-clustering). However, the existing methods for solving NMTF usually involve intensive matrix multiplication, which is characterized by high time and space complexities, that is, there are limitations of slow convergence of the multiplicative update rules and high memory overhead. In order to solve the above problems, this paper develops a distributed parallel algorithm with a 2-dimensional data partition scheme for NMTF (i.e., PNMTF-2D). Experiments on multiple text datasets show that the proposed PNMTF-2D can substantially improve the computational efficiency of NMTF (e.g., the average iteration time is reduced by up to 99.7% on Amazon) while ensuring the effectiveness of convergence and co-clustering.
Original languageEnglish
Article number100473
JournalBig Data Research
Volume37
Early online date19 Jun 2024
DOIs
Publication statusPublished - 28 Aug 2024

Bibliographical note

Publisher Copyright:
© 2024 Elsevier Inc.

Funding

The work of Yanghui Rao was supported in part by the National Natural Science Foundation of China (62372483) and Guangdong Philosophy and Social Sciences (GD24CGL57). The work of Haoran Xie was supported in part by Lam Woo Research Fund (LWP20019), the Direct Grant (DR23B2), and the Faculty Research Grants (DB23A3 and DB23B2) of Lingnan University, Hong Kong Special Administrative Region of China. The work of Xiaohui Tao was supported in part by The Australian Research Council under Grant DP220101360.

Keywords

  • 2-Dimensional data partitioning
  • Non-negative matrix tri-factorization
  • Text co-clustering

Fingerprint

Dive into the research topics of 'Two-Dimensional Data Partitioning for Non-negative Matrix Tri-Factorization'. Together they form a unique fingerprint.

Cite this