Trends and Features of the Applications of Natural Language Processing Techniques for Clinical Trials Text Analysis

Xieling CHEN, Haoran XIE, Gary CHENG, Leonard K. M. POON, Mingming LENG, Fu Lee WANG*

*Corresponding author for this work

Research output: Journal PublicationsJournal Article (refereed)

1 Scopus Citations

Abstract

Natural language processing (NLP) is an effective tool for generating structured information from unstructured data, the one that is commonly found in clinical trial texts. Such interdisciplinary research has gradually grown into a flourishing research field with accumulated scientific outputs available. In this study, bibliographical data collected from Web of Science, PubMed, and Scopus databases from 2001 to 2018 had been investigated with the use of three prominent methods, including performance analysis, science mapping, and, particularly, an automatic text analysis approach named structural topic modeling. Topical trend visualization and test analysis were further employed to quantify the effects of the year of publication on topic proportions. Topical diverse distributions across prolific countries/regions and institutions were also visualized and compared. In addition, scientific collaborations between countries/regions, institutions, and authors were also explored using social network analysis. The findings obtained were essential for facilitating the development of the NLP-enhanced clinical trial texts processing, boosting scientific and technological NLP-enhanced clinical trial research, and facilitating inter-country/region and inter-institution collaborations.
Original languageEnglish
Article number2157
Number of pages36
JournalApplied Sciences (Switzerland)
Volume10
Issue number6
Early online date22 Mar 2020
DOIs
Publication statusPublished - 22 Mar 2020

    Fingerprint

Bibliographical note

(This article belongs to the Section Computing and Artificial Intelligence)

The work has been supported by the Interdisciplinary Research Scheme of the Dean’s Research Fund 2018–19 (FLASS/DRF/IDS-3), Departmental Collaborative Research Fund 2019 (MIT/DCRF-R2/18-19), and One-off Special Fund from the Central and Faculty Fund in Support of Research (MIT02/19-20) entitled “Facilitating Artificial Intelligence and Big Data Analytics Research in Education” of The Education University of Hong Kong, Research Seed Fund, Hong Kong Institute of Business Studies Research Seed Fund (HKIBS RSF-190-009) and LEO Dr. David P. Chan Institute of Data Science, Lingnan University, Hong Kong.

Keywords

  • Bibliometrics
  • Clinical trials text
  • Collaboration
  • Natural language processing
  • Structural topic modeling

Cite this