Abstract
Natural language processing (NLP) is an effective tool for generating structured information from unstructured data, the one that is commonly found in clinical trial texts. Such interdisciplinary research has gradually grown into a flourishing research field with accumulated scientific outputs available. In this study, bibliographical data collected from Web of Science, PubMed, and Scopus databases from 2001 to 2018 had been investigated with the use of three prominent methods, including performance analysis, science mapping, and, particularly, an automatic text analysis approach named structural topic modeling. Topical trend visualization and test analysis were further employed to quantify the effects of the year of publication on topic proportions. Topical diverse distributions across prolific countries/regions and institutions were also visualized and compared. In addition, scientific collaborations between countries/regions, institutions, and authors were also explored using social network analysis. The findings obtained were essential for facilitating the development of the NLP-enhanced clinical trial texts processing, boosting scientific and technological NLP-enhanced clinical trial research, and facilitating inter-country/region and inter-institution collaborations.
Original language | English |
---|---|
Article number | 2157 |
Number of pages | 36 |
Journal | Applied Sciences (Switzerland) |
Volume | 10 |
Issue number | 6 |
Early online date | 22 Mar 2020 |
DOIs | |
Publication status | Published - 22 Mar 2020 |
Bibliographical note
(This article belongs to the Section Computing and Artificial Intelligence)The work has been supported by the Interdisciplinary Research Scheme of the Dean’s Research Fund 2018–19 (FLASS/DRF/IDS-3), Departmental Collaborative Research Fund 2019 (MIT/DCRF-R2/18-19), and One-off Special Fund from the Central and Faculty Fund in Support of Research (MIT02/19-20) entitled “Facilitating Artificial Intelligence and Big Data Analytics Research in Education” of The Education University of Hong Kong, Research Seed Fund, Hong Kong Institute of Business Studies Research Seed Fund (HKIBS RSF-190-009) and LEO Dr. David P. Chan Institute of Data Science, Lingnan University, Hong Kong.
Keywords
- Bibliometrics
- Clinical trials text
- Collaboration
- Natural language processing
- Structural topic modeling