To read this content please select one of the options below:

Discovering research topics from library electronic references using latent Dirichlet allocation

Debin Fang (Economics and Management School, Wuhan University, Wuhan, China)
Haixia Yang (Economics and Management School, Wuhan University, Wuhan, China)
Baojun Gao (Economics and Management School, Wuhan University, Wuhan, China)
Xiaojun Li (College of Accounting, Yunnan University of Finance and Economics, Kunming, China)

Library Hi Tech

ISSN: 0737-8831

Article publication date: 19 February 2018

Issue publication date: 4 June 2018

1182

Abstract

Purpose

Discovering the research topics and trends from a large quantity of library electronic references is essential for scientific research. Current research of this kind mainly depends on human justification. The purpose of this paper is to demonstrate how to identify research topics and evolution in trends from library electronic references efficiently and effectively by employing automatic text analysis algorithms.

Design/methodology/approach

The authors used the latent Dirichlet allocation (LDA), a probabilistic generative topic model to extract the latent topic from the large quantity of research abstracts. Then, the authors conducted a regression analysis on the document-topic distributions generated by LDA to identify hot and cold topics.

Findings

First, this paper discovers 32 significant research topics from the abstracts of 3,737 articles published in the six top accounting journals during the period of 1992-2014. Second, based on the document-topic distributions generated by LDA, the authors identified seven hot topics and six cold topics from the 32 topics.

Originality/value

The topics discovered by LDA are highly consistent with the topics identified by human experts, indicating the validity and effectiveness of the methodology. Therefore, this paper provides novel knowledge to the accounting literature and demonstrates a methodology and process for topic discovery with lower cost and higher efficiency than the current methods.

Keywords

Acknowledgements

Conflicts of interest: the authors declare no conflict of interest.

The authors would like to thank all the supports from the National Natural Science Foundation Programs of China (NSFC) (71771182, 71673210, 71725007).

Citation

Fang, D., Yang, H., Gao, B. and Li, X. (2018), "Discovering research topics from library electronic references using latent Dirichlet allocation", Library Hi Tech, Vol. 36 No. 3, pp. 400-410. https://doi.org/10.1108/LHT-06-2017-0132

Publisher

:

Emerald Publishing Limited

Copyright © 2018, Emerald Publishing Limited

Related articles