Topics modeling in computer science articles

Citation Author(s):
Jose
Melendez Barros
Universidade de São Paulo
Rosa V.
Encinas Quille
Universidade de São Paulo
Márcio
Barbado Júnior
Universidade de São Paulo
Pedro Luiz
Pizzigatti Corrêa
Universidade de São Paulo
Submitted by:
Jose Melendez
Last updated:
Mon, 07/13/2020 - 08:44
DOI:
10.21227/7exb-wb55
Data Format:
Links:
License:
0
0 ratings - Please login to submit your rating.

Abstract 

Abstract: 

By querying open data of notorious scientific databases via representational state transfers, and subsequently enforcing data management practices with a dynamic topic modeling approach on the referred metadata available, this work achieves a feasible form of article set analysis and classification. Research trends for a given field in specific moments are identified, and also the referred trends evolution throughout the years. It is then possible to detect the associated lexical variation overtime on published content, ultimately determining the so-called hot topics in arbitrary instants, including now. Three prominent scientific articles databases are probed by this work, they are arXiv, IEEExplore, and Springer Nature.

 

The dataset contains:Identification of the articles used in the studyThe proportion of the topics in each documentNumber of articles per year per topicDistribution of the words that make up each topic

Instructions: 

Instructions and documentation are given in readme.pdf.

s