Sentiment Analysis Approach-digital

Citation Author(s):
anhui university of finance and economics
Submitted by:
Qiang Cao
Last updated:
Mon, 11/13/2023 - 23:19
Data Format:
0 ratings - Please login to submit your rating.


Compared with traditional finance, digital finance introduces digital technology for financial innovation, which largely reduces financial exclusion and discrimination, but improved financial services, such as mobile payment, online lending, virtual currency, and investment and wealth management, also involve potential risks. Hence, we propose a sentiment analysis model, GABP-News, to study the predictive ability of the information contained in news texts on digital financial development in China. Valence Awareness Dictionary for Sentiment Reasoning (VADER) is utilized to extract helpful information from news texts, and construct two sentiment indices, i.e., news titles and news contents, to test their predictive power respectively. Results demonstrate the usefulness of the GABP-News model for forecasting digital finance, suggesting that using sentiment variables from news titles and contents improves accuracy. It is also noted that the feature variables affecting the development of digital finance are not static, and their significant dynamic changes indicate the uncertainty of predicting digital finance. Finally, the feature variables predicting the development of digital finance are regionally heterogeneous. Compared with news contents sentiment, news title sentiment has a more relevance to each province. And trade openness is more critical for provinces with poorer digital financial development. Meanwhile, government expenditure is a vital feature variable for regions with more developed digital financial development.


The second part of this paper is to use R software to screen the feature variables affecting the development of digital finance in China through the adaptive-lasso model, the specific adaptive-lasso code is detailed in the adaptive-lasso.R program in the folder 2-Feature Selection Code, the first six lines of the first six lines of the path where the dataset diglasso.csv is imported, run the R programme, we can get the estimation results in Table 2 of the paper, i.e., the eight feature variables of fdl, econ, gfe, ndl, open, edu, title and context are selected.