software descriptions from softpedia(antivirus and compress tool)
These data are the natural language software product descriptions from softpedia. They have bee used for the research of software feature extraction and recommendation.
The original data scraped from softPedia are in the folder of dataset_original; the data after filtering the duplicate ones and the flawed ones and filling the missing values of user ratings are in the folder of dataset_filtered.There are four columns in each data file. The first column is the software product name; the second column is the product descriptions; the third column is the product downloads; and the fourth column is the user rating.