Datasets
Standard Dataset
Included Papers, venues and partial data for TSE
- Citation Author(s):
- Submitted by:
- Simin Wang
- Last updated:
- Wed, 03/23/2022 - 21:28
- DOI:
- 10.21227/xcg0-ad18
- License:
- Categories:
Abstract
To improve the applicability and generalizability of ML/DL-related SE studies, we conducted a 12-year Systematic Literature Review (SLR) on 1,428 ML/DL-related SE papers published between 2009 and 2020. Our trend analysis demonstrated the impacts that ML/DL brought to SE. We examined the complexity of applying ML/DL solutions to SE problems and how such complexity led to issues concerning the reproducibility and replicability of ML/DL studies in SE.
1. Paper_FinalList_2009-2020.csv contains the full list of included 1428 papers
2. Venue_Dictionary.csv contains the full list of included conference and journals (abbreviations and corresponding full names)
3. DataExtraction_Task_Technique_2009-2020.csv contains the extracted related SE tasks and specific employed ML/DL algorithms under each category for all 1428 papers
4. Task_DataPreprocessing.csv contains all 77 SE tasks and statistics about the different types of data preprocessing techniques for each SE task
Dataset Files
- Paper_FinalList_2009-2020.csv (162.13 kB)
- Task_DataPreprocessing.csv (4.40 kB)
- DataExtraction_Task_Technique_2009-2020.csv (243.33 kB)
- Venue_Dictionary.csv (7.92 kB)