Datasets
Standard Dataset
Dependency and size knowledge graphs for npm and pypi
- Citation Author(s):
- Submitted by:
- chenhui zhang
- Last updated:
- Mon, 11/25/2024 - 07:28
- DOI:
- 10.21227/yh00-bx90
- License:
- Categories:
- Keywords:
Abstract
Here is a dataset for our paper RED-Scenario: A Resource-Efficient Deployment Framework for Scenarios through Dependency Package Management
Dependency and Size Knowledge Graphs for 10979 Python packages with 597,049 versions, and 28,151 Node.js packages with 738,927 versions, each version containing size and dependency information.
we collect packages from vulnerable packages and the packages of regular applications. vulnerable packages come from the availability testing module. To get the packages of regular applications, We retrieve projects from GitHub written in Node.js and Python with over 10,000 stars. Then we filter those that include dependency specification files—requirement.txt for Python projects and package.json for Node.js projects. Parsing these dependency files and merging them with vulnerable packages forms the initial node of the graph, and then expanding the graph based on all versions and their sub-dependencies of the initial node. Dependency and size information are gathered from the statistical application programming interfaces of PyPI and npm.
Unzip to get two json files