Datasets
Standard Dataset
Cloud Stateless System Performance Metrics and Status
- Citation Author(s):
- Submitted by:
- Nutt Chairatana
- Last updated:
- Wed, 01/31/2024 - 12:12
- DOI:
- 10.21227/8wf2-2y40
- Data Format:
- Research Article Link:
- License:
- Categories:
- Keywords:
Abstract
We utilized Digital Ocean's cloud service, setting up three Linux virtual machines, each with 1vCPU, 1GB of memory, and a 10GB disk. The architecture included an API gateway for routing requests to a stateless application service backed by a database for storing application data. The application operates the service under a fluctuating workload generated by a load-testing script to simulate real-world usage scenarios. The target source or the application service is integrated with Prometheus, a monitoring tool for gathering system metrics. To extract data from Prometheus, we devised a custom script capable of tapping into its local storage, thereby collecting resource utilization and performance metrics. The resulting dataset encompasses roughly 8,000 data points gathered at 5-second intervals. These data points span a variety of metrics: CPU and memory usage (in percentages), network traffic (inbound and outbound rates in GB/s or MB/s), transactions per second (TPS), and response times (in seconds or milliseconds). A critical aspect of our dataset was the real-time health status of the system, assessed through HTTP response codes. Using our custom script, we monitored these codes; if predefined error codes (5xx) were detected, the system was marked as "unhealthy." In all other scenarios, the system was deemed "healthy."
Data features include
- Time
- labeled as Time in the CSV header
- expressed as yyyy-mm-dd hh:mm:ss
- Timestamp
- labeled as Timestamp in the CSV header
- expressed as milliseconds
- CPU Request
- labeled as cpu_usage in the CSV header
- expressed as percentage
- Memory Request
- labeled as memory_usage in the CSV header
- expressed as percentage
- Inbound Bandwidth
- labeled as bandwidth_inbound in the CSV header
- expressed as gigabytes per second (GB/s) or megabytes per second (MB/s)
- Outbound Bandwidth
- labeled as bandwidth_outbound in the CSV header
- expressed as gigabytes per second (GB/s) or megabytes per second (MB/s)
- Transactions Per Second
- labeled as tps in the CSV header
- expressed as requests per second (req/s)
- Average Response Time
- labeled as response_time in the CSV header
- expressed as seconds (s) or milliseconds (ms)
- System Status
- labeled as status in the CSV header
- expressed as a binary label - 0 for healthy and 1 for unhealthy