Data for the study has been retrieved from a publicly available data set of a leading European P2P lending platform, Bondora (https://www.bondora.com/en). The retrieved data is a pool of both defaulted and non-defaulted loans from the time period between 1st March 2009 and 27th January 2020. The data comprises demographic and financial information of borrowers and loan transactions. In P2P lending, loans are typically uncollateralized and lenders seek higher returns as compensation for the financial risk they take.

Instructions: 

The dataset also consists of data preprocessing Jupyter notebook that will help in working with the data and to perform basic data pre-processing. The zip file of the dataset consists of pre-processed and raw dataset directly extracted from the Bondora website https://www.bondora.com/en.

Disclaimer:
In the attached notebook, I have used my intuition and assumption for performing data-preprocessing.

Categories:
380 Views

 

This dataset is a set of eighteen directed networks that represents message exchanges among Twitter accounts during eighteen crisis events. The dataset comprises 645,339 anonymized unique user IDs and 1,396,709 edges that are labeled with respect to Plutchik's basic emotions (anger, fear, sadness, disgust, joy, trust, anticipation, and surprise) or "neutral" (if a tweet conveys no emotion).

Categories:
238 Views

This dataset contains requests execution times for comparison of direct requests and requests via API gateway to test API.

Categories:
78 Views

Background: Insomnia as one of the dominant diseases of traditional Chinese medicine (TCM) has been extensively studied in recent years. To explore the novel approaches of research on TCM diagnosis and treatment, this paper presents a strategy for the research of insomnia based on machine learning.

Categories:
73 Views

The rise of the Internet of Things (IoT) has opened new research lines that focus on applying IoT applications to domains further beyond basic user-grade applications, such as Industry or Healthcare. These domains demand a very high Quality of Service (QoS), mainly a very short response time. In order to meet these demands, some works are evaluating how to modularize and deploy IoT applications in different nodes of the infrastructure (edge, fog, cloud), as well as how to place the network controllers, since these decisions affect the response time of the application.

Categories:
84 Views

This dataset contains nearly 1 Million unique movie reviews from 1150 different IMDb movies spread across 17 IMDb genres - Action, Adventure, Animation, Biography, Comedy, Crime, Drama, Fantasy, History, Horror, Music, Mystery, Romance, Sci-Fi, Sport, Thriller and War. The dataset also contains movie metadata such as date of release of the movie, run length, IMDb rating, movie rating (PG-13, R, etc), number of IMDb raters, and number of reviews per movie.

Instructions: 

Movie details can be found by every genre file inside 1_movie_per_genre folder.

Reviews of every Movie can be found in 2_reviews_per_movie_raw folder.

Note that file name in 2nd folder equals movie name + year of release (found in first folder)

Categories:
797 Views

Vehicular networks have various characteristics that can be helpful in their inter-relations identifications. Considering that two vehicles are moving at a certain speed and distance, it is important to know about their communication capability. The vehicles can communicate within their communication range. However, given previous data of a road segment, our dataset can identify the compatibility time between two selected vehicles. The compatibility time is defined as the time two vehicles will be within the communication range of each other.

Instructions: 

Note: If you are using this then do cite our work. https://ieeexplore.ieee.org/abstract/document/9186099

 

F. H. Kumbhar and S. Y. Shin, "DT-VAR: Decision Tree Predicted Compatibility based Vehicular Ad-hoc Reliable Routing," in IEEE Wireless Communications Letters, doi: 10.1109/LWC.2020.3021430.

 

Each row contains characteristic information related to two vehicles at time t. Data set feature set (column headings) are as follows: 

 

- Euclidean Distance: The shortest distance between two vehicles in meters

- Relative Velocity: The velocity of 2nd vehicles as seen from 1st vehicle

- Direction Difference: Given the direction information of each vehicle, the direction difference feature identifies the angle both vehicles are moving towards. For instance, two vehicles going on the same road can have direction difference 0, whereas two vehicles moving in the opposite direction will have a difference of 180. we calculated direction difference using: |((Direction of i - Direction of j+ 180)%360 - 180)| .

- Direction Difference Label: To ease the process for the supervised learning model, we also included direction difference label information by identifying three possible directions ( 0 if difference < 60, 2 if difference >120 and 1 if none of above)

- Tendency: The Tendency is an interesting label that is required to differentiate between two vehicles which are moving in opposite directions, but either they are approaching each other or moving away from each other. 

 

Target Label (Compatibility time): Our goal is to identify how long two vehicles will be in the communication range of each other. The predicted compatibility time label tells us five possible values:

L0 means Compatibility Time is 0

L1 means Compatibility Time is more than 2 seconds but less than 5 seconds

L2 means Compatibility Time is more than 5 seconds but less than 10 seconds

L3 means Compatibility Time is more than 10 seconds but less than 15 seconds

 

L4 means Compatibility Time is more than 15 seconds 

Categories:
219 Views

One-way delay (OWD) is the transmission time of the network packet from the first to the last bit from the sender node to the receiver node. The data set presented here was obtained as a result of measurements performed for the paper “Improving the Accuracy of One-Way Delay Measurements”.

One-way delay measurements were performed using three different utilities:

* the utility from the OWAMP protocol;

* first version of our utility, owping1; and

* the new version of our utility, owping2.

Instructions: 

The graph shown in Figure 3 and the values in Table 2 are derived from data from files located in the Fig3andTab2 folder.

The OWAMP_chrony.csv file contains the results of measurements made on the local network: with, the IP packet size being 46 bytes, the measurement utility being OWAMP, and the type of NTP server being chrony. A file with the numerical OWAMP measurement data in microseconds can be seen via Excel.

The OWAMP_ntpd.csv file contains the results of measurements made on the local network: with, the size of the IP packet being 46 bytes, the measurement utility being OWAMP, and the type of NTP server being ntpd.

The owping2_chrony.csv file contains the results of measurements on the local network: with, the packet size being 46 bytes, the measuring utility being owping2, the NTP server type being chrony, and the protocol being UDP.

The owping2_ntpd.csv file contains the results of measurements on the local network: with, the packet IP size being 46 bytes, the measuring utility being owping2, the NTP server type being ntpd, and the protocol being UDP.

 

The graph displayed in Figure 5 and the values from Table 3 are derived from data from files located in the Fig5andTab3 folder.

All these files contain the results of measurements across a local network without a switch; the IP packet size is 46 bytes. The measurements in the files are presented in microseconds. They can be displayed via Excel.

In the owping1_icmp.csv file, the data is derived from owping1 utility measurements of ICMP packets.

In the owping1_udp.csv file, the data is derived from owping1 utility measurements of UDP packets.

In the owping2_icmp.csv file, the data is derived from owping2 utility measurements of ICMP packets.

In the owping2_udp.csv file, the data is derived from owping2 utility measurements of UDP packets.

 

The graph displayed in Figure 6 and the values in Table 4 are derived from data from a file located in the Fig6andTab4 folder.

The owamp_smr-crm_udp.csv file contains the OWD measurements across the global network, in the Samara-Crimea direction, using the OWAMP measurement utility.

Column A – represents the measurements made when the server was located in Crimea.

Column B – represents the measurements made when the server was located in Samara.

 

Table 5 was built using data from files located in the Tab5 folder.

The ping.csv file contains the results of RTT measurements across the global network, in the Samara-Crimea direction, using the RIPE Atlas measuring system.

The file 1 Client in Crimea.csv contains the results of OWD measurements across the Samara-Crimea section: with IP packet size being 46 bytes, and the measurement utility being owping2. The first column represents the measurements relating to the route from Samara to Crimea, the second represents the measurements relating to the route from Crimea to Samara. The values are in milliseconds. The file can be displayed using Excel.

File 2 Client in Crymea.csv contains the results of OWD measurements across the Crimea-Samara section: with, the IP packet size being 46 bytes, and the measurement utility being owping2. The first column represents the measurements relating to the route from Crimea to Samara, the second represents the measurements relating to the route from Samara to Crimea.

 

The graph displayed in Figure 7 was constructed using data from a file located in the Fig5 folder.

The owping2-owamp.csv file contains the OWD measurements for the Crimea-Samara direction. Column A contains data measured with owping2, Column B contains data measured with OWAMP.

 

The values shown in Table 6 were obtained using data from files located in the Tab6 folder.

OWAMP.csv contains the results of measurements across a global network in the Crimea-Samara direction (client in Crimea), where the IP packet size is 1500 bytes, and the measurement utility is OWAMP.

Column A - OWD from Crimea to Samara.

Column B - OWD from Samara to Crimea.

owping2.csv contains the results of measurements across a global network in the Crimean-Samara direction (client in Crimea), where the IP packet size is 1500 bytes, the measurement utility is owping2, and the protocol is UDP.

Column A - OWD from Crimea to Samara.

Column B - OWD from Samara to Crimea.

 

In addition to the data for the present paper, this set includes several additional files located in the Add folder.

The Rostov-Samara.csv file contains the results of OWD measurements from Rostov in the Don to Samara direction. Column A contains data for the Rostov-Samara direction, measured with owping2. Column B contains data for the return direction, Samara-Rostov.

The Rostov-Moscow.csv file contains the results of OWD measurements at Rostov in the Don to Moscow direction. Column A contains data for the Rostov-Moscow direction, measured with owping2. Column B contains data for the return direction, Moscow-Rostov.

The Rostov-Crimea.csv file contains the results of OWD measurements at Rostov in the Don-Crimea direction. Column A contains data for the Rostov-Crimea direction, measured with owping2. Column B contains data for the return direction Crimea-Rostov.

Categories:
270 Views

This dataset contains the results of the simulation runs of the experiments performed to evaluate and compare the proposed spatial model for situated multi-agent systems. The model was introduced in a paper entitled "BioMASS, a spatial model for situated multiagent systems that optimizes neighborhood search". In this paper we presented a new model to implement a spatially explicit environment that supports constant-time sensory (neighborhood search) and locomotion functions for situated multiagent systems.

Instructions: 

The dataset include a compressed file in zip format. It contains a directory structure as shown below. Each directory is a specific experiment with each simulation toolkit and parameters. Inside each directory there are 50 CSV files, one for echa simulation run. Each file has a header describing the main parameters of the corresponding experiment. We use the Repast Toolkit, and Mason Toolkit to perform a benchmark with the proposed BioMASS spatial model. 

 

Experiments

heterogeneous

heter-biomass

heter-mason.20

heter-mason.40

heter-mason.200

heter-mason.400

Homegeneous

homo-biomass

homo-mason.200

homo-mason.400

homo-repast

 

Categories:
55 Views

This is a dataset of Finite Difference Time Domain (FDTD) simulation results of 13 defective crystals and one non-defective crystal.  There are 4 fields in the dataset, namely: Real, Img, Int, and Attribute. The header real shows a real part of the simulated result, img shows the imaginary part, int gives the intensity all in superimposed form. Attribute denotes the label of a crystal simulated. The label 0 is for the simulated crystal, which is non-defective.  Other 13 labels, from crystal 1 to crystal 13 are assigned to the 13 different crystals whose simulations are studied.

Instructions: 

Read the abstract.

Categories:
122 Views

Pages