Artificial Intelligence

This data set contains: 

- Training dataset: 271 CT-scans of inner ears used for optimization and training of the model. 

- Validation dataset: 70 CT-scans of inner ears used for external validation. 


SSADLog is a novel log-based anomaly detection framework. It introduces a hyper-efficient log data pre-processing method that generates a representative subset of small sample logs. This is SSADLog pre-processed BGL dataset which are used in training, test1 and test2. You can see the small sample datasets significantly reduce the time required to execute the entire SSADLog framework but still provide a holistic understanding of the original log sequences.


Data on 2355 COVID-19 cases by date of July to December 2021 were extracted from a data set recorded by COVID-19 referral centers at Qazvin province in Iran. We recorded a wide range of clinical characteristics including age, sex, previous diseases, and hospitalization time. Moreover, we collected data about the different consumed medications including Atrovastatin 20 mg, Atrovastatin 40 mg, Ivermectin 3 mg, Ivermectin 40 mg, Dexamethasone, Kaletra, Favipiravir, Famotidine 40 mg, Interferon, Remdesivir, Hydroxychloroquine.


During our research in generating or optimizing molecules to be drug candidates by extending deep reinforcement learning and graph neural networks algorithms, we used GEOM data [1], and we had an idea to make a dataset obtained from molecules from GEOM to predit the activity towards COVID and the drug linkeness. We calculated over 200 descriptors for the molecules using RDKit [2]. We hope you enjoy using it.




<p>This dataset consists of 200 occurrences extracted from three fields of a web analytics tool over time, along with labels indicating the service availability status at that moment. The data pertains to customer accesses of a real financial institution. The columns are named with a type and a unique identifier number. The column TX_ACAO_EVT represents an action performed by the customer, such as a click, system message, or background application action. The column TX_CTGR_EVT represents the category of the action, such as an error message or a specific type of action.



The datasets in discussion present detailed records for two of the world's most cultivated crops: wheat and rice. These datasets aim to provide comprehensive insights into various environmental and soil-related factors that are traditionally considered influential in determining the yield of these crops. By analyzing these datasets, researchers, agronomists, and farmers can gain a better understanding of the interplay between different attributes and how they might impact the overall crop yield.


Wheat Dataset:



The pathology files of 194 colon cancer patients, 137 breast cancer patients, 124 gastric cancer patients, and 169 thyroid cancer patients who were referred to the healthcare facilities of Qazvin Province, Iran  were examined for age, sex, surgery type, and pathological information. We collected information between 2010 and 2020.


Social Media Big Dataset for Research, Analytics, Prediction, and Understanding the Global Climate Change Trends is focused on understanding the climate science, trends, and public awareness of climate change. The use of dataset for analytics of climate change trends greatly helps in researching and comprehending global climate change trends.


The Numerical Latin Letters (DNLL) dataset consists of Latin numeric letters organized into 26 distinct letter classes, corresponding to the Latin alphabet. Each class within this dataset encompasses multiple letter forms, resulting in a diverse and extensive collection. These letters vary in color, size, writing style, thickness, background, orientation, luminosity, and other attributes, making the dataset highly comprehensive and rich.


This paper conducts a systematic bibliometric analysis in the Artificial Intelligence (AI) domain to explore privacy protection research as AI technologies integrate and data privacy concerns rise. Understanding evolutionary patterns and current trends in this research is crucial. Leveraging bibliometric techniques, the authors analyze 3,061 papers from the Web of Science (WoS) database, spanning 1994 to 2023.