Artificial Intelligence

Knowledge Distillation Method Comparison

Artificial Intelligence (AI) has increasingly influenced modern society, recently in particular through significant advancements in Large Language Models (LLMs). However, high computational and storage demands of LLMs still limit their deployment in resource-constrained environments. Knowledge distillation addresses this challenge by training a smaller language model (student) from a larger one (teacher). Previous research has introduced several distillation methods for both generating training data and training the student model.

Categories:: Artificial Intelligence
Machine Learning

20 Views

Knowledge Distillation Method Comparison

Categories:: Artificial Intelligence
Machine Learning

20 Views

S5P2PM

Data associated with the article: "PM2.5 Retrieval with Sentinel-5P Data over Europe Exploiting Deep Learning"

Categories:: Artificial Intelligence
Signal Processing
Remote Sensing

23 Views

Wafer

Semiconductor manufacturing is a highly complex process requiring precise control and monitoring to maintain product quality and yield. This research presents a comprehensive comparative analysis of three machine learning algorithms—Random Forest, Support Vector Machine (SVM), and XGBoost—for anomaly detection in semiconductor fabrication. Through extensive experimentation using a real-world wafer dataset, we demonstrate that XGBoost outperforms other models, achieving 97.1\% accuracy, 96.4\% precision, and 95.0\% recall.

Categories:: Artificial Intelligence

36 Views

historical information-event emotion dataset

This dataset, constructed around the Jilin Baishan Incident, aims to enhance the emotion prediction capabilities of large language models. Approximately 3.5 million raw comments were collected via the Weibo API, covering key information such as user identifiers, text content, timestamps, and interaction metrics. The data underwent preprocessing steps including normalization, Chinese tokenization, stopword removal, deduplication, and anomalous sample exclusion.

Categories:: Artificial Intelligence
Social Sciences

Views

UQTR Dataset - Snowy and non-snowy road images

The UQTR dataset consists of 7838 real and synthetic images of the Université du Québec à Trois-Rivières (UQTR) campus road under normal and snow conditions. The image resolution is 1280×720. It includes lane labels in .txt files, where each row stores the set of points of a lane. The points are stored as x1 y1 x2 y2, as in the tutorial by Ruijin Liu, Zejian Yuan, Tie Liu, Zhiliang Xiong: Train and Test Your Custom Data.

Categories:: Artificial Intelligence
Transportation
Image Processing
Computer Vision
Weather

217 Views

Review data and processing documents of driverless and human driving

The data is derived from 22,898 comments on driverless and human driving obtained by crawler technology on China's Weibo and XiaoHongshu platforms from May 1 to August 31, 2024. The main data formats are xlsx, py, txt, json and so on. The files in py format are script files, which are used to process data. The dataset was eventually used for topic mining, sentiment analysis, and more on Chinese users' comments on driverless and human driving.

Categories:: Artificial Intelligence
Transportation
Security

33 Views

Walnuts

Walnut and Heart CT Data corresponding to Noisier2Inverse consist of high-resolution computed tomography (CT) scans used for evaluating deep learning-based image reconstruction under severe noise conditions. The dataset includes walnut CT scans from controlled experimental settings and clinical cardiac CT images. The Walnut data stems from this source: https://paperswithcode.com/dataset/cbct-walnut, and the Heart CT data is processed in python before, and is provided in .pt format.

Categories:: Artificial Intelligence

25 Views

GMIMDA: Interpretable miRNA-Disease Association Prediction via Game Optimization and Multi-view Representation Learning

miRNAs influence cellular functions by regulating gene expression and interacting with diverse biomolecules within the cell. Accurate prediction of miRNAdisease associations (MDA) plays a crucial role in disease diagnosis, treatment, and drug development. However, existing computational methods focus on network structure and ignore multi-view information such as linear and non-linear when extracting miRNA and disease features. In addition, these models are generally “blackbox” in nature, which limits the understanding of their prediction mechanisms.

Categories:: Artificial Intelligence

29 Views

Experimental data

During the course of this experimental study, we meticulously collected and recorded a comprehensive set of data. These data not only reflect the precise outcomes of the experimental procedures but also directly correspond to the contents presented in the tables within the research paper. These results are crucial for validating our research hypotheses, providing a solid quantitative foundation for our understanding and analysis of the experimental phenomena.

Categories:: Artificial Intelligence
Security

23 Views

Artificial Intelligence

Artificial Intelligence

Pages