TXT

Numerous studies have demonstrated that microbes play a vital role in human health, making the identification of potential microbe-drug associations critical for drug discovery and clinical treatment. In this manuscript, we proposed a novel prediction model named GTDEKAN by integrating an aware Transformer network with a Dual Cross-Attention (DCA) module (including a Channel Cross-Attention and a Spatial Cross-Attention) and an Enhanced Kolmogorov-Arnold Network (EKAN) to infer potential microbe-drug associations.

Categories:
37 Views

The Thai Deaf Corpus (TDC) is constructed from a writing activity where deaf students randomly select picture words using the image picker wheel, then write sentences corresponding to these words on the writing sheet. The sentences are transcribed and corrected manually to create the TDC.

Categories:
225 Views

A dataset incldes 841 nodes. This dataset includes 841 nodes in a mobile social network, used to simulate the process of users being interconnected and influencing each other within the mobile social network. Each row of data consists of two numbers, representing the current location of the node. In the process of information dissemination, each user is a node, influenced by their neighboring nodes and also influencing those neighbors in return.

Categories:
356 Views

This dataset comprises over 38,000 seed inputs generated from a range of Large Language Models (LLMs), including ChatGPT-3.5, ChatGPT-4, Claude-Opus, Claude-Instant, and Gemini Pro 1.0, specifically designed for the application in fuzzing Python functions. These seeds were produced as part of a study evaluating the utility of LLMs in automating the creation of effective fuzzing inputs, a method crucial for uncovering software defects in the Python programming environment where traditional methods show limitations.

Categories:
174 Views

Nasal Cytology, or Rhinology, is the subfield of otolaryngology, focused on the microscope observation of samples of the nasal mucosa, aimed to recognize cells of different types, to spot and diagnose ongoing pathologies. Such methodology can claim good accuracy in diagnosing rhinitis and infections, being very cheap and accessible without any instrument more complex than a microscope, even optical ones.

Categories:
604 Views

This dataset contains results of the 60 GHz indoor sensing measurement campaign using a bistatic OFDM radar based on 5G-specified positioning reference signals (PRSs). The data can be used for testing end-to-end indoor millimeter-wave radio positioning as well as simultaneous localization and mapping (SLAM) algorithms, including channel parameter estimation. Beamformed PRS with dense angular sampling in transmission and reception allows efficient capture of line-of-sight (LoS) as well as multipath components.

Categories:
1633 Views

Scatterplots provide a visual representation of bivariate data (or 2D embeddings of multivariate data) that allows for effective analyses of data dependencies, clusters, trends, and outliers. Unfortunately, classical scatterplots suffer from scalability issues, since growing data sizes eventually lead to overplotting and visual clutter on a screen with a fixed resolution, which hinders the data analysis process. We propose an algorithm that compensates for irregular sample distributions by a smooth transformation of the scatterplot's visual domain.

Categories:
92 Views

This dataset contains the online appendix of the paper titled "The effectiveness of hidden dependence metrics in bug prediction"

Abstract:

 

Categories:
76 Views

Wikipedia is a free encyclopedia written collaboratively by volunteers around the world. A small part of Wikipedia contributors are administrators, who are users with access to additional technical features that aid in maintenance. This gave us 2,794 elections with 103,663 total votes and 7,066 users participating in the elections (either casting a vote or being voted on). Out of these 1,235 elections resulted in a successful promotion, while 1,559 elections did not result in the promotion.

Categories:
5 Views

In the domain of Natural Language Processing (NLP), the English Writing Fluency Improvement for non-native speakers, particularly in academic contexts, poses significant challenges. While Sentence-level Revision (SentRev) endeavors to address this concern, the existing evaluation corpus, SMITH, falls short in offering a robust and comprehensive assessment of the task. To bridge this gap, our research offers a novel evaluation corpus generation scheme, leading to the creation of Ten-Country Non-native Academic English Corpus (TCNAEC).

Categories:
54 Views

Pages