Skip to main content

large language models

The LLM-RIMSA dataset, designed to advance 6G networks through ultra-massive connectivity and intelligent radio environments. The dataset is built around a novel framework that integrates large language models (LLMs) with a reconfigurable intelligent metasurface antenna (RIMSA) architecture. This integration addresses limitations in hardware efficiency, dynamic control, and scalability seen in existing RIS technologies.

 

 

Categories:

This is the dataset of the paper Privacy-Preserving Federal Embedding Learning for Localized Retrieval-Augmented Generation.

Categories:

IRWOZ has improved industrial human-robot interaction (HRI) dialogue systems through domain-specific annotations. However, its initial version contains substantial noise in dialogue states and utterances, limiting state-tracking accuracy. We introduce IRWOZ 2.0, which addresses these limitations through large language model (LLM) enhanced generation (Mistral/Claude-3.5) and quality refinements. Our improved

Categories:

We released TrafficLLM's training datasets, which contain over 0.4M traffic data and 9K human instructions for LLM adaptation across different traffic analysis tasks.

Categories:

Computational experiments within metaverse service ecosystems enable the identification of social risks and governance crises, and the optimization of governance strategies through counterfactual inference to dynamically guide real-world service ecosystem operations. The advent of Large Language Models (LLMs) has empowered LLM-based agents to function as autonomous service entities capable of executing diverse service operations within metaverse ecosystems, thereby facilitating the governance of metaverse service ecosystem with computational experiments.

Categories:

This dataset is constructed in a study that addresses the gap between text summarization and content readability for diverse Turkish-speaking audiences. It contains paired original texts and corresponding summaries optimized for different readability levels using the YOD (Yeni Okunabilirlik Düzeyi) formula.

Categories:

Facilities for the developmentally disabled face the challenge of detecting abnormal behaviors because of limited staff and the difficulty of spotting subtle movements. Traditional methods often struggle to identify these behaviors because abnormal actions are irregular and unpredictable, leading to frequent misses or misclassifications.

Categories:

The growing adoption of declarative software specification languages, coupled with their inherent difficulty in debugging, has underscored the need for effective and automated repair techniques applicable to such languages. Researchers have recently explored various methods to automatically repair declarative software specifications, such as template-based repair, feedback-driven iterative repair, and bounded exhaustive approaches. The latest developments in Large Language Models (LLMs) provide new opportunities for the automatic repair of declarative specifications.

Categories:

Code search is essential for code reuse, allowing developers to efficiently locate relevant code snippets. Traditional encoder-based models, however, face challenges with poor generalization and input length limitations. In contrast, decoder-only large language models (LLMs), with their larger size, extensive pre-training, and ability to handle longer inputs, present a promising solution to these issues. However, their effectiveness in code search has not been fully explored.

Categories:

This dataset provides the foundational resources for evaluating and optimizing Formula L , a novel mathematical framework for semantic-driven task allocation in multi-agent systems (MAS) powered by large language models (LLM). The dataset includes Python code and both empirical and synthetic data, specifically designed to validate the effectiveness of Formula L in improving task distribution, contextual relevance, and dynamic adaptation within MAS.

The dataset comprises:

Categories: