Datasets
Standard Dataset
Dataset on RAG Pipeline Evaluation for Retrieval and Generative Response Accuracy Testing
- Citation Author(s):
- Submitted by:
- Pruthvi Raj Ven...
- Last updated:
- Sat, 08/31/2024 - 13:01
- DOI:
- 10.21227/hg2s-f350
- Data Format:
- License:
- Categories:
- Keywords:
Abstract
This dataset has been meticulously curated to evaluate the efficiency of Retrieval-Augmented Generation (RAG) pipelines in both retrieval and generative accuracy, with a particular focus on scenarios involving overlapping contexts. The dataset comprises two primary components: Motor data and Employee data. The Motor dataset includes master data of various motor models along with their corresponding manuals, linked by the motor's model name. Similarly, the Employee dataset encompasses employee master data and associated policy documents, linked by department. By providing a diverse and contextually rich set of information, this dataset serves as a comprehensive resource for testing RAG pipelines' capabilities in handling complex queries and generating precise responses in domains where context overlap is prevalent.
Documentation
Attachment | Size |
---|---|
Dataset Instruction | 5.99 KB |