ArPC is an Arabic paraphrase identification corpus. It consists of 1331 sentence pairs along with their binary score that indicates weather the pairs are paraphrase or not. The corpus has been manually annotated by three Arabic native speakers.

Dataset Files

You must be an IEEE Dataport Subscriber to access these files. Subscribe now or login.

Documentation: 
AttachmentSize
File README.txt537 bytes
[1] Alaa Altheneyan, Mohamed Menai, "ArPC a corpus for paraphrase identification in Arabic text", IEEE Dataport, 2019. [Online]. Available: http://dx.doi.org/10.21227/jtmt-zg81. Accessed: Feb. 01, 2023.
@data{jtmt-zg81-19,
doi = {10.21227/jtmt-zg81},
url = {http://dx.doi.org/10.21227/jtmt-zg81},
author = {Alaa Altheneyan; Mohamed Menai },
publisher = {IEEE Dataport},
title = {ArPC a corpus for paraphrase identification in Arabic text},
year = {2019} }
TY - DATA
T1 - ArPC a corpus for paraphrase identification in Arabic text
AU - Alaa Altheneyan; Mohamed Menai
PY - 2019
PB - IEEE Dataport
UR - 10.21227/jtmt-zg81
ER -
Alaa Altheneyan, Mohamed Menai. (2019). ArPC a corpus for paraphrase identification in Arabic text. IEEE Dataport. http://dx.doi.org/10.21227/jtmt-zg81
Alaa Altheneyan, Mohamed Menai, 2019. ArPC a corpus for paraphrase identification in Arabic text. Available at: http://dx.doi.org/10.21227/jtmt-zg81.
Alaa Altheneyan, Mohamed Menai. (2019). "ArPC a corpus for paraphrase identification in Arabic text." Web.
1. Alaa Altheneyan, Mohamed Menai. ArPC a corpus for paraphrase identification in Arabic text [Internet]. IEEE Dataport; 2019. Available from : http://dx.doi.org/10.21227/jtmt-zg81
Alaa Altheneyan, Mohamed Menai. "ArPC a corpus for paraphrase identification in Arabic text." doi: 10.21227/jtmt-zg81