Arabic Sentiment Embeddings

Arabic Sentiment Embeddings

Citation Author(s):
Nora
Al-Twairesh
King Saud University
Hadeel
Al-Negheimesh
King Saud University
Submitted by:
Nora Al-Twairesh
Last updated:
Fri, 03/29/2019 - 09:02
DOI:
10.21227/aavk-g896
License:
Dataset Views:
12
Share / Embed Cite
Abstract: 

Includes sentiment-specific distributed word representations that have been trained on 10M Arabic tweets that are distantly supervised using positive and negative keywords. As described in the paper [1], we follow Tang’s [2] three neural architectures, which encode the sentiment of a word in addition to its semantic and syntactic representation. 

 

Specifications Table

Subject area

 Natural Language Processing

More specific subject area

Arabic Sentiment Embeddings

Type of data

text files

How data was acquired

Training Tang’s [2] models on an Arabic tweets dataset that was independently collected.

Data format

Raw

Data source location

Not applicable

Data accessibility

 

 

Value of the data  

·        May replace hand-engineered features for sentiment classification.

·        Can be used for benchmarking other Arabic sentiment embeddings.

·        The Arabic sentiment embeddings can be used for other NLP tasks where sentiment is important.

References

  1. N. Al-Twairesh, H. Al-Negheimish, Surface and Deep Features Ensemble for Sentiment Analysis of Arabic Tweets , in submission.
  2. D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, B. Qin, Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification, in: Proc. 52nd Annu. Meet. Assoc. Comput. Linguist. Vol. 1 Long Pap., Association for Computational Linguistics, Baltimore, Maryland, 2014: pp. 1555–1565. http://www.aclweb.org/anthology/P14-1146 (accessed May 18, 2018).
Instructions: 

Data

We include three files, each corresponding to one of the models which are described in detail in [1]:

1.      embeddings_ASEP.txt: the Arabic Sentiment Embeddings built using the Prediction model.

2.      embeddings_ASER.txt: the Arabic Sentiment Embeddings built using the Ranking model.

3.      embeddings_ASEH.txt: the Arabic Sentiment Embeddings built using the Hybrid model.

 

Each of the files contains 212,976 lines, starting with the word in the vocabulary, followed by a space, and then 50 decimal numbers separated by spaces (which represent the word vector).

Dataset Files

No Data files have been uploaded.

Documentation

AttachmentSize
PDF icon Arabic Sentiment Embeddings.pdf350.35 KB

Embed this dataset on another website

Copy and paste the HTML code below to embed your dataset:

Share via email or social media

Click the buttons below:

facebooktwittermailshare
[1] Nora Al-Twairesh, Hadeel Al-Negheimesh, "Arabic Sentiment Embeddings", IEEE Dataport, 2019. [Online]. Available: http://dx.doi.org/10.21227/aavk-g896. Accessed: Jul. 22, 2019.
@data{aavk-g896-19,
doi = {10.21227/aavk-g896},
url = {http://dx.doi.org/10.21227/aavk-g896},
author = {Nora Al-Twairesh; Hadeel Al-Negheimesh },
publisher = {IEEE Dataport},
title = {Arabic Sentiment Embeddings},
year = {2019} }
TY - DATA
T1 - Arabic Sentiment Embeddings
AU - Nora Al-Twairesh; Hadeel Al-Negheimesh
PY - 2019
PB - IEEE Dataport
UR - 10.21227/aavk-g896
ER -
Nora Al-Twairesh, Hadeel Al-Negheimesh. (2019). Arabic Sentiment Embeddings. IEEE Dataport. http://dx.doi.org/10.21227/aavk-g896
Nora Al-Twairesh, Hadeel Al-Negheimesh, 2019. Arabic Sentiment Embeddings. Available at: http://dx.doi.org/10.21227/aavk-g896.
Nora Al-Twairesh, Hadeel Al-Negheimesh. (2019). "Arabic Sentiment Embeddings." Web.
1. Nora Al-Twairesh, Hadeel Al-Negheimesh. Arabic Sentiment Embeddings [Internet]. IEEE Dataport; 2019. Available from : http://dx.doi.org/10.21227/aavk-g896
Nora Al-Twairesh, Hadeel Al-Negheimesh. "Arabic Sentiment Embeddings." doi: 10.21227/aavk-g896