Arabic Sentiment Embeddings

Arabic Sentiment Embeddings

Citation Author(s):
King Saud University
King Saud University
Submitted by:
Nora Al-Twairesh
Last updated:
Fri, 03/29/2019 - 09:02
Dataset Views:
Share / Embed Cite

Includes sentiment-specific distributed word representations that have been trained on 10M Arabic tweets that are distantly supervised using positive and negative keywords. As described in the paper [1], we follow Tang’s [2] three neural architectures, which encode the sentiment of a word in addition to its semantic and syntactic representation. 


Specifications Table

Subject area

 Natural Language Processing

More specific subject area

Arabic Sentiment Embeddings

Type of data

text files

How data was acquired

Training Tang’s [2] models on an Arabic tweets dataset that was independently collected.

Data format


Data source location

Not applicable

Data accessibility



Value of the data  

·        May replace hand-engineered features for sentiment classification.

·        Can be used for benchmarking other Arabic sentiment embeddings.

·        The Arabic sentiment embeddings can be used for other NLP tasks where sentiment is important.


  1. N. Al-Twairesh, H. Al-Negheimish, Surface and Deep Features Ensemble for Sentiment Analysis of Arabic Tweets , in submission.
  2. D. Tang, F. Wei, N. Yang, M. Zhou, T. Liu, B. Qin, Learning Sentiment-Specific Word Embedding for Twitter Sentiment Classification, in: Proc. 52nd Annu. Meet. Assoc. Comput. Linguist. Vol. 1 Long Pap., Association for Computational Linguistics, Baltimore, Maryland, 2014: pp. 1555–1565. (accessed May 18, 2018).


We include three files, each corresponding to one of the models which are described in detail in [1]:

1.      embeddings_ASEP.txt: the Arabic Sentiment Embeddings built using the Prediction model.

2.      embeddings_ASER.txt: the Arabic Sentiment Embeddings built using the Ranking model.

3.      embeddings_ASEH.txt: the Arabic Sentiment Embeddings built using the Hybrid model.


Each of the files contains 212,976 lines, starting with the word in the vocabulary, followed by a space, and then 50 decimal numbers separated by spaces (which represent the word vector).

Dataset Files

No Data files have been uploaded.


PDF icon Arabic Sentiment Embeddings.pdf350.35 KB

Embed this dataset on another website

Copy and paste the HTML code below to embed your dataset:

Share via email or social media

Click the buttons below:

[1] Nora Al-Twairesh, Hadeel Al-Negheimesh, "Arabic Sentiment Embeddings", IEEE Dataport, 2019. [Online]. Available: Accessed: May. 20, 2019.
doi = {10.21227/aavk-g896},
url = {},
author = {Nora Al-Twairesh; Hadeel Al-Negheimesh },
publisher = {IEEE Dataport},
title = {Arabic Sentiment Embeddings},
year = {2019} }
T1 - Arabic Sentiment Embeddings
AU - Nora Al-Twairesh; Hadeel Al-Negheimesh
PY - 2019
PB - IEEE Dataport
UR - 10.21227/aavk-g896
ER -
Nora Al-Twairesh, Hadeel Al-Negheimesh. (2019). Arabic Sentiment Embeddings. IEEE Dataport.
Nora Al-Twairesh, Hadeel Al-Negheimesh, 2019. Arabic Sentiment Embeddings. Available at:
Nora Al-Twairesh, Hadeel Al-Negheimesh. (2019). "Arabic Sentiment Embeddings." Web.
1. Nora Al-Twairesh, Hadeel Al-Negheimesh. Arabic Sentiment Embeddings [Internet]. IEEE Dataport; 2019. Available from :
Nora Al-Twairesh, Hadeel Al-Negheimesh. "Arabic Sentiment Embeddings." doi: 10.21227/aavk-g896