<p>This multilingual Twitter dataset spans over 2 years from October 2019 to the end of 2021,&nbsp;including 3 months before the outbreak of the COVID-19&nbsp;pandemic.</p>


The presence of organisations in Online Social Networks (OSNs) has motivated malicious users to look for attack vectors, which are then used to increase the possibility of carrying out successful attacks and obtaining either private information or access to the organisation. This article hypothesised that organisations have specific languages that their members use in OSNs, which malicious users could potentially use to carry out an impersonation attack.


Twitter is one of the most popular social networks for sentiment analysis. This data set of tweets are related to the stock market. We collected 943,672 tweets between April 9 and July 16, 2020, using the S&P 500 tag (#SPX500), the references to the top 25 companies in the S&P 500 index, and the Bloomberg tag (#stocks). 1,300 out of the 943,672 tweets were manually annotated in positive, neutral, or negative classes. A second independent annotator reviewed the manually annotated tweets.


This dataset is a set of eighteen directed networks that represents message exchanges among Twitter accounts during eighteen crisis events. The dataset comprises 645,339 anonymized unique user IDs and 1,396,709 edges that are labeled with respect to Plutchik's basic emotions (anger, fear, sadness, disgust, joy, trust, anticipation, and surprise) or "neutral" (if a tweet conveys no emotion).


This dataset includes 24,201,654 tweets related to the US Presidential Election on November 3, 2020, collected between July 1, 2020, and November 11, 2020. The related party name and sentiment scores of tweets, also the words that affect the score were added to the data set.


This dataset is very vast and contains tweets related to COVID-19. There are 226668 unique tweet-ids in the whole dataset that ranges from December 2019 till May 2020 . The keywords that have been used to crawl the tweets are 'corona',  ,  'covid ' , 'sarscov2 ',  'covid19', 'coronavirus '.


This data set includes Covid-19 related Tweet messages written in Turkish that contain at least one of four keywords (Covid, Kovid, Corona, Korona). These keywords are used to express Covid-19 virus in Turkey. Tweets collection was started from 11th March 2020, the first Covid-19 case seen in Turkey.

Currently dataset contain 4,8 million tweets with 6 different attribute of each tweets that were sent from 9 March 2020 until 6 May 2020.

The data file contains comma separated values (CSV). It contains the following information (6 Column) for each tweet in the data file: