Tweets Originating from India During COVID-19 Lockdowns

0
0 ratings - Please login to submit your rating.

Abstract 

This India-specific COVID-19 tweets dataset has been developed using the large-scale Coronavirus (COVID-19) Tweets Dataset, which currently contains more than 700 million COVID-19 specific English language tweets. This dataset contains tweets originating from India during the first week of each four phases of nationwide lockdowns initiated by the Government of India. For more information on filtering keywords, please visit the primary dataset page.

— Dataset usage terms : By using this dataset, you agree to (i) use the content of this dataset and the data generated from the content of this dataset for non-commercial research only, (ii) remain in compliance with Twitter's Developer Policy and (iii) cite the following paper:

Lamsal, R. Design and analysis of a large-scale COVID-19 tweets dataset. Applied Intelligence (2020). https://doi.org/10.1007/s10489-020-02029-z

-------------------------------------

Related datasets:

(a) Coronavirus (COVID-19) Tweets Dataset

(b) Coronavirus (COVID-19) Geo-tagged Tweets Dataset

(c) Coronavirus (COVID-19) Tweets Sentiment Trend (Global)

-------------------------------------

What's inside the dataset? The files in the dataset contain IDs of the tweets present in the Coronavirus (COVID-19) Tweets Dataset. Note: Below, (all files) means that all the files mentioned and in-between have been considered to develop the ID file, while (only even-numbered files) suggests that only the even-numbered files have been considered.

Lockdown period tweets: (all files)

Lockdown1.zip: March 25, 2020 - April 02, 2020; corona_tweets_08.csv to corona_tweets_14.csv

Lockdown2.zip: April 14, 2020 - April 21, 2020; corona_tweets_27.csv to corona_tweets_33.csv

Lockdown3.zip: May 01, 2020 - May 07, 2020; corona_tweets_44.csv to corona_tweets_49.csv

Lockdown4.zip: May 18, 2020 - May 23, 2020; corona_tweets_61.csv to corona_tweets_66.csv

Extras: (all files)

extras_june1_june7.zip: corona_tweets_75.csv to corona_tweets_80.csv

Extras: (only even-numbered files)

extras_june24_july1.zip: corona_tweets_96.csv to corona_tweets_104.csv

extras_july2_july15.zip: corona_tweets_106.csv to corona_tweets_118.csv

extras_july16_august4.zip: corona_tweets_120.csv to corona_tweets_138.csv

extras_august5_august18.zip: corona_tweets_140.csv to corona_tweets_152.csv

extras_august19_september1.zip: corona_tweets_154.csv to corona_tweets_166.csv

extras_september2_september15.zip: corona_tweets_168.csv to corona_tweets_180.csv

Comments

Please also develop a Pakistan-specific COVID-19 tweets dataset.

Submitted by Saghir Ahmed on Sat, 10/24/2020 - 10:35

Hello Saghir.
You can hydrate the IDs present in the primary dataset to create country-specific datasets. If you closely follow the instructions that I'd emailed you earlier, you can easily extract Pakistan specific tweets based on geo-tagged info and/or Twitter place info.

Submitted by Rabindra Lamsal on Sun, 10/25/2020 - 00:50

Sir, how were the scores generated?

Submitted by Aswin Krishna M on Mon, 11/02/2020 - 08:49

Hello Aswin.
The scores are generated by TextBlob's sentiment analysis module. For more info please visit the primary dataset's page.

Submitted by Rabindra Lamsal on Tue, 11/03/2020 - 23:54

Sir, do these tweets are unique or have retweets?

Submitted by GONGATI REDDY on Thu, 11/19/2020 - 05:35

Tweets in this dataset are unique because the retweets have NULL geo and place objects.

Submitted by Rabindra Lamsal on Fri, 11/20/2020 - 23:48

Sir, how to get tweets daily from india, especially from states/cities?
I am using geocode option in python but it is not responding..
Also in R this error is coming
::Warning message in doRppAPICall("search/tweets", n, params = params, retryOnRateLimit = retryOnRateLimit, :="100 tweets were requested but the API can only return 0"

Submitted by GONGATI REDDY on Tue, 12/29/2020 - 01:34

Please refer to my comment posted below.

Submitted by Rabindra Lamsal on Sat, 01/02/2021 - 02:16

How to get tweets from cities like delhi, mumbai, kolkatta, hyderbad, bangalore directly from R or Python about covid daily?

Submitted by GONGATI REDDY on Tue, 12/29/2020 - 01:59

Once the IDs are hydrated, you can filter out tweets as per your preference (use any spreadsheets).
Or if you are comfortable with some level of programming, you can apply conditions to the "place" Twitter Geo Object:
Eg, tweet["place"]["full_name"] == "New Delhi, India" or tweet["place"]["full_name"] == "Mumbai, India" and so no.

I hope this helps.

Submitted by Rabindra Lamsal on Mon, 01/04/2021 - 01:25