Tweets Originating from India During COVID-19 Lockdowns 1, 2, 3, 4

0 ratings - Please login to submit your rating.


This India-specific COVID-19 tweets dataset has been developed using the large-scale Coronavirus (COVID-19) Tweets Dataset, which currently contains more than 600 million COVID-19 specific English language tweets. This dataset contains tweets originating from India during the first week of each four phases of nationwide lockdowns initiated by the Government of India. For more information on filtering keywords, please visit the primary dataset page.


Related: Coronavirus (COVID-19) Tweets DatasetCoronavirus (COVID-19) Geo-tagged Tweets Dataset and Coronavirus (COVID-19) Tweets Sentiment Trend (Global)


What's inside the dataset? The files in the dataset contain IDs of the tweets present in the Coronavirus (COVID-19) Tweets Dataset. Note: Below, (all files) means that all the files mentioned and in-between have been considered to develop the ID file, while (only even-numbered files) suggests that only the even-numbered files have been considered.

Lockdown period tweets: (all files) March 25, 2020 - April 02, 2020; corona_tweets_08.csv to corona_tweets_14.csv April 14, 2020 - April 21, 2020; corona_tweets_27.csv to corona_tweets_33.csv May 01, 2020 - May 07, 2020; corona_tweets_44.csv to corona_tweets_49.csv May 18, 2020 - May 23, 2020; corona_tweets_61.csv to corona_tweets_66.csv

Extras: (all files) corona_tweets_75.csv to corona_tweets_80.csv

Extras: (only even-numbered files) corona_tweets_96.csv to corona_tweets_104.csv corona_tweets_106.csv to corona_tweets_118.csv corona_tweets_120.csv to corona_tweets_138.csv corona_tweets_140.csv to corona_tweets_152.csv corona_tweets_154.csv to corona_tweets_166.csv corona_tweets_168.csv to corona_tweets_180.csv


The zipped files contain .db (SQLite database) files. Each .db file has a table 'geo'. To hydrate the IDs you can import the .db file as a pandas dataframe and then export it to .CSV or .TXT for hydration. For more details on hydrating the IDs, please visit the primary dataset page.

conn = sqlite3.connect('/path/to/the/db/file')

c = conn.cursor()

data = pd.read_sql("SELECT tweet_id FROM geo", conn)