Coronavirus (COVID-19) Tweets Dataset

4.875
16 ratings - Please login to submit your rating.

Abstract 

This dataset includes CSV files that contain IDs and sentiment scores of the tweets related to the COVID-19 pandemic. The tweets have been collected by an on-going project deployed at https://live.rlamsal.com.np. The model monitors the real-time Twitter feed for coronavirus-related tweets using 90+ different keywords and hashtags that are commonly used while referencing the pandemic. This dataset has been wholly re-designed on March 20, 2020, to comply with the content redistribution policy set by Twitter.

The paper associated with this dataset is available here: Design and analysis of a large-scale COVID-19 tweets dataset

-------------------------------------

Related datasets:

(a) Tweets Originating from India During COVID-19 Lockdowns

(b) Coronavirus (COVID-19) Tweets Sentiment Trend (Global)

-------------------------------------

Below is the quick overview of this dataset.

— Dataset name: COV19Tweets Dataset

— Number of tweets : 797,355,603 tweets

— Coverage : Global

— Language : English (EN)

— Dataset usage terms : By using this dataset, you agree to (i) use the content of this dataset and the data generated from the content of this dataset for non-commercial research only, (ii) remain in compliance with Twitter's Developer Policy and (iii) cite the following paper:

Lamsal, R. Design and analysis of a large-scale COVID-19 tweets dataset. Applied Intelligence (2020). https://doi.org/10.1007/s10489-020-02029-z

— Geo-tagged Version: Coronavirus (COVID-19) Geo-tagged Tweets Dataset (GeoCOV19Tweets Dataset)

— Dataset updates : Everyday

— Active keywords and hashtags (archive: keywords.tsv) : "corona", "#corona", "coronavirus", "#coronavirus", "covid", "#covid", "covid19", "#covid19", "covid-19", "#covid-19", "sarscov2", "#sarscov2", "sars cov2", "sars cov 2", "covid_19", "#covid_19", "#ncov", "ncov", "#ncov2019", "ncov2019", "2019-ncov", "#2019-ncov", "pandemic", "#pandemic" "#2019ncov", "2019ncov", "quarantine", "#quarantine", "flatten the curve", "flattening the curve", "#flatteningthecurve", "#flattenthecurve", "hand sanitizer", "#handsanitizer", "#lockdown", "lockdown", "social distancing", "#socialdistancing", "work from home", "#workfromhome", "working from home", "#workingfromhome", "ppe", "n95", "#ppe", "#n95", "#covidiots", "covidiots", "herd immunity", "#herdimmunity", "pneumonia", "#pneumonia", "chinese virus", "#chinesevirus", "wuhan virus", "#wuhanvirus", "kung flu", "#kungflu", "wearamask", "#wearamask", "wear a mask", "vaccine", "vaccines", "#vaccine", "#vaccines", "corona vaccine", "corona vaccines", "#coronavaccine", "#coronavaccines", "face shield", "#faceshield", "face shields", "#faceshields", "health worker", "#healthworker", "health workers", "#healthworkers", "#stayhomestaysafe", "#coronaupdate", "#frontlineheroes", "#coronawarriors", "#homeschool", "#homeschooling", "#hometasking", "#masks4all", "#wfh", "wash ur hands", "wash your hands", "#washurhands", "#washyourhands", "#stayathome", "#stayhome", "#selfisolating", "self isolating"

Dataset Files (the local time mentioned below is GMT+5:45)

corona_tweets_01.csv + corona_tweets_02.csv + corona_tweets_03.csv: 2,475,980 tweets (March 20, 2020 01:37 AM - March 21, 2020 09:25 AM)

corona_tweets_04.csv: 1,233,340 tweets (March 21, 2020 09:27 AM - March 22, 2020 07:46 AM)

corona_tweets_05.csv: 1,782,157 tweets (March 22, 2020 07:50 AM - March 23, 2020 09:08 AM)

corona_tweets_06.csv: 1,771,295 tweets (March 23, 2020 09:11 AM - March 24, 2020 11:35 AM)

corona_tweets_07.csv: 1,479,651 tweets (March 24, 2020 11:42 AM - March 25, 2020 11:43 AM)

corona_tweets_08.csv: 1,272,592 tweets (March 25, 2020 11:47 AM - March 26, 2020 12:46 PM)

corona_tweets_09.csv: 1,091,429 tweets (March 26, 2020 12:51 PM - March 27, 2020 11:53 AM)

corona_tweets_10.csv: 1,172,013 tweets (March 27, 2020 11:56 AM - March 28, 2020 01:59 PM)

corona_tweets_11.csv: 1,141,210 tweets (March 28, 2020 02:03 PM - March 29, 2020 04:01 PM)

corona_tweets_12.csv: 793,417 tweets (March 30, 2020 02:01 PM - March 31, 2020 10:16 AM)

corona_tweets_13.csv: 1,029,294 tweets (March 31, 2020 10:20 AM - April 01, 2020 10:59 AM)

corona_tweets_14.csv: 920,076 tweets (April 01, 2020 11:02 AM - April 02, 2020 12:19 PM)

corona_tweets_15.csv: 826,271 tweets (April 02, 2020 12:21 PM - April 03, 2020 02:38 PM)

corona_tweets_16.csv: 612,512 tweets (April 03, 2020 02:40 PM - April 04, 2020 11:54 AM)

corona_tweets_17.csv: 685,560 tweets (April 04, 2020 11:56 AM - April 05, 2020 12:54 PM)

corona_tweets_18.csv: 717,301 tweets (April 05, 2020 12:56 PM - April 06, 2020 10:57 AM)

corona_tweets_19.csv: 722,921 tweets (April 06, 2020 10:58 AM - April 07, 2020 12:28 PM)

corona_tweets_20.csv: 554,012 tweets (April 07, 2020 12:29 PM - April 08, 2020 12:34 PM)

corona_tweets_21.csv: 589,679 tweets (April 08, 2020 12:37 PM - April 09, 2020 12:18 PM)

corona_tweets_22.csv: 517,718 tweets (April 09, 2020 12:20 PM - April 10, 2020 09:20 AM)

corona_tweets_23.csv: 601,199 tweets (April 10, 2020 09:22 AM - April 11, 2020 10:22 AM)

corona_tweets_24.csv: 497,655 tweets (April 11, 2020 10:24 AM - April 12, 2020 10:53 AM)

corona_tweets_25.csv: 477,182 tweets (April 12, 2020 10:57 AM - April 13, 2020 11:43 AM)

corona_tweets_26.csv: 288,277 tweets (April 13, 2020 11:46 AM - April 14, 2020 12:49 AM)

corona_tweets_27.csv: 515,739 tweets (April 14, 2020 11:09 AM - April 15, 2020 12:38 PM)

corona_tweets_28.csv: 427,088 tweets (April 15, 2020 12:40 PM - April 16, 2020 10:03 AM)

corona_tweets_29.csv: 433,368 tweets (April 16, 2020 10:04 AM - April 17, 2020 10:38 AM)

corona_tweets_30.csv: 392,847 tweets (April 17, 2020 10:40 AM - April 18, 2020 10:17 AM)

> With the addition of some more coronavirus specific keywords, the number of tweets captured day has increased significantly, therefore, the CSV files hereafter will be zipped. Lets save some bandwidth.

corona_tweets_31.csv: 2,671,818 tweets (April 18, 2020 10:19 AM - April 19, 2020 09:34 AM)

corona_tweets_32.csv: 2,393,006 tweets (April 19, 2020 09:43 AM - April 20, 2020 10:45 AM)

corona_tweets_33.csv: 2,227,579 tweets (April 20, 2020 10:56 AM - April 21, 2020 10:47 AM)

corona_tweets_34.csv: 2,211,689 tweets (April 21, 2020 10:54 AM - April 22, 2020 10:33 AM)

corona_tweets_35.csv: 2,265,189 tweets (April 22, 2020 10:45 AM - April 23, 2020 10:49 AM)

corona_tweets_36.csv: 2,201,138 tweets (April 23, 2020 11:08 AM - April 24, 2020 10:39 AM)

corona_tweets_37.csv: 2,338,713 tweets (April 24, 2020 10:51 AM - April 25, 2020 11:50 AM)

corona_tweets_38.csv: 1,981,835 tweets (April 25, 2020 12:20 PM - April 26, 2020 09:13 AM)

corona_tweets_39.csv: 2,348,827 tweets (April 26, 2020 09:16 AM - April 27, 2020 10:21 AM)

corona_tweets_40.csv: 2,212,216 tweets (April 27, 2020 10:33 AM - April 28, 2020 10:09 AM)

corona_tweets_41.csv: 2,118,853 tweets (April 28, 2020 10:20 AM - April 29, 2020 08:48 AM)

corona_tweets_42.csv: 2,390,703 tweets (April 29, 2020 09:09 AM - April 30, 2020 10:33 AM)

corona_tweets_43.csv: 2,184,439 tweets (April 30, 2020 10:53 AM - May 01, 2020 10:18 AM)

corona_tweets_44.csv: 2,223,013 tweets (May 01, 2020 10:23 AM - May 02, 2020 09:54 AM)

corona_tweets_45.csv: 2,216,553 tweets (May 02, 2020 10:18 AM - May 03, 2020 09:57 AM)

corona_tweets_46.csv: 2,266,373 tweets (May 03, 2020 10:09 AM - May 04, 2020 10:17 AM)

corona_tweets_47.csv: 2,227,489 tweets (May 04, 2020 10:32 AM - May 05, 2020 10:17 AM)

corona_tweets_48.csv: 2,218,774 tweets (May 05, 2020 10:38 AM - May 06, 2020 10:26 AM)

corona_tweets_49.csv: 2,164,251 tweets (May 06, 2020 10:35 AM - May 07, 2020 09:33 AM)

corona_tweets_50.csv: 2,203,686 tweets (May 07, 2020 09:55 AM - May 08, 2020 09:35 AM)

corona_tweets_51.csv: 2,250,019 tweets (May 08, 2020 09:39 AM - May 09, 2020 09:49 AM)

corona_tweets_52.csv: 2,273,705 tweets (May 09, 2020 09:55 AM - May 10, 2020 10:11 AM)

corona_tweets_53.csv: 2,208,264 tweets (May 10, 2020 10:23 AM - May 11, 2020 09:57 AM)

corona_tweets_54.csv: 2,216,845 tweets (May 11, 2020 10:08 AM - May 12, 2020 09:52 AM)

corona_tweets_55.csv: 2,264,472 tweets (May 12, 2020 09:59 AM - May 13, 2020 10:14 AM)

corona_tweets_56.csv: 2,339,709 tweets (May 13, 2020 10:24 AM - May 14, 2020 11:21 AM)

corona_tweets_57.csv: 2,096,878 tweets (May 14, 2020 11:38 AM - May 15, 2020 09:58 AM)

corona_tweets_58.csv: 2,214,205 tweets (May 15, 2020 10:13 AM - May 16, 2020 09:43 AM)

> The server and the databases have been optimized; therefore, there is a significant rise in the number of tweets captured per day.

corona_tweets_59.csv: 3,389,090 tweets (May 16, 2020 09:58 AM - May 17, 2020 10:34 AM)

corona_tweets_60.csv: 3,530,933 tweets (May 17, 2020 10:36 AM - May 18, 2020 10:07 AM)

corona_tweets_61.csv: 3,899,631 tweets (May 18, 2020 10:08 AM - May 19, 2020 10:07 AM)

corona_tweets_62.csv: 3,767,009 tweets (May 19, 2020 10:08 AM - May 20, 2020 10:06 AM)

corona_tweets_63.csv: 3,790,455 tweets (May 20, 2020 10:06 AM - May 21, 2020 10:15 AM)

corona_tweets_64.csv: 3,582,020 tweets (May 21, 2020 10:16 AM - May 22, 2020 10:13 AM)

corona_tweets_65.csv: 3,461,470 tweets (May 22, 2020 10:14 AM - May 23, 2020 10:08 AM)

corona_tweets_66.csv: 3,477,564 tweets (May 23, 2020 10:08 AM - May 24, 2020 10:02 AM)

corona_tweets_67.csv: 3,656,446 tweets (May 24, 2020 10:02 AM - May 25, 2020 10:10 AM)

corona_tweets_68.csv: 3,474,952 tweets (May 25, 2020 10:11 AM - May 26, 2020 10:22 AM)

corona_tweets_69.csv: 3,422,960 tweets (May 26, 2020 10:22 AM - May 27, 2020 10:16 AM)

corona_tweets_70.csv: 3,480,999 tweets (May 27, 2020 10:17 AM - May 28, 2020 10:35 AM)

corona_tweets_71.csv: 3,446,008 tweets (May 28, 2020 10:36 AM - May 29, 2020 10:07 AM)

corona_tweets_72.csv: 3,492,841 tweets (May 29, 2020 10:07 AM - May 30, 2020 10:14 AM)

corona_tweets_73.csv: 3,098,817 tweets (May 30, 2020 10:15 AM - May 31, 2020 10:13 AM)

corona_tweets_74.csv: 3,234,848 tweets (May 31, 2020 10:13 AM - June 01, 2020 10:14 AM)

corona_tweets_75.csv: 3,206,132 tweets (June 01, 2020 10:15 AM - June 02, 2020 10:07 AM)

corona_tweets_76.csv: 3,206,417 tweets (June 02, 2020 10:08 AM - June 03, 2020 10:26 AM)

corona_tweets_77.csv: 3,256,225 tweets (June 03, 2020 10:27 AM - June 04, 2020 10:23 AM)

corona_tweets_78.csv: 2,205,123 tweets (June 04, 2020 10:26 AM - June 05, 2020 10:03 AM) (tweet IDs were extracted from the backup server for this session)

corona_tweets_79.csv: 3,381,184 tweets (June 05, 2020 10:11 AM - June 06, 2020 10:16 AM)

corona_tweets_80.csv: 3,194,500 tweets (June 06, 2020 10:17 AM - June 07, 2020 10:24 AM)

corona_tweets_81.csv: 2,768,780 tweets (June 07, 2020 10:25 AM - June 08, 2020 10:13 AM)

corona_tweets_82.csv: 3,032,227 tweets (June 08, 2020 10:13 AM - June 09, 2020 10:12 AM)

corona_tweets_83.csv: 2,984,970 tweets (June 09, 2020 10:12 AM - June 10, 2020 10:13 AM)

corona_tweets_84.csv: 3,068,002 tweets (June 10, 2020 10:14 AM - June 11, 2020 10:11 AM)

corona_tweets_85.csv: 3,261,215 tweets (June 11, 2020 10:12 AM - June 12, 2020 10:10 AM)

corona_tweets_86.csv: 3,378,901 tweets (June 12, 2020 10:11 AM - June 13, 2020 10:10 AM)

corona_tweets_87.csv: 3,011,103 tweets (June 13, 2020 10:11 AM - June 14, 2020 10:08 AM)

corona_tweets_88.csv: 3,154,328 tweets (June 14, 2020 10:09 AM - June 15, 2020 10:10 AM)

corona_tweets_89.csv: 3,837,552 tweets (June 15, 2020 10:10 AM - June 16, 2020 10:10 AM)

corona_tweets_90.csv: 3,889,262 tweets (June 16, 2020 10:11 AM - June 17, 2020 10:10 AM)

corona_tweets_91.csv: 3,688,348 tweets (June 17, 2020 10:10 AM - June 18, 2020 10:09 AM)

corona_tweets_92.csv: 3,673,328 tweets (June 18, 2020 10:10 AM - June 19, 2020 10:10 AM)

corona_tweets_93.csv: 3,634,172 tweets (June 19, 2020 10:10 AM - June 20, 2020 10:10 AM)

corona_tweets_94.csv: 3,610,992 tweets (June 20, 2020 10:10 AM - June 21, 2020 10:10 AM)

corona_tweets_95.csv: 3,352,643 tweets (June 21, 2020 10:10 AM - June 22, 2020 10:10 AM)

corona_tweets_96.csv: 3,730,105 tweets (June 22, 2020 10:10 AM - June 23, 2020 10:09 AM)

corona_tweets_97.csv: 3,936,238 tweets (June 23, 2020 10:10 AM - June 24, 2020 10:09 AM)

corona_tweets_98.csv: 3,858,387 tweets (June 24, 2020 10:10 AM - June 25, 2020 10:09 AM)

corona_tweets_99.csv: 3,883,506 tweets (June 25, 2020 10:10 AM - June 26, 2020 10:09 AM)

corona_tweets_100.csv: 3,941,476 tweets (June 26, 2020 10:09 AM - June 27, 2020 10:10 AM)

corona_tweets_101.csv: 3,816,987 tweets (June 27, 2020 10:11 AM - June 28, 2020 10:10 AM)

corona_tweets_102.csv: 3,743,358 tweets (June 28, 2020 10:10 AM - June 29, 2020 10:10 AM)

corona_tweets_103.csv: 3,880,998 tweets (June 29, 2020 10:10 AM - June 30, 2020 10:10 AM)

corona_tweets_104.csv: 3,926,862 tweets (June 30, 2020 10:10 AM - July 01, 2020 10:10 AM)

corona_tweets_105.csv: 4,365,171 tweets (July 01, 2020 10:11 AM - July 02, 2020 12:28 PM)

corona_tweets_106.csv: 3,563,659 tweets (July 02, 2020 12:29 PM - July 03, 2020 10:10 AM)

corona_tweets_107.csv: 3,446,100 tweets (July 03, 2020 10:10 AM - July 04, 2020 07:00 AM)

corona_tweets_108.csv: 4,076,176 tweets (July 04, 2020 07:01 AM - July 05, 2020 09:16 AM)

corona_tweets_109.csv: 3,827,904 tweets (July 05, 2020 09:17 AM - July 06, 2020 10:10 AM)

corona_tweets_110.csv: 3,991,881 tweets (July 06, 2020 10:10 AM - July 07, 2020 10:10 AM)

corona_tweets_111.csv: 4,104,245 tweets (July 07, 2020 10:11 AM - July 08, 2020 10:10 AM)

corona_tweets_112.csv: 4,032,945 tweets (July 08, 2020 10:10 AM - July 09, 2020 10:10 AM)

corona_tweets_113.csv: 3,912,560 tweets (July 09, 2020 10:10 AM - July 10, 2020 10:12 AM)

corona_tweets_114.csv: 4,024,227 tweets (July 10, 2020 10:12 AM - July 11, 2020 10:20 AM)

corona_tweets_115.csv: 3,746,316 tweets (July 11, 2020 10:20 AM - July 12, 2020 10:09 AM)

corona_tweets_116.csv: 3,902,393 tweets (July 12, 2020 10:10 AM - July 13, 2020 10:09 AM)

corona_tweets_117.csv: 4,045,441 tweets (July 13, 2020 10:10 AM - July 14, 2020 10:09 AM)

corona_tweets_118.csv: 4,130,726 tweets (July 14, 2020 10:10 AM - July 15, 2020 10:25 AM)

corona_tweets_119.csv: 4,106,648 tweets (July 15, 2020 10:26 AM - July 16, 2020 10:10 AM)

corona_tweets_120.csv: 4,083,573 tweets (July 16, 2020 10:11 AM - July 17, 2020 10:10 AM)

corona_tweets_121.csv: 4,014,323 tweets (July 17, 2020 10:10 AM - July 18, 2020 10:25 AM)

corona_tweets_122.csv: 3,639,620 tweets (July 18, 2020 10:25 AM - July 19, 2020 10:30 AM)

corona_tweets_123.csv: 3,600,404 tweets (July 19, 2020 10:30 AM - July 20, 2020 10:10 AM)

corona_tweets_124.csv: 3,777,908 tweets (July 20, 2020 10:11 AM - July 21, 2020 10:10 AM)

corona_tweets_125.csv: 3,771,150 tweets (July 21, 2020 10:11 AM - July 22, 2020 10:10 AM)

corona_tweets_126.csv: 3,691,852 tweets (July 22, 2020 10:10 AM - July 23, 2020 10:10 AM)

corona_tweets_127.csv: 3,661,885 tweets (July 23, 2020 10:10 AM - July 24, 2020 10:10 AM)

corona_tweets_128.csv: 3,621,819 tweets (July 24, 2020 10:10 AM - July 25, 2020 10:20 AM)

corona_tweets_129.csv: 3,512,553 tweets (July 25, 2020 10:20 AM - July 26, 2020 10:10 AM)

corona_tweets_130.csv: 3,399,349 tweets (July 26, 2020 10:11 AM - July 27, 2020 10:10 AM)

corona_tweets_131.csv: 3,889,978 tweets (July 27, 2020 10:10 AM - July 28, 2020 10:10 AM)

corona_tweets_132.csv: 4,167,168 tweets (July 28, 2020 10:10 AM - July 29, 2020 10:10 AM)

corona_tweets_133.csv: 4,007,131 tweets (July 29, 2020 10:10 AM - July 30, 2020 10:10 AM)

corona_tweets_134.csv: 3,968,762 tweets (July 30, 2020 10:10 AM - July 31, 2020 10:10 AM)

corona_tweets_135.csv: 3,867,434 tweets (July 31, 2020 10:10 AM - August 01, 2020 10:12 AM)

corona_tweets_136.csv: 3,533,863 tweets (August 01, 2020 10:12 AM - August 02, 2020 10:10 AM)

corona_tweets_137.csv: 3,748,433 tweets (August 02, 2020 10:10 AM - August 03, 2020 10:10 AM)

corona_tweets_138.csv: 3,810,246 tweets (August 03, 2020 10:10 AM - August 04, 2020 10:12 AM)

corona_tweets_139.csv: 3,726,039 tweets (August 04, 2020 10:12 AM - August 05, 2020 10:10 AM)

corona_tweets_140.csv: 3,770,597 tweets (August 05, 2020 10:10 AM - August 06, 2020 10:10 AM)

corona_tweets_141.csv: 3,839,194 tweets (August 06, 2020 10:10 AM - August 07, 2020 10:10 AM)

corona_tweets_142.csv: 3,702,517 tweets (August 07, 2020 10:11 AM - August 08, 2020 10:10 AM)

corona_tweets_143.csv: 3,482,091 tweets (August 08, 2020 10:11 AM - August 09, 2020 10:10 AM)

corona_tweets_144.csv: 3,822,854 tweets (August 09, 2020 10:10 AM - August 10, 2020 10:10 AM)

corona_tweets_145.csv: 3,911,443 tweets (August 10, 2020 10:10 AM - August 11, 2020 10:10 AM)

corona_tweets_146.csv: 3,838,286 tweets (August 11, 2020 10:10 AM - August 12, 2020 10:10 AM)

corona_tweets_147.csv: 3,624,028 tweets (August 12, 2020 10:10 AM - August 13, 2020 10:10 AM)

corona_tweets_148.csv: 3,749,980 tweets (August 13, 2020 10:10 AM - August 14, 2020 10:10 AM)

corona_tweets_149.csv: 3,683,305 tweets (August 14, 2020 10:10 AM - August 15, 2020 10:10 AM)

corona_tweets_150.csv: 3,187,087 tweets (August 15, 2020 10:10 AM - August 16, 2020 10:10 AM)

corona_tweets_151.csv: 3,181,939 tweets (August 16, 2020 10:10 AM - August 17, 2020 10:10 AM)

corona_tweets_152.csv: 3,680,958 tweets (August 17, 2020 10:10 AM - August 18, 2020 10:10 AM)

corona_tweets_153.csv: 3,610,316 tweets (August 18, 2020 10:10 AM - August 19, 2020 10:10 AM)

corona_tweets_154.csv: 3,534,349 tweets (August 19, 2020 10:10 AM - August 20, 2020 10:10 AM)

corona_tweets_155.csv: 3,609,804 tweets (August 20, 2020 10:10 AM - August 21, 2020 10:10 AM)

corona_tweets_156.csv: 3,962,927 tweets (August 21, 2020 10:10 AM - August 22, 2020 10:10 AM)

corona_tweets_157.csv: 3,583,818 tweets (August 22, 2020 10:10 AM - August 23, 2020 10:10 AM)

corona_tweets_158.csv: 4,045,201 tweets (August 23, 2020 10:10 AM - August 24, 2020 10:10 AM)

corona_tweets_159.csv: 3,982,835 tweets (August 24, 2020 10:10 AM - August 25, 2020 10:20 AM)

corona_tweets_160.csv: 3,896,212 tweets (August 25, 2020 10:20 AM - August 26, 2020 10:10 AM)

corona_tweets_161.csv: 3,965,851 tweets (August 26, 2020 10:10 AM - August 27, 2020 10:10 AM)

corona_tweets_162.csv: 3,913,091 tweets (August 27, 2020 10:10 AM - August 28, 2020 10:10 AM)

corona_tweets_163.csv: 3,850,248 tweets (August 28, 2020 10:10 AM - August 29, 2020 10:10 AM)

corona_tweets_164.csv: 3,282,065 tweets (August 29, 2020 10:10 AM - August 30, 2020 10:10 AM)

corona_tweets_165.csv: 3,494,658 tweets (August 30, 2020 10:11 AM - August 31, 2020 10:10 AM)

corona_tweets_166.csv: 3,725,303 tweets (August 31, 2020 10:10 AM - September 01, 2020 10:10 AM)

corona_tweets_167.csv: 3,665,464 tweets (September 01, 2020 10:10 AM - September 02, 2020 10:10 AM)

corona_tweets_168.csv: 3,742,416 tweets (September 02, 2020 10:10 AM - September 03, 2020 10:10 AM)

corona_tweets_169.csv: 3,833,791 tweets (September 03, 2020 10:10 AM - September 04, 2020 10:10 AM)

corona_tweets_170.csv: 3,189,110 tweets (September 04, 2020 10:10 AM - September 05, 2020 10:15 AM)

corona_tweets_171.csv: 2,736,116 tweets (September 05, 2020 10:15 AM - September 06, 2020 10:10 AM)

corona_tweets_172.csv: 2,742,674 tweets (September 06, 2020 10:10 AM - September 07, 2020 10:10 AM)

corona_tweets_173.csv: 3,428,867 tweets (September 07, 2020 10:10 AM - September 08, 2020 10:10 AM)

corona_tweets_174.csv: 3,596,199 tweets (September 08, 2020 10:10 AM - September 09, 2020 10:10 AM)

corona_tweets_175.csv: 3,983,190 tweets (September 09, 2020 10:11 AM - September 10, 2020 10:10 AM)

corona_tweets_176.csv: 4,032,447 tweets (September 10, 2020 10:10 AM - September 11, 2020 10:10 AM)

corona_tweets_177.csv: 3,499,620 tweets (September 11, 2020 10:10 AM - September 12, 2020 10:10 AM)

corona_tweets_178.csv: 3,165,691 tweets (September 12, 2020 10:10 AM - September 13, 2020 10:10 AM)

corona_tweets_179.csv: 3,172,727 tweets (September 13, 2020 10:10 AM - September 14, 2020 10:10 AM)

corona_tweets_180.csv: 3,590,356 tweets (September 14, 2020 10:10 AM - September 15, 2020 10:10 AM)

corona_tweets_181.csv: 3,638,935 tweets (September 15, 2020 10:10 AM - September 16, 2020 10:10 AM)

corona_tweets_182.csv: 3,839,131 tweets (September 16, 2020 10:10 AM - September 17, 2020 10:10 AM)

corona_tweets_183.csv: 3,661,202 tweets (September 17, 2020 10:10 AM - September 18, 2020 10:10 AM)

corona_tweets_184.csv: 3,328,710 tweets (September 18, 2020 10:10 AM - September 19, 2020 10:10 AM)

corona_tweets_185.csv: 2,373,557 tweets (September 19, 2020 10:10 AM - September 20, 2020 10:10 AM)

corona_tweets_186.csv: 2,717,488 tweets (September 20, 2020 10:10 AM - September 21, 2020 10:10 AM)

corona_tweets_187.csv: 3,647,346 tweets (September 21, 2020 10:10 AM - September 22, 2020 10:10 AM)

corona_tweets_188.csv: 3,773,214 tweets (September 22, 2020 10:10 AM - September 23, 2020 10:10 AM)

corona_tweets_189.csv: 3,675,631 tweets (September 23, 2020 10:10 AM - September 24, 2020 10:10 AM)

corona_tweets_190.csv: 3,469,386 tweets (September 24, 2020 10:10 AM - September 25, 2020 10:10 AM)

corona_tweets_191.csv: 2,943,510 tweets (September 25, 2020 10:10 AM - September 26, 2020 10:10 AM)

corona_tweets_192.csv: 2,750,782 tweets (September 26, 2020 10:10 AM - September 27, 2020 10:10 AM)

corona_tweets_193.csv: 2,349,966 tweets (September 27, 2020 10:10 AM - September 28, 2020 10:10 AM)

corona_tweets_194.csv: 2,828,438 tweets (September 28, 2020 10:10 AM - September 29, 2020 10:10 AM)

corona_tweets_195.csv: 3,154,852 tweets (September 29, 2020 10:10 AM - September 30, 2020 10:10 AM)

corona_tweets_196.csv: 2,652,574 tweets (September 30, 2020 10:10 AM - October 01, 2020 10:10 AM)

corona_tweets_197.csv: 3,149,122 tweets (October 01, 2020 10:10 AM - October 02, 2020 10:10 AM)

corona_tweets_198.csv: 4,312,864 tweets (October 02, 2020 10:10 AM - October 03, 2020 10:10 AM)

corona_tweets_199.csv: 4,263,630 tweets (October 03, 2020 10:10 AM - October 04, 2020 10:10 AM)

corona_tweets_200.csv: 4,029,468 tweets (October 04, 2020 10:10 AM - October 05, 2020 10:10 AM)

corona_tweets_201.csv: 4,120,708 tweets (October 05, 2020 10:10 AM - October 06, 2020 10:10 AM)

corona_tweets_202.csv: 4,250,382 tweets (October 06, 2020 10:10 AM - October 07, 2020 10:12 AM)

corona_tweets_203.csv: 4,004,487 tweets (October 07, 2020 10:12 AM - October 08, 2020 10:10 AM)

corona_tweets_204.csv: 3,963,002 tweets (October 08, 2020 10:10 AM - October 09, 2020 10:10 AM)

corona_tweets_205.csv: 3,876,621 tweets (October 09, 2020 10:10 AM - October 10, 2020 10:10 AM)

corona_tweets_206.csv: 3,725,364 tweets (October 10, 2020 10:10 AM - October 11, 2020 10:10 AM)

corona_tweets_207.csv: 3,555,115 tweets (October 11, 2020 10:10 AM - October 12, 2020 10:10 AM)

corona_tweets_208.csv: 3,836,485 tweets (October 12, 2020 10:10 AM - October 13, 2020 10:10 AM)

corona_tweets_209.csv: 3,773,010 tweets (October 13, 2020 10:10 AM - October 14, 2020 10:10 AM)

corona_tweets_210.csv: 3,530,338 tweets (October 14, 2020 10:10 AM - October 15, 2020 10:10 AM)

corona_tweets_211.csv: 3,288,065 tweets (October 15, 2020 10:10 AM - October 16, 2020 10:10 AM)

corona_tweets_212.csv: 3,111,052 tweets (October 16, 2020 10:10 AM - October 17, 2020 10:10 AM)

corona_tweets_213.csv: 2,936,275 tweets (October 17, 2020 10:10 AM - October 18, 2020 10:10 AM)

corona_tweets_214.csv: 3,114,902 tweets (October 18, 2020 10:10 AM - October 19, 2020 10:10 AM)

corona_tweets_215.csv: 3,495,131 tweets (October 19, 2020 10:10 AM - October 20, 2020 10:10 AM)

corona_tweets_216.csv: 3,474,414 tweets (October 20, 2020 10:10 AM - October 21, 2020 10:10 AM)

corona_tweets_217.csv: 3,211,180 tweets (October 21, 2020 10:10 AM - October 22, 2020 10:10 AM)

corona_tweets_218.csv: 3,342,926 tweets (October 22, 2020 10:10 AM - October 23, 2020 10:10 AM)

corona_tweets_219.csv: 3,269,122 tweets (October 23, 2020 10:10 AM - October 24, 2020 10:10 AM)

corona_tweets_220.csv: 3,193,236 tweets (October 24, 2020 10:10 AM - October 25, 2020 10:10 AM)

corona_tweets_221.csv: 3,382,538 tweets (October 25, 2020 10:10 AM - October 26, 2020 10:10 AM)

corona_tweets_222.csv: 3,339,344 tweets (October 26, 2020 10:10 AM - October 27, 2020 10:10 AM)

corona_tweets_223.csv: 2,553,632 tweets (October 27, 2020 10:10 AM - October 28, 2020 03:35 PM)

corona_tweets_224.csv: 2,865,014 tweets (October 28, 2020 03:35 PM - October 29, 2020 10:10 AM)

corona_tweets_225.csv: 3,256,945 tweets (October 29, 2020 10:10 AM - October 30, 2020 10:10 AM)

corona_tweets_226.csv: 3,428,783 tweets (October 30, 2020 10:10 AM - October 31, 2020 10:10 AM)

corona_tweets_227.csv: 3,500,835 tweets (October 31, 2020 10:10 AM - November 01, 2020 10:10 AM)

corona_tweets_228.csv: 3,371,046 tweets (November 01, 2020 10:10 AM - November 02, 2020 10:10 AM)

corona_tweets_229.csv: 3,299,450 tweets (November 02, 2020 10:10 AM - November 03, 2020 10:15 AM)

corona_tweets_230.csv: 2,601,575 tweets (November 03, 2020 10:15 AM - November 04, 2020 10:10 AM)

corona_tweets_231.csv: 2,415,739 tweets (November 04, 2020 10:10 AM - November 05, 2020 10:10 AM)

corona_tweets_232.csv: 2,781,918 tweets (November 05, 2020 10:10 AM - November 06, 2020 10:10 AM)

corona_tweets_233.csv: 2,514,116 tweets (November 06, 2020 10:10 AM - November 07, 2020 10:10 AM)

corona_tweets_234.csv: 2,998,457 tweets (November 07, 2020 10:10 AM - November 08, 2020 10:10 AM)

corona_tweets_235.csv: 2,339,099 tweets (November 08, 2020 10:10 AM - November 09, 2020 10:10 AM)

corona_tweets_236.csv: 3,524,228 tweets (November 09, 2020 10:10 AM - November 10, 2020 10:35 AM)

corona_tweets_237.csv: 3,296,368 tweets (November 10, 2020 10:35 AM - November 11, 2020 10:10 AM)

corona_tweets_238.csv: 3,156,729 tweets (November 11, 2020 10:10 AM - November 12, 2020 10:10 AM)

corona_tweets_239.csv: 3,664,120 tweets (November 12, 2020 10:10 AM - November 13, 2020 10:10 AM)

corona_tweets_240.csv: 3,850,005 tweets (November 13, 2020 10:10 AM - November 14, 2020 10:10 AM)

corona_tweets_241.csv: 3,341,609 tweets (November 14, 2020 10:10 AM - November 15, 2020 10:10 AM)

corona_tweets_242.csv: 3,424,942 tweets (November 15, 2020 10:10 AM - November 16, 2020 10:10 AM)

corona_tweets_243.csv: 3,629,902 tweets (November 16, 2020 10:10 AM - November 17, 2020 10:10 AM)

corona_tweets_244.csv: 3,566,213 tweets (November 17, 2020 10:10 AM - November 18, 2020 10:10 AM)

corona_tweets_245.csv: 3,649,808 tweets (November 18, 2020 10:10 AM - November 19, 2020 10:10 AM)

corona_tweets_246.csv: 3,593,689 tweets (November 19, 2020 10:10 AM - November 20, 2020 10:10 AM)

corona_tweets_247.csv: 3,467,627 tweets (November 20, 2020 10:10 AM - November 21, 2020 10:10 AM)

corona_tweets_248.csv: 3,302,059 tweets (November 21, 2020 10:10 AM - November 22, 2020 10:10 AM)

corona_tweets_249.csv: 3,072,344 tweets (November 22, 2020 10:10 AM - November 23, 2020 10:10 AM)

corona_tweets_250.csv: 3,405,513 tweets (November 23, 2020 10:10 AM - November 24, 2020 10:10 AM)

corona_tweets_251.csv: 3,172,136 tweets (November 24, 2020 10:10 AM - November 25, 2020 10:10 AM)

corona_tweets_252.csv: 3,084,475 tweets (November 25, 2020 10:10 AM - November 26, 2020 10:10 AM)

corona_tweets_253.csv: 2,853,608 tweets (November 26, 2020 10:10 AM - November 27, 2020 10:10 AM)

corona_tweets_254.csv: 2,463,578 tweets (November 27, 2020 10:10 AM - November 28, 2020 10:10 AM)

corona_tweets_255.csv: 2,437,566 tweets (November 28, 2020 10:10 AM - November 29, 2020 10:10 AM)

corona_tweets_256.csv: 2,570,601 tweets (November 29, 2020 10:10 AM - November 30, 2020 10:10 AM)

corona_tweets_257.csv: 3,144,302 tweets (November 30, 2020 10:10 AM - December 01, 2020 10:10 AM)

corona_tweets_258.csv: 2,892,121 tweets (December 01, 2020 10:10 AM - December 02, 2020 10:11 AM)

corona_tweets_259.csv: 3,255,342 tweets (December 02, 2020 10:11 AM - December 03, 2020 10:10 AM)

corona_tweets_260.csv: 3,398,579 tweets (December 03, 2020 10:10 AM - December 04, 2020 10:10 AM)

corona_tweets_261.csv: 3,107,132 tweets (December 04, 2020 10:10 AM - December 05, 2020 10:10 AM)

Notes on the data:

> March 29, 2020 04:02 PM - March 30, 2020 02:00 PM -- Some technical fault has occurred. Preventive measures have been taken. Tweets for this session won't be available.

> Sentiment scores are not available (only the tweet IDs are available) for the tweets collected between October 27, 2020 10:10 AM - October 28, 2020 03:35 PM (corona_tweets_223.csv). 

Why are only tweet IDs being shared?

Twitter's content redistribution policy restricts the sharing of tweet information other than tweet IDs and/or user IDs. Twitter wants researchers always to pull fresh data. It is because a user might delete a tweet or make his/her profile protected.

Playing around with 'tweet ID' and 'date & time'

In this dataset, tweet IDs are listed in different CSV files based on their creation dates. However, if you are interested in hydrating the IDs that fall between a specific time interval, you can convert a tweet ID to date & time (tweet ID > time epoch > human-readable date & time). Yes, you read it correctly; Twitter has a timestamp placed in tweet ID. You can also convert a time epoch to tweet ID. This way, you can easily select a list of tweet IDs from this dataset that fall under your preferred time frame.

> Converting tweet ID to date & time

import time

tweet_id = ...

shifted_id = tweet_id >> 22 #applying right shift operator to the tweet ID

timestamp = shifted_id + 1288834974657 

data_time = time.ctime(timestamp/1000)

> Converting date & time to tweet ID

milisecond epoch = ... #convert the preferred date & time to ms epoch

epoch = milisecond epoch - 1288834974657

tweet_id = epoch << 22 #applying left shift operator

What about the tweets collected before March 20, 2020?

Unfortunately, I had to unpublish more than 20 million tweets collected between Jan 27, 2020, and March 20, 2020, because the collection did not have tweet IDs obtained. "Why?" you might ask. Initially, the primary objective of the deployed model was not just to collect the tweets; it was more like an optimization project aiming to study how much information can be processed with minimal computing resources at hand in a near-real-time scenario. However, when the COVID-19 outbreak started becoming a global emergency, I decided to release the collected tweets rather than just keeping them with me. But, the collection did not have tweet IDs. Which is why a fresh collection was started after March 20, 2020.

Instructions: 

Each CSV file contains a list of tweet IDs. You can use these tweet IDs to download fresh data from Twitter (hydrating the tweet IDs). To make it easy for the NLP researchers to get access to the sentiment analysis of each collected tweet, the sentiment score computed by TextBlob has been appended as the second column. To hydrate the tweet IDs, you can use applications such as Hydrator (available for OS X, Windows and Linux) or twarc (python library).

Getting the CSV files of this dataset ready for hydrating the tweet IDs:

import pandas as pd

dataframe=pd.read_csv("corona_tweets_10.csv", header=None)

dataframe=dataframe[0]

dataframe.to_csv("ready_corona_tweets_10.csv", index=False, header=None)

The above example code takes in the original CSV file (i.e., corona_tweets_10.csv) from this dataset and exports just the tweet ID column to a new CSV file (i.e., ready_corona_tweets_10.csv). The newly created CSV file can now be consumed by the Hydrator application for hydrating the tweet IDs. To export the tweet ID column into a TXT file, just replace ".csv" with ".txt" in the to_csv function (last line) of the above example code.

If you are not comfortable with Python and pandas, you can upload these CSV files to your Google Drive and use Google Sheets to delete the second column. Once finished with the deletion, download the edited CSV files: File > Download > Comma-separated values (.csv, current sheet). These downloaded CSV files are now ready to be used with the Hydrator app for hydrating the tweets IDs.

Comments

I am getting this error

 

DatabaseError: database disk image is malformed

Submitted by Junaid khan on Mon, 03/16/2020 - 15:23

Can you tell me the name of the file you're experiencing this error with? I would recommend you to first use any kind of SQLite DB viewer to check if the downloaded file is not corrupted.

Submitted by Rabindra Lamsal on Wed, 04/29/2020 - 00:41

Hi! Could you mention what filters are you using to get the tweets? Thanks

Submitted by Victor Tavares on Tue, 03/17/2020 - 00:21

keyword: corona, language: en

A significant amount of tweets used the word 'corona' ignoring the word 'virus'. So I had to track tweets using the most generic word: just 'corona'. Therefore, a couple of tweets relating to 'corona beer' might also be present in the databases.

Submitted by Rabindra Lamsal on Tue, 03/17/2020 - 00:45

Hi ! I cannot access the LSTM Model.

Submitted by islam sadat on Wed, 03/18/2020 - 10:41

Try refreshing. Maybe the server was busy while you were trying to access the site. I just can't believe that more than 338,500 requests have been made to the model within the last 24 hours. And this amount of request is something that my model cannot handle. Sorry for the inconvenience!

Submitted by Rabindra Lamsal on Wed, 03/18/2020 - 11:09

Please fix this two datasets        

1. corona_tweets_2M.db.zip        2. corona_tweets_2M_2.zip

 

it shows this error DatabaseError: database disk image is malformed

Submitted by imran khan on Thu, 03/19/2020 - 08:19

I downloaded the very same compressed files from this page and loaded both the databases on an SQLite DB viewer. The databases work just fine. See the screenshot here: https://i.ibb.co/SyQ7ff1/Screen-Shot-2020-03-19-at-8-21-46-PM.png

I recommend you to open the databases (which are generating the image malformed error) using any DB viewer and re-save them on your machine or export to SQL or to any tabular format file system as per your preference.

Submitted by Rabindra Lamsal on Fri, 05/29/2020 - 01:31

Hi thanks for providing these datasets for the public. I have one questio, are all these files contain same structure? I wish if they had the other feilds twitter provides with tweets so we can directly do our research?

I wonder if the other files all have three columns only, unix, text and sentiment.

Submitted by ali ALdulaimi on Thu, 03/19/2020 - 11:54

Hello there! Yes, all the files have the same structure (unix, text, sentiment score). However, starting March 20 the collected tweets will also have one additional column, viz. tweet ID.

This is because, initially, the purpose of the deployed web app was not just to collect the tweets; it was more like an optimization project. However, when the corona outbreak started in China, I decided to release the collected tweets rather than just keeping them with me.

Submitted by Rabindra Lamsal on Thu, 03/19/2020 - 14:05

Hi

Rabindra Have the SQLite dbs been replaced with CSV with only time and sentiment score?Thanks  

Submitted by Bevan Ward on Sat, 03/21/2020 - 23:50

Hello Bevan. No, the first column in the CSV files is tweet ID. You'll have to automate the extraction of tweets using the list of tweet IDs. Twitter Policy; so I had to remove every other info except the tweet ID and sentiment score.

Submitted by Rabindra Lamsal on Sun, 03/22/2020 - 02:25

Thanks Rabindra for the reply - take care Bevan

Submitted by bevan ward on Sun, 03/29/2020 - 18:24

Hi, Can you please upload the tweet ids and sentiment of the old file from February and early March?

 

Thank you

Submitted by Rabia batool on Tue, 03/24/2020 - 05:17

Hello Rabia! unfortunately, I had to take down all the tweets which were collected between Feb 1, 2020, and Mar 19, 2020, because the old DB files didn't have tweet IDs collected. This was because, initially, the purpose of the deployed web app was not just to collect the tweets; it was more like an optimization project. However, when the corona outbreak started in China, I decided to release the collected tweets rather than just keeping them with me. Therefore, because of twitter data sharing policies, I am not authorized to share the old files. Sorry for the inconvenience.

Submitted by Rabindra Lamsal on Tue, 03/24/2020 - 10:42

Thank you for your response. I completely understand this. 

 

Submitted by Rabia batool on Wed, 03/25/2020 - 03:50

Hi, I'm trying to view a particular tweet using the tweet IDs that you provided with a piece of python code that you provided above after adding my credentials for (CONSUMER_KEY, CONSUMER_SECRET, OAUTH_TOKEN, OAUTH_TOKEN_SECRET), however,  it always gives me the following error message:

 

tweepy.error.TweepError: [{'code': 144, 'message': 'No status found with that ID.'}]

 

Have you hashed those tweet ids that you uploaded? Any advice is appreciated. 

 

Best regards, 

 

Submitted by Basheer Qolomany on Mon, 03/30/2020 - 18:26

Maybe the particular tweet which you're trying to view has been either removed or hidden by the user.

Submitted by Rabindra Lamsal on Mon, 03/30/2020 - 19:56

Thanks for replying, actually I don't think those tweets have been removed or hidden by the users,  because I tried in a for loop hundreds of different tweet ids and all of them gave me the same error message. While I got some tweet id from another source they worked just fine. 

Here are the some of tweet ids that I used from file number 10 for example: 

 

1243420522592910000

1243420476824640000

1243420477235660000

1243420477646720000

1243420477894190000

1243420478238150000

1243420478535890000

1243420478829510000

1243420478951180000

1243420479706150000

1243420479844530000

1243420479982990000

1243420479924250000

1243420478837900000

1243420480205280000

1243420481744560000

1243420482075930000

1243420482201770000

1243420482222730000

1243420482084270000

1243420482814100000

1243420482935760000

1243420482629590000

 

 

Thanks, 

Submitted by Basheer Qolomany on Mon, 03/30/2020 - 20:42

I double-checked corona_tweets_10.csv, but I could not find any of these IDs in the file. However, I can see one pattern in the tweet IDs you've listed above: they all end with a number of zeros. Use sublime text or a simple text editor to open the CSV files. Looks like the application which you're using to open these files is somehow chopping off some digits at the back and replacing the chopped ones with zeros.

For example, the last ID you've listed 1243420482629590000 should have been 1243420482629591040. See that the last 4 digits are zeroes at your end. Same is the case with all other IDs you've mentioned above.

Submitted by Rabindra Lamsal on Tue, 03/31/2020 - 02:13

Yes, that's right. I read the CSV files with R, it fixed the numbers. 

Also, if you have the tweet ids for March 13 to March 19, that would be great to upload it here. 

 

Thanks; 

Submitted by Basheer Qolomany on Tue, 03/31/2020 - 17:35

The model has been collecting the corona-related tweets since Jan 27, 2020. However, the model was designed as a part of an optimization project and therefore it was made to only extract the tweets but not the tweet IDs. And because of Twitter's data sharing policy, I am not allowed to share them. Therefore, I started extracting and uploading the tweet IDs since March 20, 2020, only.

Submitted by Rabindra Lamsal on Tue, 03/31/2020 - 21:59

Thank you,

Submitted by Basheer Qolomany on Wed, 04/01/2020 - 18:30

I'm haivng the exact same issue, i.e. all IDs end with four zeros while the zeros should in fact be other numbers. I was just opening it as csv file.

 

Could you please let me know how to fix it? Thank you very much!

Submitted by Mandy Huang on Wed, 04/08/2020 - 04:02

Are you trying to write a script to hydrate the tweet IDs or something else? Please see the instruction given in the dataset description field.

Submitted by Rabindra Lamsal on Wed, 04/08/2020 - 11:26

Thank you for the reply! I've tried using the QCRI's Tweets Downloader to hydrate the tweet IDs, but same as tweepy API, the first step is to get a list of correct tweet IDs, which I don't have because of the zeros at the end of the tweet_id column in the original dataset. 

 

I saw in the previous discussion you mentioned "For example, the last ID you've listed 1243420482629590000 should have been 1243420482629591040", could you please let me know how you get the correct tweet ID that ends with 1040? Many thanks!

Submitted by Mandy Huang on Wed, 04/08/2020 - 15:21

Hello Mandy. Please do not use MS Excel to open the CSV files. Excel shows numbers with 15 digits precision. I would suggest you load the CSV file as a pandas data frame, then drop the sentiment column and export the final data frame as a CSV file (for Hydrator app) or as a text file (for QCRI 's Tweets Downloader Tool). I hope this helps.

Submitted by Rabindra Lamsal on Sat, 06/06/2020 - 00:28

Hi 

I try to download all data from twitter using user id, but the app Hydrator always stop downloading.

Is that mean the download tweets reach the rate limit?

 

thanks

Submitted by JINGLI SHI on Fri, 04/03/2020 - 00:09

Can you please elaborate? Also, I would recommend you to write to the app's author regarding the issue.

Submitted by Rabindra Lamsal on Sun, 04/05/2020 - 22:50

Congratulations for this work!

Submitted by Thiago Aparecid... on Thu, 04/09/2020 - 20:55

Thank you, Thiago.

Submitted by Rabindra Lamsal on Thu, 04/09/2020 - 22:11

Can someone share the code snippet to get the tweet text from tweet id.

Submitted by Haider Akram on Fri, 04/10/2020 - 12:11

Use Hydrator (https://github.com/DocNow/hydrator) or QCRI's Tweet Downloader tool (https://crisisnlp.qcri.org/data/tools/TweetsRetrievalTool-v2.0.zip) for downloading the tweets.

Submitted by Rabindra Lamsal on Fri, 04/10/2020 - 15:07

Can someone please help me with how to fetch the tweets? 

I am just able to see the 'Tweet IDs' and 'sentiment score'. Where and how can I download the tweets ? 

Thanks in advance. 

Submitted by Navya Shiva on Sun, 05/03/2020 - 05:48

Please refer to my reply to your comment below.

Submitted by Rabindra Lamsal on Sun, 05/03/2020 - 08:34

Hi, 

I am able to only see two columns ( 'Tweet ID' and 'sentiment score'). Could you please tell me if the tweets column is removed?

Submitted by Navya Shiva on Sun, 05/03/2020 - 06:11

Hello Navya. Because of Twitter's data sharing policy, we are not allowed to share anything except the Tweet IDs and/or User IDs. Therefore, this dataset contains only the Tweets IDs. In order to download the tweets, you'll need to hydrate these IDs using applications such as DocNow's Hydrator (available for OS X, Windows and Linux) or QCRI's Tweets Downloader (java based).

Submitted by Rabindra Lamsal on Sun, 05/03/2020 - 06:35

Hi Rabindra,

Thank you for your reply.  I have downloaded Hydrator and tried downloading the tweets. However, the CSV file isn't getting downloaded ( I chose only 85, 000 rows if Tweet IDs for sample). It is throwing an error. Could you please help me fix it? 

I am unable to post the picture here. The error is displayed as "A Javascript error occured in the main proces...", the error has so many lines with this heading. Please let me know how to go about this. 

Thank you. 

 

 

 

 

 

Submitted by Navya Shiva on Mon, 05/04/2020 - 05:20

[updated on October 26] Maybe you did not remove the second column. The hydrator app consumes only a list of IDs. The CSV files here have an extra column i.e. sentiment score. You'll need to remove the sentiment score column and then only use the hydrator.

Submitted by Rabindra Lamsal on Mon, 10/26/2020 - 00:43

Dear

Can you mention the library used to find the sentiment?

ThankYou

Submitted by Furqan Rustam on Mon, 06/01/2020 - 22:05

Hello Furqan. If you do not want to create a Machine learning model of your own to have the sentiment scores computed, you can make use of the TextBlob NLP toolkit.

Submitted by Rabindra Lamsal on Tue, 06/02/2020 - 00:49

Dear can you mention which toolkit you used to find the sentiment score?

Submitted by Furqan Rustam on Fri, 06/05/2020 - 06:50

It is clearly mentioned in the abstract field. It's TextBlob.

Submitted by Rabindra Lamsal on Fri, 06/05/2020 - 13:40

Hello , i am working on NLP Project. i want extract tweet only from USA . Can someone tell me how can i do it? 

 What range of  sentiment score is considered as Positive, Negative and Neutral?

Thank you

Submitted by Anil Kumar on Tue, 06/09/2020 - 10:48

Hello Anil. You'll have to add a condition to the country ('United States') or country_code ('US') Twitter object while hydrating the IDs. For this purpose, you can use the twarc python library.

for tweet in t.hydrate(open('id_file.txt')):

     country or country_code condition here:

          store the Twitter objects values (whichever you need)

Visit https://developer.twitter.com/en/docs/tweets/data-dictionary/overview/tweet-object for more info regarding Twitter objects.

About the sentiment score: [-1,0) is considered negative sentiment, 0 is considered neutral and (0,1] is considered positive. Please go through TextBlob's documentation for more information.

Submitted by Rabindra Lamsal on Tue, 06/09/2020 - 12:12

Hello , I am looking for some ideas for NLP project using this dataset. i have Tweet ID and sentiment score. what can be the NLP applications for this dataset? 

Thanks 

Submitted by Anil Kumar on Fri, 06/12/2020 - 10:40

Hello Anil. I think you should be the one who decides on what specific task you want this dataset to be used. You'll find multiple blogs, and most importantly you can go through the recently written papers at arxiv.org. From these resources, you can get hints in what specific domain people are working on using the COVID-19 tweets datasets that are currently available on the web for non-commercial research.

I am restricted by Twitter's data sharing policy, which is why I am allowed to share only the tweet IDs. You'll have to hydrate the IDs to get other information relating to a tweet. I recommend you to first learn what different Twitter Objects are available. Then you can really understand where and how you can use a particular tweet dataset.

Submitted by Rabindra Lamsal on Fri, 06/12/2020 - 12:33

When I download the .csv file of the first 3000 tweets using Hydrator, I obtain around 34 columns but I'm not able to view the sentiment score as one of the columns. Can you please help me with this?

Submitted by Pranav Saihgal on Mon, 06/15/2020 - 07:58

Twitter Objects does not have such a thing as "sentiment score". You'll have to play around with the original CSV file to append the second column of this dataset (sentiment) to the hydrated CSV.

You can refer to the example code given in the Instructions section to get a gist of how you can (i) load a CSV file as a pandas dataframe, (ii) play around with columns, and (iii) export the final dataframe as a CSV file ready for your study.

Submitted by Rabindra Lamsal on Mon, 06/15/2020 - 11:14

Has the sentiment score been obtained after preprocessing text (tweets)  or before?

Submitted by Pranav Saihgal on Mon, 06/15/2020 - 16:28

Pages

Dataset Files

LOGIN TO ACCESS DATASET FILES