COVID-19 Tweets

We present GeoCoV19, a large-scale Twitter dataset related to the ongoing COVID-19 pandemic. The dataset has been collected over a period of 90 days from February 1 to May 1, 2020 and consists of more than 524 million multilingual tweets. As the geolocation information is essential for many tasks such as disease tracking and surveillance, we employed a gazetteer-based approach to extract toponyms from user location and tweet content to derive their geolocation information using the Nominatim (Open Street Maps) data at different geolocation granularity levels. In terms of geographical coverage, the dataset spans over 218 countries and 47K cities in the world. The tweets in the dataset are from more than 43 million Twitter users, including around 209K verified accounts. These users posted tweets in 62 different languages.

719 views
  • Artificial Intelligence
  • Last Updated On: 
    Wed, 06/24/2020 - 15:39

    This dataset is very vast and contains Bengali tweets related to COVID-19. There are 36117 unique tweet-ids in the whole dataset that ranges from December 2019 till May 2020 . The keywords that have been used to crawl the tweets are 'corona',  ,  'covid ' , 'sarscov2 ',  'covid19', 'coronavirus '.  For getting the other 33 fields of data drop a mail at "avishekgarain@gmail.com". Code snippet is given in Documentation file. Sharing Twitter data other than Tweet ids publicly violates Twitter regulation policies.    

    185 views
  • Artificial Intelligence
  • Last Updated On: 
    Thu, 06/11/2020 - 09:08

    This dataset is very vast and contains Spanish tweets related to COVID-19. There are 18958 unique tweet-ids in the whole dataset that ranges from December 2019 till May 2020 . The keywords that have been used to crawl the tweets are 'corona',  ,  'covid ' , 'sarscov2 ',  'covid19', 'coronavirus '.  For getting the other 33 fields of data drop a mail at "avishekgarain@gmail.com". Code snippet is given in Documentation file. Sharing Twitter data other than Tweet ids publicly violates Twitter regulation policies.    

    103 views
  • Artificial Intelligence
  • Last Updated On: 
    Tue, 06/30/2020 - 12:10

    This dataset contains the IDs of geo-tagged tweets. The tweets are captured by an on-going project deployed at https://live.rlamsal.com.np. The model monitors the real-time Twitter feed for these keywords - “corona”, "coronavirus", "covid", "pandemic", "lockdown", "quarantine", "hand sanitizer", "ppe", "n95", different possible variants of "sarscov2", "nCov", "covid-19", "ncov2019", "2019ncov", "flatten(ing) the curve", "social distancing", "work(ing) from home" and the respective hashtag of all these keywords.

    12932 views
  • COVID-19
  • Last Updated On: 
    Tue, 07/14/2020 - 07:41

    This dataset includes CSV files that contain tweet IDs. The tweets have been collected by an on-going project deployed at https://live.rlamsal.com.np.

    60630 views
  • COVID-19
  • Last Updated On: 
    Tue, 07/14/2020 - 07:21