The Winners of the 2020 IEEE DataPort Dataset Contest

IEEE DataPort is excited to announce the 2020 Dataset Upload competition winners, selected by a panel of IEEE volunteers based on technical merit and level of engagement among the IEEE DataPort global technical community. 

  • 1st Place: Rabindra Lamsal, Twitter Sentiment Analysis

  • 2nd Place: Avishek Garain, Hotel Reviews from Around the World with Sentiment Values and Review Ratings in Different Categories for Natural Language Processing

  • 3rd Place: Ibrahim Sabuncu, Corona Virus (COVID-19) Turkish Tweets Dataset

Researchers around the globe rely on IEEE DataPort to safely and easily store, share, and manage their research data. Lamsal, a researcher at Jawaharlal Nehru University in New Delhi, uses this research data platform because of its high storage capacity; up to 2TB; and the ability to directly connect with dataset owners. Garain, a student at Jadavpur University of Kolkata, India uses IEEE DataPort because it is a stable and secure platform to make his research available to the biomedical community. And Sabuncu, a researcher at the Yalova University, Yalova, Turkey, uses IEEE DataPort because of its ease of access and dataset availability.

Check out these winning datasets below and upload your own research data to IEEE DataPort.

Keep watching for announcements regarding the next IEEE DataPort Dataset Upload Contest coming soon!

To learn more about how to upload your own datasets visit


1st Place: Rabindra Lamsal


This dataset includes CSV files that contain tweet IDs. The tweets have been collected by an on-going project deployed at The model monitors the real-time Twitter feed for coronavirus-related tweets, using filters: language “english”, and keywords “corona”, "coronavirus", "covid", "pandemic", "lockdown", "quarantine", "hand sanitizer", "ppe", "n95", different possible variants of "sarscov2", "nCov", "covid-19", "ncov2019", "2019ncov", "flatten(ing) the curve", "social distancing", "work(ing) from home" and the respective hashtag of all these keywords. This dataset has been completely re-designed on March 20, 2020, to comply with the content redistribution policy set by Twitter.

Submitted On: Fri, 03/13/2020 - 01:25
Last Updated On: Tue, 06/30/2020 - 05:42


2nd Place: Avishek Garain


The dataset consists of reviews for various hotels throughout the world and data columns range from Location, Trip Type to various parameters of reviewing with individual review score. The data can be preprocessed and used for various purposes ranging from review categorization, topic extraction, sentiment analysis, location based quality calculation etc. Trustworthy real world data comes handy now-a-days and is tough to get a grasp on. So this dataset will be a good contribution for the researcher community as well as professionals.

Submitted On: Wed, 04/22/2020 - 20:28
Last Updated On: Thu, 06/11/2020 - 00:43


3rd Place: Ibrahim Sabuncu


This data set includes Covid-19 related Tweet messages written in Turkish that contain at least one of four keywords (Covid, Kovid, Corona, Korona). These keywords are used to express Covid-19 virus in Turkey. Tweets collection was started from 11th March 2020, the first Covid-19 case seen in Turkey.

Submitted On: Tue, 05/04/2020 - 07:48
Last Updated On: Tue, 05/19/2020 - 16:41