Instructions for Initiating a Data Competition

Prerequisites

 

IEEE DataPort Components for Data Competition Initiation

To initiate a Data Competition on IEEE DatPort, the following required components are to be provided:

  • Dataset files: Data Competition initiators may upload one or more dataset files IEEE DataPort in the Data Competitions module which will be the focus of the Data Competition.  Dataset files may be up to 2 TB each and the following formats are currently supported: zip, gz, gzip, csv, json, txt, sql, xml, tsv, ebs, avro, orc, parquet, hdf5, 7z, tbz2, iso, tar, bz2, z, xls, xlsx.
  • TIP: For large datasets, we recommend uploading a series of files that are 100 GB or less. This will make the upload process easier to manage, and will also enable users to more easily work with your files.
  • TIP: For datasets with a large number of files (>100), we recommend compressing your upload(s) using the ZIP or GZIP format. GZIP may work best with some big data solutions.
  • Title: A descriptive title that will be used to reference your Data Competition on IEEE DataPort and for citation.
  • Citation Author(s): One or more dataset authors/owners that will be used for citation and DOI generation.
  • Start & End Dates: This date range defines the beginning and end of the submission period for your competition.
  • Competition Image: Please upload an image that can be used to help identify your competition on the IEEE DataPort site.
  • Abstract: A detailed overview of the competition, including details about the associated datasets, scripts, and files.
  • Instructions: A detailed set of instructions indicating the rules of your Data Competition as well as all technical details on how to work with associated datasets, scripts, and files.
  • Access Restrictions: Indicate if access to your Data Competition will be restricted by a list of email addresses.

 

Instructions

1.       Login to IEEE DataPort (the login link is located in the upper right of the website).

2.       Click on the COMPETITIONS tab on the IEEE DataPort banner.

3.       Click on SUBMIT A DATA COMPETITION on the lower left hand side of the screen. 

4.       Complete required inputs including Meta Data, Abstract, Instructions, and Access tabs (all required inputs are shown with red asterisk). 

A.       Meta Data: 

i. Required Fields: 

1.       Title of Dataset

2.       Citation Authors = name(s) of data author(s)

3.       Competition Start and End Dates (select one or more categories that the dataset should reside in)

ii. Other Fields:

1.       Keywords – enter keywords so dataset is searchable

2.       Data Format – enter format of dataset file (e.g. CSV, TXT, ORC, Avro, XML, SQL, JSON, XLSX); for ZIP and GZIP files, indicate the format of the included files.

3.       Related Dataset – if your dataset is related to another IEEE DataPort dataset, please enter the title of the related dataset in this field (autocomplete available)

4.       LINKS – enter links to external documentation, data sources, project pages, author homepage, etc.  Multiple links are allowed; to add more than one link, click “Add Another Link” button. 

B.      Abstract

i. Required Fields

1.       Abstract – enter abstract for dataset in the required Abstract field.

2.       Dataset Image – enter image for your dataset to improve dataset appearance and identification on IEEE DataPort; choose image file and hit Upload button.

C.       Instructions

                    i.      Required Fields

1.       Instructions – enter detailed instructions to enable Data Competition participants to be able to understand and participate in the data competition effectively.

          ii.      Other Fields

1.       Documentation – Data Competition administrator may upload documents which contain additional instructional information for the Data Competition; choose file and hit Upload button; formats allowed include pdf, txt, docx, doc, and md.

D.      Access

i.      Required Fields

1.       Data Competition administrator must indicate if the Data Competition has access restrictions.  Administrator must either check “Restricted Access via Whitelist” or check “No Access Restrictions”.

a.       If Competition is “Restricted Access via Whitelist”, the Data Competition administrator should input email addresses (if known) separated by commas of those individuals allowed to participate in the Data Competition in the “Whitelist” box on the Access page.  If participants are not yet known, the “Whitelist” box can be left blank and names can be added by the administrator over time as participants are identified.

b.      If Competition has “No Access Restrictions”, no entry needs to be made in the “Whitelist” box and all users of IEEE DataPort that are logged in will have access to the Data Competition.

2.       Selection of Template for Request to Participate in Data Competition (currently there is only one standard template for the request form).

3.       Review Terms of Use using the link provided and indicated agree to IEEE DataPort Terms of Use by checking the box at the bottom of the either the Meta Data, Abstract,  Instructions or Access page; your Data Competition dataset cannot be uploaded until you agree to the IEEE DataPort Terms of Use.

4.       Hit SUBMIT DATA COMPETITION at the bottom of the page after all required Metadata, Abstract, Instructions and Access inputs have been provided.

5.       A DOI will be generated and appear in the right hand section of the page entitled DATASET DETAILS.  On this page, click the UPLOAD YOUR COMPETITION DATASET button to get to a screen which allows you to select and upload the actual dataset file.   Choose your competition dataset file and then hit the Upload button; files may be zip, gz, gzip, csv, json, sql, xml, tsv, ebs, avro, orc, parquet, hdf57z.  The name of the competition dataset file will appear on the screen after loading.  You will then see a box which enables you to add a short descriptive title for each dataset file loaded.

6.       If your competition dataset consists of multiple files, you can again select ‘choose’ and ‘upload’ to upload additional files.

7.       At bottom of page, click SUBMIT COMPETITION DATASET to load the dataset into IEEE DataPort.

8.       Dataset will be provided with an S3 URI and will also be downloadable after it is uploaded to IEEE DataPort. 

9.       Congratulations – you have now uploaded your data competition to IEEE DataPort!  You can verify your competition dataset by going to the home page, searching for your competition dataset and opening it! 

10.       As an administrator, you can now manage your competition by approving participation request forms and/or updating the Whitelist for the Data Competition, updating your Data Competition (e.g. additional instructions), and viewing analysis results that participants have submitted on IEEE DataPort.  All participants should submit their Data Competition results through IEEE DataPort to make is easy for you as the administrator to evaluate results. 

11.   If you have restricted access to you Data Competition, you will see two additional buttons at the bottom of your Data Competition page

a.       EDIT ACCESS FORM:  Use this button to customize the ‘Request Access’ form; option include:

 i.      Add, delete or modify the form fields as needed.

ii.      Enter one or more email addresses where submission will be sent (your email address will be automatically included, but you can add other email addresses)

iii.      Enter additional instructions that users who submit the ‘Request Access’ form will receive.

iv.      Set up an email notification to be sent to users who submit the ‘Request Access’ form.

b.      ACCESS FORM SUBMISSIONS:  View and/or download the ‘Request Access’ form submissions.

12.   If you are restricting access to your Data Competition, please click the EDIT DATA COMPETITION button to manage access to you Data Competition whitelist.  On the ACCESS tab you may add/edit your email whitelist.  Please also use the ‘Request Access’ submission to inform your users once they have been granted access to the Data Competition.  

 

Background

IEEE DataPort™ is designed to assist all members of the global technical community host and easily administer Data Competitions.  A Data Competition is a time-limited challenge in which a dataset is provided and members of the global technical community are invited to provide a specific analysis or make predications based on the dataset.   Participation in the Data Competition is managed by the Data Competition initiator and can be open to all members of the global technical community or limited to a specific set of participants. 

IEEE DataPort supports efficient hosting of Data Competitions and enables:

  • easy initiation and administration of the Data Competition, including control of the Data Competition duration
  • acceptance of analysis results in IEEE DataPort so Data Competition results are consolidated and easily evaluated,
  • effective management of Data Competition participation.

 

IEEE DataPort Data Competition Features

  • IEEE DataPort Data Competition datasets will be provided with a DOI (Digital Object Identifier) and stored in Amazon Simple Storage Service (S3) file storage.
  • Customizable Access Request Form:  the Data Competition initiator may attach and customize a ‘Request Access’ form that may be required for participants to submit before they can gain access to dataset files. The form and resulting notifications are fully customizable by competition administrators.
  • Citation Generator: Citations are generated in popular citation formats.
  • Comments: registered users may submit comments and questions on the Data Competition page(s)
  • Private Messages: registered users may submit private messages to competition leads