BHI 2017: Big Data Analytics Competition

Submission Dates:
11/28/2016 to 02/19/2017
Citation Author(s):
United States Patent and Trademark Office
Submitted by:
Chris Franzino
Last updated:
Tue, 08/08/2017 - 10:52
Data Format:
Creative Commons Attribution


Data Science is all about the processes and methods to access and analyze data to gain insights for informed decision making. To promote the awareness and analytic technology of Big Data, IEEE EMBS and the IEEE Big Data Initiative are organizing a Data Analytics Competition. The competition will be held during the International Conference on Biomedical and Health Informatics (IEEE BHI2017), 16-19 February 2017 in Orlando, Florida, and is open to all participants of the conference.

Data is all around us and everywhere. In particular, biomedical data is relevant to the themes of the 4th International Conference on Biomedical and Health Informatics (BHI2017). The challenge is to analyze a curated dataset, and determine what can be learned from the data.


The data to be used for the competition is a curated dataset of US patent information [1] based upon BHI keywords. Your challenge is to analyze the dataset and present an interesting finding based on your analysis. Download the dataset using the link on the right.

[1] PAIR Bulk Data. United States Patent and Trademark Office, Accessed 10 October 2016.

Keywords: Smart medicine, precision medicine, preventive medicine, bioinformatics, digital imaging, artificial intelligence, internet of things, mobile health, electronic health, telemedicine
Status: Patented Case



One consideration for example, better informed practitioners, clinicians, consumers will enable stronger patient care and health management. A study of the data may provide potential predictive trends and recommendations for a better informed consumer. What insights can you provide by analyzing the dataset? Your insights and suggestions are expected to be creative. Based on this dataset, what are the most common medical and health applications where patent development is occurring? How frequently are patents being filed with the same title? How would you improve this dataset to better distinguish unique patents with duplicate titles? What additional data / metadata would you include in this dataset to help researchers more efficiently locate relevant medical and health patents? What conclusions can you draw from this data? What trends, if any, have formed over the past decade? Where are the trends moving? Consider both health industry and patent filing perspectives. What anomalies can you find in this data? Is there anything that affects the integrity of the data?

Participation is open to conference attendees only. To participate, please send an email to Kathy Grise ( and Theresa Cavrak ( You may participate individually or as part of a team.

Evaluation and Awards
You are expected to present your findings to a panel of judges. Presentations will be evaluated based on the following criteria:

  • Selection criteria of the dataset
  • Clarity and Relevance of Analysis
  • Methodology
  • Creative use of data
  • Significance of findings and recommendations
  • Delivery of findings and recommendations


Two awards will be given out, 1) winning student or student team; 2) winning professional or professional team. A judging panel of practitioners will evaluate presentations and each winning team (or individual) will be awarded with $1,000.

The presentations should be formatted as a PowerPoint document that can be uploaded via IEEE DataPort using the “Submit an Analysis” button below. Please be prepared to discuss your presentation for roughly 8-10 minutes.


  • 15 February 2017 – Final date for Presentation to be uploaded via IEEE DataPort
  • Friday – Saturday, 17-18 February 2017 - Presentations
  • Saturday, 18 February 2017  – Awards


I was running a larger clustering and finally got the image showing the results, but the site was closed.  Oh well.  My ppt has a tiny one in it, but 77x77 is more impressive than 16x16!

Competition Dataset Files

File HealthPatentData.xlsx289.51 KB