facebooktwittermailshare

Example datasets for AtyImo (Spark version)

Abstract: 

## FEDERAL UNIVERSITY OF BAHIA (UFBA)
## ATYIMOLAB (www.atyimolab.ufba.br)
## University College London (UCL)
## Denaxas Lab (www.denaxaslab.org)
## Robespierre Pita and Clicia Pinto and Marcos Barreto and Spiros Denaxas
 
/*
@(#)File:           $atyimo_dataset_info.txt$
@(#)Version:        $v1$
@(#)Last changed:   $Date: 2017/12/04 12:00:00 $
@(#)Purpose:        Example data sets for the AtyImo data linkage tool
@(#)Author:         Robespierre Pita and Clicia Pinto and Marcos Barreto and Spiros Denaxas
 
@(#)Usage:
 
@(#)Comments:
 
 (*) These are synthetic data sets generated using a random routine for names and dates of birth.
 (*) They are very similar in structure to those real (identifiable) data sets linked by AtyImo.
 
*/
 
## Linkage attributes (existent in most Brazilian government databases)
 
code : Unique key for each record
municipality_residence : IBGE (Brazilian Institute of Gepgraphy and Statistics) code for a specific municipality.
This code was chosen randomly considering Brazilian municipalities.
name : Person name
mother_name : Mother's name
birth_date : Date of birth (yyyy-mm-dd)
gender : Gender ("1" and "M" - MALE, "3" and "F" - FEMALE)
 
## Files (semicolon ";" separated values)
##  AtyImo - Spark
 
small/
- DATASET_1_5K_records.csv: 500,000 records
- DATASET_2_1M_records.csv: 1,000,000 records
 
large/
- DATASET_1_5M_records.csv: 5,000,000 records
- DATASET_2_5M_records.csv: 5,000,000 records
 
## Files (Bloom coded)
##  AtyImo - Hybrid (OpenMP, CUDA)
 
bloom_hybrid/
- input_1000.bloom, input_10000.bloom, input_500000.bloom
 

Instructions: 

## FEDERAL UNIVERSITY OF BAHIA (UFBA)
## ATYIMOLAB (www.atyimolab.ufba.br)
## University College London (UCL)
## Denaxas Lab (www.denaxaslab.org)
## Robespierre Pita and Clicia Pinto and Marcos Barreto and Spiros Denaxas
 
/*
@(#)File:           $atyimo_dataset_info.txt$
@(#)Version:        $v1$
@(#)Last changed:   $Date: 2017/12/04 12:00:00 $
@(#)Purpose:        Example data sets for the AtyImo data linkage tool
@(#)Author:         Robespierre Pita and Clicia Pinto and Marcos Barreto and Spiros Denaxas
 
@(#)Usage:
 
@(#)Comments:
 
 (*) These are synthetic data sets generated using a random routine for names and dates of birth.
 (*) They are very similar in structure to those real (identifiable) data sets linked by AtyImo.
 
*/
 
## Linkage attributes (existent in most Brazilian government databases)
 
code : Unique key for each record
municipality_residence : IBGE (Brazilian Institute of Gepgraphy and Statistics) code for a specific municipality.
This code was chosen randomly considering Brazilian municipalities.
name : Person name
mother_name : Mother's name
birth_date : Date of birth (yyyy-mm-dd)
gender : Gender ("1" and "M" - MALE, "3" and "F" - FEMALE)
 
## Files (semicolon ";" separated values)
##  AtyImo - Spark
 
small/
- DATASET_1_5K_records.csv: 500,000 records
- DATASET_2_1M_records.csv: 1,000,000 records
 
large/
- DATASET_1_5M_records.csv: 5,000,000 records
- DATASET_2_5M_records.csv: 5,000,000 records
 
## Files (Bloom coded)
##  AtyImo - Hybrid (OpenMP, CUDA)
 
bloom_hybrid/
- input_1000.bloom, input_10000.bloom, input_500000.bloom
 

Submit an Analysis

Dataset Files

You must be an IEEE Dataport Subscriber to access these files. Subscribe now or login.

Help us make IEEE DataPort better. Sign up to be a Beta Tester and receive a coupon code for a free subscription to IEEE DataPort! Learn More

Dataset Details

Citation Author(s):
Robespierre Pita, Clicia Pinto, Marcos Barreto, Spiros Denaxas
Submitted by:
Marcos Barreto
Last updated:
Tue, 12/12/2017 - 17:53
DOI:
10.21227/H2K92G
Data Format:
Links:
 
Cite

Documentation

AttachmentSize
Plain text icon atyimo_dataset_info.txt1.58 KB

Subscribe

[1] Robespierre Pita, Clicia Pinto, Marcos Barreto, Spiros Denaxas, "Example datasets for AtyImo (Spark version)", IEEE Dataport, 2017. [Online]. Available: http://dx.doi.org/10.21227/H2K92G. Accessed: Jan. 18, 2018.
@data{h2k92g-17,
doi = {10.21227/H2K92G},
url = {http://dx.doi.org/10.21227/H2K92G},
author = {Robespierre Pita; Clicia Pinto; Marcos Barreto; Spiros Denaxas },
publisher = {IEEE Dataport},
title = {Example datasets for AtyImo (Spark version)},
year = {2017} }
TY - DATA
T1 - Example datasets for AtyImo (Spark version)
AU - Robespierre Pita; Clicia Pinto; Marcos Barreto; Spiros Denaxas
PY - 2017
PB - IEEE Dataport
UR - 10.21227/H2K92G
ER -
Robespierre Pita, Clicia Pinto, Marcos Barreto, Spiros Denaxas. (2017). Example datasets for AtyImo (Spark version). IEEE Dataport. http://dx.doi.org/10.21227/H2K92G
Robespierre Pita, Clicia Pinto, Marcos Barreto, Spiros Denaxas, 2017. Example datasets for AtyImo (Spark version). Available at: http://dx.doi.org/10.21227/H2K92G.
Robespierre Pita, Clicia Pinto, Marcos Barreto, Spiros Denaxas. (2017). "Example datasets for AtyImo (Spark version)." Web.
1. Robespierre Pita, Clicia Pinto, Marcos Barreto, Spiros Denaxas. Example datasets for AtyImo (Spark version) [Internet]. IEEE Dataport; 2017. Available from : http://dx.doi.org/10.21227/H2K92G
Robespierre Pita, Clicia Pinto, Marcos Barreto, Spiros Denaxas. "Example datasets for AtyImo (Spark version)." doi: 10.21227/H2K92G