TIPD: Taiwan Indigenous Peoples Open Research Data
- Submission Dates:
- 01/25/2022 to 01/31/2022
- Citation Author(s):
- Submitted by:
- Ji-Ping Lin
- Last updated:
- Thu, 01/27/2022 - 18:19
- DOI:
- 10.21227/4ggg-7c17
- Data Format:
- Links:
- License:
- Creative Commons Attribution
- Categories:
- Keywords:
Abstract
Abstract (for details, see https://osf.io/e4rvz/)
-
Why the Research and Importance: Taiwan Indigenous Peoples (TIPs) are a branch of Polynesian-Malaysian (or Austronesian) ethnic groups in genetic and linguistic context. Since early 17th Century, TIPs had been playing a crucial role during the Great Marine Times of East Asia trades. There was a rich body of ethnographic, official and academic records on TIPs before 1940. However, the period of 1940-2000 marks as data ”Dark Ages” for TIPs due to 1941-45 Pacific War, 1946-1990 political authoritarian rule in fears of communism and communists infiltration. Persistent lack of TIPs data led TIPs to become isolated, marginalized and thus underdeveloped. Taiwan resumed TIPs population census in 2000 and began recording TIPs individual records in household registration system since 2003. This research program is conducted on the basis of a four-year Joint Research Agreement between Academia Sinica and Council of Indigenous Peoples starting in 2013. One important aim of the research is to construct big anonymous TIPs open research data (or TIPD) based on contemporary census and household registration data sets. TIPD utilizes state-of-the-art data science, record linkage, geocoding, and high-performance in-memory computing technology to construct various dimensions of TIPs demographics & developments. Major outputs of TIPD applications include cross-sectional categorical data, longitudinally constructed population dynamics data, life tables, household statistics, micro genealogy data, intra- & inter-ethnic marriage data, ethnic integration data, ethnic patriarchy and matriarchy identity data, etc. They reflect the progress and efforts of Taiwan academicians struggling to construct various developments of contemporary TIPs. Not only is the research program expected to unveil contemporary TIPs demographics and various developments, but also to help overcome research barriers & unleash social creativity for TIPs studies. They will contribute to shed lights on contemporary population, human dynamics, and developments of TIPs which have been “invisible” to the world for seven decades.
-
Types of TIPD: Major outputs of TIPD which are open to the public amount to 99,184 files in number and around 204 GB in size. TIPD are bilingually documented and its content, context, and volume are growing steadily. TIPD now consist of three categories of open research data: (1) categorical data, (2) household structure and characteristics data, and (3) population dynamics data. Categorical data include four broad dimensions. The first one is contingency tables which are available in PDF, HTML, RTF, XLS formats; the second is multi-dimensional data which are offered in CSV, Excel, dBase, Matlab, Gauss, HTML, JMP, SAS, SPSS, Stata, Access, and R data formats; the third is open data on urban indigenous peoples; the fourth is open data on indigenous traditional tribes/communities. Household structure and characteristics data consist of three broad dimensions of information: (1) household head information, (2) household member composition information, and (3) household geographical information. They are also available in CSV, Excel, dBase, Matlab, Gauss, HTML, JMP, SAS, SPSS, Stata, and R data formats. Population dynamics data consists of three categories: (1) increased population within a given period of time. It can be further dichotomized into two branches of data, population increase due to birth and due to immigration; (2) decreased population within a given period of time. It can also be divided into two branches of data, population decrease due to death and due to emigration; (3) intact population within a given period of time. It can be distinguished into two categories of population: those who make internal migration and those remaining staying-put. For intact population who make internal migration, internal migration processes such as in-, out-, net, gross migrations are analyzable. Every types of population dynamics data are available in CSV, Excel, dBase, Matlab, Gauss, HTML, JMP, SAS, SPSS, Stata, Access, and R data formats. The potential applications for research on TIPs based on TIPD include studies on birth, death, migration, residential mobility, life table, marriage, ageing, education, medical care, labor, family, community, etc. TIPD could be used as background data for survey study, including population analysis, sampling design and sampling planning. TIPD also offers additional open data for specific research purpose, including (1) indigenous tribal data in traditional territories, (2) indigenous communities in urbanized areas, and (3) aggregated population centers at various geographic units.
-
Potential contributions: TIPD reflects the progress and efforts of Taiwan academicians struggling to construct various developments of contemporary TIPs population. The potential applications for research on TIPs based on TIPD include studies on birth process, death process, migration process, residential mobility, life tables, marriage, ageing, education, medical care, labor, family, community, etc. TIPD could be used as background data for survey study, including population analysis, sampling design and sampling planning. Not only does TIPD help unveil contemporary TIPs demographics and various developments, but it also helps overcome research barriers and unleashes creativity for TIPs studies. It will help shed light on contemporary population, human dynamics, and developments of TIPs which have been “invisible” to the world for seven decades. TIPD potential contributions are threefold. First, theoretically based on data science, not only does TIPD overcome legal and ethical issues, but it also democratizes the use of detailed information hidden in modern micro data sets. Thus it is expected to promote research and unleash creativity in the context of TIPs studies and to enhance the visibility of TIPs. Second, TIPD empirically demonstrates that the value-added data enrichment and open data sharing could be accomplished by using less expensive digital infrastructure and open data repository. Third, in addition to general-purpose research, TIPD enables us to conduct very specific researches, such as population dynamics, family life course, life table, ethnic relationship, etc. In short, the potential contributions of TIPD are as follows. First, a contribution of moving from “closed” to “open” in the sense that the research on TIPD helps shed light on contemporary Taiwan Indigenous Peoples and human dynamics which have been “invisible” to the world for seven decades. Second, a contribution of moving from “the elite” to “the ordinary” in the sense that the constructed open data sets reduce tech barriers for researchers interested in indigenous population studies. Third, a contribution of moving from “local” to “global” in the sense that the English version of TIPD is open to the international academic community to promote further value-added data enrichment through international collaboration. Fourth, a contribution of enabling TIPs research from “macro and static” to “micro and dynamic” data by providing, e.g., micro social network data, genealogy, and population dynamics open data
-
ReadMe1st.pdf: a ReadMe file that briefly introduces theoretical foundation, methods, techniques applied to createTIPD repository.
-
ReadMeDirectoryStructure.pdf: a ReadMe file using tree structure to highlight how TIPD repository is organized.
-
ReadMeFileList.pdf: a ReadMe file that show the list of files in TIPD repository.
Competition Dataset Files
Attachment | Size |
---|---|
The zipped file of all archives in TIPD V4.0 (reduced version) | 3.14 GB |
Documentation
Attachment | Size |
---|---|
Brief introduction of TIPD V4.0 | 684.81 KB |
Directory structure of TIPD V4.0 repository | 55.18 KB |
File list of all archives in TIPD V4.0 repository | 711.79 KB |