TIPD: Taiwan Indigenous Peoples Open Research Data
English (for details, see https://osf.io/e4rvz/)
Why the Research and Importance: Taiwan Indigenous Peoples (TIPs) are a branch of Polynesian-Malaysian (or Austronesian) ethnic groups in genetic and linguistic context. Since early 17th Century, TIPs had been playing a crucial role during the Great Marine Times of East Asia trades. There was a rich body of ethnographic, official and academic records on TIPs before 1940. However, the period of 1940-2000 marks as data ”Dark Ages” for TIPs due to 1941-45 Pacific War, 1946-1990 political authoritarian rule in fears of communism and communists infiltration. Persistent lack of TIPs data led TIPs to become isolated, marginalized and thus underdeveloped. Taiwan resumed TIPs population census in 2000 and began recording TIPs individual records in household registration system since 2003. This research program is conducted on the basis of a four-year Joint Research Agreement between Academia Sinica and Council of Indigenous Peoples starting in 2013. One important aim of the research is to construct big anonymous TIPs open research data (or TIPD) based on contemporary census and household registration data sets. TIPD utilizes state-of-the-art data science, record linkage, geocoding, and high-performance in-memory computing technology to construct various dimensions of TIPs demographics & developments. Major outputs of TIPD applications include cross-sectional categorical data, longitudinally constructed population dynamics data, life tables, household statistics, micro genealogy data, intra- & inter-ethnic marriage data, ethnic integration data, ethnic patriarchy and matriarchy identity data, etc. They reflect the progress and efforts of Taiwan academicians struggling to construct various developments of contemporary TIPs. Not only is the research program expected to unveil contemporary TIPs demographics and various developments, but also to help overcome research barriers & unleash social creativity for TIPs studies. They will contribute to shed lights on contemporary population, human dynamics, and developments of TIPs which have been “invisible” to the world for seven decades.
Types of TIPD: Major outputs of TIPD which are open to the public amount to 99,184 files in number and around 204 GB in size. TIPD are bilingually documented and its content, context, and volume are growing steadily. TIPD now consist of three categories of open research data: (1) categorical data, (2) household structure and characteristics data, and (3) population dynamics data. Categorical data include four broad dimensions. The first one is contingency tables which are available in PDF, HTML, RTF, XLS formats; the second is multi-dimensional data which are offered in CSV, Excel, dBase, Matlab, Gauss, HTML, JMP, SAS, SPSS, Stata, Access, and R data formats; the third is open data on urban indigenous peoples; the fourth is open data on indigenous traditional tribes/communities. Household structure and characteristics data consist of three broad dimensions of information: (1) household head information, (2) household member composition information, and (3) household geographical information. They are also available in CSV, Excel, dBase, Matlab, Gauss, HTML, JMP, SAS, SPSS, Stata, and R data formats. Population dynamics data consists of three categories: (1) increased population within a given period of time. It can be further dichotomized into two branches of data, population increase due to birth and due to immigration; (2) decreased population within a given period of time. It can also be divided into two branches of data, population decrease due to death and due to emigration; (3) intact population within a given period of time. It can be distinguished into two categories of population: those who make internal migration and those remaining staying-put. For intact population who make internal migration, internal migration processes such as in-, out-, net, gross migrations are analyzable. Every types of population dynamics data are available in CSV, Excel, dBase, Matlab, Gauss, HTML, JMP, SAS, SPSS, Stata, Access, and R data formats. The potential applications for research on TIPs based on TIPD include studies on birth, death, migration, residential mobility, life table, marriage, ageing, education, medical care, labor, family, community, etc. TIPD could be used as background data for survey study, including population analysis, sampling design and sampling planning. TIPD also offers additional open data for specific research purpose, including (1) indigenous tribal data in traditional territories, (2) indigenous communities in urbanized areas, and (3) aggregated population centers at various geographic units.
Potential contributions: TIPD reflects the progress and efforts of Taiwan academicians struggling to construct various developments of contemporary TIPs population. The potential applications for research on TIPs based on TIPD include studies on birth process, death process, migration process, residential mobility, life tables, marriage, ageing, education, medical care, labor, family, community, etc. TIPD could be used as background data for survey study, including population analysis, sampling design and sampling planning. Not only does TIPD help unveil contemporary TIPs demographics and various developments, but it also helps overcome research barriers and unleashes creativity for TIPs studies. It will help shed light on contemporary population, human dynamics, and developments of TIPs which have been “invisible” to the world for seven decades. TIPD potential contributions are threefold. First, theoretically based on data science, not only does TIPD overcome legal and ethical issues, but it also democratizes the use of detailed information hidden in modern micro data sets. Thus it is expected to promote research and unleash creativity in the context of TIPs studies and to enhance the visibility of TIPs. Second, TIPD empirically demonstrates that the value-added data enrichment and open data sharing could be accomplished by using less expensive digital infrastructure and open data repository. Third, in addition to general-purpose research, TIPD enables us to conduct very specific researches, such as population dynamics, family life course, life table, ethnic relationship, etc. In short, the potential contributions of TIPD are as follows. First, a contribution of moving from “closed” to “open” in the sense that the research on TIPD helps shed light on contemporary Taiwan Indigenous Peoples and human dynamics which have been “invisible” to the world for seven decades. Second, a contribution of moving from “the elite” to “the ordinary” in the sense that the constructed open data sets reduce tech barriers for researchers interested in indigenous population studies. Third, a contribution of moving from “local” to “global” in the sense that the English version of TIPD is open to the international academic community to promote further value-added data enrichment through international collaboration. Fourth, a contribution of enabling TIPs research from “macro and static” to “micro and dynamic” data by providing, e.g., micro social network data, genealogy, and population dynamics open data
繁體中文 (細節請參考 https://osf.io/e4rvz/)
資料種類及內容：TIPD目前計99,184個共約204GB開放學術研究資料，以中文及英文描述，研究資料內容、種類、及規模將持續增加之中。目前主要開放研究資料類型包括：(1)類別資料(或稱離散資料)，(2)家戶結構資料，(3)人口動態資料等三大類型。類別資料再分成兩大類：(a)列聯表(或所謂交叉表)，含PDF、HTML、RTF、XLS四個基本格式；(b)多維表，含CSV、Excel、dBase、Matlab、Gauss、HTML、JMP、SAS、SPSS、Stata、Access、R等格式；(3) 都市原住民開放資料；(4) 傳統部落/社區開放資料。家戶結構資料包括三個基本面向：(1)戶長基本資訊、(2)家戶成員組成資訊、(3)家戶環境變數資訊，亦以CSV、Excel、dBase、Matlab、Gauss、HTML、JMP、SAS、SPSS、Stata、Access、R等格式呈現；人口動態資料分成三大類：(1)兩時點間”增加人口”，可區隔”出生”及”跨國移入”因素、(2)兩時點間”減少人口”，亦可區隔”死亡”及”跨國移出”因素、(3)兩時點間”持續存活人口”，可區隔”沒做內部遷徒(stay-put)”及”有做內部遷徒”因素，內部遷徙亦可分析”移入”及”移出”等要素，人口動態資料亦以多維表資料呈現，含CSV、Excel、dBase、Matlab、Gauss、HTML、JMP、SAS、SPSS、Stata、Access、R等格式。TIPD開放研究資料庫可當質性或量化研究之專業或背景資料，主要潛在應用領域包括出生、死亡、人口遷徙、家戶流動、生命表、婚配、老化、教育、醫療、照護、勞動、家庭、社區，亦可做為母體分析、抽樣設計及抽樣規劃等參考依據。TIPD亦提供特定用途的開放研究資料，包括(1) 都市原住民開放研究資料、(2) 傳統部落及社區基礎資料、(3)各類人口重心資訊（包括區域、縣市、鄉鎮市區、村里、都會區、部落）。
Both files use tree structure to exhibit TIPD repository structure and files under a give directory in TIPD repository.