Treatment Effect Estimation Benchmarks

- Citation Author(s):
- Submitted by:
- Damian Machlanski
- Last updated:
- DOI:
- 10.21227/0v4q-nn37
- Data Format:
- Categories:
- Keywords:
Abstract
This bundle contains 4 well known and established causal inference benchmark datasets in order to evaluate the performance of causal/treatment effect estimation methods. These datasets are: IHDP, Jobs, Twins and News. All datasets are already publicly available. This bundle merely collects them in a single location for ease of replication.
IHDP is based on Infant Health Development Program (IHDP) clinical trial. Goal: predict the effect of receiving specialised childcare on cognitive test score of the infants. Introduced by [1].
Jobs combines data from the National Supported Work Program and the Panel Study of Income Dynamics. Goal: predict the effect of job training on employement status. Introduced by [4].
Twins consists of twin births in the US between 1989-1991. Goal: predict the effect of higher body mass on mortality. This specifically pre-processed data come from [3].
News is a collection of news articles represented as bags of words. Goal: predict the effect of device type used to read the article on the user experience. Introduced by [2].
See respective references for more details about the datasets.
References
[1] J. L. Hill, ‘Bayesian Nonparametric Modeling for Causal Inference’, Journal of Computational and Graphical Statistics, vol. 20, no. 1, pp. 217–240, Jan. 2011, doi: 10.1198/jcgs.2010.08162.
[2] F. D. Johansson, U. Shalit, and D. Sontag, ‘Learning representations for counterfactual inference’, in Proceedings of the 33rd International Conference on International Conference on Machine Learning - Volume 48, in ICML’16. New York, NY, USA: JMLR.org, Jun. 2016, pp. 3020–3029.
[3] C. Louizos, U. Shalit, J. M. Mooij, D. Sontag, R. Zemel, and M. Welling, ‘Causal Effect Inference with Deep Latent-Variable Models’, Advances in Neural Information Processing Systems, vol. 30, 2017, Accessed: May 25, 2021. [Online]. Available: https://proceedings.neurips.cc/paper/2017/hash/94b5bde6de888ddf9cde6748ad2523d1-Abstract.html
[4] J. A. Smith and P. E. Todd, ‘Does matching overcome LaLonde’s critique of nonexperimental estimators?’, Journal of Econometrics, vol. 125, no. 1–2, pp. 305–353, 2005.
Instructions:
Please visit the following GitHub repository for further instructions and examples on how to load and use the datasets with Python programming language.
https://github.com/misoc-mml/cate-benchmark