Is Predicted Data a Viable Alternative to Real Data?

It is costly to collect the household- and individual-level data that underlies official estimates of poverty and health. For this reason, developing countries often do not have the budget to update their estimates of poverty and health regularly,...

Full description

Bibliographic Details
Main Authors: Fujii, Tomoki, van der Weide, Roy
Format: Working Paper
Language:English
en_US
Published: World Bank, Washington, DC 2016
Subjects:
Online Access:http://documents.worldbank.org/curated/en/2016/09/26822026/predicted-data-viable-alternative-real-data
http://hdl.handle.net/10986/25156
id okr-10986-25156
recordtype oai_dc
spelling okr-10986-251562021-12-10T18:04:12Z Is Predicted Data a Viable Alternative to Real Data? Fujii, Tomoki van der Weide, Roy double sampling survey costs poverty prediction model It is costly to collect the household- and individual-level data that underlies official estimates of poverty and health. For this reason, developing countries often do not have the budget to update their estimates of poverty and health regularly, even though these estimates are most needed there. One way to reduce the financial burden is to substitute some of the real data with predicted data. An approach referred to as double sampling collects the expensive outcome variable for a sub-sample only while collecting the covariates used for prediction for the full sample. The objective of this study is to determine if this would indeed allow for realizing meaningful reductions in financial costs while preserving statistical precision. The study does this using analytical calculations that allow for considering a wide range of parameter values that are plausible to real applications. The benefits of using double sampling are found to be modest. There are circumstances for which the gains can be more substantial, but the study conjectures that these denote the exceptions rather than the rule. The recommendation is to rely on real data whenever there is a need for new data, and use the prediction estimator to leverage existing data. 2016-10-13T20:46:52Z 2016-10-13T20:46:52Z 2016-09 Working Paper http://documents.worldbank.org/curated/en/2016/09/26822026/predicted-data-viable-alternative-real-data http://hdl.handle.net/10986/25156 English en_US Policy Research Working Paper;No. 7841 CC BY 3.0 IGO http://creativecommons.org/licenses/by/3.0/igo/ World Bank World Bank, Washington, DC Publications & Research Publications & Research :: Policy Research Working Paper
repository_type Digital Repository
institution_category Foreign Institution
institution Digital Repositories
building World Bank Open Knowledge Repository
collection World Bank
language English
en_US
topic double sampling
survey costs
poverty
prediction model
spellingShingle double sampling
survey costs
poverty
prediction model
Fujii, Tomoki
van der Weide, Roy
Is Predicted Data a Viable Alternative to Real Data?
relation Policy Research Working Paper;No. 7841
description It is costly to collect the household- and individual-level data that underlies official estimates of poverty and health. For this reason, developing countries often do not have the budget to update their estimates of poverty and health regularly, even though these estimates are most needed there. One way to reduce the financial burden is to substitute some of the real data with predicted data. An approach referred to as double sampling collects the expensive outcome variable for a sub-sample only while collecting the covariates used for prediction for the full sample. The objective of this study is to determine if this would indeed allow for realizing meaningful reductions in financial costs while preserving statistical precision. The study does this using analytical calculations that allow for considering a wide range of parameter values that are plausible to real applications. The benefits of using double sampling are found to be modest. There are circumstances for which the gains can be more substantial, but the study conjectures that these denote the exceptions rather than the rule. The recommendation is to rely on real data whenever there is a need for new data, and use the prediction estimator to leverage existing data.
format Working Paper
author Fujii, Tomoki
van der Weide, Roy
author_facet Fujii, Tomoki
van der Weide, Roy
author_sort Fujii, Tomoki
title Is Predicted Data a Viable Alternative to Real Data?
title_short Is Predicted Data a Viable Alternative to Real Data?
title_full Is Predicted Data a Viable Alternative to Real Data?
title_fullStr Is Predicted Data a Viable Alternative to Real Data?
title_full_unstemmed Is Predicted Data a Viable Alternative to Real Data?
title_sort is predicted data a viable alternative to real data?
publisher World Bank, Washington, DC
publishDate 2016
url http://documents.worldbank.org/curated/en/2016/09/26822026/predicted-data-viable-alternative-real-data
http://hdl.handle.net/10986/25156
_version_ 1764458693305303040