Columns
Data cooking in government surveys
There is a dearth of institutions in Nepal capable of carrying out large-scale research.Achyut Wagle
The first edition of the Nepal Living Standards Survey (NLSS-I) 1995-96 aimed to create “unique opportunities to assess the poverty situation in the country and carry out many other research works by providing a large database at a single reference point.” Simultaneously, the survey operation aimed to contribute to the capacity building of the then Central Bureau of Statistics (CBS) in conducting sample surveys. In a sense, it was an experimentation and training opportunity created for the next generation of sample survey researchers. This, therefore, had obvious caveats on the robustness of its outcomes and usability of inferences.
Last month, the National Statistical Office (NSO), the rebaptised CBS, published a report of the fourth edition of the Survey (NLSS-IV) 2023-24. The original objective of the NLSS to “assess the poverty situation” does not seem to have changed. However, in nearly three decades of exercises that published three editions—1996, 2004 and 2011—this government entity seems to have barely learned to conduct such sample-based surveys, improve its inferential credibility, and maintain research integrity while working with a disproportionately small sample of a large population.
Unrepresentative sampling
No doubt, sample surveys, as against population surveys, are risky and can provide only indicative inference even if they are implemented with utmost professionalism and high precision research. Otherwise, they suffer from self-selection bias and inferential misrepresentation due to underrepresentation of data variation. Besides, creating a mini sample from an already available population dataset to inquire into the topic that the population sample itself could carry out better is a redundant exercise and a misuse of resources.
First, the NLSS rituals on top of the National Census do not add any value to better understanding the country's socioeconomic realities. Second, the authenticity of small-sized sample surveys like the NLLS can be established if, and only if, their statistical outcomes are consistent with those of comprehensive surveys like censuses. The NLSS exercises have failed on both counts.
For the NLSS-IV, data was collected from a sample of 9,600 out of 6.761 million households, or 0.14 percent. When it is disaggregated into selected 15 geographical domains, the average number of households surveyed in each domain is only 640. This is certainly a gross underrepresentation to statistically infer anything meaningful. For example, the NLSS-IV claims the proportion of households in Nepal receiving remittance income has reached 76.8 percent. It also argues that it is a dramatic increase from 23.4 percent in 1996 and 56.8 percent in 2011. Nevertheless, it is not only inconsistent with the National Census 2021 data but also does not reflect the actual ground reality.
Inferential flaws
The Census 2021 states that the total number of individuals absent from home, possibly to be in a foreign country, is only 2.169 million. It is next to impossible to receive remittances by 5.192 million (76.8 percent of 6.761 million households recognised by the Census) households sent by only 2.169 million migrating individuals, including seasonal absentees. Among them, even if only one individual from each household is considered out of home and remits the earnings back, only 32 percent of the households are likely to receive any kind of remittance. The direct positive correlation between the outmigration of individuals and remittance income can no way be undermined or ruled out.
There were similar inconsistencies in NLSS-2011 and Census-2011. The Census stated that the population away from the country then was only 1.921 million, while the NLSS-III claimed that nearly 57 percent of the 5.7 million households benefitted from the remittance receipts. As such, which inferences and conclusions should people believe in and use—the Census or the NLSS, both published by the same agency? There seems to be a great deal of confusion even in defining what essentially constitutes the 'remittance' for the purpose of this report.
Even in determining the poverty line and poverty-related indicators—the sole stated objective of the NLSS exercise—both the data and inferences are messy, to say the least. It has adopted a new poverty line using 2022/23 as the base year. The updated line marginally increases the minimum income bar. A Nepali citizen is now classified as poor if her annual per capita total consumption expenditure is less than NRs72,908 annually, or Rs200 (USD 1.5 at 1 USD=NPR 133) per day. The report, at one instance, claims that Nepal has seen a significant reduction in poverty headcount over the last 12 years, from 25.16 percent in 2010-11 to 3.57 percent (vis-a-vis 2011 inflation-adjusted poverty line) in 2022-23. In the next paragraph, it says, “20.27 percent of the population in Nepal lives below the new poverty line”.
The definitional relation in practice between the 'headcount' and the 'poverty line' is that the 'national poverty headcount ratio is the percentage of the population living below the national poverty line.' Then how do the claims that poverty declined to 3.57 percent while 20.27 percent of the population still languish below the poverty line by earning only USD 1.5 dollars a day stand together? Regardless of these purely data doctoring or data cooking machinations, the critical question is: Whether the poverty rate actually declined in reality or not.
Extrapolations and excuses
These are only a few examples of research by a government agency that is dependent on a small set of data while resources are disproportionately used in clever tabulation and preparation of politically convenient reports to show progress in critical topics like poverty reduction. No research finding should be only a circus of data overexploitation and far-fetched extrapolation like this.
Whenever the discrepancies in research outcomes of government agencies are called out, the instant response is to point fingers towards methodological differences or biased perceptions. Also, the “we are the government” mindset runs deep to prove their findings as sacrosanct and beyond question.
Nepal is pathetically handicapped in data-based research. There is a clear dearth of institutions in the country, academic or otherwise, that have the technical capacity and financial resources to carry out large-scale, nationwide research to counter the spuriousness of many such government surveys and research outcomes. International development partners, for instance, the World Bank, which has been supporting the NLSS endeavours since the beginning in 1995, have hardly paid any attention to the fact that the effective presence of autonomous research institutions out of government influence can better contribute to national policymaking. A plethora of survey-based research norms and practices, specifically in government-owned agencies, must change now.