A survey is conducted to produce hundreds of estimates on different aspects of the study population. Whether or not the main results of two large-scale sample surveys on a given subject are comparable depends mainly on (i) the concepts, definitions and reference periods adopted for the surveys; (ii) how the sample of households are drawn; and (iii) how closely the set procedures are followed in fieldwork.
As for the factor (i), there is virtually no difference between the last employment and unemployment survey conducted in 2011-12 and the PLFS. The factor (ii) also can be disregarded for the surveys in question. For its household surveys, the NSSO has been using basically the same sampling design over the years, with some fine tuning made every year with the objective of improving accuracy of important estimates. In the PLFS the main departure from the usual practice was that of repeated visits to urban households in the sample, with no basic change in the sampling procedure. The finetunings are not known to have brought about any significant change in accuracy of the estimates and thus do not make the results of two surveys “not comparable”. The main outcome of a labour force surveys is undeniably the estimates of employment and unemployment rates. Whether these estimates are comparable depends mainly on the bias caused by the factor (iii), particularly in field operations. The issues relating to the existing field conditions raised in the article are indeed most pertinent, but the answers provided unfortunately can at best be said to be presumptuous.
On reliability and sample size
Here the author falls into the error, common among non-statisticians, of believing that the accuracy of the estimates is determined by the sampling fraction, which is the ratio of sample size to population size, such as “3 out of 1,000” cited in the article. Though counter intuitive, the theoretically established fact is that the accuracy actually depends on the sample size, with the sampling fraction having virtually no role to play, as long as it is small, say, under 1 per cent. This implies that a sample of 55,000 households drawn from a population of 200 million households produces results as reliable as a sample of the same size drawn from a population of only 2 million households. Like the other household surveys of NSSO, the PLFS is designed to provide reliable estimates at the state-level. The minimum sample size worked out for estimating a ratio reliably applies to all the states — whether as large as Uttar Pradesh or as small as Goa. The sample size on which national-level estimates of the PLFS are based are, in fact, much much larger than the minimum sample size required to produce reliable estimates of unemployment rate.
Further, the author, noticing the higher-than-proportionate allocation made in the sample for households with members educated above a certain level, jumps to the conclusion that the data reported by these households will have a disproportionately large effect on the estimates. This is a common misapprehension among those unfamiliar with survey sampling methods — actually this does not happen because when some population groups are over-represented in the sample, correspondingly low blow-up factors are used in the estimation formula for households from these groups.
The issue of survey estimates of the population being lower than the known population figures have been examined by experts in the past and are well known to NSSO data users. It is one of the reason why the NSSO provides only estimates of rates and ratios that are known to be free from this problem. In fact, experts in the Indian Statistical Institute Kolkata have already developed a procedure to calibrate the survey estimates with the population data that will address this problem.
Having said this, the need for increasing sample sizes to produce estimates at lower levels of aggregation like regions, districts, population groups etc cannot be over emphasised. The manpower resources of the entire national statistical system including the NSSO has remained the same since the eighties. Commensurate with the increase in population and expansion of the economic activities this requires to be urgently augmented and the reliance on temporary investigators minimised. While technology can address the question of data recording, transmission and processing, the actual survey data collection remains a task best performed through personal interviews with the respondents.
(Next: Autonomy of statistical agencies)
Mohanan was a member of the National Statistical Commission and resigned his position recently. Kar is a survey statistician and member of the Standing Committee for Labour Force surveys that guided the Periodic Labour Force Survey. He is currently associated with the ISI Kolkata.