India’s Data Landscape: Broken Comparability, Selective Publication Persists
As the year comes to a close, we explain how deeper problems persist, including a delayed census weakening survey estimates

Mumbai: By the end of last year, 16 major Indian datasets were delayed across sectors such as health, education, economy, environment, and demography as IndiaSpend reported in 2024. Ten of these have since been released.
The datasets that were released this year include:
Consumer Expenditure Survey (CES), listed as pending in 2023 due to quality and sampling issues, now shows a 2023-24 release;
Sample Registration System outputs including the Statistical Report, Cause of Death, and Maternal Mortality bulletins;
Road accidents in India (2023),
Crime in India (2023)
EnviStats (2025)
India State of Forest Report (2023)
Prison Statistics (2023)
While the release of key datasets marks progress, experts caution that deeper problems persist—methods that break comparability with past data, delayed census weakening survey estimates, and selective release patterns that erode trust.
IndiaSpend spoke to four experts—people who have worked to formulate, use and analyse public data—to understand the current state of data quality, periodicity and transparency, and what that means for the larger landscape of evidence-driven policymaking in India.
For one, India’s methodology for collecting data is old and needs to be changed to incorporate the latest developments. So it has become difficult to build a clean, continuous true picture of the economy and society over time, says Arun Kumar, who taught economics at Jawaharlal Nehru University, and is the author most recently of ‘Indian Economy’s Greatest Crisis: Impact of the Coronavirus and the Road Ahead’. “The way the statistics are presented officially ends up telling its own partial story about policy making.”
“The problem is not that methods have changed—that is inevitable—but that they keep changing in ways that break the ruler we use to measure trends, while the choices about what to publish look increasingly selective,” says Parakala Prabhakar, political economist and formerly a cabinet-ranked communications advisor in Andhra Pradesh.
Fragmented and non‑comparable data
The Household Consumer Expenditure Survey, carried out every five years by the National Sample Survey Office, tracks household consumption—data used to estimate poverty and guide welfare policy. The 2022-23 round was the first since 2011-12, after the 2017-18 survey was withheld. Experts say that while the release of the latest Household Consumer Expenditure Survey (HCES) restores a critical data series after a long gap, changes in its methodology have made it difficult to track consumption and poverty trends over time.
The recent surveys expanded item lists, used three separate questionnaires for different types of goods, and shifted from single to multiple visits for data collection, now done through computer-assisted interviews. Additionally, the 2022-23 survey has data on current prices while previous surveys are in constant prices.
Former acting chairperson of the National Statistical Commission P.C. Mohanan notes that these changes were long overdue and aimed at improving accuracy: “The methodological changes were recommended by many experts in the past,” he says, explaining that multiple household visits reduce recall errors and respondent fatigue, but adds that “unfortunately it makes the data non-comparable with the previous surveys.”
Dipa Sinha, development economist based in Delhi, argues that the core problem is the absence of bridge surveys that could have preserved continuity, noting that while “definitions, questionnaires and sampling strategies have all changed,” there were no parallel surveys to link the old and new methods, leaving analysts able to report current consumption levels but unable to clearly assess how living standards have changed over time.
Mohanan gives the example of poverty estimates—analysts have used the new consumption data and applied the Tendulkar and Rangarajan poverty lines, which were at a lower level of expenditure. “So if you use this new data on the old poverty line, naturally your poverty estimates will go down drastically,” he explains.
Census delays impact statistical architecture
The Census of India is a decennial exercise that provides village-level data on population size, age structure, sex composition, literacy and migration, forming the backbone of India’s demographic statistics. Last conducted in 2011, the houselisting exercise for the next census is scheduled for April-September 2026 and population enumeration for February 2027.
The exercise—India’s first fully digital census—will include caste enumeration, deploy around 3 million field functionaries, use mobile apps and real-time monitoring systems, and aims to release more granular, machine-readable data for policymaking, even as the long gap in core demographic data is set to extend until the results are released.
Kumar said this delay in Census and changing the base year of GDP and CPI means “all large surveys now rest on a sampling frame built on 2011 demographics,” even as age structures and rural-urban patterns have changed, making “every downstream estimate—consumption, employment, poverty—less comparable over time because it is anchored to an increasingly outdated picture of the population and its characteristics.”
Sinha explains that the Census tells you “not only how many people there are, but how they are distributed across rural and urban areas and across districts”. Using old data “has direct consequences for welfare schemes, poverty lines and per‑capita allocations, which all rely on those census‑based shares to decide how many people in each place are counted as poor and how much money or grain each state ought to get.”
Mohanan highlights the practical consequences for policymaking, pointing out that “unfortunately, without the latest census data, we will be revising our national accounts” using projected population figures, leaving the government with “no option but to continue with assumptions on population size for at least another two or three years.”
“In a period of rapid change—large migration flows, urbanisation, major shocks like demonetisation and Covid—that means your sample weights, your rural–urban split, and your state shares are all increasingly misaligned with reality,” Prabhakar says. “Any estimate you turn into “number of people” or “share of the population” inherits that distortion.”
Selective releases, shrinking transparency
“Over the last decade, what has changed most is not just the numbers, but who decides what gets seen and when,” says Mohanan.
Experts have pointed out instances when the non-release of the 2017-18 Consumer Expenditure Survey, the delay of jobs data showing 45-year-high unemployment until after the 2019 election and International monetary fund raised concerns over India's gross domestic product (GDP) and gross value added (GVA) over outdated base year of 2011-12.
Across cases, the stated justifications—ranging from unverified data and methodological transitions to limitations in state reporting—have repeatedly resulted in the absence of uncomfortable or politically sensitive statistics from the public domain.
In 2025, the Union government has repeatedly told Parliament that key information sought by MPs is “not maintained,” “not compiled,” or “not available centrally,” effectively declining to provide data on issues ranging from RTI denials, competitive exam paper leaks, and suicides of medical interns.
Experts argue that India’s data system has increasingly shifted towards selective release and tighter political control, eroding transparency in different but reinforcing ways. Kumar points to a clear pattern of withholding inconvenient data, noting that “when the 2017-18 consumption survey pointed to declining consumption, the results were not released, and when an unemployment survey showed a 45-year high jobless rate just before the 2019 election, its publication was delayed until after the polls,” a practice that, taken together, means the system now “produces numbers that sustain a particular story—that growth is high, poverty is falling and welfare schemes are delivering.”
“In the case of the 2017–18 consumption survey, the professional statistical commission was never even asked to review the data; the ministry simply declared there were “data quality issues”, without spelling out what those issues were, and blocked publication until a leaked version appeared in the press,” Mohanan explains. “That kind of unilateral veto is very different from the earlier practice, where data problems were debated openly within the statistical system and in public.”
Sinha observes that “one of the most striking changes is not just how data is collected, but how public access to them is being quietly restricted,” with systems like Health Management Information System still collecting data even as it moves “behind logins and redesigned dashboards that show only a thin slice of the underlying information,” creating a situation where there is “a rhetoric of ‘data abundance’ but shrinking access to microdata.”
Prabhakar points to a larger pattern from the suppression of employment data, action against dissenting institutions, long delays in datasets like CES and NCRB, and the postponed census as evidence that “uncomfortable numbers are delayed, withdrawn or re-packaged, while more flattering series are highlighted,” rather than openly debated and corrected.
On economic data, he adds, “the problem arises when, instead of investing in independent data for the unorganised sector, growth in the organised sector is simply used as a proxy for the informal economy—even though in India the formal sector often expands at the expense of small, informal firms.
“When that happens, an 8‑plus per cent GDP growth number can sit alongside weak job creation and distress in the informal sector, yet still be treated as if it describes the whole economy. That breaks the link between the indicator and lived reality and makes it very hard to track genuine welfare changes over time.”
The way forward
Experts agree that restoring public trust in India’s official statistics will require deep institutional reform rather than piecemeal technical fixes. Mohanan stresses creation of a fully empowered, well-resourced statistical commission that works full time and for rebuilding permanent field capacity instead of relying heavily on contract investigators, which weakens continuity and quality. “There is a commission now—I was part of it—but it has very little resources and is a part‑time body, with the chairperson and members serving part‑time,” Mohanan says.
Prabhakar reinforces this, calling for statistical institutions to be kept at arm’s length from day-to-day political control, for methods and revisions to be fully documented and contestable, and for timely, open data access to be treated as a test of integrity. Unless statistics are clearly produced for public understanding and policy rather than propaganda, he warns, confidence in official numbers will not return.
Kumar argues that without strong public accountability and democratic pressure—both weak in India’s political culture—even formally independent bodies are vulnerable to political control, leading to distrust in official data.
Sinha says the starting point must be restoring an independent, expert-driven buffer between the government and the numbers, with publication as the default for all surveys and administrative data, greater transparency around methods, and open access to underlying datasets so claims on growth, poverty and welfare can be independently checked instead of hidden behind dashboards and logins.
IndiaSpend reached out to Biswajit Das of Census Planning and Ajay Baksi, Additional Director General at the Ministry of Statistics and Programme Implementation’s Household Survey Division, for comment on census delays, survey redesign, broken comparability with past data and concerns over transparency and public access. We will update this story when we receive a response.
We welcome feedback. Please write to respond@indiaspend.org. We reserve the right to edit responses for language and grammar.
