Chennai: About 10 months, nine million cases and over 130,000 deaths later, India does not yet know enough about why the SARS-CoV-2 virus that causes COVID-19 affected whom it did, and how it did that.

There is wide variation in the spread of COVID-19 between states. Areas with higher population density are expected to have more cases, but varying testing rates could be affecting case detection. While the elderly and those with comorbidities are more likely to progress to severe disease and fatality, our analysis of national health data shows that there are exceptions among states. Higher testing is expected to lead to more cases, but some states buck this trend, even as evidence suggests some undercounting. Moreover, increasing use of the rapid antigen test, which is known to return false negatives in up to 50% of cases, further complicates case detection and reported numbers.

As would be expected with an outbreak that originated in China, India's first cases were of imported origin. As a direct result, the early Indian outbreaks could be traced back to well-known international pathways into Indian states and cities. When travel resumed after the lockdown, some outbreaks could be traced to migrant movement patterns; in states such as Karnataka that disclosed the origin of their new cases, the majority were linked to travellers and contacts of positive persons, as IndiaSpend reported in August. By the third week of November, over half of all cases were of unknown origin. Further, most other states do not disclose the source of their cases--while doctors on the ground call community transmission, governments remain loathe to use the phrase.

If cases are spreading through community transmission, what explains why some states are seeing much bigger outbreaks than others? What explains variations in the outcomes of the cases in these states? And do we know enough to be certain we are seeing the full picture?

A few broad trends have emerged thus far in India and globally; for one, the Indian Council of Medical Research's seroprevalence surveys conducted in May and August found higher prevalence in urban than in rural areas.

What characteristic of cities makes them seemingly so much more prone to outbreaks? The key defining feature seems to be population density; in Mumbai for example, living in a slum and using a shared toilet was a strong predictor of sero-positivity, a sero-survey conducted by the city's municipal corporation with the Tata Institute for Fundamental Research found. Within cities, high population density slums have seen far more cases than high-rises, thus far.

Advanced age is strongly correlated with worse outcomes in COVID-19 patients in India and worldwide, health ministry data show. As a result, states with a lower median age could be expected to have fewer deaths, but many outliers exist (see chart below).

States with a higher share of people with non-communicable diseases--which become comorbidities in the case of COVID-19--could also expect to see higher mortality, but again, exceptions abound. Our analysis used the prevalence of high blood sugar and high blood pressure only among men, as men are at higher risk for both (see here and here), and for COVID-19 mortality.

Yet, these explanations do not complete the picture.

"What we have right now are parts of the explanation, but there is a lot that we still do not know," Gautam Menon, professor of physics and biology at Ashoka University, and an infectious disease modeller, told IndiaSpend. "We just do not know for example why states like Uttar Pradesh and Bihar have seen such few cases."

Part of the problem is that very little peer-reviewed published work on the nature of transmission has come out of India thus far; one recent paper published in Science magazine analysed the nature of transmission using contact-tracing datasets from Tamil Nadu and Andhra Pradesh: 5% of infected individuals accounted for 80% of cases, children and young adults accounted for a third of cases, and deaths were concentrated among 50-to-64-year-olds, the study found. But it is difficult to generalise the findings given the two southern states have better developed health infrastructure and older populations (see chart above).

Bihar's health secretary Pratyaya Amrit attributes the state's low numbers (1,878 cases per million population) to its testing regimen combined with an unlikely natural experiment. "As you know, Bihar was hit with massive flooding this year which affected more than 8.4 million people," Amrit said. "We had to evacuate more than 550,000 people. So what we decided to do was we would rescue the person, and then before dropping them off at the relief camps, check their symptoms and conduct a test. In that way we were able to test a huge number of people and contain the spread." New reports said the chief minister had instructed officials to ensure social distancing in relief camps and community kitchens. The state also conducted antigen testing at bus-stops and other crowded areas, which helped identify asymptomatic people, Amrit said.

Yet, states which have tested more widely than Bihar have not been able to keep cases as low as the state has (see chart below); in fact, higher testing drives up the numbers, Munish Moudgil, the IAS officer who chairs Karnataka's COVID-19 war-room, points out.

Also, antigen tests have low sensitivity, leading to false negatives. The declining numbers of new cases are "encouraging", but because of the problem we have about the kind of test being employed, "there can be a fair amount of confusion about whether the actual case numbers really reflect day-to-day trends accurately", K. Srinath Reddy, president of the Public Health Foundation of India (PHFI), told IndiaSpend in October.

Then, there is the case of Kerala, which has recently seen a spurt in cases, but has low mortality (55 deaths per million population), about a third the rate in neighbouring Tamil Nadu. "Kerala has a good, high-quality and well spread out healthcare network and its people have a high rate of accessing healthcare," Rajeev Sadanandan, Kerala's former health secretary, and now the special advisor to the chief minister on COVID-19, told IndiaSpend. "During the pandemic, Kerala was the first to establish a four-tier hospital system, and every single case is followed up."

But there is some evidence that Kerala is undercounting its COVID-19 dead. Arun N. Madhavan is a doctor of internal medicine who runs a clinic in Palakkad. He leads a team of volunteers who scan the state's media for news reports and obituaries that mention a death from COVID-19, and cross-reference it against the state's official list. Until now, the group says that 45% of the deaths they tallied never showed up in the government's list.

"Kerala is a state that has always taken pride in its data transparency. There were earlier many deaths that were not being counted as COVID deaths; however now this is only taking place in case there were very extenuating circumstances, like if the person had a terminal illness but also happened to be COVID positive," Sadanandan admitted, adding that he hoped that most or all deaths of COVID-positive people would be counted in the state.

The issue here cuts at the heart of a broader problem with inter-state comparisons--the wide variations between states in their ability and willingness to report data. Many states are to a larger or smaller extent attempting to under-report data so as not to attract negative media and political attention, an IndiaSpend investigation showed. Assam's mortality from COVID-19 is lower than Bihar's, but reporting from the state has shown that its death audit committees reclassified 60% of deaths of COVID-19 positive people as being "deaths from other causes".

"Comparing states on the basis of their reported numbers and then criticising the states reporting higher numbers is actually not a fair comparison," Giridhar Babu, an epidemiologist with the Public Health Foundation of India, said.

If these are states with inexplicably low numbers and under-reporting problems, equally little is yet known about why mortality in particular is higher than normal in some places. Until the first week of September, Gujarat had India's highest case fatality rate (CFR)--the proportion of lab-confirmed cases that resulted in deaths--until it was overtaken by Punjab.

Both Gujarat (29.28 years) and Punjab (31.96 years) are older than the Indian average (28.4 years), have a higher proportion of comorbidities, and are more urban; however, once again, the explanation is not complete. Tamil Nadu, for instance, meets these criteria as well, but has low case fatality.

State-level explanations are largely based on anecdotes; Punjab's nodal officer for COVID-19, Rajesh Bhaskar, attributed the state's high CFR to delayed testing. "There were rumours spreading on social media that this is like a common cold and no one needs to get tested," Bhaskar told IndiaSpend. "Once we realised this, we did a big clampdown on disinformation on social media, and also involved panchayats and local bodies in encouraging people to get tested. We reduced the waiting time for tests to 15 minutes, and now daily deaths have come down from 70 per day to 10-14 per day." Why a misinformation-driven outbreak should be unique to Punjab, however, is unclear.

Indeed, a state's declining numbers could be argued to be a result of government successes and failures. "The decline that we are seeing can be as a result of two factors: either the containment measures have been so successful that cases are declining, or the virus has spread so widely that some kind of herd immunity has been achieved," Manoj Murhekar, director of the Indian Council of Medical Research's National Institute of Epidemiology, said. "I personally think it is the latter--from what we have seen in the ICMR sero-survey as well as other published sero-surveys in metros, it appears that the virus has spread widely and that is the cause of the decline." Why it should have spread more widely in some states than others, however, needs more study, he said.

While Indian scholarship on the spread of the virus might still have some distance to go, we may never find all the answers we are looking for. "One thing to keep in mind while looking at infectious diseases is that some of it is just luck," Menon said. "Even if you do everything right, it just takes one person who happens to be hyper-infectious, coming out and infecting a bunch of people, then moving on--these are the eventualities that you cannot plan for," he said.

(Rukmini S. is an independent data journalist based in Chennai.)

We welcome feedback. Please write to We reserve the right to edit responses for language and grammar.