Site icon Youth Ki Awaaz

Do Data And Statistics Paint The Whole Picture Of An Issue?

Everyone is well aware of the uses of statistics—how data reveals some facts like changes of some variables over time and across communities depending on various aspects of society. Also, data is used to correlate various factors and form welfare policies.

However, there are several cases of abuse of statistics too. For example, a positive change in any aspect might directly be credited to its respective policy. However, we ignore the other side impacts or external factors due to which the change might have occurred. If one excludes these factors, one might realise that the policy, by itself, might have had minimal impact.

To discuss the fallacies and misinterpretation of statistics, Genalpha DC at IMPRI Impact and Policy Research Institute, New Delhi, organised an IMPRI Special Lecture: The State of Statistics — #DataDiscourses on Uses and Abuses of Statistics by Prof Manoranjan Pal, a former Professor at the Economic Research Unit, and former Director of International Statistical Education Centre, Indian Statistical Institute (ISI), Kolkata.

Prof Pal’s research work is currently focused on measurement of poverty, inequality and segregation; applied econometrics; measurement of the status of health and nutrition; gender bias and empowerment of women, among others. He is also the co-author of Basic Econometrics (6th Edn) with D N Gujarati and D C Porter.

Prof Utpal K De, professor at North-Eastern Hill University (NEHU), Shillong, was the moderator of this session. He mentioned that while collecting data and information, there is scope for several mistakes and errors, which cause a hindrance in the analysis of the data. Knowingly or unknowingly, mistakes can be committed in the analytical methods too.

Without having knowledge or using common sense, if we directly just use the data to reach a conclusion, it’s possible we present a wrong interpretation of the data. This will, in turn, lead to wrong outcomes and policy formulation.

“Growth of two rupees to four rupees exhibits a 100% growth and zero to one shows an infinite rate of growth. This shows that one has to be very meticulous, methodical and understand the actual relationship among variables.”

To quote Benjamin Disraeli, “There are three kinds of lies: lies, damned lies and statistics.”

Prof Pal didn’t completely agree with this famous quote because a statistical result depends on interpreting and applying the data. If one fails to apply the data correctly, they may come up with false answers. He described uses and abuses with appropriate reasoning and logical arguments.

Deceptive Statistics

Representative Image. (Source: flickr)

According to Prof Pal, how a question is worded matters a lot. It is difficult to say no to certain questions, such as whether children should be given more open space to play freely or about how women harassment is increasing daily. These types of questions are called directed questions.

He notes that one should avoid directed questions while collecting data.

The approach of the investigator to the respondent is also crucial. The background and presence of other persons during the interview also matters. It also depends on many other factors like the availability of time of the respondent, religious beliefs and social dictums, etc.

It is less known that the style of the questions, like “should” vs. “should not”, “is” vs. “is not”, etc., also matters in a significant manner.

The Experiment

He also gave some examples of some experiments and showed how the outcome changes with variation in sample size, the pattern of the question, mode of analysis, etc.

An experiment was conducted with these styles of questions. They set nine questions to get family-related views and 14 questions to get social views on different aspects of gender violence from adult males to females. There were two styles of questions, each giving the opposite meaning.

For example, Type A, or the affirmative question, “Women have the right to express their opinion if they disagree with their partner”. Type B, or the negative question, “Women do not have the right to express their opinion if they disagree with their partner”.

Both these questions should not be posed to the same person. Type A questions should be asked to some people and Type B to some other people. This rule was followed in this experiment too and the results were presented.

Results

Out of 51 males, 49 agreed with the Type A question and two disagreed. In the Type B question, 36 out of 52 males disagreed and 16 agreed. Now, if 49 out of 51 agreed with Type A, then ideally, 49 males out of the 52 should have disagreed. That is because disagreement in Type B is equivalent to the agreement in Type A.

So, we would expect 49 persons to disagree with Type B, but that was not the case. This type of pattern was seen in almost all the questions.

Coming back to deceptive statistics, another major source of deception is the use of inappropriate methods of drawing samples and collecting data. They tend to paint a false picture of a particular phenomenon.

Random Sample

Representative Image. (By Andreas Breitling from Pixabay)

To understand the true nature of a random sample, Prof Pal gave an example of a survey conducted among the readers of a particular magazine on whether they liked a certain political party to be in power. The readers were instructed just to fill the form and post it without disclosing their identity.

Suppose more than 60% say yes. Does the opinion of the survey reflect the correct percentage? The answer is no. By random, we mean that every individual in the population has a positive probability of coming into the sample and this probability is known beforehand. And suppose, this magazine is only catering to the higher income group. So, this random sample is not from the whole population.

Sample Size

Another important point to keep in mind is that the sample size should always be large in order to arrive at a conclusion. So, for example, if the sample size of a survey to gauge the likeability of a product is 10 and seven of them rate the product positively, it doesn’t mean that this is the popular opinion.

Even though 70% of the sample rated the product positively, we cannot confidently ascertain the opinion of that product since the sample size in question is too small.

Simple Solutions

It is not true that we always need sophisticated statistical techniques to solve a problem. An example could be the salt case. Just after Independence, many refugee camps were set up, especially in the border states such as West Bengal. The government of West Bengal gave the refugees rice and pulses.

However, the government felt that the number of people actually in the camps is lower than the number provided by the contractors of those camps. Thus, expert help was requested to ascertain the real size of the population.

By using the simple unitary method of dividing the total salt consumption of the group by the average amount of salt consumed by each person, the experts were able to figure out the actual number of refugees in the camps. Thus, sometimes solutions can be found through simple methods.

Interpretation Of Results

Representative Image. (Source: pxfuel)

Hypothetically, if the mean depth of a river is 3 ft, one would assume that it is safe to cross the river. However, factors such as the variation of the range of depth at different points and the velocity of its flow also need to be considered. Therefore, while interpreting a result, one must take into account the associated factors.

Another more common example would be that of inflation. When the inflation rate goes down, many assume that prices are falling. In reality, prices are still increasing but at a lower rate. Prices will only begin falling when the inflation rate becomes negative.

Correlation Coefficients

A linear relationship between two variables doesn’t imply a direct correlation. Both the variables might be corresponding to a common factor. For example, the production of paddy in Assam may have a high correlation with the number of road accidents in Kolkata. But in reality, this correlation has no logic supporting it. Thus, we shouldn’t seek unnecessary relations.

After a comprehensive presentation by Prof Pal on the uses and abuses of statistics, the discussant, Dr Gour Gobinda Goswami, Professor at Department of Economics, School of Business and Economics, North-South University, Dhaka, Bangladesh, mentioned a few more important points.

Dr Goswami spoke about the importance of the setup in a survey. He substantiated his point with an example of how most responses he received from a survey of a group of children were unauthentic. This was because the children were talking to each other and not responding to the questions independently.

While studying the uses of statistics, it is equally important to study the abuses. If one only studies the uses and not the abuses, one might be misguided when it comes to conducting research or interpreting data. The information must be used in an appropriate manner in order to arrive at the most accurate possible conclusion.

Acknowledgement: Chhavi Kapoor is a research intern at IMPRI and is pursuing bachelors in Political Science, Literature, and Economics from St. Xavier’s College, Mumbai.

Utpal K De, Arjun Kumar

Exit mobile version