Social
science and government aims
Proposed standards for public goals
and research aggregating statistics on
individuals
Matt Berkley
Draft, 10 January 2006
Contents
Summary
Purpose of this document
Background
Preliminary notes
Proposed standards for goals and research
1. Estimate reliability of the
data.
2. Estimate reliability of
conclusions.
3. Distinguish data on samples
from inferences on populations.
4. Distinguish reports from
history.
5. Distinguish population trends
from trends for people.
6. Distinguish spending from
income.
7. Distinguish spending from
consumption.
8. Distinguish consumption from
adequacy.
9. Distinguish income from profit.
10. Distinguish prices from relevant prices.
11. Distinguish prices from cost of living.
12. Distinguish consumption gains from material gains.
13. Distinguish incidence from prevalence.
14. Distinguish prevalence from degree.
15. Distinguish material conditions from judgements on well-being.
16. Imagine real people.
17. Look at meaningful groups of
people.
18. Distinguish “statistical
significance” from importance.
General notes on the standards
Suggested examination questions
This document proposes minimum standards in the use of language
for
a) some kinds of public policy goals and
b) some kinds of reporting in social science.
Some of the standards may be considered ethical standards.
The proposals are for the attention of researchers, funders
and policy makers and the general public.
The document may also be a guide to asking some kinds of
questions on past claims about poverty and prosperity.
The document mentions potential, and some actual, errors in
social science.
The impetus was the author’s interest in current methods of
economic analysis. In the process he
developed an urge to understand the relationships between
a) current theory in social science
b) predominant practice in social science and government and
c) what he thought he might like as government aims and
progress indicators if he were among those called “extremely poor”.
Thinking about the extreme case sometimes reveals the
principles.
The aim of this document is to help bring clarity to both
documents and debates on some aspects of government policy, and some aspects of
social science reporting. Some of the
standards apply in particular to economics.
Many people die of malnutrition in a world where resources are
adequate. At the same time, people differ
as to what they mean when they say “poverty has got worse” or “poverty has got
better”.
An element of reasoning behind this document is: One part of a solution to malnutrition may be
for organisations to adopt common language.
The organisations would include professional organisations, institutions
and funders, including government agencies.
The reasoning stems from the apparent fact that not only
consumers of economic information but sometimes producers appear to have been
unaware of the precise nature of the information.
Some questions related to social science are fundamentally
subjective. That is one reason why the
document is aimed at a wider readership than academics.
The document may help clarify which are scientific matters,
which are matters of opinion and which are moral matters.
The distinctions in this document may help clarify the
evidential position on some claims concerning human poverty and prosperity.
One argument for adopting such standards may be: Looking at what lies behind the language of
social scientists may help solve some puzzles in international statistics.
If public institutions adopt standards for clearer language, the
public may be in a better position to choose between policies.
In 2000 the present author raised a fundamental flaw with
professors of economics: if the
“poorest” die, the figures appear to show they did better. Since then, academics have begun
discussing that issue, and some have written that this is a significant conceptual
advance. But in this area, as in
others, there are no rules for economists
- no boundaries for acceptable
professional conduct. What is needed
are rules forbidding social scientists from making statements for which they
clearly do not have evidence and which may have important social impacts. It seems wrong that social scientists can
make elementary errors and face no sanction from either employers or
professional organisations.
For instance, if a doctor recommended a treatment without
looking at survival rates, that might be classed as a serious mistake. If a social scientist recommended a policy
without looking at survival rates, that might be classed as a far more serious
mistake. There is less point in
spending money training social scientists if governments can employ them to say
whatever they like.
At present, the author knows of no political party which
endorses such rules in relation to the use of economic data.
In 2000 it struck me that the debate over global poverty needed
better statistics. I came to realise it
needed clarity about existing statistics.
I was a trustee of a family trust. It occurred to me that the trust might help
provide statistics for the debate on global poverty. I also thought it might help bring academics
together with campaigners.
I then read an economic policy document taken seriously by
newspapers and the UK government, which contained elementary errors. This situation seemed to indicate widespread
errors of reasoning in economics.
Around the same time, the heads of several campaigning and
research organisations told me that it would be a good thing if there were a
think tank on global poverty.
Perhaps I was aiming to clarify two things:
1) how existing policy and research methods related to what I
might want for myself in the situation of the “poorest”;
2) the evidence for the most influential claims about policies
and poverty.
Some of the issues go back a long way.
For example, the tradition in macroeconomics (large-scale
economics) has been to assume that if incomes of the poorest rose 1% they had a
1% benefit.
Adam Smith noted a difference between the inflation rate for
working-class people and the national rate in 1776.
He also wrote about needs being a factor in prosperity: he did not advocate the idea that resources
measured either prosperity or poverty without reference to need.
Another example relates to adding up the numbers. Economists often write about utility -
meaning the consequences for people.
The idea of calculating “utility” or gains in well-being to people goes back
to the philosopher Jeremy Bentham. His
form of “utilitarianism” was about “the greatest good to the greatest
number”. He included duration of
pleasure as a factor in his “calculus” of consequences. But economists for some reason almost all
ignored this in making claims about poverty.
It appears to have been common in macroeconomics to claim to
know economic benefits to the poorest people without thinking about prices,
needs or survival rates.
There is disagreement over what prosperity is; what might constitute evidence for
prosperity; what might be accurately described
as past trends in prosperity or its inverse, poverty; what are better policies for increasing
prosperity.
It may be that clearer language would help resolve some aspects
of disagreement.
It is important to bear in mind that the standards are not meant
to override common sense.
What counts as common sense is a subjective matter. There are inevitably areas related to these
standards where judgement is involved.
The aim is to clarify descriptions of the evidence and the arguments
used for the conclusion.
Science is about better approximating to the truth.
The document aims to further that enterprise.
In the real world, it is worth remembering that there may be
temptations for social scientists, employers, governments, state institutions,
media organisations, political organisations, and/or people in general to
believe what they want to believe, and/or try to make others believe things
without really having the evidence they think or say they have.
This document aims to engage with that aspect of the real world.
Suppose a social scientist says they have data on a social
trend. How do you decide whether to believe them?
One type of question you can ask is about definitions. It’s often a good idea to ask what they
mean.
Another type of question is about reliability.
An example of a question to ask is:
Do the numbers come from samples of the population, or
everyone?
Usually, numbers come from samples.
This kind of thing may seem at first a complex area, but if
you apply imagination you can think of some relevant aspects you might want to
ask about. After all, the scientist
has obtained the data from somewhere, and scientists are only people.
Another example of an area where questioning may be useful
is this:
A scientist might say they have measured something, when the
reality is that they have added up answers people gave.
For example, suppose a researcher says “people are 2%
happier in country X than in Y”.
A description of the procedure might be something like
this:
“In each country, workers asked one in ten thousand people
how happy they were.”
In a research project there may many details. But the principles are often easy to
understand. In this case, without
knowing anything about the research, we can do some thinking. Here are some initial thoughts that we might
have:
1) You can’t really measure happiness - you
can’t really know what people are experiencing.
2) Costs are likely to limit the sample, which may be one in
1000 people or fewer
3) Some people may not tell the truth
4) People may feel different at different times
5) The questions might seem different in different languages
6) The researchers might have been forced to leave people
they couldn’t find, or who wouldn’t answer
Some of these factors might have skewed the results.
This example is not meant to reflect the reality of research
on what people say about happiness. It
is to illustrate the principle that you can think about what social scientists
tell you, and the principle that it’s not magic
- it’s just people finding out
about things by methods that are possible in real life.
My personal belief is that in the case of what some people
call “extreme poverty” a strange kind of psychological process has happened
whereby the “non-poor” can have a kind of short-circuit of imagination. Perhaps this is common in other areas of
social science research as well. What
I have consistently observed is that some people paid highly to be experts in
“poverty” often fail to show evidence of having thought about some of the most
basic aspects of real life. Some of
these aspects are mentioned below.
They include questions about the reliability of data.
Proposed standard for researchers:
Estimate reliability of data for the specific purpose,
giving reasons.
Estimates should be in the context of the specific
statistical tests.
Data may be reliable enough for a simple test but not for a
complex test.
This may sound complex, but the principles are simple: think about
a) what you are comparing and
b) how your confidence in the data might stand these
comparisons.
Example:
“We estimate that
in the context of
comparing outcomes under policy X with outcomes under policy
Y,
taking into account
the number of countries in each category or being correlated,
the sample sizes,
the survey coverage,
the possible gaps in data,
the number of time periods,
[...and any other relevant factors...]
the likelihood of our figures being right to within a% is
b.”
Note that the unreliability of a series of inferences is
multiplied (see below).
Researchers must consider the reliability of inferences,
giving reasons where the inferences are not otherwise clear.
The unreliability of a series of inferences is multiplied.
For example:
If
there are two steps in my argument
and
each has 70% probability
then
I am probably wrong.
70/100 x 70/100 = 4900/10000
= 49%.
Suppose survey data are on answers about spending. This is mostly the case in real life.
If an economist is asked by a politician to say something
about how well or badly poor people did, here would be some necessary
inferences.
From
i) “what poor people sampled in 1990 and 2000 said
they spent”
to
ii) “what representative samples of poor people in
1990 and 2000 said they spent”
to
iii). “what poor people spent in 1990 and
2000”
to
iv) “trends in spending for real poor people over
time
(taking demographic change into account)”
to
v) “trends in income for poor people”
or “consumption trends for poor people”
to
vi) “consumption adequacy for real poor people”
to
vii) “material gains for real poor people”
to
viii) “gains in well-being for real poor people”.
Necessary assumptions might concern at least, not
necessarily in this order:
a) sample adequacy
b) truthfulness of respondents
c) memory of respondents
d) demographic change (which determines food needs)
e) workload (which determines food needs)
f) changing food
quality
g) food prices
h) survival rates
i) a theory of how well-being relates to spending
j) the relative value to people of assets and consumption.
In order to understand the inferences it is perhaps
important to be clear about different kinds of statements.
The statement
“People in our sample in country P had rises in X”
is not the same statement as
“In country P, X rose”.
Researchers should note non-responses and difficulties in
obtaining random samples. The aim here
is to exclude the risk of error through sample bias.
Note: The “rich” and
the “destitute” may not be reachable in surveys. If the destitute are unreachable, it is not
clear how an economist can have data on the poorest.
Note on units chosen:
If the units chosen for comparison are countries, reasons should be
given as to why these countries might constitute representative samples of all
relevant countries.
Similar considerations apply to time periods.
What people said does not tell you what they did.
Economic data on most people are on their answers to
questions about spending.
In many areas of life, answers people give may not be
true.
These areas of life include what they spent, earned, ate,
drank, used, or acquired.
Reasons relate to
a) honesty
b) self-deception
c) memory
d) mathematical ability.
It is a mistake to describe answers as measured quantities
without good reason.
Reason:
To minimise risk of damage to people from confusion of:
a) trends for people
and
b) changes in aggregates for populations
Example:
“the average rise [for people] was x%”
versus
“the [population] average rose by y%”.
Demography axiom
It is not possible to aggregate outcomes from statistics
solely on the living.
Note
Statistics on survivors are selective. Aggregate outcome statistics include those
who die.
Statistics on survivors do not yield data on what happened to
people during a period.
Nor do they yield data on what happened to the survivors
over the period.
The statement
“The average was x% higher in 2000 than in 1990”
does not tell a researcher either that
“people had rises of x%
or that
“survivors had rises of x%”.
Examples of what are strictly speaking errors:
a) Any inferences as to aggregate trends for people from
United Nations Millennium Goal indicators for hunger, poverty, education, AIDS,
water.
These indicators were in terms of population
proportions. They are therefore not
indicators of aggregate progress for people.
Numbers of living people depend on births, deaths and
migration as well as trends for individuals.
Proportions of living people either side of a line described
in terms of proportions of people depend on births, deaths and migration of
people on each side of the line as well as on individual trends.
b) Policy assessments from macroeconomic statistics.
Treatment of population statistics as statistics on people
has been the dominant tradition in macroeconomics.
Exceptions to the standard:
where there is
a) specific relevant information on survival rates, age
structure, birth rates and migration
or
b) clear reason to infer survival rates, age structure,
birth rates and migration were within reason and in all material respects
constant, proportional or irrelevant.
Survival axiom
Where survival rates are not within reason known to be
constant, proportional or irrelevant, the notion of an average or other
single-statistic aggregate outcome is not applicable.
Note
The standard is to cover cases where
either
a) a direct statement is made concerning aggregate outcomes
or
b) a reader might reasonably understand the claim to refer
to aggregate outcomes.
An example of type (b) would be where the speaker refers to
“reduction” of a condition considered undesirable and the context is of
alleviating the condition for sufferers.
Reason
In the period 1945-2005 most economic data on individuals
related to spending; yet in 2005 the
tradition in macroeconomics in summaries for the public was to describe the
data as “income”.
Risk : The public
might assume economists counted both savings and spending.
Globally, the data mostly related to spending. A minority of data were on income (notably
in Latin America). Some data were on the
money value of items eaten or used.
Most of the numbers were on what a sample of people said
they spent.
Examples of what are strictly speaking errors:
a) Any reference to Millennium Goal indicators as on
“income”.
Millennium Goal Indicator 1 is mostly concerned with reducing
the proportion of low spenders.
b) Any reference to the economic data in policy assessments,
as “income”.
For example, even if there were no other problems, a claim
by economists that “poor people’s incomes rose at the same rate as policy X”
would still be misleading. More
accurate would be “poor people’s spending rose with policy X”.
Terminological issue:
The fact that the international data are a mixture poses a linguistic
problem for researchers: how to describe
the data accurately and concisely.
“Income” is misleading.
It would be more accurate to describe the data as on spending. In order not to mislead, it might be better
for economists to use the term “the economic data”.
Purpose of standard
To ensure the phrase “consumption expenditure” is shortened
to “spending” rather than “consumption”.
Note
The word “consumption” would perhaps most naturally be taken
by a non-economist to refer to “items or services received or used”.
In this sense, consumption would be “things received”, not
“money spent”. An economist who had
data on spending and who then wrote “consumption has risen” would not be misleading
the public in the same way as one who wrote “poverty has fallen” or “incomes
have risen” but would still be misleading them.
The definition of what counts as consumption is in any case
perhaps in some ways arbitrary.
Example: Food
consumption.
Food consumption adequacy for daily tasks depends on at
least:
size,
age,
economies of scale,
workload type,
workload amount,
weather,
food balance,
food quality.
Examples of error
a) Macroeconomic claims on global poverty.
Per-person statistics used by World Bank, UN and others up
until 2005 failed to take into account that the proportion of children is
falling.
b) Policy assessments from macroeconomists based on per capita
statistics.
Proportions of children are not constant across countries,
or within countries by spending level.
In both cases inferences from consumption to adequacy were
made without reference to needs.
It is difficult to see how a researcher who has not
estimated needs for fuel, food, water, medicine, rent or other basic items
could have estimated poverty.
Real-world puzzles potentially partially explained by this
error:
i) Discrepancy between Food and Agriculture Organisation
reported hunger trend and World Bank reported poverty trend for Millennium
Goals.
ii) Discrepancy between protesters’ and economists’ views of
progress of global poorest.
iii) Discrepancy between World Bank and other reports of
progress on Millennium Goals.
Note: In some
periods of history falls in child-adult ratios (which make cross-sectional per
capita statistics, other things being equal, overestimate progress) may
coincide with rises in longevity (which make cross-sectional statistics, other things
being equal, underestimate progress).
Note on the concept of adequacy
Consumption adequacy is a concept rather than a scientific
variable.
For example, it might be argued that people in a country
where family members live further apart need more money for transport.
How much fuel is needed in a cold place to get to the same
standard of living as in a hot place?
Such needs, as with many things described as needs, are
matters of opinion, not science.
See below on inflation and cost of living for a related
distinction.
Reason
Risk of inferring gains without estimating needs.
Example
Inferences from “income” to “income poverty”.
Profit axiom
(Income) - (necessary outgoings) =
(profit).
Parallel
Businesses. The axiom
applies to a household as for a business.
For both businesses and households, the following are
true:
Revenue, income or turnover are not profit.
1% more turnover does not indicate 1% more profit.
An income rise of 1% does not measure an income gain of
1%.
Note
It is not clear what philosophical argument might be
advanced that income is an indicator of welfare.
It does not measure the cost of rent, childcare, transport,
fuel, food, water or medical services in any country. It is a measure of money going round the
system.
Income is a social indicator.