• Home


Choose a journal article, book chapter, or book from one of the core readings from any of the modules you have taken which applies one of the methods listed below, or in the case of “notions of causation” or “ethics and positionality”, that touches directly upon either issue.

Write a critical review (see word limits below) of the way in which this method has been applied or how questions of causality or of ethics have been navigated. As part of building this critique, your review can also consider the limitations or shortcomings of relying on a single or given method. Equally, if the source you have chosen uses more than one method (e.g. a case study and discourse analysis), then your review can critique both parts of this.

As part of building your critique, you are strongly encouraged to refer to the literature from the module reading list and to add to this by searching for additional relevant references.

(Ensure that you provide a full reference for the article/chapter/book you have chosen, as well as the module reading list to which it belongs)

List of topics:

· Notions of causation 

· Ethics and Positionality 

· Interviews, ethnography and participatory methods 

· Case studies and comparative methods 

· Discourse analysis 

· Archival methods

· Intro to uses of quant data 

· Visualising data 1 

· Surveys and sampling 

Word limits:

1. For those who began their course in 2021: 3000 words.


Causes and Conditions

Author(s): J. L. Mackie

Source: American Philosophical Quarterly , Oct., 1965, Vol. 2, No. 4 (Oct., 1965), pp. 245-

Published by: University of Illinois Press on behalf of the North American
Philosophical Publications

Stable URL: https://www.jstor.org/stable/20009173

JSTOR is a not-for-profit service that helps scholars, researchers, and students discover, use, and build upon a wide
range of content in a trusted digital archive. We use information technology and tools to increase productivity and
facilitate new forms of scholarship. For more information about JSTOR, please contact support@jstor.org.

Your use of the JSTOR archive indicates your acceptance of the Terms & Conditions of Use, available at

Terms and Conditions of Use

and University of Illinois Press are collaborating with JSTOR to digitize, preserve and extend
access to American Philosophical Quarterly

This content downloaded from
������������ on Wed, 27 Apr 2022 13:55:35 UTC�������������

All use subject to https://about.jstor.org/terms

American Philosophical Quarterly
Volume 2, Number 4, October 1965


ASKED what a cause is, we may be tempted to say that it is an event which precedes the
event of which it is the cause, and is both necessary
and sufficient for the latter’s occurrence; briefly,
that a cause is a necessary and sufficient preceding
condition. There are, however, many difficulties in
this account. I shall try to show that what we
often speak of as a cause is a condition not of this
sort, but of a sort related to this. That is to say,
this account needs modification, and can be modi?
fied, and when it is modified we can explain much
more satisfactorily how we can arrive at much of
what we ordinarily take to be causal knowledge;
the claims implicit within our causal assertions can
be related to the forms of the evidence on which
we are often relying when we assert a causal

? i. Singular Causal Statements

Suppose that a fire has broken out in a certain
house, but has been extinguished before the house
has been completely destroyed. Experts investigate
the cause of the fire, and they conclude that it was
caused by an electrical short-circuit at a certain
place. What is the exact force of their statement
that this short-circuit caused this fire? Clearly the
experts are not saying that the short-circuit was a
necessary condition for this house’s catching fire
at this time; they know perfectly well that a short
circuit somewhere else, or the overturning of a
lighted oil stove, or any one of a number of other
things might, if it had occurred, have set the house
on fire. Equally, they are not saying that the
short-circuit was a sufficient condition for this
house’s catching fire; for if the short-circuit had
occurred, but there had been no inflammable
material nearby, the fire would not have broken
out, and even given both the short-circuit and the
inflammable material, the fire would not have
occurred if, say, there had been an efficient auto?
matic sprinkler at just the right spot. Far from
being a condition both necessary and sufficient for

the fire, the short-circuit was, and is known to the
experts to have been, neither necessary nor
sufficient for it. In what sense, then, is it said to
have caused the fire?

At least part of the answer is that there is a set
of conditions (of which some are positive and some
are negative), including the presence of inflam?
mable material, the absence of a suitably placed
sprinkler, and no doubt quite a number of others,

which combined with the short-circuit constituted
a complex condition that was sufficient for the
house’s catching fire?sufficient, but not necessary,
for the fire could have started in other ways. Also,
of this complex condition, the short-circuit was an
indispensable part: the other parts of this con?
dition, conjoined with one another in the absence
of the short-circuit, would not have produced the
fire. The short-circuit which is said to have caused

the fire is thus an indispensable part of a complex
sufficient (but not necessary) condition of the fire.
In this case, then, the so-called cause is, and is
known to be, an insufficient but necessary part of a
condition which is itself unnecessary but sufficient for
the result. The experts are saying, in effect, that
the short-circuit is a condition of this sort, that it
occurred, that the other conditions which con?
joined with it form a sufficient condition were also
present, and that no other sufficient condition of
the house’s catching fire was present on this
occasion. I suggest that when we speak of the
cause of some particular event, it is often a con?
dition of this sort that we have in mind. In view
of the importance of conditions of this sort in our
knowledge of and talk about causation, it will be
convenient to have a short name for them: let us
call such a condition (from the initial letters of
the words italicized above), an inus condi?

This account of the force of the experts’ state?
ment about the cause of the fire may be confirmed
by reflecting on the way in which they will have
reached this conclusion, and the way in which
anyone who disagreed with it would have to

1 This term was suggested by D. G. Stove who has also given me a great deal of help by criticizing earlier versions of this

A 245

This content downloaded from
������������ on Wed, 27 Apr 2022 13:55:35 UTC�������������

All use subject to https://about.jstor.org/terms


challenge it. An important part of the investigation
will have consisted in tracing the actual course of
the fire; the experts will have ascertained that no
other condition sufficient for a fire’s breaking out
and taking this course was present, but that the
short-circuit did occur and that conditions were
present which in conjunction with it were sufficient
for the fire’s breaking out and taking the course
that it did. Provided that there is some necessary
and sufficient condition of the fire?and this is an
assumption that we commonly make in such con?
texts?anyone who wanted to deny the experts’
conclusion would have to challenge one or another
of these points.
We can give a more formal analysis of the

statement that something is an inus condition. Let
6 A9 stand for the inus condition?in our example,
the occurrence of a short-circuit at that place?and
let ‘?’ and ‘C (that is, ‘not-C, or the absence of C)
stand for the other conditions, positive and nega?
tive, which were needed along with A to form a
sufficient condition of the fire?in our example,
B might be the presence of inflammable material,
C the absence of a suitably placed sprinkler. Then
the conjunction ‘ABC represents a sufficient con?
dition of the fire, and one that contains no re?
dundant factors ; that is, ABC is a minimaljufficient
condition for the fire.2 Similarly, let DEF, GHI,
etc., be all the other minimal sufficient conditions of
this result. Now provided that there is some
necessary and sufficient condition for this result,
the disjunction of all the minimal sufficient con?
ditions for it constitutes a necessary and sufficient
condition.3 That is, the formula “ABC or DEF or
GHI or . . .” represents a necessary and sufficient
condition for the fire, each of its disjuncts, such
as ‘ABC9, represents a minimal sufficient condition,
and each conjunct in each minimal sufficient con

dition, such as lA\ represents an inus condition. To
simplify and generalize this, we can replace the
conjunction of terms conjoined with ‘A’ (here ‘BC’)
by the single term ‘X\ and the formula representing
the disjunction of alljthe other minimal sufficient
conditions?here “DEF or GHI or . . .”?by the
single term QT\ Then an inus condition is defined
as follows:

A is an inus condition of a result P if and only if,
for some X and for some T, (AX or T) is a necessary
and sufficient condition of P, but A is not a sufficient
condition of P and X is not a sufficient condition
of P.

We can indicate this type of relation more
briefly if we take the provisos for granted and
replace the existentially quantified variables ‘X*
and 6T9 by dots. That is, we can say that A is an
inus condition of P when (A . . . or . . .) is a
necessary and sufficient condition of P.

(To forestall possible misunderstandings, I
would fill out this definition as follows.4 First, there
could be a set of minimal sufficient conditions of
P, but no necessary conditions, not even a complex
one ; in such a case, A might be what Marc-Wogau
calls a moment in a minimal sufficient condition,
but I shall not call it an inus condition. I shall
speak of an inus condition only where the dis?
junction of all the minimal sufficient conditions is
also a necessary condition. Secondly, the definition
leaves it open that the inus condition A might be
a conjunct in each of the minimal sufficient con?
ditions. If so, A would be itself a necessary condition
of the result. I shall still call A an inus condition in
these circumstances : it is not part of the definition
of an inus condition that it should not be necessary,
although in the standard cases, such as that

2 The phrase “minimal sufficient condition” is borrowed from Konrad Marc-Wogau, “On Historical Explanation,”
Theoria, vol. 28 (1962), pp. 213-233. This article gives an analysis of singular causal statements, with special reference to
their use by historians, which is substantially equivalent to the account I am suggesting. Many further references are made
to this article, especially in n. 9 below.

3 Gf. n. 8 on p. 227 of Marc-Wogau’s article, where it is pointed out that in order to infer that the disjunction of all the
minimal sufficient conditions will be a necessary condition, “it is necessary to presuppose that an arbitrary event C, if it
occurs, must have sufficient reason to occur.” This presupposition is equivalent to the presupposition that there is some
(possibly complex) condition that is both necessary and sufficient for C.

It is of some interest that some common turns of speech embody this presupposition. To say “Nothing but X will do,” or
“Either lor f will do, but nothing else will,” is a natural way of saying that X, or the disjunction (X or T), is a necessary
condition for whatever result we have in mind. But taken literally these remarks say only that there is no sufficient condition
for this result other than X, or other than (X or T). That is, we use to mean “a necessary condition” phrases whose literal
meanings would be “the only sufficient condition,” or “the disjunction of all sufficient conditions.” Similarly, to say that Z
is “all that’s needed” is a natural way of saying that ? is a sufficient condition, but taken literally this remark says that Z *s
the only necessary condition. But, once again, that the only necessary condition will also be a sufficient one follows only if

we presuppose that some condition is both necessary and sufficient.
4 I am indebted to the referees for the suggestion that these points should be clarified.

This content downloaded from
������������ on Wed, 27 Apr 2022 13:55:35 UTC�������������

All use subject to https://about.jstor.org/terms


sketched above, it is not in fact necessary.5 Thirdly,
the requirement that X by itself should not be
sufficient for P insures that A is a nonredundant
part of the sufficient condition AX; but there is a
sense in which it may not be strictly necessary or
indispensable even as a part of this condition, for
it may be replaceable : for example KX might be
another minimal sufficient condition of P.6 Fourthly,
it is part of the definition that the minimal sufficient
condition, AX, of which A is a nonredundant part,
is not also a necessary condition, that there is
another sufficient condition T (which may itself be
a disjunction of sufficient conditions). Fifthly, and
similarly, it is part of the definition that A is not
by itself sufficient for P. The fourth and fifth of
these points amount to this : I shall call A an inus
condition only if there are terms which actually
occupy the places occupied by (X’ and CT9 in the
formula for the necessary and sufficient condition.
However, there may be cases where there is only
one minimal sufficient condition, say AX. Again,
there may be cases where A is itself a minimal
sufficient condition, the disjunction of all minimal
sufficient conditions being (A or T); again, there
may be cases where A itself is the only minimal
sufficient condition, and is itself both necessary
and sufficient for P. In any of these cases, as well
as in cases where A is an inus condition, I shall
say that A is at least an inus condition. As we shall
see, we often have evidence which supports the
conclusion that something is at least an inus con?
dition; we may or may not have other evidence
which shows that it is no more than an inus con?

I suggest that a statement which asserts a
singular causal sequence, of such a form as “A
caused F,” often makes, implicitly, the following
claims :

(i) A is at least an inus condition of P?that is,

there is a necessary and sufficient condition of P
which has one of these forms: {AX or T), (A or T),
AX, A.

(ii) A was present on the occasion in question.
(iii) The factors represented by the 6X9, if any,

in the formula for the necessary and sufficient
condition were present on the occasion in question.

(iv) Every disjunct in ‘T9 which does not con?
tain iAi as a conjunct was absent on the occasion
in question. (As a rule, this means that whatever

T’ represents was absent on this occasion. If *T9
represents a single conjunction of factors, then it
was absent if at least one of its conjuncts was
absent; if it represents a disjunction, then it was
absent if each of its disjuncts was absent. But we
do not wish to exclude the possibility that ‘T9
should be, or contain as a disjunct, a conjunction
one of whose conjuncts is A, or to require that
this conjunction should have been absent.7)

I do not suggest that this is the whole of what
is meant by “A caused P” on any occasion, or even
that it is a part of what is meant on every occasion :
some additional and alternative parts of the mean?
ing of such statements are indicated below.8 But
I am suggesting that this is an important part of
the concept of causation; the proof of this sugges?
tion would be that in many cases the falsifying of
any one of the above-mentioned claims would
rebut the assertion that A caused P.

This account is in fairly close agreement, in
substance if not in terminology, with at least two
accounts recently offered of the cause of a single

Konrad Marc-Wogau sums up his account thus:

when historians in singular causal statements speak of
a cause or the cause of a certain individual event ?,
then what they are referring to is another individual
event a which is a moment in a minimal sufficient and

at the same time necessary condition post factum ?.9

5 Special cases where an inus condition is also a necessary one are mentioned at the end of ? 3.
6 This point, and the term “nonredundant,” are taken from Michael Scriven’s review of NagePs The Structure of Science, in

Review of Metaphysics, 1964. See especially the passage on p. 408 quoted below.
7 See example of the wicket-keeper discussed below.
8 See ?? 7) 8.
9 See pp. 226-227 of the article referred to in n. 2 above. Marc-Wogau’s full formulation is as follows:
“Let ‘msc’ stand for minimal sufficient condition and *nc’ for necessary condition. Then suppose we have a class K of

individual events a1} a2, . . . an. (It seems reasonable to assume that K is finite; however even if K were infinite the reasoning
below would not be affected.) My analysis of the singular causal statement: a is the cause of ?, where a and ? stand for
individual events, can be summarily expressed in the following statements:

(1) (EK) (K = {a1} a2, . . ., an}); (4) (x) ( (* c Kx ^ a?) => x is not fulfilled when a occurs);
(2) (x) (x K = x msc ?) ; (5) a is a moment in ax.
(3) Ui v a2 v . . . an) nc ?;

(3) and (4) say that ax is a necessary condition post factum for ?. If a? is a necessary condition post factum for ?, then every
moment in a? is a necessary condition post factum for ?, and therefore also a. As has been mentioned before (note 6) there is
assumed to be a temporal sequence between a and ?; ? is not itself an element in K”

This content downloaded from
������������ on Wed, 27 Apr 2022 13:55:35 UTC�������������

All use subject to https://about.jstor.org/terms


He explained his phrase “necessary condition
postfactum” by saying that he will call an event ax
a necessary condition post factum for x if the dis?
junction “?j or a2 or a3 . . . or an” represents a
necessary condition for x, and of these disjuncts
only a1 was present on the particular occasion
when x occurred.

Similarly Michael Scriven has said:
Causes are not necessary, even contingently so, they
are not sufficient?but they are, to talk that language,
contingently sufficient. . . . They are part of a set of
conditions that does guarantee the outcome, and they
are non-redundant in that the rest of this set (which
does not include all the other conditions present) is
not alone sufficient for the outcome. It is not even
true that they are relatively necessary, i.e., necessary
with regard to that set of conditions rather than the
total circumstances of their occurrence, for there may
be several possible replacements for them which
happen not to be present. There remains a ghost of
necessity; a cause is a factor from a set of possible
factors the presence of one of which (any one) is
necessary in order that a set of conditions actually
present be sufficient for the effect.10

There are only slight differences between these
two accounts, or between each of them and that
offered above. Scriven seems to speak too strongly
when he says that causes are not necessary: it is,
indeed, not part of the definition of a cause of this
sort that it should be necessary, but, as noted
above, a cause, or an inus condition, may be
necessary, either because there is only one minimal
sufficient condition or because the cause is a

moment in each of the minimal sufficient con?
ditions. On the other hand, Marc-Wogau’s account
of a minimal sufficient condition seems too strong.
He says that a minimal sufficient condition con?
tains “only those moments relevant to the effect”
and that a moment is relevant to an effect if “it
is a necessary condition for ?: ? would not have
occurred if this moment had not been present.”
This is less accurate than Scriven’s statement that
the cause only needs to be nonredundant.11 Also,

Marc-Wogau’s requirement, in his account of a

necessary condition post factum, that only one
minimal sufficient condition (the one containing a)
should be present on the particular occasion, seems
a little too strong. If two or more minimal sufficient
conditions (say a1 and a2) were present, but a was
a moment in each of them, then though neither
ax nor a2 was necessary post factum, a would be so.
I shall use this phrase “necessary post factum” to
include cases of this sort : that is, a is a necessary
condition post factum if it is a moment in every
minimal sufficient condition that was present. For
example, in a cricket team the wicket-keeper is
also a good batsman. He is injured during a match,
and does not bat in the second innings, and the
substitute wicket-keeper drops a vital catch that
the original wicket-keeper would have taken. The
team loses the match, but it would have won if
the wicket-keeper had both batted and taken that
catch. His injury was a moment in two minimal
sufficient conditions for the loss of the match;
either his not batting, or the catch’s not being
taken, would on its own have insured the loss of
the match. But we can certainly say that his
injury caused the loss of the match, and that it

was a necessary condition post factum.
This account may be summed up, briefly and

approximately, by saying that the statement “A
caused P” often claims that A was necessary and
sufficient for P in the circumstances. This de?
scription applies in the standard cases, but we have
already noted that a cause is nonredundant rather
than necessary even in the circumstances, and we
shall see that there are special cases in which it

may be neither necessary nor nonredundant.

? 2. Difficulties and Refinements12
Both Scriven and Marc-Wogau are concerned

not only with this basic account, but with certain
difficulties and with the refinements and complica?
tions that are needed to overcome them. Before
dealing with these I shall introduce, as a refine?
ment of my own account, the notion of a causal

10 Op. cit., p. 408.
11 However, in n. 7 on pp. 222-233, Marc-Wogau draws attention to the difficulty of giving an accurate definition of

“a moment in a sufficient condition.” Further complications are involved in the account given in ? 5 below of “clusters” of
factors and the progressive localization of a cause. A condition which is minimally sufficient in relation to one degree of
analysis of factors may not be so in relation to another degree of analysis.

12 This section is something of an aside: the main argument is resumed in ? 3.
13 This notion of a causal field was introduced by John Anderson. He used it, e.g., in “The Problem of Causality,” first

published in the Australasian Journal of Psychology and Philosophy, vol. 16 (1938), and reprinted in Studies in Empirical Philosophy
(Sydney, 1962), pp. 126-136, to overcome certain difficulties and paradoxes in Mill’s account of causation. I have also used
this notion to deal with problems of legal and moral responsibility, in “Responsibility and Language,” Australasian Journal of
Philosophy, vol. 33 (i955)> PP- 143-?59

This content downloaded from
������������ on Wed, 27 Apr 2022 13:55:35 UTC�������������

All use subject to https://about.jstor.org/terms


This notion is most easily explained if we leave,
for a time, singular causal statements and consider
general ones. The question “What causes influ?
enza?” is incomplete and partially indeterminate.
It may mean “What causes influenza in human
beings in general?” If so, the (full) cause that is
being sought is a difference that will mark off cases
in which human beings contract influenza from
cases in which they do not; the causal field is then
the region that is to be thus divided, human beings
in general. But the question may mean, “Given that
influenza viruses are present, what makes some
people contract the disease whereas others do
not?” Here the causal field is human beings in con?
ditions where influenza viruses are present. In all such
cases, the cause is required to differentiate, within
a wider region in which the effect sometimes
occurs and sometimes does not, the sub-region in
which it occurs: this wider region is the causal
field. This notion can now be applied to singular
causal questions and statements. “What caused
this man’s skin cancer?”14 may mean “Why did
this man develop skin cancer now when he did
not develop it before?” Here the causal field is the
career of this man: it is within this that we are
seeking a difference between the time when skin
cancer developed and times when it did not. But
the same question may mean “Why did this man
develop skin cancer, whereas other men who were
also exposed to radiation did not?” Here the
causal field is the class of men thus exposed to
radiation. And what is the cause in relation to one
field may not be the cause in relation to another.
Exposure to a certain dose of radiation may be the
cause in relation to the former field: it cannot be
the cause in relation to the latter field since it is
part of the description of that field, and being
present throughout that field it cannot differen?
tiate one sub-region of it from another. In relation
to the latter field, the cause may be, in Scriven’s
terms, “Some as-yet-unidentified constitutional

In our first example of the house which caught
fire, the history of this house is the field in relation
to which the experts were looking for the cause of
the fire: their question was “Why did this house
catch fire on this occasion, and not on others?”
However, there may still be some indeterminacy
in this choice of a causal field. Does this house,

considered as the causal field, include all its
features, or all its relatively permanent features,
or only some of these ? If we take all its features,
or even all of its relatively permanent ones, as
constituting the field, then some of the things that
we have treated as conditions?for example the
presence of inflammable material near the place
where the short-circuit occurred?would have to
be regarded as parts of the field, and we could not
then take them also as conditions which in relation
to this field, as additions to it or intrusions into it,
are necessary or sufficient for something else. We
must therefore take the house, in so far as it con?
stitutes the causal field, as determined only in a
fairly general way, by only some of its relatively
permanent features, and we shall then be free to
treat its other features as conditions which do not
constitute the field, and are not parts of it, but
which may occur within it or be added to it. It
is in general an arbitrary matter whether a par?
ticular feature is regarded as a condition (that is,
as a possible causal factor) or as part of the field,
but it cannot be treated in both ways at once. If
we are to say that something happened to this
house because of, or partly because of, a certain
feature, we are implying that it would still have
been this house, the house in relation to which we
are seeking the cause of this happening, even if it
had not had this particular feature.

I now propose to modify the account given
above of the claims often made by singular causal
statements. A statement of such a form as “A
caused P” is usually elliptical, and is to be ex?
panded into “A caused P in relation to the field F.”
And then in place of the claim stated in (i) above,
we require this:

(ia) A is at least an inus condition of P in the
field F?that is, there is a condition which, given
the presence of whatever features characterize F
throughout, is necessary and sufficient for P, and

which is of one of these forms: (AX or T), (A or T),
AX, A.

In analyzing our ordinary causal statements,
we must admit that the field is often taken for
granted or only roughly indicated, rather than
specified precisely. Nevertheless, the field in re?
lation to which we are looking for a cause of this
effect, or saying that such-and-such is a cause,
may be definite enough for us to be able to say

14 These examples are borrowed from Scriven, op. cit., pp. 409-410. Scriven discusses them with reference to what he calls
a “contrast class,” the class of cases where the effect did not occur with which the case where it did occur is being contrasted.

What I call the causal field is the logical sum of the case (or cases) in which the effect is being said to be caused with what
Scriven calls the contrast class.

This content downloaded from
������������ on Wed, 27 Apr 2022 13:55:35 UTC�������������

All use subject to https://about.jstor.org/terms


that certain facts or possibilities are irrelevant to
the particular causal problem under consideration,
because they would constitute a shift from the
intended field to a different one. Thus if we are
looking for the cause, or causes, of influenza, mean?
ing its cause (s) in relation to the field human beings,

we may dismiss, as not directly relevant, evidence
which shows that some proposed cause fails to
produce influenza in rats. If we are looking for
the cause of the fire in this house, we may similarly
dismiss as irrelevant the fact that a proposed cause
would not have produced a fire if the house had
been radically different, or had been set in a
radically different environment.
This modification enables us to deal with the

well-known difficulty that it is impossible, without
including in the cause the whole environment, the
whole prior state of the universe (and so excluding
any likelihood of repetition), to find a genuinely
sufficient condition, one which is “by itself, ade?
quate to secure the effect.”15 It may be hard to
find even a complex condition which was abso?
lutely sufficient for this fire because we should
have to include, as one of the negative conjuncts,
such an item as the earth’s not being destroyed by
a nuclear explosion just after the occurrence of the
suggested inus condition ; but it is easy and reason?
able to say simply that such an explosion would,
in more senses than one, take us outside the field
in which we are considering this effect. That is to
say, it may be not so difficult to find a condition
which is sufficient in relation to the intended field.
No doubt this means that causal statements may
be vague, in so far as the specification of the field
is vague, but this is not a serious obstacle to
establishing or using them, either in science or in
everyday contexts.16

It is a vital feature of the account I am suggest?
ing that we can say that A caused P, in the sense

described, without being able to specify exactly
the terms represented by tXi and ‘Y9 in our
formula. In saying that A is at least an inus con?
dition for P in F, one is not saying what other
factors, along with A, were both present and
nonredundant, and one is not saying what other
minimal sufficient conditions there may be for P
in F. One is not even claiming to be able to say

what they are. This is in no way a difficulty: it is
a readily recognizable fact about our ordinary
causal statements, and one which this account
explicitly and correctly reflects.17 It will be shown
(in ? 5 below) that this elliptical or indeterminate
character of our causal statements is closely con?
nected with some of our characteristic ways of
discovering and confirming causal relationships:
it is precisely for statements that are thus “gappy”
or indeterminate that we can obtain fairly direct
evidence from quite modest ranges of observation.
On this analysis, causal statements implicitly con?
tain existential quantifications; one can assert an
existentially quantified statement without asserting
any instantiation of it, and one can also have good
reason for asserting an existentially quantified
statement without having the information needed
to support any precise instantiation of it. I can
know that there is someone at the door even if
the question “Who is he?” would floor me
Marc-Wogau is concerned especially with cases

where “there are two events, each of which in?
dependently of the other is a sufficient condition
for another event.” There are, that is to say, two
minimal sufficient conditions, both of which
actually occurred. For example, lightning strikes
a barn in which straw is stored, and a tramp throws
a burning cigarette butt into the st


Durham Research Online

Deposited in DRO:

13 April 2016

Version of attached �le:

Accepted Version

Peer-review status of attached �le:


Citation for published item:

Cartwright, N. (2011) ‘Predicting ‘it will work for us’: (way) beyond statistics.’, in Causality in the sciences.
New York: Oxford University Press.

Further information on publisher’s website:


Publisher’s copyright statement:

This is a draft of a chapter that was accepted for publication by Oxford University Press in the book ‘Causality in the

sciences.’ edited by Phyllis McKay Illari, Federica Russo, and Jon Williamson and published in 2011.

Additional information:

Use policy

The full-text may be used and/or reproduced, and given to third parties in any format or medium, without prior permission or charge, for
personal research or study, educational, or not-for-pro�t purposes provided that:

• a full bibliographic reference is made to the original source

• a link is made to the metadata record in DRO

• the full-text is not changed in any way

The full-text must not be sold in any format or medium without the formal permission of the copyright holders.

Please consult the full DRO policy for further details.

Durham University Library, Stockton Road, Durham DH1 3LY, United Kingdom
Tel : +44 (0)191 334 3042 | Fax : +44 (0)191 334 2971



Predicting ‘It Will Work for Us’: (Way) Beyond Stat istics
Nancy Cartwright


1. Introduction

The topic of this paper is ‘external validity’ and its problems. The discussion will be
confined to a special class of conclusions: causal conclusions drawn from statistical
studies whose fundamental logic depends on JS Mill’s method of difference. These
include randomized control trials (RCTs), case control studies and cohort studies.

These kinds of studies aim to establish conclusions of the form ‘Treatment T causes
outcome O’ by finding a difference in the probability (or mean value) of O between
two groups, commonly called the ‘treatment’ and the ‘control’ groups.1 Given the
method-of-difference idea, in order for the causal conclusion to be justified the two
groups must have the same distribution of causal factors for O except T itself and its
downstream effects. The underlying supposition is that differences in probabilities
require a causal explanation; if the distribution of causes in the two groups is the
same but for T yet the probability of O differs between them, the only possible
explanation is that T causes O. The studies differ by how they go about trying to
ensure as best possible that the two study groups do have the same distribution for
causal factors other than T. There are, as we know, heated debates about the
importance of randomization in this regard but these debates are tangential to my

I want to separate issues in order to focus on a question of use. Suppose, contrary to
realistic fact, that we could be completely satisfied that the two groups had identical
distributions for the other factors causally relevant to O. I shall call this an ideal Mill’s
method-of-difference study. What is the form of the conclusion that can be drawn
from that and of what use is it? In particular of what use is it in predicting whether T
will cause O, or produce an improvement in the probability or mean of O, ‘for us’ – in
a population we are concerned with, implemented as it may be implemented there?

The basic problem is that the kinds of conclusions that are properly warranted by the
method-of-difference design are conclusions confined to the population in the study.
That is seldom, indeed almost never, the population that we want to know about.2 A
difference in the probability of the outcome in this kind of study can at best establish
what I call ‘it-works-somewhere’ claims and the somewhere is never where we aim to
make further predictions. We want to know, ‘Will it work for us in our target population
as it would be implemented there?’ This questions often goes under the label of an
‘effectiveness’ claim. I call it more perspicuously an ‘it-will-work-for-us’ claim. The
problem of how to move from an it-works-somewhere claim to an it-will-work-for-us
claim usually goes under the label ‘external validity’ and is loosely expressed as the
question ‘Under what conditions can the conclusion established in a study be applied
to other populations?’

In this paper I shall argue for two claims: a negative claim that external validity is the
wrong idea and a positive claim that what I call ‘capacities’ and Mill called

1 Naturally only a difference in frequency is observed. There is thus a preliminary question of statistical
inference: what probabilities to infer from the observed frequencies. I set this question aside here
because I want to focus on the issue of causal inference.
2 Even if the entire target population were enrolled in the study, predictions will be about future
effectiveness where there may be no guarantee that this population stays the same over time with
respect to the causally relevant factors.


‘tendencies’ are almost always the only right idea. The currently popular solution to
the problem of external validity from philosophers and statisticians alike is to study
the ‘invariance’ characteristics of the probability distribution that describes the
population in the study. I shall argue that external validity is the wrong way to express
the problem and invariance is a poor strategy for fixing it. Probabilistic results are
invariant under only the narrowest conditions, almost never met. What’s useful is to
establish not the invariance of the probabilistic result but the invariance of the
contribution the cause produces, where the concept of ‘contribution’ only applies
where a ‘tendency claim’ is valid. Tendencies, I shall argue, are the primary conduit
by which ‘it-works-somewhere’ claims can support that it will work for us.

This raises a serious problem that I want to stress: Reasoning involving
capacities/tendencies requires a lot more evidence and evidence of far different kinds
than we are generally instructed to consider and we lack good systematic accounts
of what this evidence can or should look like.3

In particular I shall argue:

1. We need lots more than statistics to establish tendency claims.

2. The very way tendencies operate means that building a good model to predict
effectiveness is a delicate, creative enterprise requiring a large variety of information,
at different levels of generality, from different fields and of different types.

3. Correlatively we need a large amount of varied evidence to back up the
information that informs the model.

2. What can Mill’s method of difference establish, even in the ideal?

I should begin with a couple of caveats. My discussion takes ‘ideal’ seriously. What
can be done in the real world is far from the ideal and I will not discuss how to handle
that obvious fact. I want to stress problems that we have even where some
reasonable adjustment for departures from the ideal is possible. The second caveat
is that I discuss only inferences of a narrow kind, from ‘T causes O somewhere’ to ‘T,
as T will be implemented by us, will cause O for us’. For most practical policy
purposes, inferences that start from ‘T cause O somewhere’ need to end up with
conclusions of a different form from this, often at best at ‘T’ will cause O’ for us’
where T’ and O’ bear some usually not very well understood relation to T and O. I
suppose here that the inferences made assume at least that T and O are fixed from
premise to conclusion, though other causal factors may be changed as a result of our
methods of implementation.4 With these caveats in place, turn now to the meat of
what I want to discuss.

If the conclusion that we look for in answer to the question in the title of this section is
to be a causal claim (as opposed to a merely probabilistic claim) about T and O, then
here is at least one valid conclusion that can be drawn using Mill’s methods,

3 Consider as a smattering of examples the evidence use guideliness from the U.S. Dept of Education
(2003), the Scottish Intercollegiate Guideline Network (2008), Sackett et al. (2000), Atkins et al.
(2004) or the Cabinet Office (2000).
4 Exactly what counts as changing T versus changing additional factors that were in place in the study
but are not in place in the target implementation is a little arbitrary. But drawing a rough distinction
helps make clear what additional problems still face us even if T and O are entirely fixed. (Thanks to
John Worrall for urging me to make these two caveats explicit. For more on both issues, I suggest
looking at Worrall’s many papers on these subjects. Cf. Worrall (2007) and references therein.)


supposing them applied ideally (which of course we can only hope to do
approximately and even then, we seldom are in a strong position to know whether we
have succeeded):

The treatment, T, administered as it is in the study, causes the outcome, O, in
some individuals in the study population, X.

This conclusion depends on the assumption that if there are more cases of O in the
subpopulation of X where T obtains (the ‘treatment group’) than in the subpopulation
in which it does not (the ‘control group’), then at least some individuals in the
treatment group have been caused to be O by T.

Since this conclusion depends on taking causal notions seriously and in particular on
taking the notion of singular causation5 as already given, those who are suspicious
about causation tend instead to look for mere probabilistic conclusions. The usual
one to cite is mean effect size: the mean of O in the treatment group minus the mean
of O in the control group (<O>T −<O>C).

What about the external validity of this conclusion?

ESEV (effect size external validity): When will the mean difference be the
same between the study population X and a target population θ?
� ESEV Answer 1: If T makes the same difference in O for every member of

X and θ.

This however is a situation that we can expect to be very rare. Usually the effect of a
cause will be relational, depending in particular on characteristics of the systems
affected. Consider an uncontroversial case, well-known and well-understood. The
effect of gravity or of electromagnetic attraction and repulsion on the force an object
is subject to depends, for gravity, on the mass of that object, and for
electromagnetism, on the magnetic or electric charge of the affected object.

A more widely applicable answer than ESEV Answer 1 is available wherever the
probabilistic theory of causation holds. This theory supposes that the probability (in
the sense of objective chance) of an effect O is the same for any population that has
all the same causes of O and for which the causes of O all take the same value; i.e.
the probability is the same for all members of a causally homogeneous subclass.6
Loosely, ‘The probability of an effect is set once the values of all its causes are fixed’.
The set of causes of O that are supposed fixed in this assumption are those
characteristics that appear in the antecedent of a complete and correct causal law for
O.7 The probabilistic theory of causation then provides a second sufficient condition
for effect size external validity.

� ESEV Answer 2: When X and θ are the same with respect to

5 That is, that ‘T causes O in individual i’ is already understood. Alternatively, one could presuppose
the probabilistic theory of causality in which T causes O in a population φ that is causally
homogeneous but for T and it’s downstream effects just in case in φ, Prob(O/T) > Prob(O/-T). Then if
Prob(O) in the experimental population with T > Prob(O) in the experimental population with –T, we
can be assured that there is a subpopulation of X in which ‘T causes O’. (But note that if the two
probabilities are equal, we have no reason to judge that T causes O in no subpopulations rather than
that its positive effects in some cancel its negative effect in others.)
6 These probabilities will be zero or one where determinism holds but not in cases where causality can
be purely probabilistic.
7 What counts as ‘complete’ and correct here requires some care in defining; delving into this issue
takes us too far from the main topic of this paper.


a) The causal laws affecting O AND
b) Each ‘causally homogeneous’ subclass has the same probability in

θ as in X.

Sufficiency follows from the probabilistic theory of causation. In addition, these two
are also almost necessary. When they do not hold then ESEV is an accident of the
numbers. This can be seen by constructing cases with different causal laws (hence
different subclasses that are causally homogenous) or with different probabilities for
the causally homogeneous subclasses (e.g., shifting weights between those
subclasses in which T is causally positive for O and those for which it is causally
negative or less strongly positive).8

These are strong conditions, and they are recognized as such by many scholars who
try to be careful about external validity. One good example appears in a debate about
the legitimacy of reanalyzing the results from RCTs on the effects on families from
disadvantaged neighbourhoods of moving to socioeconomically better
neighbourhoods. In ‘What Can We Learn about Neighborhood Effects from the
Moving to Opportunity Experiment?’9 Ludwig et al take the purist position: They
oppose taking away lessons that the study was not designed to teach. In a section
titled ‘Internal versus External Validity’, these authors further caution —

…MTO defined its eligible sample as…[see below]. Thus MTO data…are strictly
informative only about this population subset – people residing in high-rise public
housing in the mid-1990’s, who were at least somewhat interested in moving and
sufficiently organized to take note of the opportunity and complete an
application. The MTO results should only be extrapolated to other populations if
the other families, their residential environments, and their motivations for moving
are similar to those of the MTO population.

The trouble here is that RCTs are urged in the first place because we do not know
what the other causes of the outcome are, let alone knowing that they have the same
distribution in the study population as in possible target populations. This is a fact the
authors themselves make much of in insisting that only conclusions based on the full
RCT design can be drawn. For instance, they explain –

The key problem facing nonexperimental approaches is classic omitted-
variable bias.


A second problem … is our lack of knowledge of which neighborhood
characteristics matter…Suppose it is the poverty rate in a person’s apartment
building, and not in the rest of the census tract…[BUT an experimental]
mobility intervention changes an entire bundle of neighborhood
characteristics, and the total impact of changing this entire bundle…can be
estimated even if the researcher does not know which neighborhood
variables matter.

The overall lesson I want to urge from this is that effect size will seldom travel from
the study population to target populations and even when it does, we seldom have
enough background knowledge to be justified in assuming so.

8 The constructions resemble those illustrating Simpson’s paradox. Cf. Cartwright (1979); Salmon
9 Ludwig et al. (2008)


Effect size is a very precise result however. Perhaps we would be happy with
something weaker, for instance, the direction of the effect. So we should ask:

Effect direction external validity (EDEV): When will an increase (resp. decrease
or no difference) in the probability or mean of O given T in a study population X
be sufficient for an increase (resp. decrease or no difference) in a target
population θ?

There are a variety of answers that can supply sufficient conditions, including –

� EDEV Answer 1: If X and θ
a) Have the same causal laws AND
b) Unanimity: T acts in the same direction with respect to O in all

causally homogeneous subpopulations.

� EDEV Answer 2: If θ has ‘the right’ subpopulations in the ‘right’

Both these answers are still very demanding. Clearly they require a great deal of
background knowledge before we are warranted in assuming that they hold. In the
end I shall argue that there is no substitute for knowing a lot, though there will be
different kinds of things we need to know to follow the alternative route I propose –
that of exporting facts about the contributions of stable tendencies. The tendency
route is often no more epistemically demanding10 than what these answers require
for exporting effect direction or effect size and tendencies are a far more powerful
tool more widely applicable: Tendencies can hold and be of use across a wide range
of circumstances where ESEV Answer 1 fails; they also underwrite condition EDEV
1b. when it holds yet can be of use even where it fails; and they do not depend, as
EDEV 2. and ESEV 2. do, on getting the weights of various subpopulations right in
order to be a reliable tool for predicting direction of changes in the outcome.

Let us turn then to this alternative route, which involves exporting not probabilistic
facts but causal facts. Doing so requires that we be careful in how we formulate
causal claims. In particular it is important for this purpose to distinguish three different
kinds of causal claim.

2. Three kinds of causal claim

The distinctions that matter for our discussion are those among —

1. It-works-somewhere claims: T causes O somewhere under some conditions
(e.g. in study population X administered by method M).

2. Tendency claims: T has a (relatively) stable tendency to promote O.
3. It-will-work-for-us claims: T would cause O in ‘our’ population θ administered

as it would be administered.

3.1 T causes O somewhere

10 Nor, sadly, do I think we can hope for answers that are less demanding epistemically if we want
sound and valid arguments. And that’s the point: we need to know what the premises are for a valid
argument; only then can we get on with the serious job of seeing to what degree they can be warranted.


This is just the kind of claim that method-of-difference studies can provide evidence
for; and it is important information to have. In saying this I follow, for instance, Curtis
Meinert11 when he says: ‘There is no point in worrying whether a treatment works
the same or differently in men and women until it has been shown to work in

It-works-somewhere claims are the kind of claim that medical and social sciences
work hard to establish with a reasonably high degree of certainty. But what makes
these claims evidence for effectiveness claims: T will cause O for us? I have
reviewed the standard answer: external validity. My alternative is tendency claims: T
has a (relatively) stable tendency to promote O.

3.2 T has a stable tendency to promote O

3.2.1 What are tendencies?

I have written a lot about the metaphysics, epistemology and methodology of
tendencies already.13 Here I hope to convey a sense of what they are and what they
can do with a couple of canonical examples. For instance,

� Masses have a stable tendency to attract other masses.
� Aspirins have a relatively stable tendency to relieve headaches.

The driving concept in the logic of tendencies is that of a stable contribution. A
feature, like having a mass, has a stable tendency when there is a fixed contribution
that it can be relied on to make whenever14 it is present (or properly triggered), where
contributions do not always (indeed in many areas seldom) result in the naturally
associated behaviours. The contribution from one cause can be – and often is –
offset by contributions from features as well as unsystematic interferences. The mass
of the earth is always pulling the pin towards it even if the pin lifts into the air because
the magnet contributes a pull upwards. What actually happens on a given occasion
will be some kind of resultant of all the contributions combining together plus any
unsystematic interferences that may occur.

Reasoning in terms of contributions is common throughout the natural and social
sciences and in daily life. Consider the California class-size reduction failure.15 Here
is a stripped down version of the widely accepted account of what went wrong.

There were well conducted RCTs in Tennessee showing that small class sizes
improved reading scores there (that is, providing evidence for an it-works-somewhere
claim). But when California cut its class sizes almost in half, little improvement in
scores resulted. That is not because there was a kind of holisitc effect in Tennessee

11 Meinert is a prominent expert on clinical trial methodology and outspoken opponent of the US NIH
diversity act demanding studies of subgroups because they generally cannot be based on proper RCT
design. I agree with him that about the importance of knowing it works somewhere. But my point in
this paper is that that knowledge is a tiny part of the body of evidence necessary to make reasonable
predictions about what will work for us.
12 Quoted from Epstein (2007).
13 Cf. Cartwright (1989) and (2007a).
14 Though note that some tenancies can be purely probabilistic and also the range of application can be
15 Bohrnstedt et al. (2002)


where the result depended on the special interaction among all the local factors
there. Rather, so the story goes, the positive contribution of small class size was
offset by the negative contributions of reduced teacher quality and inadequate
classroom and backup support. These latter resulted because the programme was
rolled out statewide over the course of a year. This created a demand for twice as
many teachers and twice as many classrooms that couldn’t be met without a
dramatic reduction in quality. The positive contribution of small class size was not
impugned by these results but possibly even borne out: The presumption seems to
be that scores would have been even worse had the poorer quality teaching and
accommodation been introduced without reducing class sizes as well. The reasoning
is just like that with a magnet and gravity acting together on a pin.

Tendency claims are thus a natural conduit by which it-works-somewhere claims
come to count as evidence for it-will-work-for-us claims. It should be noted however
that a stable tendency to contribute a given result is not in any way universally
indicated by the fact that a feature like class size participates in causing that result
somewhere. Nevertheless, if a result is to be exported from a study to help predict
what happens in a new situation, it can seldom be done by any other route.

3.2.2 The big problem for tendency logic

The central problem for reasoning involving tendencies is that we do not have good
systematic accounts of what it takes to establish such claims. We have nice histories
of establishing particular claims, especially in physics, but little explicit methodology.
This contrasts, for instance, with it-work-somewhere claims. We have a variety of
well-known well-studied methods for establishing these, methods for which we have
strong principled accounts of how they are supposed to work to provide warrant for
their conclusions and of where we must be cautious about their application. Recently,
for instance, there has been a great deal of attention and debate devoted to Mill’s-
method-of-difference studies and to the advantages and disadvantages of various
methods for ensuring that the requisite conditions are met that allow them to deliver
valid conclusions. But if I am right that tendencies are the chief conduit by which it-
works-somewhere claims come to support it-will-work-for-us, this attention focuses
on only a very small part of the problem. For an it-works-somewhere claim is at best
a single rock in the kind of foundation needed to support a tendency claim.

So I want to plead for more systematic work to lay out the kinds of studies and types
of evidence that best support tendency claims. As best I can tell ultimately we need a
theory to establish tendency claims, though admittedly often we will have to settle for
our best stab at the important relevant features of such a theory. That’s because
contributions come in bundles and are characterized relative to each other. We only
have good evidence that gravity is still working when the pin soars into the air
because we can ‘subtract away’ the contribution of the magnet and thus calculate
that gravity is still exerting its pull. To do that we need to have an idea both about
what other factors make what other contributions and what the appropriate rule of
composition for them is.16

Of course we most often have to proceed to make it-will-work-for-us predictions
without a well-developed theory. In that case we make our bets. My point is that we
must be clear what we are betting on and what evidence is available to back up the

16 Note though the tension here: Most advocates of RCTs like them because, they claim, no substantive
theory is required to do what they purport to do – i.e. establish an ‘it-works-somewhere’ claim.


bet, even what kind of further evidence we should be setting out to learn. Are we
betting on, and using the logic of, stable tendencies, and if so, to what extent does
our evidence back us up in this? Or are we betting on facts about identical causal
laws and correct distributions of other causal factors between study and target
populations, and if so, to what extent does our evidence support that?

3.2.3 Tendencies versus external validity

My overall message is that sometimes there are tendencies to be learned about.
Where there is a stable tendency, this provides a strong predictive tool for a very
great range of different kinds of target populations. It naturally does not tell us what
the observed result will be unless we know there are no unsystematic interferences
at work, we have good knowledge of the contributions that will be made by the other
causal factors present and we can estimate how these contributions combine, which
is very seldom the case outside the controlled environment of a physics laboratory.
But when we know a tendency claim we can make a prediction about the direction of
change. Whatever the result would have been, if the cause is added the new result
will differ by just the amount predictable from the contribution. But beware. The
comparison we can make is with what the result would have been post
implementation, just subtracting the effect of T itself. So, even restricting ourselves
just to claims about direction of change, we still have not arrived at an ‘it will work for
us’ claim, as I have characterized that.

Let us return to a comparison of tendencies versus external validity – predicting that
‘the same’ effect, either effect size or effect direction, will hold in the target as in the
study population.

� Neither can be taken for granted.
� Both require a great deal of evidence to warrant them, though of different kinds.
� With respect to effect direction :

� Stable tendencies : Post-implementation effect direction can be predicted
from knowledge that T has a stable tendency to promote O (that it makes,
say, a known contribution) without requiring knowledge of the distribution of
other causal factors in the target.

� External validity :
• Recall by contrast that under EDEV 2. the distribution of causally

homogeneous subpopulations must be ‘right’ in order for the effect
direction to be the same in the target as in the study population; and of
course for cases in which some set of right conditions hold, it takes
considerable background knowledge of what the other causal factors are
and what the target situation is like to be warranted in assuming they do.

• T has a stable tendency to promote O implies EDEV 1.b).
• What about EDEV 1.a)? I have not gone into the issue of the range

across which a cause must make the same contribution in order to be
labelled as a tendency. Obviously there is no firm answer. What matters is
that there should be good reasons to back up whatever range is
presupposed in a given application. Many well-known tendencies,
however, can survive a change in the other causes that affect the same
outcome. Philosophers keen on modularity as a mark of genuine
causation often insist that this is a widespread feature and it is often
supposed in science as well. For instance most of us are familiar from
elementary economics with exercises to calculate what happens if the
demand laws change while the contribution to exchange from the supply


side stays fixed, and vice versa. When that’s the case tendency
reasoning can provide predictions of effect direction that EDEV 1 cannot,
though of course the assumptions that a tendency is stable across
changes in other causal laws needs good arguments to back it up.

� With respect to effect size :
� Stable tendencies . Effect size can be calculated when the contributions of all

major tendencies present in a situation are known, or reasonably
approximated, along with the appropriate rule of combination. This is typically
what we demand from an engineering design but can surely never be
supposed for social and economic policies for effects on crime, education or
public health. Various narrow medical cases are generally thought of as lying
in between these extremes.

� External validity . It is seldom the case that the target and study populations
have the same causal laws and same distribution of causal factors, and even
more rare that we should be warranted in supposing so. So if the external
validity of effect size is our primary method for learning something about
target populations from Mill’s method-of-difference studies, these studies will
be of very little use to us.

� Use of the logic of tendencies is epistemically demanding. But so is external
validity, only in different ways. Tendency knowledge, where available, can do
more than traditional external validity reasoning and is far more widely applicable.
Moreover tendency logic is well established to work well in a variety of domains.
So it is wasteful and capricious to refuse to use this logic when evidence is
available for it. Of course often some evidence will be available


A few questions on writing essay introductions

How do you start?

Background info, meaning of terms; structure of the essay; introduce the debate related to

the question; state the aims of the essay;

Do not give conclusions or full summary of essay. Do not give detail of studies; Do not say

you are going to attempt to answer!

Gap? More usually for a research study or dissertation rather than an essay.

What is the purpose of an essay introduction?

To set the scene for the reader, and let them know how you will answer the essay question –

where you are going to take them in the essay; how you will develop it. Menu or road map.

But not where they are going to get to eventually.

What are the key elements of an introduction? Explain your answer

Key – initial information for the reader – setting the scene and letting them know what will

be covered and in what order. In other words, how you plan to answer the question.

How long should it be? (eg number of paragraphs)

One or two paragraphs normally in a 3000 word essay. More than a page or so and you are

using up precious space that could be given to the main body of the assignment.

How much detail should you include? Explain your answer

Very little unless it seems crucial to the reader’s understanding of later sections.

The reader should be able to guess the title of your essay from the Introduction. If they

cannot do that, you have not been clear enough.

Key to a good essay: apply your own Theory of Mind ability. Always think about what your

reader needs to know, and in what order. Do not write for the lecturer who taught you.

Write for a ‘naïve reader’.


Writing Essays in Child Psychology

These are general points that any good academic research essay should follow.

1. Structure: essays should make an argument: your essay should have a point and reach a conclusion, even if tentative, and you should try to convince the reader that your point is correct. This is the most important single point in writing a good essay. It will help you make it well organized, and well-written. Clarity of thought, and argument, provides the necessary basis for clear writing style. Thus, just like making a legal case in the courtroom, you follow a logical progression, using data or evidence to support each step of your argument, until you reach a logical conclusion.

The Introduction should introduce so that the reader is clear about the topic and where you are going to take them. Avoid going into detail about the content, but give a general indication of it and the elements that will make it up. Readers should be able to guess an essay title from the Introduction.

The main body of the essay should follow a logical progression so that the reader has a sense of being led through your reasoning. Avoid writing about different aspects in a random order. Write out the different aspects or sub-topics that will make up the main body on different pieces of paper. Then arrange them in the most logical order to guide you in creating a coherent line of reasoning, linking your paragraphs. Write as though for a ‘naïve’ reader – that is, someone who is not familiar with the topic. Do not write for the tutor, because you will tend to assume they already know what you are writing about! This tends to result in a very incoherent essay.

What counts as a good argument, or a solid conclusion? There is room for considerable creativity here, depending on the topic. It is easier to say what does not count as a good conclusion. For example, you should never just review a study or studies, and conclude that “more work is necessary”. More work is always necessary, and YOUR work in this essay is to reach a more substantive conclusion than that. A description of some research studies with no substantive conclusions does not make a good essay, however accurate.

2. Evidence: It is important to back up each main point that you make with evidence. Don’t just cite the study, but briefly describe its key result(s) in a sentence or two, and explain, explicitly, why this supports your point. Any statements of non-obvious fact (e.g. “toddlers engage in pretend play”, or “teachers can be attachment figures”) should be followed by a reference (even if only to a textbook).

Each claim should be cited at the appropriate place in the argument (and not repeated excessively in other less appropriate places). Avoid unnecessary study information. Don’t describe all the details of any study, only those that are relevant to the question you are addressing. But do give enough for the key studies to persuade the reader of the robustness of your point.

On the other hand, ALWAYS cite relevant data. If something is relevant to your point, you must cite it (even if it goes against your argument!). Finally, don’t say “This intervention appeared to have an effect” – either there was a recognisable effect or there was not and the study discussion or conclusion would make that clear.

3. Proof: Do not say “the researchers proved that …” Quantitative research uses statistical analysis to find out the mathematical probability or likelihood that there was an effect of their experiment or study, (e.g., a reading intervention), and not caused by other factors such as skewed sampling or varying test conditions. The likelihood may be very high but it is never certain. So in general, we say ”results indicated/showed that .. “.

4. Critical analysis: Being critical means carefully examining the factual basis for a statement. Just criticizing a study (e.g. for only looking at one age group or using a particular methodology) does not demonstrate a critical attitude. Part of psychology as a science is separating the crucial from the incidental factors, and criticizing incidentals is worse than saying nothing at all. Being overcritical about irrelevant issues is as bad as (maybe worse) than being uncritical.

So for example, using a technique that is not perfect, or using a single test age group, etc., are not valid targets of strong criticism. They may be points to be brought up in considering the implications of a study, but they are not “flaws”. Flaws include critical things left uncontrolled, poorly-described methods, incorrect hypotheses, vague research questions, elements overlooked that suggest the results presented do not show what the researchers claim they show. Thinking of alternatives that can explain the observed data in a different way is one of the most creative aspects of psychological research, and can be challenging and fun.

Avoid such tropes as “it is believed that” or “scientists believe”. Or (perhaps worse) “the general consensus seems to be…” (Science is not a popularity contest!) Say instead, “Frith (1996) {claimed (particularly if you don’t agree)/showed (particularly if you do)/suggested (if it is a hypothesis)} that x”. If it is you making the argument, just say “I suggest” (don’t say “this paper suggests”). Other useful words are ‘maintained that’, ‘proposed that’, ‘posited that’, ‘considered that’, ‘speculated that’.

5. Presentation: It makes a difference to the marker’s reading if your essay is well presented.

Use 12 point font and some spacing (1.5 is fine). Make it clear where paragraphs begin and end. A space is the clearest way rather than an indent.

In a 4000 word essay it is really helpful to the reader if there are sub-headings. It helps to make the overall structure clearer. It should also help you while you are writing.

Give the full essay title. As the marker is reading, they are checking back to see if you are answering the question so having the full title on the script helps. It should also help you while you are writing.

Keep sentence structure simple and clear (avoid run-on sentences). Avoid colloquial or informal expressions “a lot of”, “to try and see what is best”, or hyperbole (“this fabulous and amazing study”). Do not use technical terms unknown to a general reader without definition/explanation. Terms like “metarepresentation”, “phoneme” or “intentionality” have technical uses that need to be concisely defined.

6. References: The first time you refer to a set of authors, give all names unless there are six or more. E.g. “Baron-Cohen, Leslie and Frith (1983) conducted a study ” If you refer to them again, you should then give just the first author’s name and et al. “Baron-Cohen et al. further found that ….”. If there are six or more authors you can just say “Baron-Cohen et al. from the first time you refer to their paper.

Don’t cite something you haven’t read yourself directly (Sherif (1966) found that ..”. Instead say: “(Sherif (1966), as cited in Hogg et al)”. In the References list at the end, if you are citing something that you have not read yourself, give the entire reference, followed by (cited in Jones, 1978) and then give the full Jones (1978) reference just once in the alphabetical place in the References list:

Nesdale, D., & Flesser, D. (2001). Social identity and the development of children’s group attitudes, Child Development, 72, 506-517. (cited in Bennett and Sani (2004).

And then, at the appropriate place in the list:

Bennett, M., & Sani, F. (20024). The Development of the Social Self, Psychology Press, East Sussex.

A few common mistakes with scientific terms/words/abbreviations:

A. The abbreviation for et alia (which means “and others” in Latin) is et al. So you write (Jones et al., 1978). Do this only if there are two or more co-authors in addition to the primary author (so Jones and Smith 1978 is never Jones et al 1978)

B. E.g. means “for example”. I.e. means “that is”.

C. “Data” are plural. “The data are consistent with…”. The singular term, “datum” is rarely used – instead we discuss a single “data point”

D. “Criterion is” singular, “criteria are” plural (thus “the criteria are”). Same with phenomenon (singular) vs phenomena (pl).

E. Hyperbolic terms like “huge”, “amazing”, “vast” , and “incredible” and punctuation like “!” very rarely have a place in scientific writing.

F. The “work” of some scientists generally refers to their life’s work or some subset of it, and thus more than a single paper. If you want to discuss one paper, call it a “study”.

G. Replication: a study is an “attempt at replication” if they 1) use the same methods and “a replication” if they 2) get the same results. If they don’t get the same results it is a “failure to replicate”. If they don’t use the same methods (in broad outline) it is neither an attempt to replicate, nor a failure to replicate – it’s just a different study.

Do not have too many quotes. Do not have overly long quotes. If you do quote, give the reference and date + page number.

Singular possessives are formed by adding -‘s (a single girl’s book). The possessive of a regular plural (with -s) is formed by adding the apostrophe after the s: “all the girls’ books”. But irregular plurals (like person/people) just take the -‘s: “The People’s Party”; the children’s books.

Example References:

Journal Article: Fitch, W. T., & Hauser, M. D. (2004). Computational constraints on syntactic processing in a nonhuman primate. Science, 303, 377-380.

Book: Byrne, R. W. (1995). The Thinking Ape: Evolutionary Origins of Intelligence. Oxford: Oxford University Press.

Book Chapter: Fitch, W. T. (2005). Computation and Cognition: Four distinctions and their implications. In A. Cutler (Ed.), Twenty-First Century Psycholinguistics: Four Cornerstones (pp. 381-400). Mahwah, New Jersey: Lawrence Erlbaum.

PLEASE NOTE – References in the text must match the REFERENCES list exactly!


Evaluate the evidence that some children experience difficulties when learning, and discuss this in relation to two different developmental disorders (e.g., ADHD, Reading Disorder).