Volume 90, Number 4, 2012
Who and What Is a "Population"? Historical Debates, Current Controversies, and Implications for Understanding "Population Health" and Rectifying Health Inequities
Nancy Krieger

Harvard School of Public Health


Context: The idea of "population" is core to the population sciences but is rarely defined except in statistical terms. Yet who and what defines and makes a population has everything to do with whether population means are meaningful or meaningless, with profound implications for work on population health and health inequities.

Methods: In this article, I review the current conventional definitions of, and historical debates over, the meaning(s) of "population," trace back the contemporary emphasis on populations as statistical rather than substantive entities to Adolphe Quetelet's powerful astronomical metaphor, conceived in the 1830s, of l'homme moyen (the average man), and argue for an alternative definition of populations as relational beings. As informed by the ecosocial theory of disease distribution, I then analyze several case examples to explore the utility of critical population-informed thinking for research, knowledge, and policy involving population health and health inequities.

Findings: Four propositions emerge: (1) the meaningfulness of means depends on how meaningfully the populations are defined in relation to the inherent intrinsic and extrinsic dynamic generative relationships by which they are constituted; (2) structured chance drives population distributions of health and entails conceptualizing health and disease, including biomarkers, as embodied phenotype and health inequities as historically contingent; (3) persons included in population health research are study participants, and the casual equation of this term with "study population" should be avoided; and (4) the conventional cleavage of "internal validity" and "generalizability" is misleading, since a meaningful choice of study participants must be in relation to the range of exposures experienced (or not) in the real-world societies, that is, meaningful populations, of which they are a part.

Conclusions: To improve conceptual clarity, causal inference, and action to promote health equity, population sciences need to expand and deepen their theorizing about who and what makes populations and their means.

Keywords: epidemiology, health inequities, history, population health.

Population sciences, whether focused on people or the plenitude of other species with which we inhabit this world, rely on a remarkable, almost alchemical, feat that nevertheless now passes as commonplace: creating causal and actionable knowledge via the transmutation of data from unique individuals into population distributions, dynamics, and rates. In the case of public health, a comparison of population data—especially rates and averages of traits—sets the basis for not only elucidating etiology but also identifying and addressing health, health care, and health policy inequities manifested in differential outcomes caused by social injustice (Davis and Rowland 1983; Irwin et al. 2006; Krieger 2001, 2011; Svensson 1990; Whitehead 1992; WHO 2008, 2011).

But who are these "populations," and why should their means be meaningful? Might some instead be meaningless, the equivalent of fool's gold or, worse, dangerously misleading?

Because "population" is such a fundamental term for so many sciences that analyze population data—for example, epidemiology, demography, sociology, ecology, and population biology and population genetics, not to mention statistics and biostatistics (see, e.g., Desrosiéres 1998; Gaziano 2010; Greenhalgh 1996; Hey 2011; Kunitz 2007; Mayr 1988; Pearce 1999; Porter 1986; Ramsden 2002; Stigler 1986; Weiss and Long 2009)—presumably it would be reasonable to posit that the meaning of "population" is clear-cut and needs no further discussion.

As I document in this article, the surprise instead is that although the idea of "population" is core to the population sciences, it is rarely defined, especially in sciences dealing with people, except in abstract statistical terms. Granted, the "fuzziness" of concepts sometimes can be useful, especially when their empirical content is still being worked out, as illustrated by the well-documented contested history of the meanings of the "gene" as variously an abstract, functional, or physical entity, extending from before and still continuing well after the mid-twentieth-century discovery of DNA (Burian and Zallen 2009; Falk 2000; Keller 2000; Morange 2001). Nevertheless, such fuzziness can also be a major problem, especially if the lack of clear definition or a conflation of meanings distorts causal analysis and accountability.

In this article, I accordingly call for expanding and deepening what I term "critical population-informed thinking." Such thinking is needed to reckon with, among other things, claims of "population-based" evidence, principles for comparing results across "populations" (and their "subpopulations"), terminology regarding "study participants" (vs. "study population"), and assessing the validity (and not just the generalizability) of results. Addressing these issues requires clearly differentiating between (1) the dominant view that populations are (statistical) entities composed of component parts defined by innate attributes and (2) the alternative that I describe, in which populations are dynamic beings constituted by intrinsic relationships both among their members and with the other populations that together produce their existence and make meaningful casual inference possible.

To make my case, I review current conventional definitions of, and historical debates over, the meaning(s) of "population" and then offer case examples involving population health and health inequities. Informing my argument is the ecosocial theory of disease distribution and its focus on how people literally biologically embody their societal and ecological context, at multiple levels, across the life course and historical generations (Krieger 1994, 2001, 2011), thereby producing population patterns of health, disease, and well-being.

Who and What Is a Population?

Conventional Definitions

Who and what determines who and what counts as a "population"? Table 1 lists conventional definitions culled from several contemporary scholarly reference texts. As quickly becomes apparent, the meaning of this term has expanded over time to embrace a variety of concepts. Tracing its etymology to the word's Latin roots, the Oxford English Dictionary (OED 2010), for example, notes that "population" originally referred to the people living in (i.e., populating) a particular place, and this remains its primary meaning. Even so, as the OED's definitions also make clear, "population" has come to acquire a technical meaning. In statistics, it refers to "a (real or hypothetical) totality of objects or individuals under consideration, of which the statistical attributes may be estimated by the study of a sample or samples drawn from it." In genetics (or, really, biology more broadly), the OED defines "population" as "a group of animals, plants, or humans, within which breeding occurs." Likewise, atoms, subatomic particles, stars, and other "celestial objects" are stated as sharing certain properties allowing them to be classed together in "populations" (even though the study of inanimate objects typically falls outside the purview of the "population sciences").

Definitions of "Population" from Scholarly Reference Texts

Oxford English Dictionary (OED 2010):
post-classical Latin population-, populatio population, multitude (5th cent.), colonization, settlement (11th cent.), rural settlement (13th cent.),
populousness (13th cent. in a British source) < populat-, past participial stem of populare POPULATE v. 2 + classical Latin -i -ION suffix 1.

I. General uses.

2. a. The extent to which a place is populated or inhabited; the collective inhabitants of a country, town, or other area; a body of inhabitants.
b. In extended use (chiefly applied to animals).
d. A group of people, esp. regarded as a class or subset within a larger group. Freq. with modifying word.

II. Technical uses.

4. Statistics. A (real or hypothetical) totality of objects or individuals under consideration, of which the statistical attributes may be estimated by the study of a sample or samples drawn from it.

5. Genetics. A group of animals, plants, or humans, within which breeding occurs.

6. Physics. The (number of) atoms or subatomic particles that occupy any particular energy state.

7. Astron. Any of several groups, originally two in number, into which stars and other celestial objects are categorized on the basis of where in the galaxy they were formed. Chiefly in population I n., population II n., population III n. at Compounds 2.

population biology n. the branch of biology that deals with the patterns and causes of diversity within and among populations, esp. as regards their ecology, demography, epidemiology, etc.

population genetics n. the branch of genetics that deals mathematically with the distribution of and change in gene frequencies in populations from one generation to another.

Oxford: A Dictionary of Science (Daintith and Martin 2005, 651):
population (in ecology). 1. A group of individuals of the same species within a community. The nature of a population is determined by such factors as density, sex ratio, birth and death rates, emigration, and immigration. 2. The total number of individuals of a given species or other class of organisms in a defined area, e.g., the population of rodents in Britain.

Oxford: A Dictionary of Epidemiology (Porta 2008, 187):
POPULATION. 1. All the inhabitants of a given country or area considered together; the number of inhabitants of a given country or area. 2. In sampling, the whole collection of units (the "UNIVERSE") from which a sample may be drawn; not necessarily a population of persons—the units may be institutions, records, or events. The sample is intended to give results that are representative of the whole population; it may deviate from that goal owing to random and systematic errors. See also GENERAL POPULATION.

Oxford: A Dictionary of Sociology (Scott and Marshall 2005, 504–5): population. In its most general sense, a population comprises the totality of the people living in a particular territory (see DEMOGRAPHY), but it has a more specific meaning in statistics. In statistical terms, a population refers to the aggregate of the individuals or units from which a sample is drawn, and to which the results of any analysis are to apply—in other words the aggregate of persons or objects under investigation. It is conventional to distinguish the target population (for which the results are required) from the survey population (those actually included in the sampling frame from which the sample is drawn). For practical reasons the two are rarely identical. Even the most complete sampling frames—electoral registers, lists of addresses, or (in the United States), lists of telephone numbers—exclude sizeable categories of the population (who fail to register to vote, are homeless, or do not own a telephone). Researchers may sometimes deliberately exclude members of the target population from the survey population. For example, it is standard practice to exclude the area north of the Caledonian Canal from the sampling frame for national sample surveys in Great Britain, on the grounds that the Northern Highlands are so thinly populated that interviews in this area would be unacceptably expensive to obtain. However, for most sociological purposes, this particular gap between the target and survey populations is not deemed to be significant—although, in a survey of 'attitudes to public transportation in thinly populated areas,' it would clearly be problematic. See also STATISTICAL INFERENCE.

International Encyclopedia of the Social & Behavioral Sciences: entry on "Human evolutionary genetics" (Mountain 2001, 6985):
Essential to the practice of human evolutionary genetics are definitions of the terms 'population' and 'group.' The less precisely defined term, 'group,' is used here to mean any collection of individuals. In a theoretical framework, the term 'population' is defined very precisely, as a set of individuals constituting a mating pool. All individuals of the appropriate sex in the population are considered to be equally available as potential mates. Groups of humans rarely, if ever, fit this definition of a population. The boundary between one population and another is obscure. In practice, therefore, human evolutionary geneticists delineate populations along linguistic, geographic, sociopolitical, and/or cultural boundaries. A population might include, for example, all speakers of a particular Bantu language, all inhabitants of a river valley in Italy, or all members of a caste group in India.

International Encyclopedia of the Social & Behavioral Sciences: entry on "Generalization: conceptions in the social sciences" (Cook 2001, 6038):
...many social scientists use universe (and population) differently from construct.... In statistics, populations are ostensive; their elements are real and can be pointed to. But constructs are hypothetical and more obviously theory dependent. Moreover, the formal methods statisticians prefer when sampling elements from a population cannot be used with constructs, because the necessary enumeration and sampling of elements cannot be readily achieved with measures of abstract constructs. This is why Cook and Campbell (1979) used external validity to refer to people and settings and construct validity to refer to instances of more hypothetical causes and effects. However, the distinction is partly arbitrary. Constructs have constitutive elements theoretically specified as their components, and instances of any one construct vary in which components they incorporate. Moreover, human populations are not totally ostensive per se. Despite official definitions, there is still room to disagree about what being an Australian means: what about someone with an Australian passport who has always lived abroad, or the illegal immigrant who has always lived in Australia without a passport?


Mirroring the OED's definitions are those provided in diverse "population sciences" dictionaries and encyclopedias. Four such texts, whose definitions are echoed in key works in population health (Evans, Barer, and Marmor 1994; Rose 1992, 2008; Rothman, Greenland, and Lash 2008; Young 2005), are worth noting: A Dictionary of Epidemiology (Porta 2008), A Dictionary of Sociology (Scott and Marshall 2005), and the two entries from the International Encyclopedia of the Social & Behavioral Sciences that offer a definition of "population," one focused on "human evolutionary genetics" (Mountain 2001) and the other on "generalization: conceptions in the social sciences" (Cook 2001). A fifth resource, the Encyclopedia of Life Sciences, interestingly does not include any articles specifically on defining "population." However, of the 396 entries located with the search term "population" and sorted by "relevance," the first 25 focus on populations principally in relation to genetics, reproduction, and natural selection (Clarke et al. 2000–2011).

Among these four texts, all germane to population sciences that study people, the first two briefly define "population" in relation to inhabitants of an area but notably remain mum on the myriad populations appearing in the public health literature not linked to geographic locale (e.g., the "elderly population," the "white population," or the "lesbian/ gay/bisexual/transgender population"). Most of their text is instead devoted to the idea of "population" in relation to statistical sampling (Porta 2008; Scott and Marshall 2005). By contrast, the third text invokes biology (with no mention of statistics) and defines a "population" to be a "mating pool" (Mountain 2001, 6985), albeit observing that "groups of humans rarely, if ever, meet this definition," so that "in practice ...human evolutionary geneticists delineate populations along linguistic, geographic, socio-political, and/or cultural boundaries. A population might include, for example, all speakers of a particular Bantu language, all inhabitants of a river valley in Italy, or all members of a caste group in India."

The fourth text avers that in the social sciences, "population" has two meanings: as a theory-dependent hypothetical "construct" (whose basis is not defined) and as an empirically defined "universe" (used as a sampling frame) (Cook 2001). A telling example illustrates that for people, geographical location, nationality, and ancestry need not neatly match, as in the case of an illegal immigrant or a legal citizen of one country legally residing in a different country (table 1). Consequently, apart from specifying that entities comprising a population individually possess some attribute qualifying them to be a member of that population, none of the conventional definitions offers systematic criteria by which to decide, in theoretical or practical terms, who and what is a population, let alone whether and, if so, why their mean value or rate (or any statistical parameter) might have any substantive meaning.

Meet the "Average Man": Quetelet's 1830s Astronomical Metaphor Amalgamating "Population" and "Statistics"

The overarching emphasis on "populations" as technical statistical entities and the limited discussion as to what defines them, especially for the human populations, is at once remarkable and unsurprising. It is remarkable because "population" stands at the core, conceptually and empirically, of any and all population sciences. It is unsurprising, given the history and politics of how, in the case of people, "population" and "sample" first were joined (Krieger 2011).

In brief, and as recounted by numerous historians of statistics (Daston 1987; Desrosiéres 1998; Hacking 1975, 1990; Porter 1981, 1986, 1995, 2002, 2003; Stigler 1986, 2002; Yeo 2003), during the early 1800s the application of quantitative methods and laws of probability to the study of people in Europe took off, a feat that required reckoning with such profound issues as free will, God's will, and human fate. To express the mind shift involved, a particularly powerful metaphor took root: that of the "l'homme moyen" (the average man), which, in the convention of the day, included women (figure 1). First used in 1831 in an address given by Adolphe Quetelet (1796–1874), the Belgian astronomer-turned-statistician-turned-sociologist-turnednosologist (Hankins 1968; Stigler 2002), the metaphor gained prominence following the publication in 1835 of Quetelet's enormously influential opus, Sur l'homme et le development de ses facultés, ou essai de physique sociale (Quetelet 1835). Melding the ideas of essential types, external influences, and random errors, the image of the "average man" solidified a view of populations, particularly human populations, as innately defined by their intrinsic qualities. Revealing these innate qualities, according to Quetelet, was a population's on-average traits, whether pertaining to height and weight, birth and death rates, intellectual faculties, moral properties, and even propensity to commit crime (Quetelet 1835, 1844).

FIGURE 1. What is the meaning of means and errors?—Adolphe Quetelet (1796–1874) and the astronomical metaphor animating his 1830s "l'homme moyen" ("the average man").

Source: Illustration of normal curve from Quetelet 1844.

The metaphor animating Quetelet's "average man" was inspired by his background in astronomy and meteorology. Shifting his gaze from the heavens to the earth, Quetelet arrived at his idea of "the average man" by inverting the standard approach his colleagues used to fix the location of stars, in which the results of observations from multiple observatories (each with some degree of error) were combined to determine a star's most likely celestial coordinates (Porter 1981; Stigler 1986, 2002). Reasoning by analogy, Quetelet ingeniously, if erroneously, argued that the distribution of a population's characteristics served as a guide to its true (inherent) value (Quetelet 1835, 1844). From this standpoint, the observed "deviations" or "errors" arose from the imperfect variations of individuals, each counting as an "observation-with-error" akin to the data produced by each observatory. The impact of these "errors" was effectively washed out by the law of large numbers. Attesting to the power of metaphor in science and more generally (Krieger 1994, 2011; Martin and Harré 1982; Ziman 2000), Quetelet's astronomical "average man" simultaneously enabled a new way to see and study population variation even as it erased a crucial distinction. For a star, the location of the mean referred to the location of a singular real object, whereas for a population, the location of its population mean depended on how the population was defined.

To Quetelet, this new conception of population meant that population means, based on sufficiently large samples, could be meaningfully compared to determine if the populations' essential characteristics truly differed. The contingent causal inference was that if the specified populations differed in their means, this would mean that they either differed in their essence (if subject to the same external forces) or else were subject to different external forces (assuming the same internal essence). Reflecting, however, the growing pressure for nascent social scientists to be seen as "objective," Quetelet's discussion of external forces steered clear of politics. Concretely, this translated to not challenging mainstream religious or economic beliefs, including the increasingly widespread individualistic philosophies then linked to the rapid ascendance of the liberal free-market economy (Desrosiéres 1998; Hacking 1990; Heilbron, Magnusson, and Wittrock 1998; Porter 1981, 1986, 1995, 2003; Ross 2003). For example, although Quetelet conceded that "the laws and principles of religion and morality" could act as "influencing causes" (Quetelet 1844, xvii), in his analyses he treated education, occupation, and the propensity to commit crime as individual attributes no different from height and weight. The net result was that a population's essence—crucial to its success or failure—was conceptualized as an intrinsic property of the individuals who comprised the population; the corollary was that population means and rates were a result and an expression of innate individual characteristics.

Or so the argument went. At the time, others were not convinced and contended that Quetelet's means were simply arbitrary arithmetic contrivances resulting from declaring certain groups to be populations (Cole 2000; Desrosiéres 1998; Porter 1981; Stigler 1986, 2002). As Quetelet himself acknowledged, the national averages and rates defining a country's "average man" coexisted with substantial regional and local variation. Hence, data for one region of France would yield one mean, and for another region it would be something else. If the two were combined, a third mean would result—and who was to say which, if any, of these means was meaningful, let alone reflective of an intrinsic essence (or, for that matter, external influences)?

Quetelet's tautological answer was to differentiate between what he termed "true means" versus mere "arithmetical averages" (Porter 1981; Quetelet 1844). The former could be derived only from "true" populations, whose distribution by definition expressed the "law of errors" (e.g., the normal curve). In such cases, Quetelet argued, the mean reflected the population's true essence. By contrast, any disparate lot of objects measured by a common metric could yield a simple "average" (e.g., average height of books or of buildings), but the meaningless nature of this parameter, that is, its inability to be informative about any innate "essence," would be revealed by the lack of a normal distribution.

And so the argument continued until the terms were changed in a radically different way by Darwin's theory of evolution, presented in Origin of Species, published in 1859 (Darwin [1859] 2004). The central conceptual shift was from "errors" to "variation" (Eldredge 2005; Hey 2011; Hodge 2009; Mayr 1988). This variation, thought to reflect inheritable characteristics passed on from parent to progeny, was in effect a consequence of who survived to reproduce, courtesy of "natural selection." No longer were species, that is, the evolving biological populations to which these individuals belonged, either arbitrary or constant. Instead, they were produced by reproducing organisms and their broader ecosystem. Far from being either Platonic "ideal types" (Hey 2011; Hodge 2009; Mayr 1988; Weiss and Long 2009), per Quetelet's notion of fixed essence plus error, or artificially assembled aggregates capable of yielding only what Quetelet would term meaningless mere "averages," "populations" were newly morphed into temporally dynamic and mutable entities arising by biological descent. From this standpoint, variation was vital, and variants that were rare at one point in time could become the new norm at another.

Nevertheless, even though the essence of biological populations was now impermanent, what substantively defined "populations" remained framed as fundamentally endogenous. In the case of biological organisms, this essence resided in whatever material substances were transmitted by biological reproduction. Left intact was an understanding of population, population traits, and their variability as innately defined, with this variation rendered visible through a statistical analysis of appropriate population samples. The enduring result was to (1) collapse the distinctions between populations as substantive beings versus statistical objects and (2) imply that population characteristics reflect and are determined by the intrinsic essence of their component parts. Current conventional definitions of "population" say as much and no more (table 1).

Conceptual Criteria for Defining Meaningful Populations for Public Health

Framing and Contesting "Population" through an Epidemiologic Lens. In the 150 years since these initial features of populations were propounded, they have become deeply entrenched, although not entirely uncontested. Figure 2 is a schematic encapsulation of mid-nineteenth to early twentieth-century notions of populations, with the entries emphasizing population statistics and population genetics because of their enduring influence, even now, on conceptions of populations in epidemiology and other population sciences. During this period, myriad disciplines in the life, social, and physical sciences embraced a statistical understanding of "population" (Desrosiéres 1998; Hey 2011; Porter 1981, 1986, 2002, 2003; Ross 2003; Schank and Twardy 2009; Yeo 2003). Eugenic thinking likewise became ascendant, espoused by leading scientists and statisticians, especially the newly named "biometricians," who held that individuals and populations were determined and defined by their heredity, with the role of the "environment" being negligible or nil (Carlson 2001; Davenport 1911; Galton 1904; Kevels 1985; Mackenzie 1982; Porter 2003; Tabery 2008).

It was also during the early twentieth century that the nascent academic discipline of epidemiology advanced its claims about being a population science, as part of distinguishing both the knowledge it generated and its methods from those used in the clinical and basic sciences (Krieger 2000, 2011; Lilienfeld 1980; Rosen [1958] 1993; Susser and Stein 2009; Winslow et al. 1952). In 1927 and in 1935, for example, the first professors of epidemiology in the United States and the United Kingdom—Wade Hampton Frost (1880–1938) at the Johns Hopkins School of Hygiene and Public Health in 1921 (Daniel 2004; Fee 1987), and Major Greenwood (1880–1949) at the London School of Hygiene and Tropical Medicine in 1928 (Butler 1949; Hogben 1950)—urged that epidemiology clearly define itself as the science of the "mass phenomena" of disease, Frost in his landmark essay "Epidemiology" (Frost [1927] 1941, 439) and Greenwood in his discipline-defining book Epidemics and Crowd Diseases: An Introduction to the Study of Epidemiology (Greenwood 1935, 125). Neither Frost nor Greenwood, however, articulated what constituted a "population," other than the large numbers required to make a "mass."

FIGURE 2. A schematic cross-disciplinary genealogy of mid-nineteen to early twentieth-century "population" thinking and current impact.

Sources: Carver 2003; Crow 1990, 1994; Dale and Katz 2011; Darwin 1859; Daston 1987; Desrosiéres 1998; Eldredge 2005; Galton1889, 1904; Hacking 1975, 1990; Hey 2011; Hodge 2009; Hogben 1933; Keller 2010; Mackenzie 1982; Marx 1845; Mayr 1988; Porter 1981, 1986, 2002, 2003; Quetelet 1835, 1844; Sarkar 1996; Schank and Twardy 2009; Stigler 1986, 1997; Tabery 2008;Yeo 2003.

Also during the 1920s and 1930s, two small strands of epidemiologic work—each addressing different aspects of the inherent dual engagement of epidemiology with biological and societal phenomena (Krieger 1994, 2001, 2011)—began to challenge empirically and conceptually the dominant view of population characteristics as arising solely from individuals' intrinsic properties. The first thread was metaphorically inspired by chemistry's law of "mass action," referring to the likelihood that two chemicals meeting and interacting in, say, a beaker, would equal the product of their spatial densities (Heesterbeek 2005; Mendelsohn 1998). Applied to epidemiology, the law of "mass action" spurred novel efforts to model infectious disease dynamics arising from interactions between what were termed the "host" and the "microbial" populations, taking into account changes in the host's characteristics (e.g., from susceptible to either immune or dead) and also the population size, density, and migration patterns (Frost [1928] 1976; Heesterbeek 2005; Hogben 1950; Kermack and McKendrick 1927; Mendelsohn 1998).

The second thread was articulated in debates concerning eugenics and also in response to the social crises and economic depression precipitated by the 1929 stock market crash. Its focus concerned how societal conditions could drive disease rates, not only by changing individuals' economic position, but also through competing interests. Explicitly stating this latter point was the 1933 monograph Health and Environment (Sydenstricker 1933), prepared for the U.S. President's Research Committee on Social Trends by Edgar Sydenstricker (1881–1936), a leading health researcher and the first statistician to serve in the U.S. Public Health Service (Krieger 2011; Krieger and Fee 1996; Wiehl 1974). In this landmark text, which explicitly delineated diverse aspects of what he termed the "social environment" alongside the physical environment, Sydenstricker argued (1933, 16, italics in original):

Economic factors in the conservation or waste of health, for example, are not merely the rate of wages; the hours of labor; the hazard of accident, of poisonous substances, or of deleterious dusts; they include also the attitude consciously taken with respect to the question of the relative importance of large capitalistic profits versus maintenance of the workers' welfare.

In other words, social relations, not just individual traits, shape population distributions of health.

Influenced by and building on both Greenwood's and Sydenstricker's work, in 1957 Jeremy Morris (1910–2009) published his highly influential and pathbreaking book Uses of Epidemiology (Morris 1957), which remains a classic to this day (Davey Smith and Morris 2004; Krieger 2007a; Smith 2001). Going beyond Frost and Greenwood, Morris emphasized that "the unit of study in epidemiology is the population or group, not the individual" (Morris 1957, 3, italics in original) and also went further by newly defining epidemiology in relational terms, as "the study of health and disease of populations and of groups in relation to their environment and ways of living" (Morris 1957, 16, italics in original). As a step toward defining "population," Morris noted that "the 'population' may be of a whole country or any particular and defined sector of it" (Morris 1957, 3), as delimited by people's "environment, their living conditions, and special ways of life" (Morris 1957, 61). He also, however, recognized that better theorizing about populations was needed and hence called for a greater "understanding of the properties of individuals which they have in virtue of their group membership" (Morris 1957, 120, italics in original). But this appeal went largely unheeded, as it directly contradicted the era's prevailing framework of methodological individualism (Issac 2007; Krieger 2011; Ross 2003).

Morris's insights notwithstanding, the dominant view has remained what is presented in table 1. Even the recent influential work of Geoffrey Rose (1926–1993), crucial to reframing individual risk in population terms, theorized populations primarily in relation to their distributional, not substantive, properties (Rose 1985, 1992, 2008). Rose's illuminating analyses thus emphasized that (1) within a population, most cases arise from the proportionately greater number of persons at relatively low risk, as opposed to the much smaller number of persons at high risk; (2) determinants of risk within populations may not be the same as determinants of risk between populations; and (3) population norms shape where both the tails and the mean of a distribution occur. Rose thus cogently clarified that to change populations is to change individuals, and vice versa, implying that the two are mutually constitutive, but he left unspecified who and what makes meaningful populations and when they can be meaningfully compared.

Current Challenges to Conventional Views of "Population." A new wave of work contesting the still reigning idea of "the average man" can currently be found in recent theoretical and empirical work in the social and biological sciences attempting to analyze population phenomena in relation to dynamic causal processes that encompass multiple levels and scales, from macro to micro (Biersack and Greenberg 2006; Eldredge 1999; Eldredge and Grene 1992; Gilbert and Epel 2009; Grene and Depew 2004; Harraway 2008; Illari, Russo, and Williamson 2011; Krieger 2011; Lewontin 2000; Turner 2005). Also germane is research on system properties in the physical and information sciences (Kuhlmann 2011; Mitchell 2009; Strevens 2003).

Applicable to the question of who and what makes a population, one major focus of this alternative thinking is on processes that generate, maintain, transform, and lead to the demise of complex entities. This perspective builds on and extends a long history of critiques of reductionism (Grene and Depew 2004; Harré 2001; Illari, Russo, and Williamson 2011; Lewontin 2000; Turner 2005; Ziman 2000), which together aver that properties of a complex "whole" cannot be reduced to, and explained solely by, the properties of its component "parts." The basic two-part argument is that (a) new (emergent) properties can arise out of the interaction of the "parts" and (b) properties of the "whole" can transform the properties of their parts. Thus, to use one well-known example, a brain can think in ways that a neuron cannot. Taking this further in regard to the generative causal processes at play, what a brain thinks can affect neuron connections within the brain, and it also is affected by the ecological context and experiences of the organism, of which the brain is a part (Fox, Levitt, and Nelson 2010; Gibson 1986; Harré 2001; Stanley, Phelps, and Banaji 2008). The larger claim is that the causal processes that give rise to complex entities can both structure and transform the characteristics of both the whole and its parts.

What might it look like for public health to bring this alternative perspective to the question of defining, substantively, who and what makes a population? Let me start with a conceptual answer, followed by some concrete public health propositions and examples.

Populations as Relational Beings: An Alternative Causal Conceptualization

In brief, I argue that a working definition of "populations" for public health (or any field concerned with living organisms) would, in line with Sydenstricker (1933) and Morris (1957) and the other contemporary theorists just cited, stipulate that populations are first and foremost relational beings, not "things." They are active agents, not simply statistical aggregates characterized by distributions.

Specifically, as tables 2 and 3 show, the substantive populations that populate our planet

  1. Are animate, self-replicating, and bounded complex entities, generated by systemic causal processes.
  2. Arise from and are constituted by relationships of varying strengths, both externally (with and as bounded by other populations) and internally (among their component beings).
  3. Are inherently constituted by, and simultaneously influence the characteristics of, the varied individuals who comprise its members and their population-defined and -defining relationships.

It is these relationships and their underlying causal processes (both deterministic and probabilistic), not simply random samples derived from large numbers, that make it possible to make meaningful substantive and statistical inferences about population characteristics, as well as meaningful causal inferences about observed associations.

Accordingly, as summarized by Richard A. Richards, a philosopher of biology (who was writing about species, one type of population), populations have "well-defined beginnings and endings, and cohesion and causal integration" (Richards 2001). They likewise necessarily exhibit historically contingent distributions in time and space, by virtue of the dynamic interactions intrinsically occurring between (and within) their unique individuals and with other equally dynamic codefining populations and also their changing abiotic environs. Underscoring this point, even a population of organisms cloned from a single source organism will exhibit variation and distributions as illustrated by the phenomenon of developmental "noise," an idea presaged by early twentieth-century observations of chance differences in coat color among litter mates of pure-bred populations raised in identical circumstances (Davey Smith 2011; Lewontin 2000; Wright 1920).

Conceptual Criteria for Defining Meaningful Populations for Population Sciences, Guided by the Ecosocial Theory of Disease Distribution

Source: Krieger 1994, 2001, and 2011, 214–15.

As for the inherent relationships characterizing populations, both internally and externally, I suggest that four key types stand out, as informed by the ecosocial theory of disease distribution (Krieger 1994, 2001, 2011); the collaborative writing of Niles Eldredge, an evolutionary biologist, and Marjorie Grene, a philosopher of biology (Eldredge and Grene 1992); as well as works from political sociology, political ecology, and political geography (Biersack and Greenberg 2006; Harvey 1996; Nash and Scott 2001). As tables 2 and 3 summarize, these four kinds of relationships are (1) genealogical, that is, relationships by biological descent; (2) internal and economical, in the original sense of the term, referring to relationships essential to the daily activities of whatever is involved in maintaining life (in ancient Greece, oikos, the root of the "eco" in both "ecology" and "economics," referred to a "household," conceptualized in relation to the activities and interactions required for its existence [OED 2010]); (3) external and ecological, referring to relationships between populations and with the environs they coinhabit; and (4) in the case of people (and likely other species as well), teleological, that is, by design, with some conscious purpose in mind (e.g., citizenship criteria). Spanning from mutually beneficial (e.g., symbiotic) to exploitative (benefiting one population at the expense of the other), these relationships together causally shape the characteristics of populations and their members.

What are some concrete examples of animate populations that exemplify these points? Table 3 provides four examples. Two pertain to human populations: the "U.S. population" (Foner 1997; Zinn 2003) and "social classes" (Giddens and Held 1982; Wright 2005). The third considers microbial populations within humans (Dominguez-Bello and Blaser 2011; Pflughoeft and Versalovic 2012; Walter and Ley 2011), and the fourth concerns a plant population, a species of tree, the poplar, whose genus name (Populus) derives from the same Latin root as "population" (Braatne, Rood, and Heillman 1996; Fergus 2005; Frost et al. 2007; Jansson and Douglas 2007). Together, these examples clarify what binds—as well as distinguishes—each of these dynamic populations and their component individuals. They likewise underscore that contrary to common usage, "population" and "individual" are not antonyms. Instead, they hark back to the original meaning of "individual"—that is, "individuum," or what is indivisible, referring to the smallest unit that retained the properties of the whole to which it intrinsically belonged (OED 2010; Williams 1985). Thus, although it is analytically possible to distinguish between "populations" and "individuals," in reality these phenomena occur and are lived simultaneously. A person is not an individual on one day and a member of a population on another. Rather, we are both, simultaneously. This joint fact is fundamental and is essential to keep in mind if analysis of either individual or population phenomena is to be valid.

Defining Features of Populations of Living Beings, Including Humans, Relevant to Public Health and Population Sciences
  Intrinsic (Constitutive) Relationships (Internal and External)
Example Boundaries Individuals Genealogical Internal and Economical: Relationships among Individuals in the Population External and Ecological: With Other Populations Teleological (for Humans and Possibly Some Other Species)

Human beings: U.S. population

Political and geographic, i.e., nation-state with citizenship criteria established by politics and territory; although "cultural" boundaries also exist, they are predicated on nationality.

Individual persons, in legally defined groups demarcated by historically contingent citizenship status: (a) U.S. free nonindigenous citizensa; (b) U.S. indigenous citizensb (who may have legally defined dual citizenship with sovereign tribal nations); and (c) noncitizens: legal "permanent residents" (and "permanent aliens"c), legally defined refugees, and undocumented persons.

Direct genealogy: U.S. citizenship by being born to U.S. citizens (jus sanguisd); citizenship by place of birth (jus soli, for persons not otherwise born to U.S. citizens) can become genealogical citizenship for subsequent generations.

As in any polity (political-geographic entity), the economic, legal, political, and social relationships in the United States between individuals that produce, reproduce, and transform the daily conditions of life (e.g., involving work, commerce, property, and the production, exchange and consumption of material goods; establishing and maintaining family life from birth to death), which individuals are legally permitted to engage in these relationships is historically contingent (e.g., banning of child labor in the early 20th century; legal racial discrimination in employment and housing until the mid-1960s; current legal restriction of marriage to heterosexual couples in most U.S. states)

U.S. foreign and domestic policy, along with international treaties the United States has signed, shape political, territorial, legal, social, economic, cultural, and ecosystem relationships both (a) between the U.S. population and populations elsewhere in the world (including who is and is not allowed to immigrate, cf. the 1882 Chinese exclusion act and the 1924 immigration restriction act)c and (b) within the United States.

U.S. domestic and foreign policy sets parameters of who counts as the U.S. population and the conditions in which the U.S. population (and its component groups) lives.


Human beings: Social classes

Economic, political, and legal, set by rules and relationships involving property and labor (within and across boundaries of nation-states).

Individual persons and/or individuals in households and/or family structures that live as an economic unit.

Direct genealogy: class origins at birth; political system and legal rules determine if class position is solely hereditary or if class mobility is allowed.

Social classes are established and maintained through their intrinsic relationships to one another as established by the prevailing political system and its legal rules involving property and labor (e.g., cannot have employer without employee); individuals within particular classes can form groups to advance their class interest (whether in conflict or cooperation with the other classes).

Political, legal, and economic relationships among social classes generated by underlying political economy, shaping ways of living, and rights of each social class.

Political philosophies and economic interests shape how individuals view social classes and act to maintain or alter the political and economic systems that give rise to them.


Populations within human beings: human cells and the microbiome

Biological: cell surfaces (and surfaces of cells as organized in tissues, and of tissues as organized in organs).

Human cells (~10% of cells within a human) and microbial cells (~90% of the cells within and on a human).

Human cells: from fertilized ovum.

Microbiome: initiated by exposure to mother’s microbial ecology via birth (vaginal if vaginal delivery, epidermal if Cesarean section); bacteria then primarily reproduce asexually and new bacteria may be introduced (e.g., by fecal-oral transmission).

Example of gut microbiome: symbiotic (mutualistic) extension of human gut cell faculties, in which diverse types of bacteria (represented by different phyla and their species in the oral cavity, stomach, small intestine, and large intestine) receive (and compete for) nourishment, aid with digestion, produce vitamins, and modulate inflammatory response.

Relationships within and on body: among bacteria (intraspecies, interspecies, and gene transfer) and with human cells.

Relationships across body boundary: exposure to exogenous bacteria.

Deliberate alteration of microbiome composition by use of antibiotics, probiotics, changes in diet, and changes in water supply and sanitation.


Nonhuman population: example of the eastern cottonwood (Populus deltoids), a hardwood tree native to North America that grows best near streams, and one of 35+ tree species that are poplars.

Biological: a tree species, one that has the ability to produce hybrids with other species in the same genus, including Populus trichocarpa, whose genome was sequenced in 2006, thereby establishing it as the first tree model system for plant biology.

Individual tree (dioecious, i.e., tree is typically female or male).

Sexual reproduction: via wind-driven pollination of flowers on female tree by pollen from flowers on male tree (whereby a female tree may annually produce millions of seeds fertilized by pollen from thousands of male trees), and the seeds (which have long wispy tufts, resembling cotton) are dispersed by both wind and water.

Asexual reproduction: via broken branches (e.g., due to storms and floods); people can also propagate via unrooted cuttings.

Typically grows in pure stands, with dominant trees determining spacing between trees (since the trees are very intolerant of shade).

Communication to counter predation: self-signaling and between-tree communication via plant volatiles (airborne chemicals) released by herbivore-damaged leaves (e.g., eaten by gypsy moth larvae) that prime defenses (e.g., to attract parasitoids that prey on the larvae) in other leaves (within tree and, if close enough, those of adjacent trees).

In ecosystem context of growing in riverine environment (flood plains with alluvial soil), relationships with —insect predators, —fungal pathogens: —herbivores (e.g., rabbits, deer, and livestock, who both browse and trample the seedlings and saplings) —other animals (e.g., beavers, which build dams out of the saplings; cavities in living cottonwoods used for nesting and winter shelter by wood ducks, woodpeckers, owls, opossums, raccoons) —other tree species: compete with willows (which grow in same areas).

Nontelological (on part of trees) but can be affected by purpose-driven animal behavior (e.g., beavers fell poplars for dams) and by human activity (e.g., human damming and diversions of river waters).

Notes: aBefore Emancipation, neither U.S. slaves nor their children were granted citizenship rights, and they became citizens only after passage of the 1866 Bill of Rights and, in 1868, the Fourteenth Amendment (Steinman 2011).
bIt was not until 1924 that the U.S. government extended citizenship to all American Indians born within the territorial limits of the United States; reflecting this change, in the 1930 census the terminology shifted from Indians "in" the USA to Indians "of" the USA. Before 1924, the status of "citizen" was applied only to those American Indians granted citizenship by specific treaties, naturalization proceedings, and military service in World War I (Steinman 2011).
cThe 1882 Chinese exclusion act, which banned Chinese immigration for 10 years and also imposed new restrictions on reentry (including reassignment from citizen to "permanent alien") was renewed repeatedly and reversed only in 1943. The 1924 Immigration Act, designed to control "undesirable immigration" (especially by Jews and also by Asians), set quotas and restrictions (in relation to the U.S. composition, by national origins, in 1880) that were in effect until 1965 (Foner 1997; Zinn 2003).
dAccording to the U.S. government, the criterion "to become a citizen at birth" is that the person must "have been born in the United States or certain territories or outlying possessions of the United States, and subject to the jurisdiction of the United States; OR had a parent or parents who were citizens at the time of your birth (if you were born abroad) and meet other requirements; people can also become a citizen after birth if they "apply for 'derived' or 'acquired' citizenship through parents" or "apply for naturalization" (U.S. Citizenship and Immigration Services 2012). For discussion of the changing complexities of conceptualizing and defining nation-states and who counts as belonging to them, see Wimmer and Schiller 2002.

The importance of considering the intrinsic relationships—both internal and external—that are the integuments of living populations, themselves active agents and composed of active agents, is further illuminated through contrast to the classic case of a hypothetical population: the proverbial jar of variously colored marbles, used in many classes to illustrate the principles of probability and sampling. Apart from having been manufactured to be of a specific size, density, and color, there are no intrinsic relationships between the marbles as such. Spill such a jar, and see what happens.

As this thought experiment makes clear, the marbles will not reconstitute themselves into any meaningful relationships in space or time. They will just roll to wherever they do, and that will be the end of it, unless someone with both energy and a plan scoops them up and puts them back in the jar. Nor will a sealed jar of marbles change its color composition (i.e., the proportion of marbles of a certain color), or an individual marble change its color, unless someone opens the jar and replaces, adds, or removes some marbles or treats them with a color-changing agent. Hence, a purely statistical understanding of "populations," however necessary for sharpening ideas about causal inference, study design, and empirical estimation, is by itself insufficient for defining and analyzing real-life populations, including "population health."

That said, marbles do have their uses. In particular, they can help us visualize how causal determinants can structure population distributions of the risks of random individuals via what I term "structured chances."

Populations and Structured Chances

One long-standing conundrum in population sciences is their ability to identify and use data on population regularities to elucidate causal pathways, even though they cannot predict which individuals in the population will experience the outcome in question (Daston 1987; Desrosiéres 1998; Hacking 1990; Illari, Russo, and Williamson 2011; Porter 1981, 2002, 2003; Quetelet 1835; Stigler 1986; Strevens 2003). This incommensurability of population and individual data has been a persistent source of tension between epidemiology and medicine (Frost [1927] 1941; Greenwood 1935; Morris 1957; Rose 1992, 2008). Epidemiologic research, for example, routinely uses aggregated data obtained from individuals to gain insight into both disease etiology and why population rates vary, and does so with the understanding that such research cannot predict which individual will get the disease in question (Coggon and Martyn 2005). By contrast, medical research remains bent on using just these sorts of data to predict an individual's risk, as exemplified in its increasingly molecularized quest for "personalized medicine" (Davey Smith 2011).

Where marbles enter the picture is that they can, through the use of a physical model, demonstrate the importance of how population distributions are simultaneously shaped by both structure (arising from causal processes) and randomness (including truly stochastic events, not just "randomness" as a stand-in for "ignorance" of myriad deterministic events too complex to model). As Stigler has recounted (1997), perhaps the first person to propose using physical models to understand probability was Sir Francis Galton (1822–1911), a highly influential British scientist and eugenicist (figure 2), who himself coined the term "eugenics" and who held that heredity fundamentally trumped "environment" for traits influencing the capacity to thrive, whether physical, like health status, or mental, like "intelligence" (Carlson 2001; Cowan 2004; Galton 1889, 1904; Keller 2010; Kevels 1985; Stigler 1997). In his 1889 opus Natural Inheritance (Galton 1889), Galton sketched (figure 3) "an apparatus ...that mimics in a very pretty way the conditions on which Deviation depends" (Galton 1889, 63), whereby gun shots (i.e., marble equivalents) would be poured through a funnel down a board whose surface was studded with carefully placed pins, off which each pellet would ricochet, to be collected in evenly spaced bins at the bottom.

Galton termed his apparatus, which he apparently never built (Stigler 1997), the "Quincunx" because the pattern of the pins used to deflect the shot was like a tree-planting arrangement of that name, which at the time was popular among the English aristocracy (Stigler 1997). The essential point was that although each presumably identical ball had the same starting point, depending on the chance interplay of which pins it hit during its descent at which angle, it would end up in one or another bin. The accumulation of balls in any bin in turn would reflect the number of possible pathways (i.e., likelihood) leading to its ending up in that bin. Galton designed the pin pattern to yield a normal distribution. He concluded that his device revealed (Galton 1889, 66)

a wonderful form of cosmic order expressed by the "Law of Frequency of Error." The law would have been personified by the Greeks and deified, if they had known of it. It reigns with serenity and in complete self-effacement amidst the wildest confusion. The huger the mob, and the greater the apparent anarchy the more perfect is its sway ...each element, as it is sorted into place, finds, as it were, a pre-ordained niche, accurately adapted to fit it.

FIGURE 3. Producing population distributions: structured chances as represented by physical models.

Sources: Galton's Quincunx, Galton 1889, 63; physical models, Limpert, Stahel, and Abbt 2001 (reproduced with permission).

In other words, in accord with Quetelet's view of "l'homme moyen," Galton saw the order produced as the property of each "element," in this case, the gun shot.

However, a little more than a century later, some physicists not only built Galton's "Quincunx," as others have done (Stigler 1997), but went one further (Limpert, Stahel, and Abbt 2001): they built two, one designed to generate the normal distribution and the other to generate the log normal distribution (a type of distribution skewed on the normal scale, but for which the natural logarithm of the values displays a normal distribution) (figure 3). As their devices clearly show, what structures the distribution is not the innate qualities of the "elements" themselves but the features of both the funnel and the pins—both their shape and placement. Together, these structural features determine which pellets can (or cannot) pass through the pins and, for those that do, their possible pathways.

The lesson is clear: altering the structure can change outcome probabilities, even for identical objects, thereby creating different population distributions. For the population sciences, this insight permits understanding how there can simultaneously be both chance variation within populations (individual risk) and patterned differences between population distributions (rates). Such an understanding of "structured chances" rejects explanations of population difference premised solely on determinism or chance and also brings Quetelet's astronomical "l'homme moyen" and its celestial certainties of fixed stars back down to earth, grounding the study of populations instead in real-life, historically contingent causal processes, including those structured by human agency.

Rethinking the Meaning and Making of Means: The Utility of Critical Population-Informed Thinking

How might a more critical understanding of the substantive nature of real-life populations benefit research on, knowledge about, and policies regarding population health and health inequities? Drawing on table 2's conceptual criteria for defining who and what makes populations, table 4 offers four sets of critical public health propositions about "populations" and "study populations," whose salience I assess using examples of breast cancer, a disease increasingly recognized as a major cause of morbidity and mortality in both the global South and the global North (Althuis et al. 2005; Bray, McCarron, and Parkin 2004; Parkin and Fernandez 2006) and one readily revealing that the problem of meaningful means is as vexing for "the average woman" as for "the average man."

Propositions 1 and 2: Critically Parsing Population Rates and Their Comparisons

Consider, first, three illustrative cases pertaining to analyses of population rates of breast cancer:

  1. A recent high-profile analysis of the global burden of breast cancer (Briggs 2011; Forouzanafar et al. 2011; IHME 2011; Jaslow 2011), which estimated and compared rates across countries, accompanied by interpretative text, with the article stating, for example, that Colombia and Venezuela "...have very different trends, despite sharing many of the same lifestyle and demographic factors," followed by the inference that the "explanation of these divergent trends may lie in the interaction between genes and individual risk factors." (IHME 2011, 24)
  2. Typical reviews of the global epidemiology of breast cancer, which contain such statements as "Population-based statistics show that globally, when compared to whites, women of African ancestry (AA) tend to have more aggressive breast cancers that present more frequently as estrogen receptor negative (ERneg) tumors" (Dunn et al. 2010, 281); and "early onset ER negative tumors also develop more frequently in Asian Indian and Pakistani women and in women from other parts of Asia, although not as prevalent as it is in West Africa." (Wallace, Martin, and Ambs 2011, 1113)
  3. The headline-making news that the U.S. breast cancer incidence rate in 2003 unexpectedly dropped by 10 percent, a huge decrease (Kolata 2006, 2007; Ravdin et al. 2006, 2007).

Four Propositions to Improve Population Health Research, Premised on Critical Population-Informed Thinking

Proposition 1. Stating what should be obvious: the meaningfulness of means to provide insights into health-related population characteristics and their generative causal processes depends on how meaningfully the populations are defined in relation to the inherent intrinsic and extrinsic dynamic generative relationships by which they are constituted.
Corollary 1.1. A critical appraisal of the validity and meaning of estimated "population rates" of health-related phenomena (whether based on registry, survey, or administrative data or generated by mathematical models) requires an explicit recognition of populations as inherently relational beings.
Corollary 1.2. A critical comparison of population rates of health-related phenomena (at a given point in time or over time), and a formulation of hypotheses to explain observed differences and similarities, likewise requires an explicit recognition of populations as inherently relational beings.

Proposition 2. Structured chances—structured by a population's constitutive intrinsic and extrinsic dynamic relationships—drive population distributions of health, disease, and well-being, including (a) on-average rates, (b) the magnitude of health inequities, and (c) their change or persistence over time.
Corollary 2.1. Health inequities, arising out of population dynamics, are historically contingent, so that the risks associated with variables intended to serve as markers for structural determinants of health should be expected to vary by time and place.
Corollary 2.2. The manifestation of health, disability, and disease, at both the population level and the individual level, should be conceptualized as embodied phenotypes, not decontextualized genotypes.

Proposition 3. To improve scientific accuracy and promote critical thinking, persons used in population health studies should be referred to as "study participants," not the "study population," and whether they meet criteria for being a meaningful "population" should be explained, not presumed.
Corollary 3.1. Texts describing the study participants should—in addition to explaining the methods used to identify and include them—explicitly situate them in relation to the inherent intrinsic and extrinsic dynamic relationships constituting the society (or societies) in which they are based.
Corollary 3.2. If study participants are identified by methods using probability samples, the defining characteristics of the sampled populations must be explicated in relation to the intrinsic and extrinsic dynamic relationships constituting the population(s) at issue.

Proposition 4. The conventional cleavage of "internal validity" and "generalizability" is misleading, since a meaningful choice of study participants must be in relation to the range of exposures experienced (or not) in the real-world societies, that is, meaningful populations, of which they are a part.
Corollary 4.1. Although studies do not need to be "representative" to generate valid results regarding exposure-outcome associations, a critical appraisal of the observed associations requires situating the observed distribution (on-average level and range) of exposures and outcomes in relation to distributions observed among populations defined by the intrinsic and extrinsic dynamic relationships in the society (or societies) in which the study participants are based.
Corollary 4.2. The restriction of studies to "easy-to-reach" populations can, owing to selection bias, produce biased estimates of risk, lead to invalid causal inferences, and hamper the discovery of needed etiologic and policy-relevant knowledge.


What these three commonplace examples have in common is an uncritical approach to presenting and interpreting population data, premised on the dominant assumption that population rates are statistical phenomena driven by innate individual characteristics. Cautioning against accepting these claims at face value are propositions 1 and 2, with their emphases, respectively, on (1) critically appraising who constitutes the populations whose means are at issue and (2) critically considering the dynamic relationships that give rise to population patterns of health, including health inequities.

From the standpoint of proposition 1, the first relevant fact is that as a consequence of global disparities in resources (Klassen and Smith 2011) arising from complex histories of colonialism and underdevelopment (Birn, Pillay, and Holtz 2009), only 16 percent of the world's population is covered by cancer registries, with coverage of less than 10 percent within the world's most populous regions (Africa, Asia [other than Japan], Latin America, and the Caribbean), versus 99 percent in North America (Parkin and Fernandez 2006). Put in national terms, among the 184 countries for which the International Agency on Cancer (IARC) reports estimated rates, only 33 percent—almost all located in the global North—have reliable national incidence data (GLOBOCAN 2012). These data limitations are candidly acknowledged both by IARC (GLOBOCAN 2012) and in the scientific literature, including that on breast cancer (Althuis et al. 2005; Bray, McCarron, and Parkin 2004; Ferlay et al. 2012; Krieger, Bassett, and Gomez 2012; Parkin and Fernandez 2006). To generate estimates of incidence in countries lacking national cancer registry data, the IARC transparently employs several modeling approaches, based on, for example, a country's national mortality data combined with city-specific or regional cancer registry data (if they do exist, albeit typically not including the rural poor) or, when no credible national data are available, estimating rates based on data from neighboring countries (GLOBOCAN 2012).

A critical analysis of the population claims asserted in examples 1 and 2 starts by questioning whether the means at issue can bear the weight of meaningful comparisons and inference. Thus, relevant to example 1, Colombia has only one city-based cancer registry (in Cali), and Venezuela has no cancer registries at all (GLOBOCAN 2012). Moreover, the rates compared (Forouzanafar et al. 2011; IHME 2011) were generated by nontransparent modeling methods (Krieger, Bassett, and Gomez 2012) that have empirically been shown not to estimate accurately the actually observed rates in the "gold-standard" Nordic countries, known for their excellent cancer registration data (Ferlay et al. 2012). Second, relevant to the countries and geographic regions listed in example 2, the cancer incidence rates estimated by IARC are based (a) for Pakistan, solely on the weighted average for observed rates in south Karachi, (b) for India, on a complex estimation scheme for urban and rural rates in different Indian states and data from cancer registries in several cities, and (c) for western Africa, on the weighted average of data for sixteen countries, of which ten have incidence rates estimated based on those of neighboring countries, another five rely on data extrapolated from cancer registry data from one city (or else city-based cancer registries in neighboring countries), and only one of which has a national cancer registry (GLOBOCAN 2012). Critical thinking about who and what makes a population thus prompts questions about whether the data presented in examples 1 and 2 can provide insight into either alleged individual innate characteristics or into what the true on-average rate would be if everyone were counted (let alone what the variability in rates might be across social groups and regions). There is nothing mundane about a mean.

Proposition 2 in turn calls attention to structured chance in relation to the dynamic intrinsic and extrinsic relationships constituting national populations, with table 2 illustrating what types of relationships are at play using the example of the United States. It thus spurs critical queries as to whether observed national and racial/ethnic differences (if real, and not an artifact of inaccurate data) arise from innate (i.e., genetic) differences between "populations," as posed by examples 1 and 2. Two lines of evidence alternatively suggest these population differences could instead be embodied inequalities (Krieger 1994, 2000, 2005, 2011; Krieger and Davey Smith 2004) that arise from structured chances. The first line pertains to well-documented links among national, racial/ethnic, and socioeconomic inequalities in breast cancer incidence, survival, and mortality (Klassen and Smith 2011; Krieger 2002; Vona-Davis and Rose 2009). The second line stems from research that evaluates claims of intrinsic biological difference by examining their dynamics, as illustrated by the first investigation to test statistically for temporal trends in the white/black odds ratio for ER positive breast cancer between 1992 and 2005, which revealed that in the United States, the age-adjusted odds ratio rose between 1992 and 2002 and then leveled off (and actually fell among women aged fifty to sixty-nine) (Krieger, Chen, and Waterman 2011).

Relevant to example 3, these findings of dynamic, not fixed, black/white risk differences for breast cancer ER status likely reflect the socially patterned abrupt decline in hormone therapy use following the July 2002 release of results from the U.S. Women's Health Initiative (WHI) (Rossouw et al. 2002). This was the first large randomized clinical trial of hormone therapy, despite its having been widely prescribed since the mid-1960s (Krieger 2008). The WHI found that contrary to what was expected, hormone therapy did not decrease (and may have raised) the risk of cardiovascular disease, and at the same time, the WHI confirmed prior evidence that long-term use of hormone therapy increased the risk of breast cancer (especially ER+). Thus, before the initiative, hormone therapy use in the United States was highest among white women with health insurance who could afford, and were healthy enough, to take the medication without any contraindications (Brett and Madans 1997; Friedman-Koss et al. 2002). Population-informed thinking would thus predict that any drops in breast cancer incidence would occur chiefly among those sectors of women most likely to have used hormone therapy. Subsequent global research has borne out these predictions (Zbuk and Anand 2012), including the sole U.S. study that systematically explored socioeconomic differentials both within and across racial/ethnic groups, which found that the observed breast cancer decline was restricted to white non-Hispanic women with ER+ tumors residing in more affluent counties (Krieger, Chen, and Waterman 2010). These results counter the widely disseminated and falsely reassuring impression that breast cancer risk was declining for everyone (Kolata 2006, 2007). They accordingly provide better guidance to public health agencies, clinical providers, and breast cancer advocacy groups regarding trends in breast cancer occurrence among the real-life populations they serve.

Together, these examples illuminate why proposition 2's corollary 2.2 proposes conceptualizing the jointly lived experience of population rates and individual manifestations of health, disease, and well-being as what I would term "embodied phenotype." Inherently dynamic and relational, this proposed construct meaningfully links the macro and micro, and populations and individuals, through the play of structured chance. It also is consonant with new insights emerging from the fast-growing field of ecological evolutionary developmental biology ("eco-evo-devo") into the profound and dynamic links among environmental exposures, gene expression, development, speciation, and the flexibility of organisms' phenotypes across the life span (Gilbert and Epel 2009; Piermsa and van Gils 2011; West-Eberhard 2003). Only just beginning to be integrated into epidemiologic theorizing and research (Bateson and Gluckman 2012; Davey Smith 2011, 2012; Gilbert and Epel 2009; Kuzawa 2012; Relton and Davey Smith 2012), eco-evo-devo's historical and relational approach to biological expression affirms the need for critical population-informed thinking.

Propositions 3 and 4: Study Participants, Study Populations, and Causal Inference

Finally, a population-informed approach helps clarify, in accordance with propositions 3 and 4, why improving our understanding of "study populations," and thus study participants, matters for causal inference. Consider, for example, the 1926 pathbreaking epidemiologic study of breast cancer conducted by the British physician and epidemiologist Janet Elizabeth Lane-Claypon (1877–1967) (Lane-Claypon 1926), the first study to identify systematically what were then called "antecedents" of breast cancer (today termed "risk factors") and now also widely acknowledged to be the first epidemiologic case-control study, as well as the first epidemiologic study to publish its questionnaire (Press and Pharoah 2010; Winkelstein 2004). Quickly replicated in the United States in 1931 by Wainwright (Wainwright 1931), these two studies have recently been reanalyzed, using current statistical methods. The results show that their estimates of risk associated with major reproductive risk factors (e.g., early age at first birth, parity, lactation, and early age at menopause) are consistent with the current evidence (Press and Pharoah 2010).

Not addressed in the reanalysis, however, are the two studies' different results for occupational class, defined in relation to the women's employment before marriage. When these occupational data are recoded into the meaningful categories of professional, working-class nonmanual, and working-class manual (Krieger, Williams, and Moss 1997; Rose and Pevalin 2003), the data quickly reveal why the studies had discrepant results. Thus, Lane-Claypon concluded there was no "appreciable difference" in breast cancer risk by social class (Lane-Claypon 1926, 12) (x2 = 1.833; p = 0.4), whereas in the U.S. study risk was lower among the working-class manual women (x2 = 9.305; p = 0.01). Why? In brief, a far higher proportion of the British women were working-class manual (78.7% cases, 84.2% controls vs. the U.S. women: 48.8% cases, 62.5% controls), and a far lower proportion were professionals (6.5% cases, 4.2% controls, vs. the U.S. women: 23.8% cases, 20.7% controls). Just as Rose famously observed that if everyone smoked, smoking would not be identified as a cause of lung cancer (Rose 1985, 1992), when most study participants are from only one social class, socioeconomic inequalities in health cannot and will not be detected (Krieger 2007b). The net result is erroneous causal inferences about the relevance of social class to structuring the risk of disease, thereby distorting the evidence base informing efforts to address health inequities.

Critical population-informed thinking therefore would question the dominant conventional cleavage, in both the population health and the social sciences, between "internal validity" and "generalizability" (or "external validity") and the related endemic language of "study population"—routinely casually equated with study participants—and "general population" (Broadbent 2011; Cartwright 2011; Cook 2001; Kincaid 2011; Kukuall and Ganguli 2012; Porta 2008; Rothman, Greenland, and Lash 2008). One critical determinant of a study's ability to provide valid tests of exposure-outcome hypotheses is the range of exposure encompassed (Chen and Rossi 1987; Schlesselman and Stadel 1987); another is the extent to which participants' selection into a study is associated with important unmeasured determinants of the outcome (Pizzi et al. 2011). Given the social structuring of the vast majority of exposures, as evidenced by the virtually ubiquitous and dynamic societal patternings of disease (Birn, Pillay, and Holtz 2009; Davey Smith 2003; Krieger 1994, 2011; WHO 2008), meaningful research requires that the range of exposures experienced (or not) by study participants needs to capture the etiologically relevant range experienced in the real-world societies, that is, meaningful populations, of which they are a part. The point is not that ideal study participants should be a random sample of some "general population"; instead, it is that their location in the intrinsic and extrinsic relationships creating their population membership cannot be ignored.

Highlighting the need for critical population-informed thinking is advice provided in the widely used and highly influential textbook Modern Epidemiology (Rothman, Greenland, and Lash 2008). Although the text correctly states that "the pursuit of representativeness can defeat the goal of validly identifying causal relations," it further asserts that "one would want to select study groups for homogeneity with respect to important confounders, for highly cooperative behavior, and for availability of accurate information, rather than attempt to be representative of a natural population" (p. 146). "Classic examples" of the populations fulfilling these criteria are stated to be "the British Physicians' Study of smoking and health and the Nurses' Health Study, neither of which were remotely representative of the general population with respect to sociodemographic factors" (Rothman, Greenland, and Lash 2008, 146–47).

Of course, studies need accurate data, but the advice here raises more questions than it answers. First, just who and what is a "natural population"? —and, related, who is that "general population"? Second, might there be drawbacks to, not just benefits from, preferentially studying predominantly white health professionals and others with the resources to be "highly cooperative" and possess "accurate information"? Stated another way, what might be the adverse consequences on scientific knowledge and policymaking of discounting people that mainstream research already routinely and problematically calls "hard-to-reach" populations (Crosby et al. 2010; Shaghaghi, Bhopal, and Sheik 2011)? These populations include the disempowered and dispossessed, whose adverse social and physical circumstances mean that their range of exposures almost invariably differ, in both level and type, from those encountered by the effectively "easy-to-reach." Might it not also be critical for researchers to develop more inclusive approaches that could yield accurate etiologic and policy-relevant data on the distributions and determinants of disease among those who bear the brunt of health inequities (Smylie et al. 2012)? —a scientific task that necessarily requires contrasts in both exposures and outcomes between the social groups defined by the inequitable societal relationships at issue, whether involving social class, racism, gender, or other forms of social inequality (Krieger 2007b).

Reflecting on how who is studied determines what can be learned, the eminent British biologist Lancelot Hogben (1895–1975) (figure 2; Bud 2004; Werskey 1988), in his lucid and prescient 1933 book titled Nature and Nurture (Hogben 1933, 106), cogently observed:

Differences to which members of the same family or different families living at one and the same social level are exposed may be very much less than differences to which individuals belonging to families taken from different social levels are exposed. Experiment shows that ultraviolet light has a considerable influence on growth in mammals. In Great Britain, some families live continuously in the sooty atmosphere of an industrial area. Others spend their winters on the Riviera.

In other words, critical population-informed thinking is vital to good science.

Conclusion: Meaningful Means, Embodied Phenotypes, and the Structural Determinants of Populations and the People's Health

In conclusion, to improve causal inference and policies and action based on this knowledge, the population sciences need to expand and deepen theorizing about who and what makes populations and their means. At a time when the topic of causality in the sciences remains hotly debated by philosophers and researchers alike, all parties nevertheless agree that "the question of how probabilistic accounts of causality can mesh with mechanistic accounts of causality desperately needs answering" (Illari, Russo, and Williamson 2011, 20). As my article makes clear, the idea and reality of "population" reside at the nexus of this question. Clarifying the substantive defining features of populations, including who and what structures the dynamic and emergent distributions of their characteristics and components, is thus crucial to both analyzing and altering causal processes. For public health, this means sharpening our thinking about how structured chances, structured by the political and economic relationships constituting the societal determinants of health (Birn, Pillay, and Holtz 2009; Irwin et al. 2006; Krieger 1994, 2011), generate the embodied phenotypes that are the people's health.

As should be evident, the challenges to developing critical population-informed thinking are not purely conceptual; they are also political, because these ideas necessarily engage with issues involving not only the distribution of people but also the distribution of power and property and the societal relationships that bind individuals and populations, for good and for bad (Krieger 2011). Nearly two hundred years after Quetelet introduced his "l'homme moyen," the countervailing call for routinely measuring and tracking population health inequities, and not just on-average population rates of health, is only now gaining traction globally (WHO 2008, 2011). This is coincident with the ever-accelerating aforementioned genomic quest for "personalized medicine" (Davey Smith 2011), as well as the continued economic, social, political, and public health reverberations of the 2008 global economic crash (Benatar, Gill, and Bakker 2011; Stiglitz 2010). In such a context, clarity regarding who and what populations are, and the making and meaning of their means, is vital to population sciences, population health, and the promotion of health equity.


Althuis, M.D., J.M. Dozier, W.F. Anderson, S.S. Devesa, and L.A. Brinton. 2005. Global Trends in Breast Cancer Incidence and Mortality 1973–1999. International Journal of Epidemiology 34:405–12.

Bateson, P., and P. Gluckman. 2012. Plasticity and Robustness in Development and Evolution. International Journal of Epidemiology 41:219–23.

Benatar, S.R., S. Gill, and I. Bakker. 2011. Global Health and the Global Economic Crisis. American Journal of Public Health 101:646–53.

Biersack, A., and J.B. Greenberg, eds. 2006. Reimagining Political Ecology. Durham, NC: Duke University Press.

Birn, A.E., Y. Pillay, and T.M. Holtz. 2009. Textbook of International Health: Global Health in a Dynamic World. 3rd ed. New York: Oxford University Press.

Braatne, J.H., S.B. Rood, and P.E. Heillman. 1996. Life History, Ecology, and Conservation of Riparian Cottonwoods in North America. In Biology of Populus and Its Implications for Management and Conservation, ed. R.F. Stettler, H.D. Bradshaw Jr., P.E. Heilman, and T.M. Hinckley, 57–85. Ottawa: National Research Council of Canada, NRC Research Press.

Bray, F., P. McCarron, and D.M. Parkin. 2004. The Changing Global Patterns of Female Breast Cancer Incidence and Mortality. Breast Cancer Research 6:229–39.

Brett, K.M., and J.H. Madans. 1997. Difference in Use of Postmenopausal Hormone Replacement Therapy by Black and White Women. Menopause 4:66–70.

Briggs, H. 2011. Women's Cancers Reach Two Million. BBC News Health, September 14. Available at (accessed June 17, 2012).

Broadbent, A. 2011. Inferring Causation in Epidemiology: Mechanisms, Black Boxes, and Contrasts. In Causality in the Sciences, ed. P.M. Illari, F. Russo, and J. Williamson, 45–69. Oxford: Oxford University Press.

Bud, R. 2004. Hogben, Lancelot Thomas (1895–1975). Oxford Dictionary of National Biography. Oxford: Oxford University Press. Available at (accessed June 17, 2012).

Burian, R.M., and D.T. Zallen. 2009. Genes. In The Modern Biological and Earth Sciences, ed. P.J. Bowler and J.V. Pickstone. Cambridge: Cambridge University Press, Cambridge Histories Online. DOI:10.1017/CHOL9780521572019.024.

Butler, A.H.B. 1949. Obituary: Major Greenwood. Journal of the Royal Statistical Society: Series A (General), 112:487–89.

Carlson, E.A. 2001. The Unfit: A History of a Bad Idea. Cold Spring Harbor, NY: Cold Spring Harbor Press.

Cartwright, N. 2011. Predicting "It Will Work for Us": (Way) beyond Statistics. In Causality in the Sciences, ed. P.M. Illari, F. Russo, and J. Williamson, 750–68. Oxford: Oxford University Press.

Carver, T. 2003. Marx and Marxism. In The Modern Social Sciences, ed. T.M. Porter and D. Ross. Cambridge: Cambridge University Press, Cambridge Histories Online. DOI:10.1017/CHOL9780521594424.013.

Chen, H.-T., and P.H. Rossi. 1987. The Theory-Driven Approach to Validity. Evaluation and Program Planning 10:95–103.

Clarke, A., A.F. Agrò, Y. Zheng, C. Tickle, R. Jansson, H. Kehrer-Sawatzki, D.N. Cooper, P. Delves, J. Battista, G. Melino, D.J. Perkel, A.M. Hetherington, W.F. Bynum, J.M. Valpuesta, and D. Harper, eds. 2000–2011. Encyclopedia of Life Sciences. Chichester: Wiley. Available at (accessed September 6, 2011).

Coggon, D.I.W., and C.N. Martyn. 2005. Time and Chance: The Stochastic Nature of Disease Causation. The Lancet 365:1434–37.

Cole, J. 2000. The Power of Large Numbers: Populations, Politics, and Gender in Nineteenth-Century France. Ithaca, NY: Cornell University Press.

Cook, T.D. 2001. Generalization: Conceptions in the Social Sciences. In International Encyclopedia of the Social & Behavioral Sciences, ed. N.J. Smelser and P.B. Baltes, 6037–43. Oxford: Pergamon. DOI:10.1016/B0-08-043076-7/00698-7.

Cowan, R.S. 2004. Galton, Sir Francis (1822–1911). Oxford Dictionary of National Biography. Oxford: Oxford University Press. Available at (accessed June 17, 2012).

Crosby, R.A., L.F. Salazar, R.J. DiClemente, and D.L. Lang. 2010. Balancing Rigor against the Inherent Limitations of Investigating Hard-to-Reach Populations. Health Education Research 25:1–5.

Crow, J.F. 1990. R.A. Fisher: A Centennial View. Genetics 124:204–11.

Crow, J.F. 1994. Sewall Wright (1889–1988): A Biographical Memoir. Washington, DC: National Academy of Science.

Daintith, J., and E. Martin, eds. 2005. A Dictionary of Science. 5th ed. Oxford: Oxford University Press.

Dale, A.I., and S. Katz. 2011. Arthur L. Bowley: A Pioneer in Modern Statistics and Economics. London: World Scientific Publishing.

Daniel, T.M. 2004. Wade Hampton Frost: Pioneer Epidemiologist 1880–1938. Rochester, NY: University of Rochester Press.

Darwin, C. (1859) 2004. Origin of Species. Edison, NJ: Castle Books.

Daston, L.J. 1987. Rational Individuals versus Laws of Society: From Probability to Statistics. In The Probabilistic Revolution. Vol.1, Ideas in History, ed. L. Kruger, L.J. Daston, and M. Heidelberger, 295–304. Cambridge, MA: MIT Press.

Davenport, C.B. 1911. Heredity in Relation to Eugenics. New York: Henry Holt.

Davey Smith, G. 2003. Health Inequalities: Lifecourse Approaches. Bristol: Policy Press.

Davey Smith, G. 2011. Epidemiology, Epigenetics and the "Gloomy Prospect": Embracing Randomness in Population Health Research and Practice. International Journal of Epidemiology 40:537–62.

Davey Smith, G. 2012. Epigenesis for Epidemiologists: Does Evo-Devo Have Implications for Population Health Research and Practice? International Journal of Epidemiology 41:236–47.

Davey Smith, G., and J. Morris. 2004. A Conversation with Jerry Morris. Epidemiology 15:770–73.

Davis, K., and D. Rowland. 1983. Uninsured and Underserved: Inequities in Health Care in the United States. The Milbank Quarterly 61:149–76.

Desrosiéres, A. 1998. The Politics of Large Numbers: A History of Statistical Reasoning. Trans. Camille Naish. Cambridge, MA: Harvard University Press.

Dominguez-Bello, M.G., and M.J. Blaser. 2011. The Human Microbiota as a Marker for Migrations of Individuals and Populations. Annual Review of Anthropology 40:451–74.

Dunn, B.K., T. Agurs-Collins, D. Browne, R. Lubet, and K.A. Johnson. 2010. Health Disparities in Breast Cancer: Biology Meets Socioeconomic Status. Breast Cancer Research and Treatment 121:281–92.

Eldredge, N. 1999. The Pattern of Evolution. New York: Freeman.

Eldredge, N. 2005. Darwin: Discovering the Tree of Life. New York: Norton.

Eldredge, N., and M. Grene. 1992. Interactions: The Biological Context of Social Systems. New York: Columbia University Press.

Evans, R.G., M.L. Barer, and T.R. Marmor. 1994. Why Are Some People Healthy and Others Not? The Determinants of Health of Populations. New York: De Gruyter.

Falk, R. 2000. The Gene—A Concept in Tension: A Critical Overview. In The Concept of the Gene in Development and Evolution: Historical and Epistemological Perspectives, ed. P.J. Beurton, R. Falk, and H.-J. Rehinberger, 317–49. Cambridge: Cambridge University Press.

Fee, E. 1987. Disease and Discovery: A History of the Johns Hopkins School of Hygiene and Public Health, 1916–1939. Baltimore: Johns Hopkins University Press.

Fergus, C. 2005. Trees of New England: A Natural History. Guildford, CT: FalconGuide.

Ferlay, J., D. Forman, C.D. Mathers, and F. Bray. 2012. Re: "Breast and Cervical Cancer in 187 Countries between 1980 and 2010." The Lancet 379:1390–91.

Foner, E., ed. 1997. The New American History. Rev. and expanded ed. Philadelphia: Temple University Press.

Forouzanafar, M.H., K.J. Foreman, A.M. Delossantos, R. Lozano, A.D. Lopez, C.J. Murray, and M. Naghanvi. 2011. Breast and Cervical Cancer in 187 Countries between 1980 and 2010: A Systematic Analysis. The Lancet 378:1461–84.

Fox, S.E., P. Levitt, and C.A. Nelson III. 2010. How the Timing and Quality of Early Experiences Influence the Development of Brain Architecture. Child Development 81:28–40.

Friedman-Koss, D., C.J. Crespo, M.F. Bellantoni, and R.E. Andersen. 2002. The Relationship of Race/Ethnicity and Social Class to Hormone Replacement Therapy: Results from the Third National Health and Nutrition Examination Survey 1988–1994. Menopause 9:264–72.

Frost, C., H. Appel, J. Carlson, C.M. De Moraes, M. Mescher, and J.C. Schultz. 2007. Within-Plant Signaling by Volatiles Overcomes Vascular Constraints on Systemic Signaling and Primes Responses against Herbivores. Ecology Letters 10:490–98.

Frost, W.H. (1927) 1941. Epidemiology. In Papers of Wade Hampton Frost, M.D., ed. K.F. Maxcy, 439–52. New York: Commonwealth Fund.

Frost, W.H. (1928) 1976. Some Conceptions of Epidemics in General. American Journal of Epidemiology 103:141–51.

Galton, F. 1889. Natural Inheritance. London: Macmillan.

Galton, F. 1904. Eugenics: Its Definition, Scope, and Aims. Nature 70:82.

Gaziano, J.M. 2010. The Evolution of Population Science: Advent of the Mega Cohort. JAMA 304:2288–89.

Gibson, J.J. 1986. The Ecological Approach to Visual Perception. Hillsdale, NJ: Erlbaum.

Giddens, A., and D. Held, eds. 1982. Classes, Power, and Conflict: Classical and Contemporary Debates. Berkeley: University of California Press.

Gilbert, S.F., and D. Epel. 2009. Ecological Developmental Biology: Integrating Epigenetics, Medicine, and Evolution. Sunderland, MA: Sinaeur Associates.

GLOBOCAN. 2012. Data Sources and Methods. International Agency for Research on Cancer, World Health Organization. Available at (accessed June 17, 2012).

Greenhalgh, S. 1996. The Social Construction of Population Science: An Intellectual, Institutional, and Political History of Twentieth-Century Demography. Comparative Studies Society History 38:26–66.

Greenwood, M. 1935. Epidemics and Crowd Diseases: An Introduction to the Study of Epidemiology. London: Williams & Norgate.

Grene, M., and D. Depew. 2004. The Philosophy of Biology. Cambridge: Cambridge University Press.

Hacking, I. 1975. The Emergence of Probability. Cambridge: Cambridge University Press.

Hacking, I. 1990. The Taming of Chance. Cambridge: Cambridge University Press.

Hankins, F.H. 1968. Adolphe Quetelet as Statistician. New York: Arno Press.

Harraway, D.J. 2008. When Species Meet. Minneapolis: University of Minnesota Press.

Harré, R. 2001. Individual/Society: History of the Concept. In International Encyclopedia of the Social & Behavioral Sciences, ed. N.J. Smelser and P.B. Baltes, 7306–10. Oxford: Pergamon. DOI:10.1016/B008-043076-7/00125-X.

Harvey, D. 1996. Justice, Nature, and the Geography of Difference. Cambridge, MA: Blackwell.

Heesterbeek, H. 2005. The Law of Mass-Action in Epidemiology: A Historical Perspective. In Ecological Paradigms Lost: Routes of Theory Change, ed. K. Cuddington and B.E. Beisner, 81–106. Burlington, MA: Elsevier Academic Press.

Heilbron, J., L. Magnusson, and B. Wittrock, eds. 1998. The Rise of the Social Sciences and the Formation of Modernity: Conceptual Change in Context, 1750–1850. Dordrecht: Kluwer Academic Publishers.

Hey, J. 2011. Regarding the Confusion between the Population Concept and Mayr's "Population Thinking." Quarterly Review of Biology 86:253–64.

Hodge, J. 2009. Evolution. In The Modern Biological and Earth Sciences, ed. P.J. Bowler and J.V. Pickstone. Cambridge: Cambridge University Press, Cambridge Histories Online. DOI:10.1017/CHOL9780521572019.015.

Hogben L. 1933. Nature and Nurture. London: Williams & Norgate.

Hogben L. 1950. Major Greenwood: 1880–1949. Obituary Notices of Fellows of the Royal Society 7:138–54.

IHME (Institute for Health Metrics and Evaluation). 2011. The Challenge Ahead: Progress and Setbacks in Breast and Cervical Cancer. Seattle.

Illari, P.M., F. Russo, and J. Williamson. 2011. Why Look at Causality in the Sciences? A Manifesto. In Causality in the Sciences, ed. P.M. Illari, F. Russo, and J. Williamson, 3–22. Oxford: Oxford University Press.

Irwin, A., N. Valentine, C. Brown, R. Loewenson, O. Solar, H. Brown, T. Koller, and J. Vega. 2006. The Commission on the Social Determinants of Health: Tackling the Social Roots of Health Inequities. PLoS Medicine 3(6):e106.

Issac, J. 2007. The Human Sciences in Cold War America. Historical Journal 50:725–46.

Jansson, S., and C.J. Douglas. 2007. Populus: A Model System for Plant Biology. Annual Review of Plant Biology 58:435–458.

Jaslow, R. 2011. Breast, Cervical Cancer Rates Rising around World: Why? CBS News, September 15, 2011. Available at (accessed June 17, 2012).

Keller, E.F. 2000. The Century of the Gene. Cambridge, MA: Harvard University Press.

Keller, E.F. 2010. The Mirage of a Space between Nature and Nurture. Durham, NC: Duke University Press.

Kermack, W.O., and A.G. McKendrick. 1927. Contributions to the Mathematical Theory of Epidemics, Part I. Proceedings of the Royal Society Series A 115:700–721.

Kevels, D. 1985. In the Name of Eugenics: Genetics and the Uses of Human Heredity. New York: Knopf.

Kincaid, H. 2011. Causal Modeling, Mechanisms, and Probability in Epidemiology. In Causality in the Sciences, ed. P.M. Illari, F. Russo, and J. Williamson, 70–90. Oxford: Oxford University Press.

Klassen, A.C., and K.C. Smith. 2011. The Enduring and Evolving Relationship between Social Class and Breast Cancer Burden: A Review of the Literature. Cancer Epidemiology 35:217–34.

Kolata, G. 2006. Reversing Trend, Big Drop Is Seen in Breast Cancer. New York Times, December 15. Available at (accessed June 17, 2012).

Kolata, G. 2007. Sharp Drop in Rates of Breast Cancer Holds. New York Times, April 19. Available at (accessed June 17, 2012).

Krieger, N. 1994. Epidemiology and the Web of Causation: Has Anyone Seen the Spider? Social Science & Medicine 39:887–903.

Krieger, N. 2000. Epidemiology and Social Sciences: Towards a Critical Reengagement in the 21st Century. Epidemiology Review 11:155–63.

Krieger, N. 2001. Theories for Social Epidemiology in the 21st Century: An Ecosocial Perspective. International Journal of Epidemiology 30:668–77.

Krieger, N. 2002. Breast Cancer: A Disease of Affluence, Poverty, or Both?—The Case of African American Women. American Journal of Public Health 92:611–13.

Krieger. N. 2005. Embodiment: A Conceptual Glossary for Epidemiology. Journal of Epidemiology & Community Health 59:350–55.

Krieger, N. 2007a. Ways of Asking and Ways of Living: Reflections on the 50th Anniversary of Morris' Ever-Useful Uses of Epidemiology. International Journal of Epidemiology 36:1173–80.

Krieger, N. 2007b. Why Epidemiologists Cannot Afford to Ignore Poverty. Epidemiology 18:658–63.

Krieger, N. 2008. Hormone Therapy and the Rise and Perhaps Fall of US Breast Cancer Incidence Rates: Critical Reflections. International Journal of Epidemiology 37:627–37.

Krieger, N. 2011. Epidemiology and the People's Health: Theory and Context. New York: Oxford University Press.

Krieger, N., M. Bassett, and S. Gomez. 2012. Re: "Breast and Cervical Cancer in 187 Countries between 1980 and 2010." The Lancet 379:1391–92.

Krieger, N., J.T. Chen, and P.D. Waterman. 2010. Decline in US Breast Cancer Rates after the Women's Health Initiative: Socioeconomic and Racial/Ethnic Differentials. American Journal of Public Health 100:S132–S139; erratum, 972.

Krieger, N., J.T. Chen, and P.D. Waterman. 2011. Temporal Trends in the Black/White Breast Cancer Case Ratio for Estrogen Receptor Status: Disparities Are Historically Contingent, Not Innate. Cancer Causes and Control 22:511–14.

Krieger, N., and G. Davey Smith. 2004. Bodies Count & Body Counts: Social Epidemiology & Embodying Inequality. Epidemiology Review 26:92–103.

Krieger, N., and E. Fee. 1996. Measuring Social Inequalities in Health in the United States: An Historical Review, 1900–1950. International Journal of Health Services 26:391–418.

Krieger, N., D. Williams, and N. Moss. 1997. Measuring Social Class in US Public Health Research: Concepts, Methodologies and Guidelines. Annual Review of Public Health 18:341–78.

Kuhlmann, M. 2011. Mechanisms in Dynamically Complex Systems. In Causality in the Sciences, ed. P.M. Illari, F. Russo, and J. Williamson, 880–906. Oxford: Oxford University Press.

Kukuall, W.A. and M. Ganguli. 2012. Generalizability: The Trees, the Forest, and the Low-Hanging Fruit. Neurology 78:1886–91.

Kunitz, S.J. 2007. The Health of Populations: General Theories and Particular Realities. New York: Oxford University Press.

Kuzawa, C. 2012. Why Evolution Needs Development, and Medicine Needs Evolution. International Journal of Epidemiology 41:223–29.

Lane-Claypon, J.E. 1926. A Further Report on Cancer of the Breast with Special Reference to Its Associated Antecedent Conditions. Reports on Public Health and Medical Subjects no. 32. London: HMSO.

Lewontin, R. 2000. The Triple Helix: Gene, Organism, and Environment. Cambridge, MA: Harvard University Press.

Lilienfeld, A.M., ed. 1980. Times, Places, and Persons: Aspects of the History of Epidemiology. Baltimore: Johns Hopkins University Press.

Limpert, E., W.A. Stahel, and M. Abbt. 2001. Log-Normal Distributions across the Sciences: Keys and Clues. BioSci 51:341–52.

Mackenzie, D. 1982. Statistics in Britain, 1865–1930: The Social Construction of Scientific Knowledge. Edinburgh: Edinburgh University Press.

Martin, J., and R. Harré. 1982. Metaphor in Science. In Metaphor: Problems and Perspectives, ed. D.S. Miall, 89–105. Sussex, NJ: Harvester Press.

Marx, K. (1845) 1888. Theses on Feuerbach. First published, in an edited version, as an appendix to Engels F. Ludwig Feuerbach und der Ausgang der klassischen deutschen Philosophie. Mit Anghard: Karl Marx uber Feuerbach von Jarhe 1845. Stuttgart: J.H.W. Dietz. Available at (2002 trans. by Cyril Smith) (accessed June 17, 2012).

Mayr, E. 1988. Towards a New Philosophy of Biology: Observations of an Evolutionist. Cambridge, MA: Harvard University Press.

Mendelsohn, J.A. 1998. From Eradication to Equilibrium: How Epidemics Became Complex after World War I. In Greater Than the Parts: Holism in Biomedicine, 1920–1950, ed. C. Lawrence and G. Weisz, 303–31. New York: Oxford University Press.

Mitchell, M. 2009. Complexity: A Guided Tour. Oxford: Oxford University Press.

Morange, M. 2001. The Misunderstood Gene. Cambridge, MA: Harvard University Press.

Morris, J.N. 1957. Uses of Epidemiology. Edinburgh: E. & S. Livingston.

Mountain, J.L. 2001. Human Evolutionary Genetics. In International Encyclopedia of the Social & Behavioral Sciences, ed. N.J. Smelser and P.B. Baltes, 6984–91. Oxford: Pergamon, Oxford. DOI:10.1016/B0-08043076-7/03088-6.

Nash, K., and A. Scott, eds. 2001. The Blackwell Companion to Political Sociology. Malden, MA: Blackwell.

OED (Oxford English Dictionary) online. 2010. Draft revision June. Available at (accessed June 17, 2012).

Parkin, D.M., and L.M.G. Fernández. 2006. Use of Statistics to Assess the Global Burden of Breast Cancer. Breast Journal 12(suppl. 1):S70–S80.

Pearce, N. 1999. Epidemiology as a Population Science. International Journal of Epidemiology 28:S1015–S18.

Pflughoeft, K.J., and J. Versalovic. 2012. Human Microbiome in Health and Disease. Annual Review of Pathology: Mechanisms of Disease 7:99–122.

Piermsa, T., and J.A. van Gils. 2011. The Flexible Phenotype: A Body-Centered Integration of Ecology, Physiology, and Behavior. New York: Oxford University Press.

Pizzi, C., B. De Stavola, F. Merletti, R. Bellocco, I. dos Santos Silva, N. Pearce, and L. Richiardi. 2011. Sample Selection and Validity of Exposure-Disease Association Estimates in Cohort Studies. Journal of Epidemiology & Community Health 65:407–11.

Porta, M., ed. 2008. A Dictionary of Epidemiology. 5th ed. Oxford: Oxford University Press.

Porter, T.M. 1981. A Statistical Survey of Gases: Maxwell's Social Physics. Historical Studies in the Physical Sciences 12:77–116.

Porter, T.M. 1986. The Rise of Statistical Thinking, 1820–1900. Princeton, NJ: Princeton University Press.

Porter, T.M. 1995. Trust in Numbers: The Pursuit of Objectivity in Science and Public Life. Princeton, NJ: Princeton University Press.

Porter, T.M. 2002. Statistics and Physical Theories. In The Modern Physical and Mathematical Sciences, ed. M.J. Nye. Cambridge: Cambridge University Press, Cambridge Histories Online. DOI:10.1017/CHOL9780521571999.027.

Porter, T.M. 2003. Statistics and Statistical Methods. In The Modern Social Sciences, ed. T.M. Porter and D. Ross. Cambridge: Cambridge University Press, Cambridge Histories Online. DOI:10.1017/CHOL9780521594424.015.

Press, D.J., and P. Pharoah. 2010. Risk Factors for Breast Cancer: A Reanalysis of Two Case-Control Studies from 1926 and 1931. Epidemiology 21:566–72.

Quetelet, A. 1835. Sur l'homme et le development des ses facultés, ou essai de physique sociale. Paris. For a translation, see Quetelet, A. (1842) 1968. A Treatise on Man and the Development of His Faculties. Trans. R. Knox. Reprint, New York: Burt Franklin.

Quetelet, A. 1844. Recherches statistiques. Brussels: M. Hayez (Imprimeur de la Commission centrale de statistique).

Ramsden, E. 2002. Carving Up Population Science: Eugenics, Demography and the Controversy over the "Biological Law" of Population Growth. Social Studies of Science 32:857–99.

Ravdin, P.M., K.A. Cronin, N. Howlader, C.D. Berg, R.T. Chlebowski, E.J. Feuer, B.K. Edwards, and D.A. Berry. 2007. The Decrease in Breast-Cancer Incidence in 2003 in the United States. New England Journal of Medicine 356:1670–74.

Ravdin, P.M., K.A. Cronin, N. Howlader, R.T. Chlebowski, and D.A. Berry. 2006. A Sharp Decrease in Breast Cancer Incidence in the United States in 2003. Breast Cancer Research and Treatment 100(suppl.):S2 (abstract).

Relton, C.L., and G. Davey Smith. 2012. Is Epidemiology Ready for Epigenetics? International Journal of Epidemiology 41:5–9.

Richards, R.A. 2001 (online 2007). Species Problem—A Philosophical Analysis. In Encyclopedia of Life Sciences. New York: Wiley. DOI: 10.1002/9780470015902.a0003456.

Rose, D., and D.J. Pevalin, eds. 2003. A Researcher's Guide to the National Statistics Socio-economic Classification. London: Sage.

Rose, G.A. 1985. Sick Individuals and Sick Populations. International Journal of Epidemiology 14:32–38.

Rose, G.A. 1992. The Strategy of Preventive Medicine. Oxford: Oxford University Press.

Rose, G.A. 2008. Rose's Strategy of Preventive Medicine: The Complete Original Text, with a Commentary by Kay-Tee Khaw and Michael Marmot. Oxford: Oxford University Press.

Rosen, G. (1958) 1993. A History of Public Health. Expanded ed. Introduction by E. Fee; biographical essay and new bibliography by E.T. Morman. Baltimore: Johns Hopkins University Press.

Ross, D. 2003. Changing Contours of the Social Science Disciplines. In The Modern Social Sciences, ed. T.M. Porter and D. Ross, 275–305. Cambridge: Cambridge University Press.

Rossouw, J.E., G.L. Anderson, R.L. Prentice, A.Z. LaCroix, C. Kooperberg, M.L. Stefanick, R.D. Jackson, S.A. Beresford, B.V. Howard, K.C. Johnson, J.M. Kotchen, J. Ockene, and Writing Group for the Women's Health Initiative Investigators. 2002. Risk and Benefits of Estrogen plus Progestin in Healthy Postmenopausal Women: Principal Results from the Women's Health Initiative Randomized Controlled Trial. JAMA 288:321–33.

Rothman, K.J., S. Greenland, and T.L. Lash. 2008. Modern Epidemiology. 3rd ed. Philadelphia: Lippincott Williams & Wilkins.

Sarkar S. 1996. Lancelot Hogben, 1895–1975. Genetics 142:655–60.

Schank, J.C., and C. Twardy. 2009. Mathematical Models. In The Modern Biological and Earth Sciences, ed. P.J. Bowler and J.V. Pickstone. Cambridge: Cambridge University Press, Cambridge Histories Online. DOI:10.1017/CHOL9780521572019.023.

Schlesselman, J.J., and B.V. Stadel. 1987. Exposure Opportunity in Epidemiologic Studies. American Journal of Epidemiology 125:174–78.

Scott, J., and G. Marshall, eds. 2005. A Dictionary of Sociology. 3rd ed. Oxford: Oxford University Press.

Shaghaghi, A., R.J. Bhopal, and A. Sheik. 2011. Approaches to Recruiting "Hard-to-Reach" Populations in Research: Review of the Literature. Health Promotion Perspectives 1(2):1–9.

Smith, G.D. 2001. The Uses of "Uses of Epidemiology." International Journal of Epidemiology 30:1146–55.

Smylie, J., A. Lofters, M. Firestone, and P. O'Campo. 2012. Population-Based Data and Community Empowerment. In Rethinking Social Epidemiology: Towards a Science of Change, ed. P. O'Campo and J.R. Dunn, 68–92. Dordrecht: Springer Science + Business Media B.V.

Stanley, D., A.E. Phelps, and M.R. Banaji. 2008. The Neural Basis of Implicit Attitudes. Current Directions in Psychological Science 17:165–70.

Steinman, E. 2011. Sovereigns and Citizens? The Contested Status of American Indian Tribal Nations and Their Members. Citizenship Studies 15:57–74.

Stigler, S.M. 1986. The History of Statistics: The Measurement of Uncertainty before 1900. Cambridge, MA: Belknap Press / Harvard University Press.

Stigler, S.M. 1997. Regression towards the Mean, Historically Considered. Statistical Methods in Medical Research 6:103–14.

Stigler, S.M. 2002. The Average Man Is 168 Years Old. In Statistics on the Table: The History of Statistical Concepts and Methods, by S.M. Stigler, 51–65. Cambridge, MA: Harvard University Press.

Stiglitz, J. 2010. Freefall: America, Free Markets, and the Sinking World Economy. New York: Norton.

Strevens, M. 2003. Bigger Than Chaos: Understanding Complexity through Probability. Cambridge, MA: Harvard University Press.

Susser, M., and Z. Stein. 2009. Eras in Epidemiology: The Evolution of Ideas. New York: Oxford University Press.

Svensson, P.-G. 1990. Special Issue: Health Inequities in Europe. Social Science & Medicine 31:225–27.

Sydenstricker, E. 1933. Health and Environment. New York: McGraw-Hill.

Tabery, J. 2008. R.A. Fisher, Lancelot Hogben, and the Origin(s) of Genotype-Environment Interaction. Journal of the History of Biology 41:717–61.

Turner, J.H. 2005. A New Approach for Theoretically Integrating Micro and Macro Analyses. In The Sage Handbook of Sociology, ed. C. Calhoun, C. Rojek, and B. Turner, 405–22. Thousand Oaks, CA: Sage.

U.S. Citizenship and Immigration Services. 2012. Citizenship. Available at (accessed June 17, 2012).

Vona-Davis, L., and D.P. Rose. 2009. The Influence of Socioeconomic Disparities on Breast Cancer Tumor Biology and Prognosis: A Review. Journal of Women's Health 18:883–93.

Wainwright, J.M. 1931. A Comparison of Conditions Associated with Breast Cancer in Great Britain and America. American Journal of Cancer 15:2610–45.

Wallace, T.A., D.N. Martin, and S. Ambs. 2011. Interactions among Genes, Tumor Biology and the Environment in Cancer Health Disparities: Examining the Evidence on a National and Global Scale. Carcinogenesis 32:1107–21.

Walter, J., and R. Ley. 2011. The Human Gut Microbiome: Ecology and Recent Evolutionary Changes. Annual Review of Microbiology 65:411–29.

Weiss, K.M., and J.C. Long. 2009. Non-Darwinian Estimation: My Ancestors, My Genes' Ancestors. Genome Research 19:703–10.

Werskey, G. 1988. The Visible College: A Collective Biography of British Scientists and Socialists of the 1930s. Foreword by R.M. Young. London: Free Association Books.

West-Eberhard, M.T. 2003. Developmental Plasticity and Evolution. New York: Oxford University Press.

Whitehead, M. 1992. The Concepts and Principles of Equity and Health. International Journal of Health Services 22:429–45.

WHO (World Health Organization). 2008. Closing the Gap in a Generation: Health Equity through Action on the Social Determinants of Health. Commission on the Social Determinants of Health—Final Report. Geneva. Available at (accessed June 17, 2012).

WHO (World Health Organization). 2011. Rio Political Declaration on Social Determinants of Health. Rio de Janeiro, October 21. Available at (accessed June 17, 2012).

Wiehl, D.G. 1974. Edgar Sydenstricker: A Memoir. In The Challenge of the Facts: Selected Public Health Papers of Edgar Sydenstricker, ed. R.V. Kasius, 1–17. New York: Prodist, for the Milbank Memorial Fund.

Williams, R. 1985. Keywords: A Vocabulary of Culture and Society. Rev. ed. New York: Oxford University Press.

Wimmer, A., and N.G. Schiller. 2002. Methodological Nationalism and Beyond: Nation-State, Migration, and the Social Sciences. Global Networks 4:301–34.

Winkelstein, W., Jr. 2004. Claypon, Janet Elizabeth Lane- [married name Janet Elizabeth Forber, Lady Forber] (1877–1967). Oxford Dictionary of National Biography. Oxford: Oxford University Press. Available at (accessed June 17, 2012).

Winslow, C.-E.A., W.G. Smillie, J.A. Doull, and J.E. Gordon. 1952. The History of American Epidemiology, ed. F.H. Top. Sponsored by the Epidemiology Section, American Public Health Association. St. Louis: Mosby.

Wright, E.O., ed. 2005. Approaches to Class Analysis. Cambridge: Cambridge University Press.

Wright, S. 1920. The Relative Importance of Heredity and Environment in Determining the Pie-Bald Pattern of Guinea-Pigs. Proceedings of the National Academy of Sciences 6:320–32.

Yeo, E.J. 2003. Social Surveys in the Eighteenth and Nineteenth Centuries. In The Modern Social Sciences, ed. T.M. Porter and D. Ross. Cambridge: Cambridge University Press, Cambridge Histories Online. DOI:10.1017/CHOL9780521594424.007.

Young, T.K. 2005. Population Health: Concepts and Methods. 2nd ed. New York: Oxford University Press.

Zbuk, K., and S.S. Anand. 2012. Declining Incidence of Breast Cancer after Decreased Use of Hormone-Replacement Therapy: Magnitude and Time Lags in Different Countries. Journal of Epidemiology & Community Health 66:1–7.

Ziman, J. 2000. Real Science: What It Is and What It Means. Cambridge: Cambridge University Press.

Zinn, H. 2003. A People's History of the United States: 1492–Present. New York: HarperCollins.





Acknowledgments: No funding supported this work.

Address correspondence to: Nancy Krieger, Department of Society, Human Development and Health, Harvard School of Public Health, Kresge 717, 677 Huntington Avenue, Boston, MA 02115 (email: This email address is being protected from spambots. You need JavaScript enabled to view it.).



The Milbank Memorial Fund is an endowed operating foundation that engages in nonpartisan analysis, study, research, and communication on significant issues in health policy. In the Fund's own publications, in reports, films, or books it publishes with other organizations, and in articles it commissions for publication by other organizations, the Fund endeavors to maintain the highest standards for accuracy and fairness. Statements by individual authors, however, do not necessarily reflect opinions or factual determinations of the Fund.

©2012 Milbank Memorial Fund. All rights reserved. This publication may be redistributed electronically, digitally, or in print for noncommercial purposes only as long as it remains wholly intact, including this copyright notice and disclaimer.