The Pennsylvania State University
The Graduate School
Department of Philosophy
THE ETHICS OF CHANCE
A Thesis in
Philosophy
by
Ernst-Jan C. Wit
Submitted in Partial Fulfillment
of the Requirements
for the Degree of
Doctor in Philosophy
May 1997
Table of Contents
List of Figures
List of Tables
Acknowledgments
Introduction
Chapter 1. Faces of Chance
A. Synopsis
B. The Appearance of Chance
Etymology of chance
Unavoidability of luck
History of probability
C. Coincidence and Chance
Life is full of surprises
Coincidence and explanation
Inference from coincidence
Coincidence: a problem for Bayesians and frequentists
Deliberative rationality: beating the coincidence
D. Theories of Probability
Personal theory
Frequency theory
Chapter 2. Matters of Principle
A. Synopsis
B. Explanation and Responsibility
Who is to blame? --- theory of explanation
Responsibility and chance
C. Chance and Morality
What is Moral Luck?
Rejection of Moral Luck
The drunken driver
Regret --- the ordinary and the extraordinary
D. Justice and Luck: Fair Opportunity
Introduction
Principles of justice
History of the lottery
Personified luck and its dangers
The Lottery in Babylon
Lottery of life
Fair Chance --- a principle of distributive justice?
Principle of fair opportunity
Principle of redress
Chapter 3. Gaming and Deciding
A. Synopsis
B. Morality and Game Theory
A contractarian enterprise?
From distributive to allocative justice
Optimal games
Expectation, a guide to action?
C. Competing Risks
Choice between two evils --- the best safety effort
Cognitive vs. health risks --- competing hypotheses
Chapter 4. Safe shall be my going...
A. Synopsis
B. Risk
History of risk
Some empirical facts on risk-behavior
`Risky' systems
C. Risk, a Free Ride
Risk, tradable good or externality?
Free rider problem of risk
D. Intervention
Insurance
Government intervention
Distributive justice and regulation
Appendix A. Bayesian Calculations
Appendix B. Sample Size and Type II Error in Carcinogen Studies
Bibliography
Figure 1. How drawing a face card of Hearts is a coincidence
Figure 2. Models of moral attribution
Figure 3. Events without explanations
Figure 4. Regret's eroding effect on deliberative rationality
Figure 5. Fairness and optimality
Figure 6. Mixed strategies always have an equilibrium
Figure 7. Relationship between the utility of (C) 1 out 1 or (D) 1 out 4 removed
Figure 8. Risk-behavior: risk versus benefit*
Figure 9. Explanatory lotteries
Figure 10. Value of the future
Figure 11. Clear identifiability of hypotheses if enough observations
Table 1. Coin-flip data (I)
Table 2. Coin-flip data (II)
Table 3. Coin-flip data (III)
Table 4. The Sunrise Problem
Table 5. Sunrise data
Table 6. Forms of responsibility
Table 7. Possible moral attributions in the presence of uncertainty
Table 8. Possible Gauguin lotteries
Table 9. `Counter-example' Rawls' maximin
Table 10. `To go or not to go?' That's the question.
Table 11. Prisoner's Dilemma
Table 12. Binmore and the Prisoner's Dilemma*
Table 13. Possible outcomes in the two bullet games
Table 14. Probabilities of possible outcomes in the two bullet games
Table 15. Survival probabilities
Table 16. Possible results of hypothesis testing
Table 17. Probabilities of the possible results of hypothesis testing
Table 18. Different risk systems
Table 19. Unknown risks are positively selected
Acknowledgments
I would like to thank my advisor, Professor J. J. Kockelmans, for his academic and personal support. I am also grateful to the other members of my Ph.D. committee, Professor K. Chatterjee, Professor S. Kemal and Professor B. G. Lindsay. Without the generous support of the Department of Statistics at the Pennsylvania State University, in particular the head, Professor J. L. Rosenberger, completion of this dissertation would have been very difficult. A final word of gratitude and love goes to my parents for always being there.
The following text is divided in four chapters. Each of the chapters has its own and unique place in the argumentative structure of this thesis. Chapter 1 is the engine of the thesis: it provides the necessary background information and develops the notions and concepts that will be of importance in this dissertation. In order to make the argument lift off, two wings are needed. Chapter 2 is the morally normative wing, whereas Chapter 3 constitutes the technically normative wing of the thesis. The first three chapters together will carry Chapter 4, in which the implications of the theory are discussed.
Chance complicates the apparent control that human beings have over their life-world. Although the scientific community has come a long way in taming probabilities, coincidence will always remain unavoidable and is principally irreducible. It comes, therefore, as no surprise that chance affects a human life deeply --- as deep as a person's convictions, beliefs, and other matters central to her life-plan. If she had not missed her train, she would never have met her spouse. She could have, by chance, grown up in the slumps of Brooklyn, instead of in a Californian suburb. Had she lived in other times, she may have been a black slave in Georgia instead of an American citizen with equal rights. Had her father been someone else, she would not have been born as a Latino immigrant's daughter, but maybe as the son of Governor Pete Wilson.
Despite the taming of probability during the last three centuries, coincidence can never be confined completely. In the first chapter we shall show that coincidence --- like luck --- differs from chance and probability in its temporal constitution, and is thereby outside human reach. A coincidence is an a posteriori recognition of a synchronic occurrence of two independent --- or partly conflicting --- events, whereas a chance or probability is a matter of a priori anticipation. We shall show in what way the conceptual confusion between coincidence and probability is responsible for incorrect inference. Although we may be able to put probability to work for us, coincidence is a lazy laborer. The only way to beat coincidence is by accepting its laziness and trying to make one's judgments and actions immune from it. How can this be done? How can we beat coincidence and restore some kind of autonomous agency for ourselves? This is one of the central questions of this thesis. In brief, the answer runs as follows: an agent's action is immune to coincidence if she acts in such a way that she will not have to regret her action --- no matter how it finally works out. We shall call this action in accordance with deliberative rationality.
Sometimes things just occur to a person. How should that affect our moral judgment of her? What does it mean that things just occur to someone? It seems natural to hold her responsible to the extent that she is the cause of what happens. The totality of causes for any event is too overwhelming, and it seems impossible to single out the cause. Subsequent skepticism may endanger moral judgment. The problem is that things are not as they seem: one drunken driver comes home safely, another kills a pedestrian. In Chapter 2 we shall develop a probabilistic concept of explanation that rescues the concept of responsibility in the presence of uncertainty.
Our theory of explanation is based on frequentist ideas of probability, specifically the concept of homogeneity. We shall show that the homogeneity requirement of the explanans assures that the explanation is relevant even in the presence of uncertainty. Modern theories of moral luck have eroded a meaningful application of the concept of responsibility. These ideas are grounded in an epistemic skepticism about the possibility of attributing events to agents in the presence of chance and luck.
Yet the aim of making morality immune to luck is bound to be disappointed. The form of this point which is most familiar, from discussions of freewill, is that the dispositions of morality, however far back they are placed in the direction of motive and intention, are as `conditioned' as anything else.
Theories of moral luck therefore argue that a person should be responsible for any event that occurs to her, even though the event may be beyond her control --- the concept of control is illusory anyway.
We shall not only show that the concept of moral luck is inconsistent, but also demonstrate that a meaningful concept of responsibility is possible. From our analysis of moral luck we shall derive rules of action that bear a strong similarity with the way to beat coincidence. Human beings should act in such a way as to minimize the maximum anticipated regret of their actions. This means that a person should ask herself whether she can still identify with her choice of action, even if the course of events happens to turn out for the worse. In dealing with uncertainty one should be more concerned with the extraordinary than with the ordinary.
The contingencies of human life affect our social-economic place in society. Life is a giant lottery, in which the prizes are handed out at birth. Can this chance distribution be called fair? Everyone did have a fair chance to get the best starting position in life, or not? What does it mean to have a fair chance? Is the fairness of chance a moral principle? We shall argue that the concept `a fair chance' is misleading because chance is morally void. So, the only way the lottery of life can be called fair is when the random distribution at birth is made subject to additional moral rules of fairness; these rules are called the principles of distributive justice. The most important principle of distributive justice is the principle of fair opportunity that neutralizes the undeserved advantages in the distribution of goods and evils by the redistribution of starting positions. The principle of redress is an active application of the principle of fair opportunity. It redistributes some of the actual advantages of one group to those who are disadvantaged. We shall propose rules for the proper application of the principle of redress.
Theories of Social Choice have proposed game-theoretic and decision-theoretic methods to divide society's pie optimally. Do these considerations of optimality have moral significance? In Chapter 3 we shall study the connection between morality and formal distributive theories. It shall be argued that formal distributive principles --- such as maximization of expected utility --- are empty abstractions and that it is in the interest of the game-theorist and the decision-theorist not to give it any normative value. Economic theory, in general, is at best a set of tautologies that can function as a descriptive model, but never as a set of prescriptive rules for distributive justice.
Game theory, decision theory and statistics are tools for the moral philosopher, and we shall study the implications of these formal theories for moral decision making. Life is a choice-situation under various forms of uncertainty. The decision maker risks to have misunderstood the situation, or the situation itself may pose risks. In the second part of Chapter 3 we shall discuss these matters. First, we shall show, with decision theoretic means, that it is most efficient to reduce the highest risks. Efforts by society to implement safety measures should be coordinated to guarantee that society's money is spent most efficiently. Secondly, we shall reveal risky statistical practices in the Environmental Protection Agency, in which health risks are routinely ignored for the sake of minimizing `cognitive risks', i.e., for the sake of obeying stringent and traditional scientific criteria of statistical hypothesis testing. It is our claim that these health risks deserve serious consideration on the grounds of deliberative rationality; even though there may not have been enough statistical evidence to conclude with a high level of significance that a certain substance is a carcinogen, in the face of uncertainty it is our moral duty to be cautious --- a position for which we argued in Chapter 2.
We shall discuss some implications of the first three chapters in the final chapter on risk and safety. On the one hand, the twentieth century has shown a growing awareness of the predominance of risk in human society, however, on the other hand, the solutions to risk issues have been characterized by technocratic approaches, regulatory failures, and little awareness of moral implications. We could have market forces regulate the risks in society. Libertarians argue for the efficient and moral superiority of this solution. However, we shall show that risk provides ideal opportunities for free rides that upset the market outcome. Moreover, the fact that risks are as blind as any other chance, there is a need for governmental intervention to neutralize the adverse consequences of risks and to provide a sufficient level of safety for its people and future generations.
Chapter 1. Faces of Chance
Chance has had a tumultuous conceptual history, in which people, more often than not, have not understood what was at stake. However, the pressure of our times in which risk has become a constitutive factor of our lives demands a conceptual analysis of chance. It is an inborn struggle to overcome what is uncertain and out of our control. We attempt to control the circumstances in which we are involved, the actions that we perform, etcetera. Our tales and stories attempt a similar thing. Martha Nussbaum says, "It occurred to me to ask myself whether the act of writing about the beauty of human vulnerability is not, paradoxically, a way of rendering oneself less vulnerable and more in control of the uncontrolled elements in life." Over the course of the last three centuries developments in the theory of probability have indeed increased our control over chance.
However, not all forms of chance have been tamed by technological-scientific progress. Certain forms of `chance' will be impossible to tame by merely technical means. It is necessary in our analysis to address the confusion originating from the conceptual history of chance and from life itself. Nussbaum did not feel the need to explain what she means by luck. "I shall use the word `luck' in a not strictly defined but, I hope, perfectly intelligible way, closely related to the way in which the Greeks themselves spoke of tuche." For our purposes this does not suffice. On the one hand we shall classify concepts such chance, uncertainty, risk and probability together. They are a priori, quantified or quantifiable measures of randomness. On the other hand, luck, fortune, and, most importantly, coincidence have a distinctly different temporal structure as opposed to the first group of concepts. They are a posteriori in nature, and often not quantifiable. Confusion of these two basic sets of categories introduces a great potential for bias of inference.
Both Bayesian and frequentist theories of probability are affected, if they neglect to make this distinction. In several examples we shall show that both theories need `genuine' randomness to make meaningful inferences. We shall briefly look at what the two theories have to offer, and we shall take important ideas from both theories for subsequent chapters. Bayesian inference will return in the fourth chapter, whereas condition of homogeneity for the frequentist theory of probability is an important element in the theory of explanation in the second chapter.
The accidental is an ontological concept, defining a condition of a world, and of a lawful world. Only in this lawful infinity can freedom be found or the secular become sacred.
O fortune, fortune! All men call thee fickle.
William Shakespeare ---
Romeo and Juliet (Act 3, scene 5)
The vocabularies of many European languages have the ability to express the accidental occurrence of certain events, having either a positive or negative influence on those who are subject to it. In English these are chance, coincidence, hazard, risk, probability, randomness, luck, and fortune, with activities corresponding to them, such as betting, gambling, and lotteries. We shall look into the etymology of some of these concepts for a hint of a proper application.
In English the word luck expresses a fortunate and, at least partially, unanticipated occurrence. It has been derived from the fifteenth century Middle High German gelücke. Gelücke or its modern German counterpart, Glück, means both happiness and luck. Sie ist glücklich means that she is happy or that she is lucky, and often both. The English word `luck' has the analytic advantage of having only one generic meaning. However, the German Glück is a mark of a pervasive intuition that the conditions of morality are partially the same as the conditions of good luck. Proverbs in several languages celebrate the inner connection between the conditions of fortune and happiness. The English language has for instance: `Lucky men need no counsel.' `Luck is better than wisdom.' `Good luck beats early rising.' This intuition has been explored in several modern ethical treatises on the issue of moral luck, arguing that morality is indeed rooted in the accidental or nowhere at all. Is having Glück (luck) sufficient for Glück (being happy)? Is being happy, i.e., being morally successful, identical with being fortuitously successful? In other words, can we claim moral appraisal for being lucky? We shall answer this question in the second chapter when we deal with the issue of moral luck.
There is another aspect to luck. In Dutch, which has evolved from Middle High German, the verb lukken or gelukken means "to succeed with a certain help of luck." Normally this verb is subjected to events and not to persons. In Dutch, one says that the revolution lukte (lucked out), rather than that the rebels did. Apparently, luck acts as an impersonal force. It refers to certain essential features of chance, as for instance, being out of the subject's control. Moreover, luck is always attributed a posteriori.
Both chance and coincidence both have their etymological roots in the Latin verb cadere, to fall, and its conjunction, cadens, falling. It reflects the idea that chance is something that falls down from the heavens onto the people. Chance has often been considered to be authored by God(s) in heaven and thus came --- quite inappropriately, as we shall argue --- to be connected with fate and necessity.
Hazard's etymological derivation lies in the Arabic word for a die, az-zahr. Gambling is an old practice. It goes back at least as far as ancient Egypt where people used four-sided astragali made from animal heel-bones. The first known tract on gambling was written by the Roman emperor Claudius (10 BC-54 AD). Gambling, from ancient French gamen (to play), has always and everywhere been pervasive among all social classes. Interestingly, the word wedding finds its origins in gambling. The verb `to wed' is related to the Middle High German word for betting. In Dutch wedden still means to bet. According to the historian Johan Huizinga (1872-1945), in Medieval Europe a wedding used to be a contract between two persons that were gambling. The contract specified that the one player promised to give away his daughter if he lost the gamble.
In an attempt to produce some income in the early sixteenth century Florence organized a lottery, which it called La Lotto. Although not the first lottery, it became a model for all subsequent lotteries and its name became a noun. In a lottery the prizes are specified in advance and the random activity lies in the distribution of these prizes. When we shall discuss the issue of distributive justice, we shall use the model of a lottery to present the problem of whether fairness exists in the presence of chance.
The concept of risk was first employed in the fifteenth century merchant vocabulary in Central and Western Europe, where it stood for financial speculation. The concept developed in the Italian city states, probably from the verb rischiare, meaning to endanger or to wager. The verb itself originates, via risicare, from the Greek r i z a , which means cliff. Risicare came to mean to go around the cliff, much in the same sense as the noun clipper is intended, and consequently they acquired their contemporary meaning, rischiare, to speculate. Until the nineteenth century the use of risk was exclusively in the domain of economics. During the same period there was an important shift of meaning. Whereas previously the idea of risk had been connected with the hope and expectation of the one who financed the operation in the individual who executed it, in the course of the seventeenth and eighteenth century the newly gained understanding of probabilities and mathematical expectations transformed the concept of risk into long run average wins and losses away from individuality.
Aristotle's definition of a human being as a rational animal describes a particular human tension constitutive of its being. The tension stands for the complex interaction between spirit and matter. Reason, by its very nature, is colonial, imperial. It aims to take hold of whatever is presented to it. Also, reason is reflexive, bends back on itself, and in this move it can recognize its boundaries; these boundaries are not to make life more interesting, but they constitute a fundamental separation of the mind with the beyond, i.e., that what fundamentally transcends reason. One such transcendent aspect is luck. Our laws of chance go only so far. The cognitive separation of the present from the future, and that of knowledge from ignorance constitute two domains of luck, which present themselves as transcending aspects to reason, which in reflection recognizes them as its boundary.
That luck transcends the imperial grasp of reason, i.e., the absence of predictive power, produces for human psychology some sort of `suspenseful interest.' Suspense can be seated in the unexpected --- at least from the point of view of someone who undergoes it. We can find pleasure in the unexpected, at least if it is at some spatial or temporal distance. Unless a person is suicidal, she would not consider it pleasurable to be caught by surprise in a hurricane at sea. If she identifies herself with a character in a movie that is caught in a similar hurricane, she might find it suspenseful. This is Aristotle's catharsis, the purification of the emotions, at its best. Or, if she re-experiences the event while telling it to friends, after all has been said and done, there could very well be a pleasurable aspect to it. Otherwise, there is something stressful about luck and the unexpected. Both confront us with our rational limit.
Rescher misconstrues the psychological dimension of luck as its ontological essence.
We need (and apparently do actually have) a balance --- a world that is predictable enough to make the conduct of life manageable and --- by and large --- convenient, but unpredictable enough to make room for an element of suspenseful interest.
The emotional aspect of luck should be distinguished from the nature of luck. It is rather a platitude to assert that "we would find it horrible to live in a luckless world," as Rescher does. He confuses the psychological realm with the epistemic realm. It might sound appealing to assert that we would not want to eliminate luck from our lives. However, in a luckless world there would be no we. Luck is the cutting edge of the time line on which human beings experience themselves; luck is the expression of human finitude, both in time and in rational capabilities. To eliminate luck, by means of a thought-experiment, is to eliminate, ontologically and epistemically, humankind – consequently, there would be no sense talking about whether we would like such a situation or not. It is a mere phantom to think that we need unpredictability or otherwise we would bore ourselves to death. Unpredictability does not exist to suit human needs. It exists because otherwise there would be no finitude and certainly no humankind.
History of probability
Classical probability theory arrived when luck was banished.
When De Mere asked Pascal how the stakes should be divided between two gamblers if a game of chance is interrupted, he unknowingly initiated new theoretical grounds. First of all, is it possible to subject that what is a matter of mere chance to solid calculations? And, how much does one risk to lose or stand a chance to win in the face of uncertainty. In answering to those questions, Pascal developed the notion of expectation as a type of fair exchange or contract: a certain amount for which it is fair to forego the gamble. During the same period, seventeenth century Europe faced a spiritual crisis. The ideal of certain knowledge was undermined by reformation and skepticism and religious belief was under attack. Pascal made a strong attempt to resist this charge with the new means that he had created for the problem of dividing a stake. Pascal introduced a wager betting on the existence of God. Pascal showed that it was in the gambler's selfish advantage to act as if he believed in God, because the expected gain by believing in God was much higher than by not believing.
The development of the concept of probability has shown two essential features that were already present in its seventeenth century origins: the connection of probability and the magnitude of harm or profit --- the issue of risk --- and interrelatedness of the problems and solutions of probability. Pascal was confronted with two distinct issues that he tried to solve in a similar way using revolutionary new concepts. These origins of probability as a scientific concept foreshadowed the way in which probability would conquer, or rather, colonize the world in the course of the following centuries. Pascal showed that the fruitful application of one concept in one realm, can be used to solve another problem in a completely different realm. Probability theory hopped from one field to another, sometimes bluntly applying the results it had found in one to problems of the next. Probability acted like a parasite, settling there where it could gain most.
From Pascal's time on the dice were, literary and figuratively, rolled. Probability came to be applied in a growing number of areas: it assisted gamblers, it modeled uncertain evidence in a courtroom and it described the life-time of people, which enabled insurance companies to be more competitive. In the beginning the theory was thought to describe the reasonable intuitions of an impartial judge or a canny merchant. In the eighteenth century we can see a shift in the direction of prescription rather than description. People as for instance Laplace showed that probability theory often superseded intuitions, and by the beginning of the nineteenth century it had become a tool rather than a model for enlightenment. In the course of the nineteenth century, two branches started to grow out of a common root: the calculus of probabilities. On the one hand, people such as De Morgan tried to encapsulate probability `finally' within the safe realm of mathematics, whereas, on the other hand, sociologists like Quetelet applied the new theoretical findings --- the law of large number and the central limit theorem --- to social phenomena, creating a statistical-empirical theory. This movement resulted in a complete separation in the second quarter of the twentieth century, when Kolmogorov wrote his Foundation of the Theory of Probability and when around the same time Fisher established the first sound theory of statistics. In the same period, Von Neumann and Morgenstern applied the newly gained insights in the development of Game Theory, which explicitly dealt with making decisions under uncertainty, thereby completing the circle back to Pascal's problem of the wager.
Where observation is concerned, chance favours only the prepared mind.
Louis Pasteur (1822-1895)
This keen observation by Pasteur will be the central theme of this section. Chance only exists by virtue of anticipation. To think that chance is `out-there-to-be-discovered' is a misconception. Coincidence will be distinguished from chance and it will be shown that chance --- not coincidence --- should be the concern of the statistical sciences.
Another candidate for expressing the randomness of life is concept of coincidence. On first sight it seems to be an appropriate term for describing the essential features of chance. "It is such a coincidence," an older student told me one day, "that our daughter has her birthday one day before my husband, and my son in law has his birthday a day before mine." Indeed, it is a coincidence, I nodded, "But," I told her, "although it is definitely a coincidence, it is not chancy in a strict way." She shook her head and she explained to me that this event was very unlikely to happen, so how can I say that this was not chancy at all? Well, understand me well, it was indeed unlikely to happen prior to the facts, but after the facts the concept of probability loses its validity.
Coincidence functions in a distinctively different temporal-semantic framework than chance. Chance is only meaningful without information about the actual occurrence or non-occurrence of the event, typically because the event lies in the future. Coincidence, on the other hand, pertains to either present or past. Coincidence is derived from the Latin stem: con incidere, to fall together. The contemporary use of coincidence has preserved that meaning. A circumstance is called a coincidence when two, unrelated rather than unlikely, events fall together, i.e., happen at the same time, coincide. I returned to my hometown, and met my favorite Latin teacher in a shop. That was a coincidence. He and I happened to be in the same store at the same time; to be sure, it would have been a similar coincidence if I had met an old school friend there. It would have aroused in me the same kind of surprise and would have stimulated me to say the same expression: "What a coincidence!" In the case of my student, she would have been surprised if her daughter and son-in-law had had their birthdays switched, or if her daughter had had her birthday exactly half a year before hers and her son in law exactly half a year before her husband's birthday, or some other pattern.
The semantic structure of coincidence is different from that of chance. Coincidence is essentially a posteriori, while chance is an a priori property. For sure, it is possible to anticipate a coincidence in saying that it would be a coincidence to meet an old friend in a bar in New York. And, similarly, we can talk about chance after the fact, in saying that it was unlikely that that specific high school teacher would be in that store at time. In doing so, we can easily create misconceptions that are so prevalent. Often people think of a coincidence as something unlikely, but we claim that this is not so. Is it unlikely that I would meet an old friend in a bar in Soho if every afternoon I have a coffee and often in the evening I have a beer there? A little probability theory suffices to show that over the length of my lifetime I am, in fact, very likely to meet an old friend there. Nonetheless, it is completely justified call this a coincidence, because if I will meet John there three years from now, then my presence and his presence will coincide, will happen to fall together.
The character of something odd pertains to the nature of a coincidence, but that is not equivalent with being unlikely. It is the rarity or uncommonness of the combination of two events that yields a surprise, and that surprise makes us exclaim: "What a coincidence!" In the next section we shall formally clarify how a coincidence does not have to be unlikely, but should have a certain aspect of oddity.
Surprise requires the upsetting of anticipation, and thus assumes both form and content. It is the mark of an experience for which data are presumed. It does not challenge them in principle, nor demand an explanation of them in principle.
The determinist model of causality has often been taken as a model for explanation. In Chapter 2 we shall unnerve such an attempt and put forward a more modest, probabilistic model. Here we shall put forward a formal formulation of this probabilistic model of explanation in order to show that coincidence can be understood in terms of explanation, and vice versa. The complementarity of explanation and coincidence foreshadows the main idea of next section, namely, that coincidence cannot be basis of statistical inference. Instead, genuine randomness is required.
Before we get to a formal definition of coincidence, we should address the question how the idea of rarity pertains to a coincidence? The argument that a too incoherent coincidence or surprise is "the antithesis of the objective," relies on a deterministic view of the world. This idea itself is, however, incoherent. Instead, the view of the world in which we live is full of uncertainties, chaos and choice. If one is committed to a frequentist definition of probability, as we are, then does rarity not automatically imply a low probability? In the next section we shall argue that what instigates a surprise has a problematic relationship with the concept of probability. Here we shall show that a coincidence does not necessarily be connected with a low probability.
Any attempt to specify a proper level of unlikeliness for a coincidence is doomed to fail, as the attempt to specify a proper level of likeliness for what can count as a fact has failed. If one draws a card out of a deck consisting of the four cards as shown in Figure 1, then it would be a coincidence if the card is a face card of Hearts (a face card of Spades would have been more in line with the expectation), whereas it would not be a coincidence if the card was a number card of Hearts (because Spades did not have any number). However, both events have the same probability, one fourth, of happening. Apparently, it is impossible to specify an absolute level of unlikeliness for something to be a coincidence.
Figure 1. How drawing a face card of Hearts is a coincidence
We take a hint from Owens' approach, in which a coincidence is coined in terms of independence between the constituents of the coincidence-event.
An event is a coincidence, if and only if, it can be naturally divided into parts which are such that the (temporarily prior) conditions necessary and sufficient for the occurrence of one part are independent of those necessary and sufficient for the occurrence of the other.
This definition seems to grasp at least one aspect of a coincidence. The fact that my birthday and my uncle's birthday happen to fall on the same day is a coincidence. The constituent conditions of the two components of this event are clearly independent of one another. In our definition of coincidence we shall go one step further. If two events that are `negatively correlated' occur both at the same time, then we would also like to speak of a coincidence. It is a matter of a coincidence that Brenda will come home safely if she drives drunk because she comes home safely despite her drunk driving. Similarly, if an unqualified doctor performs an operation the patient's recovering will be a coincidence because the patient recovers despite the fact that an unqualified doctor operated on her.
Therefore, the definition of a coincidence should include besides the notion of independence also the idea that the components of the event could occur despite one another. We define a coincidence as an event that can be analyzed into two or more components, such that the occurrence of one component reduces the probability of the others, i.e.,
event E = E1E2 is a coincidence, if and only if,
(i) P(E1|E2) £ P(E1), and (ii) P(E2|E1) £ P(E2).
If E1 and E2 are non-vanishing events, then condition (i) implies condition (ii), and vice versa. In the above examples these conditions are fulfilled. If two events, E1 and E2, are independent, then P(E1|E2) = P(E1), and so E1E2 is a coincidence. In the four cards example, the event E1E2, where E1 = {Drawn card is a face card} and E2 = {Drawn card is of Hearts}, is a coincidence, as
1/2 = P(E1|E2) £ P(E1) = 3/4.
The event E2E3, where E3 = {Drawn card is a number}, is no coincidence.
1/2 = P(E3|E2) > P(E3) = 1/4.
Also, if E1 = {Brenda comes home safely} and E2 = {Brenda drives drunk}, then obviously P(E1|E2) £ P(E1). Thus it can be called a coincidence that one comes home safely while driving drunk. The same relation holds if E1 = {Patient recovers} and E2 = {An unqualified doctor performs the operation}. The probabilistic definition grasps the salient features of coincidence. The definition of a coincidence can easily be extended to the multi-component case.
With the newly gained insight in the nature of coincidence it is possible to make a connection to the concept of explanation. In the next chapter we shall study the theory of explanation in greater detail. Here we shall provide the basic idea of it. An event E1 explains another event E2, if it is more likely for E2 to happen in the presence of E1 than by itself alone, i.e., P(E2|E1) > P(E2). So we say that driving drunk explains having an accident, and driving drunk does not explain arriving home safely. The latter is said to be a coincidence. Therefore, two events are a coincidence, if and only if, the events do not explain each other. This follows immediately from the definitions of coincidence and explanation. The idea of complementarity of explanation and coincidence will be the central theme in the following section on statistical inference.
"What a coincidence!" is an expression that is not exclusively limited to the domain of the life-world, but also plays an important role in scientific discovery. "Eureka," Archimedes exclaimed when his answer coincided with the true explanation of the problem he was presented. In science there are many instances in which coincidence has played a constitutive role in discovering scientific explanations that furthered the progress of science. Antoine-Henri Becquerel (1852-1908), a member of a famous French family of physicists, happen to discover certain rays (or radioactivity as we, with Marie Curie, would be calling it) when he was working with his photographic plates. He published several articles on what was called "Becquerel rays" in 1896 and 1897, but he left the field because his rays did not seem as interesting as previously discovered X-rays. He himself did not see the importance of upon what he had stumbled. It had just been a coincidence. A similar coincidence occurred to Alexander Fleming (1881-1955), when he discovered antibiotics while experimenting with yeast molds. "One sometimes finds what one is not looking for," he said later.
In the twentieth century statistics has been developed as a scientific method in order to systematize fortuitous discoveries, to find hidden `trends' in observations. The history of statistics began in the nineteenth century in many other branches of science. The Belgian sociologist Adolphe Quetelet (1796-1874) applied probabilistic reasoning in social phenomena. Discussions in biology about the nature of human beings and animals used a vast array of concepts, such as `chance', `design' and `spontaneity', in unspecified ways. This discussion was subsumed in Darwin's evolution theory, from which the statistically oriented Biometric School and the more probabilistic Mendelians sprang. Even in physics statistics was introduced to describe the behavior of masses of atoms. In these last decades of the nineteenth century statistics made its first clumsy, baby steps in an attempt to get rid of coincidence in science and expand the region of explanation. When Fisher in the twenties and thirties of this century developed a consistent statistical theory in the field of agriculture, statistics became a separate science. In many experimental sciences statistics has become an irreplaceable tool. Statistics has in many cases even changed the nature of the methodology in those fields. Statistics became a normative criterion of the maturity of several experimental sciences. Psychology is an example of a science that has been changed radically over the course of the last sixty years under the influence of statistics.
It has been the promise of statistics to reveal genuine explanation in the region where there used to be mere coincidences. We concentrate on the issue of statistical inference to illustrate in what way coincidence enters the field of science in modern times and to argue that it is essential to make a distinction between coincidence --- such as the scientist might perceive a particular result of an experiment --- and chance --- in the way that the statistician should perceive the same result. For the scientist a regularity seems meaningful, but for a statistician regularity should be compared carefully with a chance result --- it may be a mere coincidence.
In statistical inference conclusions are drawn on the basis of data that are available to the scientist. The data that are presented to the scientist allow him to find regularities and check whether these regularities are significant or could be contributed "to chance." In regression, for instance, one tries to find a statistical relationship between two (or more) variables and check whether this relationship is statistically significant. In hypothesis testing the procedure is a little more complex. The statistician has to define a certain operational feature of the characteristic in which she is interested. This is called the test-statistic, whose value can empirically be determined on the basis of the data (for instance, in order to see whether a coin is somehow manipulated, she could count the number of head vs. the number of tails, or test the number of switches from heads to tails in the sequence). Then the statistician determines whether this number indicates enough evidence to reject the belief into the presence of the characteristic under scrutiny.
Statistical inference by means of hypothesis testing decides whether the "conservative" hypothesis can be rejected on the basis of the pre-specified level of significance and the evidence as contained in the data. In order to avoid the confusion of true probabilities and what can be coined as "surprising coincidence," it is essential to specify levels of significance and the hypotheses prior to the collection of data. Negligence of this procedure may make the scientist susceptible to a hindsight bias.
It is not uncommon in the practice of science that hypotheses are specified only after the experiments have been done and after the experimenter has gathered a sense of the responses. Historically, this is the way in which science progressed to its current height. From Francis Bacon's tables full of data of his experiments, to Isaac Newton, who saw the apple fall from the theory, science has been involved in a form of reasoning that fitted its conclusions to what it saw happening. This form of reasoning has generally been called inductive reasoning. C. R. Rao observes that for a long time "inductive reasoning remained more as an art with a degree of success depending on an individual's skill, experience and intuition."
Statistics has generally been considered the continuation and perfection of inductive reasoning --- statistics is sometimes called inductive logic. There is, however, also an important distinction between the old and the new form of induction. Coincidence could be a constitutive element of the old form of induction as the examples above showed. However, coincidence in the new model principally biases the stochastic models, and as a result warps the scientific conclusions. In the case that a statistician has seen the data, subsequent inference has the immanent danger of bias. For instance, the to be tested hypothesis and the test statistic should be chosen under the veil of ignorance, i.e., randomly with respect to the data. If this is not the case, then --- although no warning shows up in the quantitative analysis --- the numbers lose their probabilistic significance. One could try to incorporate explicitly the bias in one's analysis by formulating the hypothesis as a joint hypothesis stating instances in which one's surprise would instigate further statistical testing. What counts as a surprise depends on human psychology, and would probably be impossible to explicate. For sure the joint hypothesis will bring down the significance --- maybe to a level of insignificance.
We shall show that basing either the formulation of the hypothesis or the choice of the test statistic on the experimental data will, in unpredictable ways, decrease the significance of one's test without becoming quantitatively visible in one's calculations. In the following simple example we shall test whether a certain coin is fair. We state the conservative hypothesis H0 as "The coin is fair." We neglect to specify the test statistic and we start collecting our data without delay, found in Table 1.
Table 1. Coin-flip data (I)
data
HHT
HHT
THT
HTT
HTT
An experimenter looks at this sequence and decides to take the amount of tails that finish a sequence of three as the test statistic. It may seem silly, and it is definitely not the most powerful test, but it is a perfectly fine test statistic. In the model under the null hypothesis the test statistic has a binomial distribution with n = 5 and p = 1/2. The experimenter observes that 5 tails occurred at the end of the 5 sequences. The so-called p-value is (1/2)5 = .031, which might persuade an uncritical reader to believe that the coin is biased. However, in reality, it was the scientist who was biased. In the twenties and thirties the British statistician Sir Ronald Aylmer Fisher (1890-1962) wrote about the importance of a proper design of statistical experiments. Even thirty years before that the American philosopher Charles Sanders Peirce (1839-1914) made these observations;
If the major premiss, that the proportion of r of the M's [e.g., flips] are P's [e.g., heads] be laid down first, before the instances of M's are drawn, we really draw our inference concerning those instances (that the proportion r of them will be P's) in advance of the drawing, and therefore we know whether they are P's or not. But if we draw the instances of the M's first, and after the examination of them decide what we will select for the predicate of our major premiss, the inference will generally be completely fallacious.
Proper probabilistic inference can only result from an intentional, a priori mental process. It is essential that, in Peirce's terms, the predicate of the major premise is determined before information is gathered.
Still, people might be confused and might be wondering whether it was a coincidence that the last of all sequences of three was a tail. Indeed it is a coincidence; we do not deny that. It is surprising to see that five tails constitute the end of each sequence. It might motivate to do further experimenting specifically aimed at determining whether there is significant evidence for the fact that the third of a sequence of coin tosses with this coin is a tail. From the moment that we specify our hypothesis and test-statistic, the semantic structure of the future chain of events changes; the sequence of coin tosses loses its semantic openness that allowed it to yield surprises. Surprise is an essentially a posteriori concept that goes hand in hand with coincidence. The expression goes that life is full of surprises, and that is true in a very literal sense of the term. Retrospectively we can always describe a certain sequence of events as being exceptional in some sense. The above sequence of fifteen coin tosses already allows 215 = 32,768 different descriptions, whereas a sequence of 130 coin tosses generates a number of descriptions that is larger than the amount of particles in the universe. Certainly, several of these descriptions are considered rare from a prior probabilistic point of view, but that a rare description of the actual sequence of events is possible is almost certain.
The conceptual confusion between chance and coincidence has played a disruptive role in the acceptance policy of publications in scientific journals. After statistics became the `official' methodology of several fields of science in the forties and fifties, it became the policy of the Journal of Experimental Psychology and other journals that no article would be published if it did not reject the null hypothesis with a significance level of five percent or, preferably, of one percent. This policy led to some rampant examples of fraud with test-statistics and hypotheses, not always out of evil will, and, in fact, often due to striking misunderstandings of statistics.
However, even if no hindsight bias were present in the construction of the null hypothesis or the choice of the test statistic, this policy would still introduce another bias. Think of a test of a certain null hypothesis that is actually true --- although the statistician does not know this fact because it is epistemically hidden from her. Assume that a generation of scientists attempts to disprove the null hypothesis and that they all use the same test-statistic, which rejects the null hypothesis with probability 0.01 given that the null hypothesis is true. Thus, each experiment can be considered a flip of a coin, which with probability 0.99 falls on `justly accepting the null hypothesis' and with probability 0.01 on `unjustly rejecting the null hypothesis.' Then, if a journal accepts only publications that significantly reject the null hypothesis, then it is a matter of time that the false statement, "the null hypothesis is false," will be `proven' without the opportunity for other scientists to present counter-evidence. Although most journals have released this bias-inducing policy, it is still not uncommon that articles that do not reject the null hypothesis fail to be published. These articles are dropped somewhere in the process of experimenting, writing, and reviewing.
Probability is often misunderstood. The origin of the difficulty is the asymmetry of probability with respect to time. A probability is a property of an event that changes when it is observed. A probability depends on the state of mind of the observer, or, more carefully, the conclusion we can draw from the occurrence of a certain random event depends on the antecedent state of mind of the observer. Someone is carelessly tossing a coin, and suddenly she notices that head showed up ten times in a row. She is surprised. Is her surprise justified? Yes, because surprise is a mark of coincidence. However, surprise is untouched by chance because chance possesses the aspect of anticipation, which is exactly that what a surprise lacks. Any sequence was equally likely to occur and a sudden sequence of heads is to be expected if one keep tossing a coin. On the other hand, if she would have formulated prior to the tossing a hypothesis, say: "heads are more likely to occur than tails," then the occurrence of ten heads would have allowed her to draw a statistical conclusion with a level of significance, p = 0.00098.
She would commit a hindsight fallacy, if she --- without prior formulation of a hypothesis --- believes that the occurrence of ten heads could lead to the same significant conclusion under a prior hypothesis. Is it not a common practice in science that the scientist attempts retrospectively to find regularities in the data? Our point is relatively simple and can be expressed in straightforward mathematical language. If a scientist formulates his hypothesis on basis of the data and then tries to test whether this pattern could be attributed to chance, she does not calculate the probability that this specific pattern of data occurred, P(pattern), which might be small. She calculates the probability that this pattern would occur given the data, P(pattern| data), which could be as high as one.
We shall formulate our objection in a more subtle way: the method that makes use of a retrospective study of the data cannot reach the same significance level as a prior formulation of the hypothesis. Kant recognized that the human mind is teleologically organized. Human beings tend to interpret the world around them as purposeful. The scientific mind performs a similar action when observing data: it tries to find regularities. This is an important statement as it makes clear that when we study the sequence of coin tosses we do not only look for a sequence of heads, but for any kind of regularity. Our retrospective hypotheses should therefore be formulated as follows:
H0 : no true regularity
H1 : true regularity
And the alternative hypothesis thus covers a wider region, which has as necessary consequence that the significance of the test cannot be as sharp as before. It is a matter of psychology to determine what our minds count as a surprise and would induce us to perform a statistical test. We perform, in fact, a test with a joint hypothesis, for instance:
H1 : more heads than tails, or, more tails than heads, or, more switches from heads to tails and back, or, more groups of two, or ..., etc., than usual.
Hardly any scientist, however, will find it interesting to test the fairness of a coin and therefore the relevance of my previous remarks may seem questionable. We would like to clear ourselves from this accusation by showing that similar methods are still widespread in applied statistics. Look at the following excerpt of an introduction to applied statistics. When the author wants to indicate the importance of numerical and graphical data-description prior to statistical inference, he gives two examples. After the first example he continues,
Similarly, in developing an economic forecast of new housing start for the next year, it is necessary to use sample data from various economic indicators in order to make such a prediction (inference). In both of these examples involving an inference, description of the sample data is an important step leading toward the inference that we make. Thus no matter what our objective, statistical inference or data description, we must first describe the set of measurements at our disposal.
As we have seen in the case of the coin, any prior knowledge about our actual data-set might affect our choice of test-statistic or it could lead to the inclusion of a certain variable in forecasting. To what extent this happens is unclear. When the scientist observes the data in some form possible bias slips into her head.
Take the following example of two data sets of length of hospital stays at two different hospitals. When we view the data in a graph it seems that hospital A generally keeps its patients longer for observation than hospital B does. To see if we can make a significant inference, we perform a one sided test of the kind where we compare the two mean hospital stays:
H0: m A = m B,
H1: m A > m B.
Clearly, the choice for a one-sided test was inspired by the data themselves (or a graphical description thereof) which meant that we used the data more than one time for the same inference. The significance level of this inference, commonly called the p-value, will therefore misrepresent the true significance of the data. The inference becomes a black box where data come in and meaningless numbers come out. The general drawback of seeing a description of the data before making inferences is that it affects significance levels in an unknown fashion.
This discussion of statistical hypothesis testing pin-points a more general point. Chance and coincidence are different entities. Something is not a coincidence because it is chancy. Chance has a sense of anticipation, whereas coincidence is a retrospective notion. This observation is not a philosopher's phantom; it is part of our everyday language. We say that it was a coincidence that Becquerel discovered radioactivity, or that I met a friend in a cafe in New York. From our use of language it is clear that coincidence is retrospectively attributed to a certain event that is considered to be rare in some sense. Even when we say: "It would be a coincidence if I meet my friend in a cafe in New York," clearly the identification of can only occur after the event has actually happened. Chance, on the other hand, is an inherent aspect of a situation prior to any development.
Coincidence: a problem for Bayesians and frequentists
In this section we shall show that coincidence is a problem for both Bayesians and frequentists by means of a simple example. Whereas frequentists may go wrong in hypothesis testing, Bayesians may face an even greater danger. Bayesian theory argues that specifying a subjective numerical probability of an event is always possible, permissible and even inherently part of the furniture and functionality of the human mind. In what follows we shall argue that certain probabilities are fundamentally meaningless because the Bayesian failed to distinguish between a probability and a coincidence. The argument here is aimed at the universal ambitions of the Bayesian theory --- an attitude that is coined Bayesianism.
Bayesians have no conceptual problem, for instance, to construct the probability that Gauguin becomes a successful painter. Williams argues that any such attempt is absurd. "What is reasonable conviction supposed to be in such a case? Should Gauguin consult professors of art?" This is not a mere rhetorical argument. Any subjective approach to chance may result in probabilistic bias if no distinction has been made to prior probabilities and posterior results.
We shall quickly recapitulate the problem of hindsight in the case of hypothesis testing, by means of an example. Bias is introduced in a statistical conclusions if one fails to construct a prior mental model of the results. We shall continue to show that the situation is analogous for Bayesian estimation and that the universal aspirations of the Bayesian theory put the validity of its results in peril. We shall give two examples of the dangers of Bayesian theory, which rest essentially the confusion of coincidence with chance.
Suppose Andy and Brenda are told that a coin is going to be flipped eight times. Andy decides to check whether the coin is biased toward changing sides. Brenda does not give it a second thought and just observes the experiment. They both receive the data as recorded in Table 2.
Table 2. Coin-flip data (II)
data
H
T
H
T
H
T
H
T.
Andy starts testing his hypothesis. He wanted to check whether there was a bias to changing sides, thus, accepting the null-hypothesis:
H0 = the coin is not biased to changing sides.
According to conventional statistical theory, he should then evaluate the probability under the null-hypothesis that the observed sequence would happen. The so-called test-statistic is:
T = # of changes of sides.
Note that T has the value seven, as the coin changed sides seven times. Andy calculates the probability that such an event could have occurred by chance, i.e., P(T ³ 7|H0) = 1/128. Thus only once in 128 times that a coin is not biased, one would `expect' such a result to occur. With an ordinary significance level --- normally chosen around a = 0.01 --- Andy will conclude that the null-hypothesis should be rejected, i.e., that the coin is biased to changing sides. Brenda, on the other hand, did not make any hypothesis about the outcome. She can be surprised about the outcome, but she cannot make any probabilistic inference based on her surprise. Any outcome is unique and any outcome had the same chance of 1/128 to happen. She could formulate now the hypothesis that the coin is biased towards changing sides, but she cannot, on the basis of these same data, draw the same conclusion as Andy did.
The situation of Andy and Brenda is not uniquely a problem of hypothesis testing. The same issue features in the case of Bayesian inference. Let us assume that Andy and Brenda are now two Bayesians. Again, Andy is interested to see whether the coin is biased towards changing sides. He has no knowledge of the coin that is used for this experiment, so he considers the parameter of interest, pchanging sides, as a uniform(0, 1) distribution. Having observed that 7 changes of sides take place, Andy updates his parameter, which becomes a beta(8,1) distribution. His best, Bayesian guess will be that the coin has a 8/9 (or more) chance to chance sides, each time it is tossed.
For Brenda the situation is essentially different. She did not specify a prior distribution and thereby forfeited the opportunity to make a statistical inference. She cannot conclude as Andy did that the coin is biased towards changing sides. This conclusion may seem strange, but it becomes intuitive after grasping the following example, which is essentially the same as the coin-flip example. Seven numbers have been recorded in Table 3. Each entry is either a one, heads, or a two, tails.
Table 3. Coin-flip data (III)
no.
1
2
3
4
5
6
7
8
data
1
1
2
1
2
2
2
?
A Bayesian is asked to make an estimate about the next number in the sequence. Through these seven points there are two unique polynomials, gi(x), of the seventh degree where g1 goes through 1 and g2 goes through 2 at point x=8. If we then retrospectively apply both g1 and g2 as priors, a paradoxical situation arises: the prior distributions on both paccording to g1 and paccording to g2 are uniform(0, 1) distributions. Having observed that seven numbers occur both according to g1 and g2, the Bayesian statistician would believe that the posterior distributions on both paccording to g1 and paccording to g2 are beta(8,1) distributions. Paradoxically, the predictions of the eight number are conflicting although each has the same high probability, i.e., paccording to g1 = paccording to g2 » 8/9.
The conclusion of these paradoxical calculations is that coincidence does not have a legitimate place in statistics, neither in Bayesian nor in Classical statistics. Moreover, probabilities do not belong to the furniture of the human mind. The mind can lease probabilities, but the rent to pay is prior attentiveness. Pasteur was more in the right than he might have known, when he said, almost one and a half century ago, that chance favors only an attentive mind.
Deliberative rationality: beating the coincidence
At the end of this section on coincidence we shall indicate how the concept of coincidence will function in the dynamics of the ethics of chance. Coincidence upsets the cognitive continuity of the world, because a coincidence indicates an absence of explanation. Cognitive expectations and explanations do not always coincide with the causality --- in a broad sense --- of the world. In the chasm between the two realms coincidence lures. However, coincidence does not necessarily indicate a lack of rationality. Quite on the contrary, it points us to an important aspect of rationality. In appropriate Kantian terms, coincidence proves reason (Vernünft) to acknowledge the principal limitations of the understanding (Verstand). Rationality itself, as we shall argue in this dissertation, should admit the possibility of the unexpected and the improbable, and should make its judgment to the largest possible extent resistant to the occurrence of both, that is, rationality should act in such a way that no matter what turns out it feels the least possible regret about its decisions.
The following two examples precisely makes this point. Media reported in the seventies and eighties cases of people who spent their entire incomes on building nuclear bomb shelters. Was it worth spending so many resources on preventing what never happened? The answer of this question is not as obvious as it may seem. It is not an unambiguous no. With hindsight people often feel justified to qualify those people as too cautious, bordering on the irrational. However, hindsight is a bad judge when one is confronted with uncertainties. We would say that a person who refuses to give her wallet to the robber with a fake gun is lucky. However, if she is prepared to give him her wallet, then calling her overcautious or irrational afterwards is certainly not true. Similarly, with the information we have now, those people who have taken unnecessary precautions against the possibility of a nuclear war are, in a way, `unlucky,' however, not necessarily overcautious or irrational. At a height of the Cold War in the eighties experts estimated an expected accidental nuclear war within three to fifteen years, given the false alarm rates and decision structure of that moment. That means an expected occurrence within one generation. Those preparing for this possibility were not unreasonable in their assumptions that the probability of an accidental nuclear war was considerable. The coincidence that the nuclear threat diminished does not diminish the level of deliberate rationality of their decision.
These examples foreshadow the general issue of deliberative rationality in later chapters, in which the concept of coincidence will play a key role. The tension between explanation and coincidence will result in the relevance of the latter for the issue of responsibility. In Chapter 2 we shall show the unjustified popularity of the contemporary notion of moral coincidence or moral luck with the tools that we have developed here. In the final chapter of this dissertation the concept of coincidence or unknown risk will be important, and we shall prove its relevance for safety regulations and intervention.
Probabilities are not readily available in the world around us. Expressing uncertainty, probability represents precisely what is epistemically unavailable to us. Also the concepts chaos and free choice indicate a lack of predictability of the world. Probability is distinct from chaos and free will in that it presupposes some type of long run regularity. In this section we shall deal with questions such as how probabilities can be assessed and evaluated and to what extent long run regularities are relevant to this issue. Is every long run relative frequency a probability? What is the probability of a single event?
There are several methods to assess a probability and they can be distinguished broadly in four different methods, i.e., the frequency theory, the propensity theory, the logical theory and the personal theory. Each of these methods has its own criteria of assessment and evaluation. We shall advance an eclectic mixture of all four theories. It seems to us that the wide semantic range of chance should be reflected in an equally rich interpretative approach to probability. The quantum mechanical behavior of subatomic wave-particles is generally given a propensity interpretation, whereas a die, if not suspected of being biased, exhibits probabilistically logical behavior. The traffic in New York City has been modeled with a probabilistic, frequentist model. Decisions by drivers have been replaced by impersonal, randomized events. A doctor interprets the posterior probability of having breast-cancer given a positive test-result of the mammogram as the level of epistemic certainty she has on the basis of the test alone.
In the following sections we shall touch upon three separate issues of probability, its interpretations, methodologies and structure. Besides an eclectic interpretation of probability, it is our claim that any probability, whatever interpretation suits best, possesses a frequency structure, i.e., it is always possible to express a probability in terms of a long-run relative frequencies. One important methodological aspect of probability follows from our observations concerning coincidence. Any Platonic image of probability, namely as the world as a list of probabilities out there, is misguided. This idea is defended by some personalists, logicists and frequentists, but is most frequently found among the former. Platonic ideas about probabilities can be found among logicists and personalists with respect to the issue of a uniform prior. In the following sections we shall examine the personal theory of probability and the frequency theory of probability.
Some things are thought to be more probable than others. It is more probable that the sun will rise tomorrow than that it will not. One holds this belief quite strongly --- and legitimately so. Probability captures, in some sense, the strength of belief, i.e., of the confidence that the event will be repeated. Personalist theory postulated that probability not only expresses the strength of belief, but that it is actually defined as such. Personalists associate probability with the subjective magnitude of seeming probable. The sun-rise example is particularly interesting, because it has been a focus of controversy between personalists and defenders of other theories. We shall return to this example shortly.
Personalists make a distinction between the construction and the evaluation of a probability. According to personalists, "the question when a probability statement is correctly made has two different meanings: (1) How should we make or construct well justified probability judgments? For example, when is a surgeon justified in saying that an operation will succeed with probability 0.90? (2) How should we evaluate such judgments after we know the truth of the matter? For example, how would we evaluate the surgeon's probability judgment if the operation succeeds, or if it fails?" In what follows, we shall concentrate on the construction issue of a personal probability. Methods of evaluation, such as calibration, are discussed only briefly.
Personalists come in several flavors. After ground-breaking work by Definetti and Savage, different schools of personalist thought have developed. All personalists have in common that they define probability as a numerical measure of the strength of a belief in a certain event. Their picture of belief bears strong affinity with that of the British empiricists. John Locke argued that every belief is held with a certain "strength" in the human mind. The personalist theory of probability interpreted this strength as the individual's idea of the likelihood of the event. The current mainstream personal theory is the Bayesian theory, and we shall use the terms interchangeably. Bayesians believe that it is possible to make probability assessments even in the absence of frequency information. However, a personal probability is not a mere opinion. It is an orderly opinion. The personal theory specifies consistency rules. One of the concerns of the theory is with the revision of a probability in the light of new evidence. Bayesian theory developed a calculus of beliefs that specifically deals with this issue. The original degree of belief is replaced by a new degree of belief when new evidence is obtained.
The personal theory of probability raises a number of issues. The great advantage of the theory, viz., the completeness of probability judgments over the set of all events, is also its great weakness. The initial prior probability, i.e., the initial strength of belief, is based on a mixture of the available information (which includes feelings, knowledge or pressure from some authority) of the individual. Prior beliefs are subject to the individual's bias. Personalists defend a Peircean stance to truth, i.e., they believe that the true probability is the point of convergence of the scientific process. The possibility of wide variation of prior personal probability assessments has been recognized by the personalists as their Achilles' heel, and it has been become the aim of serious theory to show mathematically that in the light of new evidence different personalist probability inferences will converge to the same numerical value. Several convergence theorems have been proven.
In the following sections we shall focus on some important aspects of Bayesian theory. Bayesians have recognized that acting under uncertainty is essentially the same as making a bet. In constructing probabilities Bayesians have historically invoked several additional assumptions that are not directly related to any frequency ideas: the idea of risk neutrality and the principle of insufficient reason. We shall briefly discuss some Bayesian calculations. Further mathematical details are placed in the appendix. We shall then return to the Sunrise example and show that the world cannot be considered as a list of probabilities. This will call upon the personalist --- but not only on her --- to show moderation in `finding' probabilities.
Making a bet --- constructing a probability
Some personalists make an explicit connection between probability and a specific behavioral attitude. "The probability of a proposition should be one of the determinants of our willingness to act as though the proposition were true --- to `bet on it'" From American pragmatism stems the definition of belief as the willingness to act. Instead of speculating about what is going on `inside the head' the pragmatist program proposed to consider only those beliefs that affect behavior. Pragmatic personalists operationalized probability, the measure of believe, as a measure of willingness to bet. The concept of a bet can include also non-monetary rewards and punishments. Any action under uncertainty can be interpreted as a gamble in a wider sense of the term and thus gambling is unavoidable. For instance, if Sarah goes out for a walk and does not bring her umbrella despite a negative weather forecast, she is, in fact, gambling: the stake is the nuisance of carrying her umbrella, whereas the uncertain pay-off is the event of getting wet by the rain. In hypothesis testing, for instance, probabilities stand for the rate of accepting faulty inferences.
According to these personalists, the identification of probability with the willingness to bet singles out an identifiable numerical value for a probability:
Thus, if I say that the probability of an event is one-third, I will be just willing to accept a bet in which I gain 20 cents if the event occurs and lose 10 cents if it does not, that is, a bet at odds of 2:1. I shall be very happy to accept a bet on this event at more favorable odds but unwilling to accept a bet at less favorable odds than 2:1.
This definition of probability makes an important assumption: risk-neutrality of the agent. It makes an explicit connection between probability and the willingness to be involved in a bet. That means that the gambling situation as such is assumed not to have any influence on the preference-structure and that the measure of preference of an event is defined by the expected benefit of the event.
This definition indeed singles out a measure of probability. Intuitively, the procedure functions as follows. A certain event yields twenty cents, but it is uncertain whether it is going to happen or not. Andrea tries to buy Brenda out of gambling on the event. With any amount less that a little over six cents Brenda feels more inclined to take the risk, whereas if Andrea offers her more than seven cents she prefers to take that rather than being involved in the gamble. Apparently at six and two thirds of a cent Brenda is indifferent between the gamble and the buy-out. Assuming that Brenda is risk-neutral, the expected utility of betting and the expected value of taking the certain stake are equal.
E[gain in betting] = 20 cents ´ P(E) = 6 2/3 cents ´ 1 = E[gain in taking stake]
and, thus
P(E) = 1/3.
Assuming risk neutrality, the true, subjective probability can be defined as the fraction of the of the indifference utility and the utility of the event. Thus,
Stake needed to buy out of gamble for E
P(E) = Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä Ä
Value of the prize, E
However, not all personalists rely on this identification of rationality and risk-neutrality. In Savage's axioms, for instance, this idea is completely absent. He defines probability as a subjective level of confidence under the conditions of transitivity, substitutability and monotonicity.
Principle of insufficient reason and ideas of randomness
One of the oldest controversies in the theory of probability, going back as far as the early nineteenth century, is whether probabilistic homogeneity can be the result of ignorance. "Personalists take seriously the idea that probability judgments are always possible. ... In some cases, personalists must appeal to the principle of insufficient reason: If there is no reason to expect one event to be more likely than another, then we should consider the two events to be equally likely." This principle has been proposed by logical probabilists and is generally embraced by personalists as well. The discussion of this principle will draw us into an analysis of the nature of randomness, which was one of the concerns of Richard von Mises (1883-1953).
Historically the principle of insufficient reason is the result of a belief in determinism and the definition of probability as the calculus of ignorance. The French mathematician Marquis de Laplace (1749-1827) combined his determinism with his groundbreaking work in probability theory. According to Laplace, the world is a determinist system whose ultimate laws are only epistemically hidden from human beings. Probabilities do not possess ontological validity or reality and they are only useful because of human limited rationality. If one assumes that a probability is a measure of ignorance, then complete ignorance is quite naturally equivalent with a uniform probability distribution.
The principle of insufficient reason is sometimes called Bayes' postulate, as it can be found for the first time in Bayes' famous two treatises in which he presented Bayes' Theorem. Rev. Thomas Bayes (1701?-1761) tried to justify his use of a uniform prior distribution to model ignorance by the following, negative argument. If one does not know anything about the probabilities of q Î Q , then one does not have reason to assume that any q is more likely than any other. Ergo, one should assume that all q Î Q are equally likely. However, Bayes was probably so at odds with this postulate that he did not publish his famous two tracts during his life-time.
The first thorough attack on the principle of insufficient reason came from the American philosopher Peirce (1839-1914), who, in turn, called it the conceptualist principle:
there is an even chance of a totally unknown event.
The essential point is that there is a difference between ignorance and uncertainty. We shall present several arguments in this section to unsettle a naive application of the principle of insufficient reason. If one does not have access to information to determine the probability of some events, considering them to be equally likely yields a false image of knowledge. Instead, one simply does not know the probability. One should take seriously the idea that probability judgments are not always possible.
In this section we shall present an argument that has more than mere academic importance and which will prove worthwhile in the rest of this thesis. When a uniform prior is justified on the basis of a hypothetical fraction of possible worlds, then one would ignore the temporal asymmetry of probability, and probability would be treated as a logical property of the world. In the concept of the world there is no information about probability. Also, probabilities can not be found in the world itself. That is, probability is no ontological property of the world. It requires an attentive mind combined with proper inference to get to a probability.
We could try to define probability as a hypothetical fraction of possible worlds. If a certain situation has five different continuations of which in two the event A happens, then it according to this definition the probability would be 2/5.
# of Possible Worlds in which A Occurs
P(A) = --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
# of All Possible Worlds
If we are ignorant about event A except that we know that there are n-1 alternatives to A, say A1, ..., An-1, then the probability of A becomes,
# {A}
1/n. P(A) = --- --- --- --- --- --- --- =
# {A, A1, ..., An-1}
This definition of a probability of ignorance makes two implicit assumptions. First, a probability is a function of the imperfect information about the world. Secondly, there is a uniquely best way to translate a certain level of ignorance into a probability. Peirce spends great effort to point out the shortcomings of these assumptions. His attack aims at the very intelligibility such ratio. According to Peirce, (i) the notion of probability as a fraction of possible worlds is unintelligible and if it were intelligible, then it would (ii)be unobtainable and (iii) make further inference superfluous. First of all, Peirce argues that the notion of "possible worlds" is absurd.
Why should we want to know the probability that the fact will accord with our conclusion? That implies that we are interested in all possible worlds, and not merely the one in which we find ourselves placed. ... It is only an absurd attempt to reduce synthetic to analytic reason, and no definite solution is possible.
The absurdity of the possible world conception of probability is that it confuses two forms of reasoning. Almost all forms of probabilistic inference are synthetic, i.e., statistical. Probabilities can not be found in the world. It requires inference to get to a probability. There is no information about probability contained in the concept of the world. The definition of probability as the ratio of possible worlds presupposes just that. It aspires to cut synthetic, or statistical, reasoning down to analytic, or logical, reasoning, as if there exists somewhere a list with relative frequencies.
Secondly, the notion of probability as a fraction of possible worlds is a phantom, not only because it is a corrupted form of inference, but also because it is principally inaccessible. The principle relies on a mental enumeration of possible worlds. However, such an enumeration is not unique; different orders of enumeration create different fractions. Consider, for instance, the following two enumerations of all positive integers:
- the sequence: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, ...
- the sequence: 1, 3, 2, 6, 4, 9, 5, 12, 7, 15, 8,...
In the former sequence, numbers divisible by three appear in 1/3 of the cases throughout the sequence, while in the latter sequence, this fraction is ½, even though all numbers are counted only once. The method of fractions of possible worlds would thus lead to inconsistent answers and Peirce concludes that "this principle is absurd. There is an indefinite variety of ways of enumerating the different possibilities, which, on application of this principle, would give different results." Peirce objects that a ratio over an infinite group cannot be given without an order of enumeration. The definition of probability as the ratio of all possible worlds falls short on this matter.
Long run ratios are not necessarily probabilities. If that is true, then what can count as a probability? What are the necessary conditions for a sequence to be a random sequence? Richard von Mises analyzed this issue in detail, and suggested that a proper definition of probability depends on obtaining a proper definition of a random sequence. A random sequence is a sequence with a certain frequency property. However, how can we meaningfully define this frequency property? The frequency theory interprets a probability p as a ratio that with probability one tends to go to p if the number of observations goes to infinity. However, how can we make sense of the reverse question? How many observations should we make in order to expect convergence to happen? For any finite sequence we can of course not be certain that a certain convergence will occur. We can at best say that `with a certain probability the convergence will lie between p± e .' But then the definition of probability would circular. Von Mises proposed to eliminate the problem by defining what can count as a random sequence, which he called a "collective." A collective satisfies two conditions. First of all, the limit of the relative frequency exists, and secondly, there exists no gambling strategy (such as allowed by an admissible place-selection) that has different odds than a strategy that would be selected by means of a coin-flip. These two conditions are strong conditions and can never be tested because it makes assumptions about the existence of limits. In the section on the frequency theory, we shall expand on these ideas by introducing the concept of homogeneity.
Construction of posterior probabilities
Bayes' theorem breached new grounds. It allowed one to model causation without causes. The model of causation, established in the seventeenth century by the work of Newton and others, had been that of mechanical causation: action equals reaction. The new theory of probability challenged the absoluteness of this model. It suggested that sometimes the connection between cause and effect is not one of iron necessity. The same action can lead to several different outcomes. A roll of a die can result in any number between one and six. The relationship between the cause and the effect is under-determined. Order appears only in the repetition of the experiment. Bayes' theorem is a description of this second tier order.
Jacob Bernoulli proved in his Ars Conjectandi a theorem that was the first description of a higher order stability of probabilistic causation. He used the example of an urn in which there is a proportion p of white balls and 1-p of black balls. Each time one draws a ball one does not know whether it will be a white or a black ball. The action of drawing in itself does not determine the outcome. Bernoulli proved, however, that if N, the number of drawings goes to infinity, one can be more and more certain that the relative fraction of white balls, m/N, is getting arbitrary close to p. Formally, Bernoulli's theorem states that for every e > 0,
Bernoulli said that even the stupidest man knows this fact. Nevertheless, this result was profound because it proved a regularity coming forth out of irregularities. Most of the ancient thinkers had considered that to be impossible.
Bernoulli's theorem linked degrees of subjective certainty with objective frequency information. The fact that heads are showing up seventy-five out of hundred coin flips is more informative than three out of four times. The limit sign in Bernoulli's result indicates that we can be more certain in the former case that the true head-frequency is close to 0.75 than in the latter case. How much more certain can we be? That was the unanswered, inverse question. Bernoulli's theorem states that given the true fraction we can be certain that in the long run the observed frequency will approximate the known fraction to any given degree. Bayes' theorem answered the other question: given the observed frequency, how certain can we be that it is the true fraction?
Prior information about a certain unknown parameter is represented by a belief density function. When one combines this function with the observed new information, one can form an updated, posterior belief function. One of the criticisms against the Bayesian posterior estimate is that the subjectivity of the prior probability carries over its subjectivity to posterior probability. However, in many practical circumstances a certain amount of prior information is available — maybe as previous statistical results or in some other form. It can be shown that in these circumstances Bayesian estimates perform better than their frequentist counterparts.
We present Bayes' theorem in the following form:
The constant of proportionality, 1/p(x), does not depend on q and therefore can be found by integrating over the range of q , i.e., q Î Q . In this equation q stands for the parameter of interest and x for the observation. In words, the likelihood that a possible cause q 0 out of the set of possible causes Q is the true cause given the observed data x is proportionate to the combined probability of the data occurring assuming that q 0 was the true cause and the prior probability of q 0 before we performed our observations.
A controversial aspect of Bayesianism is that the unknown variable is conceptualized as a random variable. According to opponents the variable is not subject to chance and should therefore be considered a fixed number. Bayesians argue that, if a parameter value is unknown, and if probability is a measure of ignorance, then the parameter should be described as a random variable. In general, the Bayesian method can be summarized as follows: a prior probability distribution for a parameter of interest is specified first. Sample information is then obtained and combined through an application of Bayes' theorem to provide a posterior probability distribution for the parameter. The posterior distribution provides the basis for statistical inferences concerning the parameter.
Bayes' theorem makes frequencies relevant to the construction of a Bayesian probability. A Bayesian starts off with a certain belief about the probability of an event, the prior probability, and then observes a certain frequency --- e.g., number of heads or car accidents. On the basis of this information she calculates a new, updated probability of the event in question, the posterior probability. As such Bayes' theorem has found its way into expert systems, for instance in medical diagnoses systems.
The Sunrise problem and Coincidence
Consider the following problem:
When Adam and Eve observed the sun rise on the first day in Paradise, what was their subjective level of confidence in a sunrise on the second day?
This question has been coined the Sunrise problem and has been subject of discussion in the personalist literature and its criticisms. If Adam and Eve were Bayesians, they would believe with probability 2/3 that the sun would rise on the second day in Paradise. In general, after observing n subsequent sunrises the probability of the (n+1)th sunrise is (n+1)/(n+2).
The discussion has often concerned the interpretation of this numerical value. Peirce, for instance, wondered what the n+2 cases in the denominator represent. We agree with Peirce that the probability should be capable to be translated into frequency language. In particular,
in the sense in which probabilities can be evaluated and made the subjects of mathematical calculation, a probability is a ratio of frequency in the long run of experience. ... and you should constantly ask yourself `Is there going to be any such experience in any way that can give the particular probability spoken of any real utility?'
We disagree with several dogmatic logicists and frequentists that the sunrise probability has no frequency interpretation. If Adam and Eve just before dawn on the first day had specified that they wanted to find out whether the sun would rise in Paradise and if they observed a sunrise on the first day, then they could indeed conclude with probability 2/3 that the sun would rise the next day, assuming the validity of the principle of insufficient reason. If the sunrise-experiment is a flip of an unfair coin of which we do not know the bias, then the first sunrise gives us some confidence that the coin used is a coin with a higher bias to showing up sunrise. For instance, if we knew that a hypothetical God used one of three coins to determine every sunrise, and the coins had a psurnrise of 0, ½ and 1 respectively, then observing a sunrise on the first day gives us the information that the p=0 coin is not used, and that it is more likely that the p=1 than the p=½ coin has been used. Table 4 shows the probabilities of a sunrise on the second day if there were i different coins with probabilities of a sunrise homogeneously spread over the 0-1 interval and the coin picked has already shown a sunrise.
Table 4. The Sunrise Problem
i
Ej = {Sunrise on day j}
Number of Coins
Prob. of head of each coin
P(E2|E1) = P(E1E2)/P(E1)
2
0, 1
(1+0)/(1+0) = 1
3
0, 0.5, 1
(1+¼)/(1+½) = 5/6
4
0, 0.33, 0.67, 1
(1+1/9+4/9)/(1+1/3+
2/3)= 7/9…
…
…
¥
`0, 1/¥ ,…, (¥ -1)/¥ , 1'
0ò
1 x2 dx / 0ò 1 x dx = 2/3What is wrong with the sunrise problem are not the calculations. The issue is more fundamental. Adam and Eve would certainly not have chosen the rising of the sun as a topic that deserves attention, if the sun did not rise on the first day. Therefore, trying to determine the probability retrospectively will necessarily introduce bias. In particular, any observational fact that has been observed once will — by this faulty Bayesian inference — receive a posterior probability of 2/3 to be observed on a second occasion. We coined this bias the hindsight bias. Bayesians run the risk of determining probabilities of events without specifying a proper prior model, actually prior to obtaining the data. Failing to specify a prior model is confusing coincidence and chance.
A word of warning should be mentioned at the end. We should not think that the hindsight bias is a mere epistemic bias. It is not only the negligence of specifying the prior probabilities that is the cause of the falling into the hindsight bias. Not even the fundamental limited rationality of the human subject is the reason. Certainly, an inattentive mind will cripple the data for probabilistic inference. Both reasons reduce applicability of the world's events for probabilistic inference. We may have given the impression that these were the only reasons. However, the final and foremost reason is that a set of data can only be used for one probabilistic inference.
A Bayesian collects sun-rise data every twelve hours over the course of one week. In accordance with the made recommendations she specifies prior to collecting data that she is interested in the two following variables:
- the probability of a sun-rise every twelve hours, q 1,
- the probability that a sun-rise is alternated with a non-sun-rise and vice versa. q 2,
And as a proper Bayesian without prior information, q 1 and q 2 are both given a prior uniform (0,1) distribution. Over the course of the week the sequence as recorded in Table 5 is observed.
Table 5. Sunrise data
day
Monday
Tuesday
W'day
Thursday
Friday
Saturday
Sunday
part
1st
2nd
1st
2nd
1st
2nd
1st
2nd
1st
2nd
1st
2nd
1st
2nd
sunrise
Y
N
Y
N
Y
N
Y
N
Y
N
Y
N
Y
N
What can be concluded from this sequence? If our Bayesian statistician would update both variables independently, then, first, the posterior distribution of q 1 would be a beta(8,8) distribution, with 1/2 as the best estimate for the probability that the sun will rise during the next twelve hours. Secondly, the posterior of q 2 would be a beta(14,1) distribution, resulting in the probability 14/15 as the best estimate for the probability that what happens in the next twelve hours is the opposite of what happens in the previous twelve hours.
Although the two conclusions are proper Bayesian inferences, they do contradict each other. If it is truly very probable that the sun will rise after it did not do so in the previous period, or vice versa, then it cannot be that there is a fifty percent chance that the sun will rise the next period. The two inferences are not consistent with one another because we are trying to get too much information out of the data. Separate Bayesian inferences bias the results because there is no guarantee that the conclusions will be consistent with one another. Even joint inference does not help to solve this problem. In particular, probabilistic inference of more than one variable out of the same set of data runs the risk of being inconsistent. Examples studied by Diaconis and Freedman indicate that Bayesian procedures can behave badly, when tried on joint estimation.
Conclusion
The first conclusion is that probability has a relative frequency structure and corresponds with long run behavior. The uniform prior can be useful in certain practical situations, but has a lot of theoretical problems attached to it. The definition of probability by means of a gamble depends on a long run average of gains, and presupposes risk neutrality of the agent. Even the personalist evaluation of probability judgments relies on a frequency structure of a probability judgment. The second conclusion of this section is that coincidence presents itself as a problem to a Bayesian. The probabilistic completeness of Bayesianism runs necessarily into contradictions, because it cannot distinguish between coincidence and chance. In the next section we shall see what solutions the frequency theory brings to the theory of probability, and what problems that also the frequency theory faces.
In the previous section it has been argued that even a Bayesian probability should is able to be specified in relative frequency terms. In this section we shall expand on the frequency theory itself. This theory defines a probability of an event as the frequency of occurrences of the specified event over the frequency of occurrences of events in a relevant reference class — in the long run. In other words, the probability that a member of the reference class R has attribute E, P(E|R), is the limit of the fraction of `draws' of subjects from R having the attribute E.
What if no frequency data or no reliable data are available? This happens frequently when new technology is introduced, or when only very few instances have been recorded, e.g., 4 out of 300,000. Although the probability appears to be small, there is no reliable estimate for the exact value possible, which could be essential for events with high catastrophic potential. The difference between a probability of P1 = 10-6 and P2 = 10-9 is a factor 1000. A society might accept one nuclear disaster in a millennium, but whether it would be willing to accept it every year is questionable. In political decision making on this matter differences in probability estimates of the order 103 are not uncommon.
A relative frequency is not necessarily a probability. Peirce criticism of the principle of insufficient reason also applies to a too easy identification of probability as a relative frequency. A long run relative frequency of possible worlds is not unique. Without an ordering principle of counting a relative frequency is meaningless. A relative frequency can only be meaningfully translated into a probability if it is accompanied by the proper prior model. For a Bayesian this prior model is a prior probability. A frequentist has to specify her null hypothesis or to determine, prior to data collection, what inference she wants to make.
Homogeneity is an essential additional assumption that we have to make before a relative frequency can be interpreted as a frequentist probability. The following example will show that a relative frequency should be defined relative to a homogeneous reference class. Assume that the population of drivers, R, could be broken down into two separate risk groups. The first driver group has a relevant risk property R1, whereas the second group possesses property R2. About fifty percent of the first group, that makes up about twenty percent of the population, is involved in an accident yearly, whereas only two and a half percent of the other driver group has an accident. It is clear that the complete driver group is not homogeneous. In one year about twelve percent of the drivers will be involved in an accident. It is misleading, however, to infer that a driver has a twelve percent probability to be involved in a car accident next year.
The misleading quality of this statement is that a vast majority of the people has a much slimmer chance and a small group a much larger chance to be in an accident. It would be more appropriate to break the driver group down in two and assign probabilities to either group. A probability is a relative frequency over a homogeneous reference class. Having said this, we shall modify this statement. In the car accident example, there are two homogeneous subclasses. This means that all persons in the same homogeneous subclass are identical with respect to relevant risk properties. It is impossible to break the subclass down in further subclasses according to prior characteristics, R3, ..., Rk, in which the relative frequency is significantly different. We shall call this strong or ontological homogeneity. However, there is a sense of homogeneity in which it can be said that a driver has a twelve probability of being involved in a car accident. If a driver is picked at random out of the group of drivers and it is unknown whether she possesses characteristic R1 or R2, then the population of drivers, R, is homogeneous in a cognitive sense of the term. Weak or cognitive homogeneity of the reference class is fulfilled if the draw is randomized and the risk specific characteristics are not revealed in the draw.
To say that a probability is defined as the relative frequency of the same event in similar circumstances opens itself up to some serious criticism. What does the notion of a homogeneous subgroup or "similar circumstances" mean? The personalist Baron renders a sharp criticism:
This theory runs into trouble right away. It would certainly make life difficult for weather forecasters. What could they possibly mean when they say that the probability of rain is 50%? They might mean "On days like today, it has rained 50% of the time in the past," but obviously they do not really mean this. Besides, if they did, they would have the problem of saying in what way the days they had considered were "like today." If these days were like today in being February 5, 1986, then there is only one such day, and the probability of rain would either be 1 or 0, and we will not know which it is until today is over or until it rains (whichever comes first). If those other days were like today in being February 5 regardless of the year, a simple record of past years would suffice for forecasting, and we could save a lot money on satellites and weather stations. If those other days were like today in the precise configuration of air masses, then, once again, there probably were not any such days except today.
This criticism reveals the connection between the frequentist theory of probability and the theory of explanation. Elsewhere we shall expand on it in more detail. For the moment a short recapitulation will suffice to understand Baron's criticism and it will suggest a solution. In a rudimentary form, we shall define explanation as follows: an argument R is an explanation for event E, if and only if, P(E|R) > P(E). This means that R constitutes a smaller reference class in which event E is more likely to occur. The determinist position claims that it is always possible to find an argument R that will necessarily entail event E, that is, P(E|R) = 1. A determinist, therefore, believes that no reference class R is really homogeneous if P(E|R) ¹ 1, because in that case there exists a mutually exclusive partition of R, R = R1 È R2, so that P(E|R1) = 0 and P(E|R2) = 1. Baron, in his criticism of the frequency theory of probability, hints precisely at this determinist sentiment. He suggests that the only appropriate reference class of the event that it will rain on February 5, 1986, is that day itself, because only February 5, 1986, is completely homogeneous, i.e., has the same circumstances as itself. Necessarily, the relative frequency is either 0 or 1.
This criticism is misguided precisely because of its underlying determinist assumptions. Finding an explanation for an event does not mean that the explanation will necessarily entail the event. Legitimate explanations are those which make the event more likely to occur --- as we shall discuss in the next chapter. A reference class, moreover, can be homogenous without being reduced to a single case. A collection of U238 atoms constitutes a homogeneous reference class in which there is an equal probability for each atom to decay within a certain time frame. Often, the reference class is a blend of cognitive and ontological homogeneity. In that case randomization is essential. In the weather forecast example the case sampled is the next day and nature itself is the proper randomizing instrument. When sampling voters for an election outcome, it is up to the statistician to randomize the reference class appropriately to ensure that the relative frequency is unbiased.
The idea of homogeneity is essential within the frequency theory of probability. Without proper attention to homogeneity probability is a mere measure of ignorance. Homogeneity reintroduces a new form of causality, i.e., a probabilistic one. In the next chapter the relative frequency notion of probability will prove of importance within the theory of explanation, which opens the road to considerations of responsibility. The fact that probability is the infrequent appearance of a certain characteristic among an otherwise homogeneous group of individuals impinges the domain of deliberative rationality, where the infrequent is subject of consideration.
Chapter 2. Matters of Principle
The purpose of this chapter is to unravel the complexity of links between moral concepts and probabilistic ones. We aim to explicate in what sense chance with its different faces is ethically relevant.
- What `moral stuff' pertains to uncertainty and luck? Luck can prevent or bring about a morally objectionable action. With the same ease it can bring us fame. Are we responsible for the event in either way? If the disaster or the fortunate event is the result of statistically known, as opposed to unexpected possibilities, does this change the nature of accountability?
- Sometimes, a lottery is called upon to divide an indivisible share `fairly.' Other times, a situational lottery is forced upon us. Are these outcomes fair, or should the losers of these lotteries be compensated for what they did not deserve?
These are questions that will be answered in this chapter. Many other questions will be left unanswered. To understand the richness of the connection of luck and morality, we shall briefly point out Martha Nussbaum's study of the moral character of Greek consciousness in general and, in particular, of how moral praise and blame are thought to be immune to luck in one sense and vulnerable in another.
Greek tragedy shows good people being ruined because of things that just happen to them, things that they do not control. This is certainly sad; but it is an ordinary fact of human life, and no one would deny that it happens. Nor does it threaten any of our deeply held beliefs about goodness, since goodness, plainly, can persist unscathed through a change in external fortunes. Tragedy also, however, shows something more deeply disturbing: it shows good people doing bad things, things otherwise repugnant to their ethical character and commitments, because of circumstances whose origin does not lie with them. Some such cases are mitigated by the presence of direct physical constraint or excusable ignorance. In those cases we may feel satisfied that the agent has not actually acted badly --- either because he or she has not acted at all, or because (as in the case of Oedipus) the thing he intentionally did was not the same as the bad thing that he inadvertently brought about. But the tragedies also show us, and dwell upon, another more intractable sort of case --- one which has come to be called, as a result, the situation of `tragic conflict'. In such cases we see a wrong action committed without any direct physical compulsion and in full knowledge of its nature, by a person whose ethical character or commitments would otherwise dispose him to reject the act. The constraint comes from the presence of circumstances that prevent the adequate fulfillment of two valid ethical claims.
In short, Nussbaum distinguishes three principal ways that luck intersects with moral judgment in Greek tragedy.
1. Good people being ruined by bad luck, bad people being saved by good luck.
2. Good people doing bad things of which the cause lies outside them because of
a) ignorance, i.e., the bad act was not object of the person's intentionality.
b) neglect, i.e., the person did not prevent the bad thing to happen.
3. 'tragic conflict': Good people doing bad things because they are subject to two valid, opposed ethical claims.
This chapter will concentrate on the first two questions, whereas the third question will be left out of consideration. A tragic conflict, for instance the one that Antigone faces when she is torn between the duty to her family and the laws of the state, involves elements of situational luck --- it is a matter of bad luck that she came to face these two opposing claims at the same time. However, in this case the issue of luck is marginal to the conflict itself. The general issue is a matter of character: phronesis is the pre-existing moral disposition and capability to deal with these kinds of conflicting situations. We shall not consider the issue of moral character in this dissertation. Instead, we shall assume that all subjects are equally deserving, unless specified otherwise.
The issue of luck is pivotal in the analysis of the first two issues. The salience of the chance-like mechanism in this chapter can be summarized in a single diagram (cf. Figure 2), which demonstrates a departure from traditional representations of morality. In this chapter we shall take into account that all what happens in the world is subject to luck and uncertainty. Chance is an intermediate stage between the individual who is subject to moral judgment and real world events.
Figure 2. Models of moral attribution
In the proposed formal model of moral relevance of chance the lottery is a lever that attributes the consequence to one of the players --- in this case to the first subject. This model will play a central role in this chapter, and it will be used to address several issues, depending on the nature of the lottery and the nature of the consequences. If the lottery represents the uncertainty about the consequences of one's actions, then the moral issue is that of responsibility. If the lottery stands for the lottery of life, in which `God' bestows endowments and a social-economical status --- unequally --- over Its subjects, then the moral issue is that of distributive justice.
We shall expand on the relationship between responsibility and explanation, by advancing a probabilistic definition of explanation. The probabilistic version of explanation modifies and partly rescues the traditional connection between responsibility and causality. With this newly gained insight in responsibility we shall argue that moral luck, an --- as we see it --- lottery-determined responsibility, is a misguided idea and misunderstands the moral neutrality of the mechanism of luck. The same ideas can be applied to another moral issue, i.e., that of distributive justice. The lottery of life is morally irrelevant, and the principle of fair opportunity supplemented by the principle of redress ought to compensate for the undeserved inequalities between people.
B. Explanation and Responsibility
Who is to blame? --- theory of explanation
... Or who deserves the praise? That is the flip side of the same question. During the 1996 Super Bowl, Deion Sanders played for the Dallas Cowboys, who beat the Pittsburgh Steelers. To what extent does he deserve credit for that win? And to what extend can the drunken driver be blamed for the accident or the operators in the Three Mile Island plant that day in March 1979 for the melt-down accident? The question of praise and blame shall be advanced from the angle of responsibility. In this section we shall investigate the hypothesis that responsibility applies to the extent that an explanation can be given, i.e., that a person x is responsible for event E due to action A only if A is an explanation for E.
The responsibility of x for E can be of different kinds. The form of responsibility would be a moral responsibility if x had done A voluntarily and intentionally. If x had done A unintentionally or if the relationship between x and A is one of legal attribution, then we cannot speak of moral responsibility. It could be that a form of legal responsibility is involved. In this dissertation we shall not attempt to analyze necessary or sufficient conditions for attributing A to x, nor what forms of a lack of intentionality (e.g., neglect) qualify for legal responsibility. Notwithstanding, our account of explanation, i.e., of the relationship between action A and event E, applies to any form of responsibility. Brenda is legally responsible for breaking the window because her daughter threw a stone at the window moments before the window broke. The American system of law specifies the necessary condition for Brenda's responsibility, namely, that her daughter is legally underage.
If responsibility is defined as the extent to which an event is explained by x's action, then the difficult questions that remain are "What counts as an explanation for an event?," "When do we say that an event has been explained?," and "Is every explanation a causal explanation?" One of the first who dealt with the issue of explanation was Carl G. Hempel. In his 1948 paper with Paul Oppenheim, "Studies in the Logic of Explanation," Hempel distinguishes between deductive and inductive explanations. A deductive explanation has as its major premise a general law, whereas the major premise in an inductive explanation is a statistical generalization. Hempel specifies validity criteria that should hold for both deductive and inductive explanations.
- The argument should have the proper logical form. A deductive argument should be valid, whereas an inductive argument should entail the explanandum with high probability.
- The explanans, the premises, should be true.
- For inductive arguments all (relevant) evidence should be given to avert possible premises that mark the situation as an exception case. This is the so-called requirement of total evidence.
These criteria sort out a class of explanations. In 1992 a Boeing 747 crashed in a suburb of Amsterdam. The explanandum is explained by the conjunction of a statistical generalization and a true premise. The airplane lost three of its four engines, and almost all Boeing 747s that lose three out of four engines crash.
However, we shall show that Hempel's criteria are neither necessary nor sufficient conditions for what can count as an explanation. The class of arguments that are defined as an explanation by Hempel is too broad in one sense and too narrow in another. There are some obvious counter-examples. Look at the following argument that agrees with Hempel's account of an explanation.
- The driver came home safely, for she had been driving drunk, and almost always when someone drives drunk she will come home safely.
There is something disturbing about this argument, because it satisfies the conditions of an explanation, whereas we believe that the driver came home safely, despite the fact that she had driven drunk. Nonetheless, the statistical generalization is true with a high inductive probability. Probably more than ninety percent of those who drive drunk do not cause any accident.
Besides explanations that are not really explanations, Hempel's criteria neglect a class of explanations. The following argument would not count as an explanation according to Hempel, because the explanandum does not hold with high inductive probability.
- Brenda caused the accident, because she had been driving drunk.
Although we tend to think of this as a valid explanation, the statistical generalization, i.e., people that drive drunk cause accidents, has only a low probability. Apparently, the explanandum does not have to hold with a high posterior probability for an argument to be an explanation, if only that probability has increased in the presence of the explanans.
We conclude that Hempel's account of explanation has to be amended to exclude vacuous arguments and to include what seem to be valid explanations. Let us look at the same examples as we used above. The reason that driving drunk is a form of reckless behavior is that it increases the risk of an accident as compared to driving sober. This suggests that Hempel's first criterion for proper explanations has to be replaced with the condition that explanations have to have a probability-increasing effect. Salmon proposes the following amendment of what an explanation should establish.
An explanatory argument shows that the explanandum event relative to the explanatory facts is substantially greater than its prior probability.
An explanation should increase the likelihood of the event that actually occurred as compared to the likelihood before the explanatory facts were presented. The prior probability of the event that Brenda would hit the pedestrian with her car is quite low. But the probability of the explanandum increases significantly when we know that Brenda has been drinking.
With the ideas developed thus far, we can propose a preliminary definition of explanation. A certain event E explains event F, if and only if,
P(F|E) > P(F),
i.e., iff the conditional probability of F given E is larger than the marginal probability of F.
Coincidence; a modification of the definition of explanation
The issues of coincidence and explanation have some illustrative connections. David Owens proposes that a coincidence is an event that has no explanation. This claim relies on a misguided idea of chance. Most coincidences can be explained given an abundance of explanatory facts. Instead, we proposed that an event is a coincidence if its components are no explanation of one another. Nonetheless, an investigation into the explanation structure of a coincidence will reveal some new aspects of explanation. My birthday coincides with that of my uncle. This is clearly a coincidence. Despite the inappropriate absoluteness of Owens' claim that there is an explanation for this event, it is clear that the activities of my parents nine months earlier do not explain the coincidence as such. However, according to the account of explanation developed thus far, it does.
E = {My uncle and I have our birthdays on April 5}
F = {The activities of my parents approximately 9 months prior to April 5}
Whereas the prior probability P(E) = (1/365)2, the posterior probability P(E|F) is in the order of 1/365. Thus, P(E|F) > P(E), and --- according to the concept of explanation developed thus far --- F explains E.
However, we do not call F an explanation of event E. An explanation of the coincidence should not merely explain one of the elements of the coincidence, but it should explain each of the elements, including the conjunction of the elements.
Therefore, the definition of explanation deserves further modification. In the case that the explanandum is a conjunction of events, the explanans should increase the probability of all its constituents and the conjunction itself. In formal terms,
F explains E = E1E2, if and only if,
(i) P(E1|F) > P(E1), (ii) P(E2|F) > P(E2), and (iii) P(E1E2|F) > P(E1E2).
It was a coincidence that David O. met his biggest enemy on a cruise-ship. The fact that he had been working very hard and needed a break is not an explanation of the coincidence as such, although it may explain his joining the cruise and it may even increase the probability of meeting his enemy there. The thing is that it does not explain his enemy joining the cruise, and thus it is unsatisfactory as an explanation.
There are a few more characteristics of explanation on which we want to expand before we continue. Owens write for instance, that
it may now be thought that while `fully explains' is a non-transitive relation, `partially explains' or explains to some degree' is transitive. After all, the loss of the nail partially explains the loss of the shoe which partially explains ... and is it not also true that the loss of the nail partially explains the loss of the kingdom?
It is a misunderstanding that explanation, even the "weaker probabilistic version of explanation" as we try to develop, is either transitive or agglomerative. The transitive property intuitively appeals to the fact that if G explains F and F explains E, then G explains E. However, this property does not hold. In order to appreciate this fact, look at this somewhat theoretical example. We have drawn a card at random from a deck of cards. Look at the following three events,
E = {drawing an ace}, F = {drawing a red face card or a black card} and
G = {drawing a black card less than an ace}.
1 = P(F|G) > P(F) = 34/52 G explains F.
4
/34 = P(E|F) > P(E) = 4/52 F explains E.0 = P(E|G) < P(E) = 4/52 G does not explain E.
This proves that `explains' is not a transitive relationship. Could it, however, be that `explains' is an agglomerative relationship? If F1 is an explanation for E1 and if F2 is an explanation for E2, the question is whether F1F2 is an explanation for E1E2? Although it may seem intuitive, in general this property does not hold. A counter-example can be constructed to disprove the agglomerative-property of explanation.
Let E1 = {drawing an ace}, E2 = {drawing a black card}, F1 = {drawing a red face card} and F2 = {drawing clubs less than a king, or the ace of hearts}.
2
/8 = P(E1|F1) > P(E1) = 4/52, F1 explains E1.11
/12 = P(E2|F2) > P(E2) = 26/52 F2 explains E2.0 = P(E1E2|F1F2) < P(E1E2) = 2/52 F1F2 does not explain E1E2.
Thus, explanation is not agglomerative.
Explanation and homogeneity
Every day approximately ten people in the United States die waiting for a liver transplant. Livers are not available in sufficient quantities and selection is needed to decide who is eligible for the available livers. Liver transplants do not have the same success-rate on everyone and in November, 1996 the United Network for Organ Sharing, a health review board, has decided that predicted success-rate should be one of the selection criteria. Those who have acute liver problems should have preference over those with a chronic ailment, because the former has a success rate higher than the latter. If those with better success prospects are treated, then overall more lives will be saved.
In the terminology that we have developed thus far, the reference class, R, of all liver patients, can be divided in R1 = {having acute liver problems}, and R2 = {having chronic liver problems}, where R1 is an explanation for event S, i.e., the success of the liver transplant, and R2 is an explanation for ~S, i.e., the failure of the operation.
P(S|R1R) > P(S|R) P(~S|R2R) > P(~S|R)
Critics of the ruling of the health review board have argued that the explanations R1 and R2 are inferior because each of the subclasses, R1 and R2, are not homogeneous. `Chronic liver problems' is a collection name for ailments that range from alcohol and substance abuse to hereditary problems. Success-rates differ for each of them. To identify the chronic nature of a liver problem as an explanation for the failure of the liver transplant falls short in understanding the essence of the homogeneity of the reference class to make sense of a probability. Certainly, epistemic homogeneity will secure a better survival rate once the review board's rule is implemented. Nonetheless, the ruling itself conceals some distinct ingredients of further explanation. According to its own narrow criteria the ruling could have been more precise.
We can ascribe a person's risk of getting heart disease according to her membership in a subclass, e.g., characterized by over-consumption of salt or daily consumption of red wine. However, this subdivision cannot go on indefinitely without becoming unintelligible. The ability of ascribing a probability of a certain effect to a certain class of general characteristics depends on the presence of a homogeneous community with such characteristics from which we can infer a relative incidence frequency. When Peirce is confronted with the problem of constructing a probability (i.e., part of logic for Peirce), he correctly recognizes "I can see but one solution of it. It seems to me that we are driven to this, that logic inexorably requires that our interests shall not be limited. They must not stop at our own fate, but must embrace the whole community." If one belongs to the class of overconsumers of salt, then one is responsible for anyone in that class who incurs a class-related ailment. According to Peirce, it is a logical requirement to identify oneself with all persons in the same homogeneity class. Imagine the situation in which a group of people, in turn, let a drop of water fall in a bucket that stands on an expensive carpet. At some moment, someone will make the bucket overflow. It is natural that everyone is responsible for what one person actually did. Responsibility is shared by all members exposing themselves to the same risk.
Events without explanations --- choice of lotteries
Before we proceed to consider the issue of responsibility, there an important aspect that we have left aside in this section: events without explanations. The apparent structure of the argument may obscure the importance of this issue. Responsibility is attributed to a person to the extent that the agent's action or negligence serves as an explanation for what happened. If an event cannot be explained by any action, it may seem impossible to attribute responsibility to the agent. We shall analyze this issue. There are two related questions. Does responsibility correspond one-to-one with the possibility of explanation? Is it possible for an event not to have any explanation at all? The former question will be answered in the next section. Here we shall explain why the latter has to be answered affirmatively.
When one draws a card from a deck of cards, it will be one of the fifty-two cards. Given that the pack of cards has been shuffled properly and nothing more is known about the card drawn, there is no explanation for the fact that it is the Queen of Hearts. Sometimes there is no a priori partition of the reference class that makes the event more likely despite the fact that the event did happen. Risks often appear in this form. There was nothing specific that went wrong, but the cable broke due to wear and tear, killing the worker instantaneously. Or, more poignantly, a nuclear power plant has certain risks associated to it, and the law of large numbers will guarantee that something disastrous will happen eventually --- without any specific failure.
This is not to say that there are no causes. It could be that any sequence of events is completely determined --- although we have argued that the deterministic world view has a lot of explaining to do, and that a probabilistic picture of the world has clear, pragmatic and philosophical advantages. Even a determinist could therefore agree with this analysis.
The lottery is a model for a type of reality, in which sometimes things just happen. Even though Andrea had only a small probability of winning the state lottery, there was someone going to win it, and it happened to be her. There is no explanation why. Although there is no subdivision of the general reference class that increases the probability of winning --- i.e., there is no explanation --- the sheer participation in the lottery is a pre-condition to having a chance at all. However, participation in the lottery is not an explanation in the formal sense that we defined it. Participation in the lottery is not a subclass of the reference class, rather it constitutes an embedding of the reference class with a larger set of alternatives --- the super reference class.
Figure 3. Events without explanations
The organizers of the lottery may have been able to choose another scheme in which Andrea had a much lower probability of winning. From a meta-perspective, therefore, the choice of lottery may affect the probabilities of the possible outcomes. The coincidental conjunction of causes may impede a true explanation of the nuclear disaster. Despite the absence of explanation of the disaster given the presence of a nuclear reactor, the conditions of the event could have been different if we had decided to play a different lottery. Whether this possibility of choice constitutes a moral imperative shall be central in the chapter on safety.
Problems: correlation and temporal direction
We shall pause for a brief moment at some objections against our definition of explanation. There are many cases in which the probability-increase of the explanandum given the explanans is merely a matter of correlation and not of causality. Neighborhood police activity is correlated with but does not explain crime. In short, correlation endangers the effectiveness of our theory of explanation.
Its temporal direction is another particular characteristic of explanation that complicates the defining activity as we have attempted here. Often, the explaining event precedes that what has to be explained. There are also instances in which the explanans follows the explanandum. Aristotle's final cause is a teleological explanation of an event, where the ordinary temporal order of cause and effect is reversed. Several years ago in a case involving a policy of Domino's Pizza, a number of pizza delivery persons caused accidents, because they had to deliver the pizza within half an hour. The teleological orientation of the action was the explanation of the accident --- not so much the complex of precursory actions and decisions.
A difficulty of this definition of explanation as a probability increasing event is not its narrowness with respect to temporal direction --- such as the one-dimensionality of physical causality --- but rather its non-specificity. Our definition of explanation is principally unable to distinguish between explanans and explanandum. If, for instance, it has been established that A, the explanandum, is explained by B, the explanans, i.e.,
, then it follows that:
P(B|A) = P(B) ´ P(A|B)/P(A) > P(B),
the explanandum, A, explains the explanans, B. Sometimes this is unproblematic. Romeo came to the cave because Juliet came there, and Juliet came to the cave because Romeo came there. However, this reciprocity does not always hold. Drunk driving may explain the death of the pedestrian, but it is not convincing that her death explains the driver's drunk driving. The trouble with our definition is not merely analytic: it will also complicate our definition of responsibility. We shall argue that because drunk driving explains the accident, the driver is responsible for the pedestrian's death. The question now is how can it be prevented that the pedestrian will be held responsible for the driver's drunkenness?
The answer to this observation is that it shows that a definition of responsibility cannot be purely formal, and needs additional, substantial principles. These principles may differ for legal and moral responsibility. We shall argue in the next section, for instance, that moral responsibility depends on the intentionality and voluntariness of the agent. The only way that these two requirements can be overturned and the agent can be held morally culpable for a certain action is if it reasonably can be shown that the lack of intentionality or involuntariness were the result of the agent's neglect.
Max Weber was among the first to develop an ethics of responsibility, describing the duties of a politician. He opposed responsibility to acting ethically out of principled disposition or conviction. In contrast, he defined acting responsibly as standing up for the consequences of one's action to the extent that these are foreseeable. It is this notion of responsibility that we have anticipated. In this section we shall explain the difference between several forms of responsibility, in their temporal constitution and their way of attribution. First we shall make the link between explanation and responsibility explicit.
Human reason orders a course of events according to its causes. The notion of responsibility is fundamentally related to that of causality. A cigarette is said to be responsible for the explosion, if the cigarette caused the fire that led to the explosion. Weber pointed out that any causal link consists of a multitude of factors. Many other conditions, such as the right oxygen level, the presence of explosive materials, etc., had to be fulfilled as well to produce the explosion. The total cause is the complex of all pre-existing factors fulfilling the necessary and sufficient conditions of the event in question. The total cause has no practical usefulness for two reasons. First, it is in its inaccessible totality. Secondly, it is dubious whether all events have sufficient conditions. There may not exist any sufficient reason why the U238 atom decayed in the past minute, but it did. The notion of causality that we shall adopt for the purpose of responsibility is that of explanation. X is responsible for event E due to action A only if A is an explanation for E.
Table 6. Forms of responsibility
responsibility
prospective
retrospective
legal
prohibition, regulation
liability, reward
moral
duty, obligation
blame, praise, (moral luck)
We shall concentrate on personal responsibility, such as appears on the intersection of two distinctions; prospective vs. retrospective responsibility and legal vs. moral responsibility (cf. Table 6). This analysis relies on our account of explanation. Explanation, as opposed to causality, overcomes several difficulties that generally pertain to the issue of relevance. Who was responsible for the cup to break? Was it Andrea who tried to balance it on her head, or was it her mother who gave birth to Andrea? The difficulties that apply to a determinist account of the world are relieved by the probabilistic approach. A priori there would not be any difference in the probability of the cup breaking, whether or not Andrea was born. However, Andrea's balancing act does significantly increase the probability of the cup to break. In the probabilistic account Andrea would be the culprit, whereas in a determinist approach to the world any event is as necessary as another.
Prospective responsibility and chance
Responsibility can belong to human beings, cigarettes, storms and the like. However, we shall narrow the scope to personal responsibility, in particular moral and legal responsibility. Personal responsibility can be distinguished in two main types: prospective and retrospective responsibility. To have a prospective responsibility for something is to have a responsibility in the form of an obligation or a duty in virtue of the role that one fulfills to make sure that a certain thing happens (or does not happen) or obtains (or does not obtain). For instance, doctors are responsible for the health of their patients. In these instances the action is determined by the role that one fulfills. The choice is under full information and certainty.
On the other hand, how does prospective responsibility relate to circumstances that are uncertain and subject to chance? DNA technology has risks attached to it, however, it is all but certain what these risks are, let alone their magnitude. What is the place of prospective responsibility here? Other issues concern personal choices. For instance, Brenda got drunk on the office party and she wants to go home. Does she, in her condition, have a prospective responsibility to avoid driving? We shall answer these questions by looking at the issue of explanations without events.
Are those things that do not happen in the need of explanation? It may seem counter intuitive that they do, but we shall argue to the contrary. The drunken driver came home safely. Her drunk driving is not an explanation of her getting there without harm or harming someone. The question is "What does her driving drunk explain, if anything?". The definition of explanation as increasing the probability of the explanandum can hold for both explananda that occurred and those that did not occur. Although she did not cause an accident, the drunken driver's drunk driving explains the accident that she actually did not cause. Therefore, the driver bears a prospective responsibility not to drive drunk, because --- notwithstanding the outcome of her driving lottery --- if she drives drunk, then her driving explains causing an accident. This prospective responsibility is moral and forces itself on the decision maker as a duty or obligation.
Prospective responsibility can also be legal --- such as drunk driving in most Western countries. More broadly, in any form of planning where there is no knowledge of actual outcomes, yet different courses of action will have different outcome-lotteries, prospective responsibility is present. In those circumstances the uncertainty and chance often go beyond merely individual concern. Although individuals do have a personal responsibility to implement certain highway safety measures --- it is up to them to decide whether their cars have airbags installed --- they may not be aware of the risks to which they put themselves and others and, consequently, they may neglect safety precautions (e.g., side markers and seat-belts). Who bears the prospective responsibility for those matters that are beyond individual concern? In the case of DNA research and other risky technologies or matters that are endangering a group of the population --- e.g., occupational risks --- prospective responsibility pertains to a collective, generally the government or governmental institutions. Companies bear a prospective responsibility for product safety. Prospective responsibility does not only exist on a personal level, but there are circumstances --- specifically when risk and uncertainty are involved --- that it can also pertain to collectives and groups. This kind of responsibility cannot be moral. It is a form of legal responsibility and it generally appears in the form of regulations and prohibitions.
In the final part of this section we want to touch briefly on the connection between prospective responsibility and chance in the work of three philosophers, with whom we share a great deal of affinity. For Charles S. Peirce, Hans Jonas and John Rawls the notions of probability and risk call upon us to extend our interests beyond the present moment and beyond ourselves. Peirce argues that probabilistic reasoning depends on the `logical' sentiments of "faith, hope and charity," because "[our] death makes the number of our risks, of our inferences, finite, and so makes their mean result uncertain. The very idea of probability and of reasoning rests on the assumption that this number is indefinitely great." So probability itself relies on the assumption the individual disregards her own finitude by extending her interests as far beyond herself as possible. Although nothing will last forever the individual should have "a hope, or a calm and cheerful wish, that the community may last beyond any assignable date." Continuity is an intricate part of chance and probability. Jonas employs the connection between continuity and chance within the modern technological society. Jonas proposes an ethics of responsibility --- essentially of prospective responsibility --- because the conditions of our times, our inevitably progressing scientific-technological developments, are inherently risky, ambivalent and uncertain. Preservation, restraint and caution are essential duties, if modern society cherishes its continuity. Rawls makes an explicit distinction between two forms of prospective responsibility. First he argues that people ought to make choices beyond their self-interest, on the basis that the lottery of life is morally neutral. For the resulting, contingent discrepancies one ought to be compensated. Second each individual should act beyond the interests of his own present self, if she acts according to the requirements of deliberative rationality. For Rawls, as for Jonas and Peirce, the conception of prospective responsibility modifies and compensates for the uncertainties of this contingent world.
Retrospective responsibility and chance
Retrospective responsibility does not necessarily come in the form of blame. It may be positive, rather than negative. Instead of blameworthiness retrospective responsibility may result in praiseworthiness, as when a regulatory mechanism capable of enforcing safety standards is said to be responsible for making nuclear power a safe form of energy. Again, retrospective responsibility does not only pertain to persons but can also be contributed to collectives and systems.
To be retrospectively responsible for something is to be accountable for a certain thing in the form of moral condemnation, moral praise, punishment, or reward. Also retrospective responsibility can either be moral or legal. We discuss the moral case of retrospective responsibility in the another part of this chapter in a section on moral luck. The notion of moral luck, as proposed by Williams and Nagel, is a rigorous attempt to rescue and reinforce retrospective responsibility. Moral luck holds someone accountable for anything that occurred to her. We reject this form of accountability and instead we propose a model of retrospective responsibility that is based on the idea of explanation and regret. We are only morally responsible for an event, if our free and voluntarily action A can count as an explanation for the event that occurred and our decision for the action has not attempted to minimize the maximum anticipated regret over all available alternative actions --- our interpretation of deliberative rationality.
Because we have defined retrospective responsibility with ample regard for the possible consequences of an action, even though these consequences may not have prevailed, the prospective and the retrospective forms of moral responsibility tend to blend into one another. In both cases the moral considerations are the same, and the moral attributions are the temporary mirror images of one another. A fulfilled duty or obligation results in moral praise, whereas a broken duty or obligation entails moral culpability.
Table 7. Possible moral attributions in the presence of uncertainty
Table of Moral Attribution
Actual outcome is
p.r. = prospective responsibility
r.r. = retrospective responsibility
undesirable
desirable
Acting without
deliberative rationality
p.r.: "broken duty"
r.r.: "moral blame"
the acting subject is lucky
p.r.: "broken duty"
r.r.: "moral blame"
Acting with
deliberative rationality
the acting subject is unlucky
p.r.: "fulfilled duty"
r.r.: "moral praise"
p.r.: "fulfilled duty"
r.r.: "moral praise"
The important conclusion, as represented in Table 7, is that the contingent outcome is of no consequence to the moral standing of the act. The moral standing is instead determined by the deliberative rationality of the act.
The legal version of retrospective responsibility is a form of liability. A person who parks her car at a non-parking spot is retrospectively responsible for her action. She will be held liable for the amount of money that the law specifies for this offense. The extent of liability is often regulated by contract. For example, the partners of a limited partnership are liable for the firm's obligations only to the extent of their contributions to the firm's capital. Liability may also be governed by the customs of tort, e.g., when children, insane persons, and other legally incompetent persons are not considered to be legally responsible for their actions. In other occasions legal retrospective responsibility appears as the flip-side of legal prospective responsibility. When a doctor does not care properly for her patients she may be held retrospectively accountable to the extent of her negligence.
In the previous section we have illuminated the moral character of chance-like events. The probabilistic notion of explanation enabled us to come to a wider application of the notion of responsibility. Recent philosophical proposals have attempted to extend the notions of responsibility --- moral culpability or praise --- even further. We shall first learn about a recent philosophical proposal that wants to give luck an independent moral status besides intentionality and voluntariness as conditions for moral success --- and in some cases for moral failure as well. This extension of morality into the region of luck is coined moral luck. We shall evaluate this proposal, reject it, and propose an alternative model of minimax regret to compensate for the imperfections of deliberative rationality.
There are circumstances in which luck has a real influence on moral judgment. One group of situations we can put together as constitutive luck. We are born into this life with a certain moral sensitivity. Our moral character does certainly not depend solely on our own autonomous choice. Many contingent factors, such as parents, education and social-economical environment, play a role in our moral behavior. Factors of circumstantial luck refer to the fact that all people do not face the same moral problems. Late twentieth century people shall probably never encounter any moral dilemma concerning breathing fresh air; maybe two centuries from now people will. In densely populated and developed areas driving drunk is a moral issue, whereas an inhabitant of a deserted island does not face that problem.
There exist a long philosophical tradition from Aristotle to Kant and beyond to disregard accidental events as relevant for a person's moral merit or fault. How could one be morally culpable for a guest's injury when a strong gust of wind blows the rain gutter from one's house? And how could a person claim moral praise for saving a baby from a burning house, if she had the intention to kidnap it? One can legitimately ask whether chance and luck have anything to do with moral judgments at all.
In the last three decades there have been serious attempts to show that a simple rejection of luck from the realm of moral judgments is not as easy as it may seem. If intentionality and voluntariness are the key concepts of normative judgment, then the concept of control expresses the supreme condition for an agent to be accountable for her actions. However, a little reflection reveals that any action is out of the agent's complete control. Luck is an irreducible factor of life. If no agent is in full control of the outcomes of her actions, is it still meaningful to distribute moral praise and blame? People actually do, and --- as Bernard Williams argues --- should, attribute moral praise or blame to a person even if factors of the action are beyond her control. If the outcome of the action happens to be positive, she has moral (good) luck. She has moral bad luck, if the lottery shows up with less a favorable consequence. Instead of considering the accidental and non-intentional joy-ride of events in this world as endangering morality in the back seat, Nagel's concept of moral luck enjoys this bumpy ride:
Where a significant aspect of what someone does depends on factors beyond his control, yet we continue to treat him in that respect as an object of moral judgment, it can be called moral luck.
To prevent morality from shrinking into an extensionless point, Nagel embraces uncontrollability and establishes a form of judgment that the philosophical tradition normally considers invalid: pass a moral judgment on someone with respect to consequences that were beyond her control.
Both Nagel and Williams claim that moral luck is an irreducible factor in moral judgment. At first, moral luck seems to oppose our moral, somewhat Kantian, intuitions. Kant argues that if the maxim underlying our will is universalizable, then unlucky outcomes of our actions are not morally significant. Aristotle says that any action that is not performed voluntarily or intentionally with a right amount of consideration, which is generally the case with luck, is morally void. In a skeptical move Nagel argues that everything can be seen as the result of a combined influence of factors, such as education, society and character, which are not or not fully within the agent's control. Therefore, the area of genuine agency shrinks to an extensionless point. Nagel admits that this is a paradox, but, as he claims, it is not a contradiction.
Moral luck, according to Nagel, rescues morality from a certain death. If moral success and failure are defined by too stringent conditions, such as voluntariness and intentionality, then in almost no instance will it be impossible to attribute moral praise or blame. Moral luck, on the other hand, considers the agent responsible for her actions, even when there were factors in the action that were not under her control.
In the literature on moral luck the Gauguin Case has been one of the prime examples. It is instructive to look at several replies to the Gauguin Case. In this case an imaginary man, named Gauguin, one day decides to leave his family in order to become a painter. In his essay "Moral luck" Bernard Williams describes the events and putative deliberations of this man. Gauguin is an ordinary man; he is not very happy with his life and he wants to become a painter. If he chooses to pursue his wish he has to leave his family and inflict poverty and misery on them. Whether he succeeds in becoming a successful painter or not cannot be foreseen. That will depend on all kinds of factors, which Williams puts together and calls "luck."
Williams poses the question, "What can justify or condemn Gauguin's choice to become a painter and leave his family?" Is it a priori immoral to leave one's family? Can utilitarian considerations justify the move? What is the relationship between leaving one's family and becoming a painter? The issue of becoming a painter, on the one hand, and leaving his family, on the other, are two identical things for Gauguin, according to Williams. His description of the situation is that Gauguin has to leave his family in order to become a painter and that no alternatives are available. Williams comes to the conclusion "that in such a situation the only thing that will justify his choice will be success itself," because the moral issue --- leaving or not leaving one's family --- is so intertwined with the other non-moral issue --- becoming a painter or not. So, according to Williams, only if Gauguin indeed manages to become a famous painter, then we can judge, retrospectively, that he has made morally right choice.
On the other hand, failure does not necessarily condemn him. Williams introduces the distinction between "intrinsic" and "extrinsic" luck. Intrinsic luck are those factors which are internal to the project or action itself, such as talent and motivation in Gauguin's case, whereas extrinsic luck concerns those factors that lay outside the project as such; Gauguin may sustain an injury on the way to Tahiti, which prevents him from ever painting again. Williams claims that the bad luck from the latter type can never condemn the action, and that only intrinsic bad luck is related to condemnation, i.e., only Gauguin's failure as a painter can condemn his action. Only success as a painter can morally justify his abandonment of his family.
This is the opinion of Bernard Williams. There are two presuppositions that seem to underlie this line of argument. Apparently, Williams believes that the only way that Gauguin can become a painter is by leaving his family. Becoming a painter is tied up in a one-to-one relationship with the abandonment of his wife and children and, thus, the issue of success becomes morally relevant. There are questionable aspects to this argument. First, the connection between the normative issue --- leaving one's family --- and the action --- becoming a painter --- is not absolute in any sense. Secondly, Williams employs a very broad notion of morality. He mixes the normative issue of artistic success with the normative issue of moral success. For Williams any normative judgment seems moral, in the sense that the analysis and evaluation of the success as a painter and of the abandonment of his family all cumulate in one comprehensive judgment about "Was Gauguin justified in what he did?". Williams' approach to morality encompasses a complete judgment of an action.
For Kant there are three kinds of actions: moral, immoral and amoral actions. According to Kant, moral actions are ends in themselves. The only motivation for a moral action is the moral action itself. To be moral is to be consistent, and to be immoral is to be inconsistent. The latter of the three categories contains, for instance, eating an apple. Under any ordinary description of this action, there is nothing inconsistent about it, nor about its opposite, i.e., not-eating the apple. Actions are immoral if the maxim underlying the action leads to a contradiction. Kant would evaluate the Gauguin Case quite differently from Williams. According to Kant, the moral status of an action is not determined by its success and is, on the other hand, a function of will. Gauguin should not ask himself whether he will become a successful painter, but whether `leaving one's wife and children' is a maxim that can be universalized. It is very doubtful that it could.
Others answered the question of morality in a different way. A social-utilitarian interpretation of a moral action is an action that aims at producing a Collective Benefit. One of its exponents is Gauthier. According to him, individuals of a society can be motivated to act in accordance with --- not necessarily on the basis of --- a model of constrained maximization of the expected utility that by definition entails a moral optimum. Gauthier would advise Gauguin first to estimate properly the probabilities of success and failure as an artist, the utility of staying with wife and children versus the utilities of failing and succeeding as an artist. Subsequently, Gauthier would tell Gauguin to compare the following two lotteries on the basis of their expected utilities.
Table 8. Possible Gauguin lotteries
Lottery 1
Lottery 2
Staying at home
Alternative
Successful artist
Failing artist
Probability
1
0
p
1 - p
Utility
D
---
W = 1
L = 0
Choosing two arbitrary values, W=1 and L=0, Gauguin fixes scale and location of his utility function. Lottery One has an expectation of D, whereas Lottery Two has expectation p. If D > p, or in other words, if the relative utility of staying at home exceeds (in some sense) the probability of becoming a successful artist, then Gauguin should not leave his wife and children, otherwise he should. (Cf. Table 8)
Williams ridicules any attempt that tries to quantify probabilities in advance. "What is a reasonable conviction supposed to be in such a case? Should Gauguin consult professors of art?" He claims that the justification for Gauguin's action to leave his family is essentially retrospective. Nagel modifies this claim. "If the retrospective judgment were moral, it would imply the truth of a hypothetical judgment made in advance, of the form `If I leave my family and become a great painter, I will be justified by success; if I don't become a great painter, the act will be unforgivable.'" The truth moral judgment in any reasonable scheme of morality should, in principle, be available in advance. Luck does not determine the range of moral judgments.
We have seen that the theory of moral luck attempts to fill the vacuum that is created by a skeptical attitude towards subjectivity and free will. These skeptical aspirations make it a strictly consequentialist theory. Williams says that "the pure Kantian conception [of morality] merely represents an obsessive exaggeration." Moral luck finds itself also in combat with other consequentialist theories of morality. In a scientific/epistemic skeptical attitude it rejects the use of a measure of luck in judging the permissibility of an action, and thereby ignores the possible use of the criterion of maximization of expected utility. Moral luck is a skeptic's response to the danger of a moral vacuum.
The theory of moral luck may seem at odds with standard moral theory, it does seem an attractive explanation of how social-moral decisions are actually taking place. The Three Mile Island nuclear accident in 1979 sparked moral indignation among the public about the use of this dangerous technology. It is often the actual success or failure of an event that gives rise to a moral judgment. Nonetheless, we shall argue that there are serious objections to this kind of reasoning. The theory of moral luck muddles moral status and social reputation together. Moreover, its scientific skepticism may cause indifference about risky technologies --- until a serious accident occurs.
The first objection against the theory of moral luck attacks specifically those questions where it is not so much the action that is subject to luck, but the judgment. The villain whose criminal actions remain secret is certainly lucky, however, her luck is not a moral luck. She may still be considered of a high moral standing, which in fact she is not. It is the skepticism about this `in fact' that causes moral status to collapse into moral standing. This skepticism is unnecessary because there is a meaningful way to distinguish between the two. The murderer that has not been caught is socially lucky. She may be considered a virtuous person as long as she lives, but that does not change the moral culpability of her act. The structure of moral judgment is, in this respect, similar to any epistemic claim: both judgments are tentative. In the light of more information, a better judgment can be made. Luck can distort the proper information needed for a moral judgment, however, luck is not intrinsically related to the moral status of the subject that is being judged. "The difference here is not moral but merely epistemic," as Rescher says.
Secondly, Nagel's skeptical move that everything is beyond a person's control is based on the false assumption of an all-encompassing causality. It is however impossible to create a consistent concept of morality from this concept of causality. A comprehensive idea of causality is incomprehensible. John William Miller coins this the paradox of cause. His argument is that if causality is all pervading then it is impossible to say exactly what causes what, because everything causes anything. Causality shrinks to a notion without explanatory value. As a result we could say that, instead of causality, some other, all-encompassing principle, e.g., God or Fate, is the constitutive maxim of all that happens. Because causality is thought to be comprehensive, it becomes indeterminable. For that reason, Miller argues, the only way we make causality intelligible is by setting limits to its application. Causality functions only in generalities.
Uncertainty about the moral status of a person is not the only way that some moralists have argued in favor of moral luck. In the Gauguin case, for instance, Gauguin's action is given. Luck enters into the picture in determining the actual outcome of the action. Luck of this kind is pervasive, and instances abound. Gauguin does not know whether or not he will become a successful painter. The Three Mile Island nuclear plant had an array of safety measures, but a disaster occurred. The woman tried to save the child from the burning building, but she failed.
For Kant and other great thinkers fortuitous events are morally irrelevant. Fortuity does not possess moral significance because there is no intentional process. What possesses moral significance is the intentional process, i.e., the will.
Even if it should happen that, owing to special disfavour of fortune, or the niggardly provision of a step-motherly nature, this will should wholly lack power to accomplish its purpose, if with its greatest efforts it should yet achieve nothing, and there should remain only the good will (not, to be sure, a mere wish, but the summoning of all means in our power), then, like a jewel, it would still shine by its own light, as a thing which has its whole value in itself.
This offers a valuable insight. Evidently a bee cannot be responsible for stinging a person, because a bee lacks an intentional thought-process. The woman, although she failed to save the child, had good intentions. Whether good intentions are enough is something that we shall discuss later. The important conclusion we can draw here is essentially negative: actions that are fundamentally risky do not attain their moral worth on the basis of their actual outcomes. Gauguin does not `become' a good moral person after he is successful as a painter, nor does the nuclear accident at Three Mile Island in itself call for moral reprehension. We agree with Rescher that "if luck alone underpins the claims your actions have to being moral, then they are, in fact, not moral at all. Morality is secure against the luck-sensitive issue of how things chance to turn out. Here luck can play no determinative role." Drunk driving is a typical case that falls under this category. In the next section we shall look at this example in more detail to determine criteria of moral significance against the background of the moral vacuity of luck.
Three friends, each in her own car, drive home after a party. All three of them are drunk. Andrea causes an accident and kills a teenager on his bike. Her friend Brenda is stopped by the police and found drunk on her way home. She receives a fine. The third friend, Cheryl, arrives home safely, drops in bed, and falls asleep immediately. What is the proper distribution of moral praise and blame over these three drunken drivers? We shall give three accounts with different answers. It becomes clear that each account is committed to a different concept of chance.
In Oedipus Tyrannus Sophocles portraits the intricate relationship between the concept of an agent and the concept of chance as fate. Aristotle in his Physics discusses this concept of chance as fate. He calls chance as fate a special "divine" cause, outside the four Aristotelian causes. In Sophocles' play Oedipus pierces his eyes after he realizes what he has done. Simple bad luck, in Oedipus' view, does not exist. He has been cursed by the Gods and he has to pay for it. According to this view, Andrea is predestined to kill the teenager and has to take the blame for it personally. Brenda and Cheryl, on the other hand, were not picked by the Gods to commit an immoral act, and therefore they go free of any moral blame.
Nagel defends the theory of moral luck with respect to drunk driving. He does not have a metaphysical commitment to personal fate. On the contrary, he is rather skeptical; however, the conclusions are similar.
If someone has had too much to drink and his car swerves on to the sidewalk, he can count himself morally lucky if there are no pedestrians in its path. If there were, he would be to blame for their deaths, and would probably be prosecuted for manslaughter. But if he hurts no one, although his recklessness is exactly the same, he is guilty of a far less serious legal offense and will certainly reproach himself and be reproached by others much less severely.
Andrea, the one who killed the teenager, should receive blame for driving drunk. Moreover, she should receive the most moral blame of these three drivers. For Brenda, Nagel would feel a mixture of pity and lesser blame. She was unlucky to come across a police control, and, of course, she did something wrong. She could have killed someone. However, with hindsight, she did not `really' do anything wrong. We do not blame her as much as Andrea. Cheryl is generally called lucky, and most people would not feel the need to blame her for her driving drunk. She did not kill anybody and she did not get caught. Some people may argue that the `could have'-argument also applies to her. In that case she deserves as much blame as Brenda did. Nagel and Williams argue that morality cannot be made immune to luck, and, instead, they embrace luck as a constitutive factor in moral judgments.
We shall defend the third view that luck is morally neutral. This view requires the ontological assumption that luck is essentially subject-independent, i.e., if u1 and u2 are two distinct subjects with identical relevant attributes, then u1 and u2 can expect the same amount of luck, ante rem, l ar(u1) = l ar(u2). This assumption is in conflict with some teleological views of the universe, however, it allows the view of a universe designed by a probabilistically trained God. The blindness-assumption of luck combined with the definition of morality as treating people as free individual agents implies that luck should be excluded from morality, because a free individual agent is by definition independent of luck.
Getting into the car after one has been drinking is like becoming involved in a lottery because the outcome of this game is in a statistically significant sense beyond the agent's control. The moral neutrality of luck implies that the outcome of a driving-lottery is morally void. Killing someone does not influence the moral reprehensibleness of the act of driving drunk. So, if one person is judged to be morally culpable for her action, then the other two should be blamed to the same extent. From a moral point of view, Andrea, Brenda, and Cheryl are equally accountable. Rescher shares our point of view: "People who drive their cars home from an office party in a thoroughly intoxicated condition, indifferent to the danger to themselves and heedless of the risks they are creating for others, are equally guilty in the eyes of morality (as opposed to legality), whether they kill someone along the way or not."
The next question is `to what extent are they morally accountable?' Is, for instance, each one of them fully culpable for the death of the teenager that one of them killed? Or is each of them, even Andrea, culpable to a lesser extent? The answer to this question is closely related to the kind of risk-attitude one chooses to defend. Extreme risk-averseness is based on the assumption that worst-possible outcome should be avoided by all who would possibly play in the lottery. Therefore, if there are subjects who do choose to play in the lottery of driving, then they are morally responsible for the worst possible outcome. The typical decision-rule associated to risk-averseness is the minimax-criterion, i.e., choose your action as to minimize the maximal expected risk. The minimax decision rule explains our intuition that driving drunk is criminal, or that nuclear plants should justify a worst-case analysis.
There is a problem with this decision-rule. Anyone who gets into a car assumes an inherent chance to cause a car-accident. In fact, any activity in which we are involved has a probability, however infinitesimal it may be, for an inherently bad outcome. Does this mean that we are morally responsible for this worst-case scenario at the moment we become involved in the activity? Philosophical rigidity would force the risk-averse person to make this strong connection between moral responsibility and the worst case. Rigid risk-averseness is therefore like an original sin trapping every human being, anytime, anywhere, anyhow.
Apparently, strict risk-averseness is untenable. But why, then, is it said to be virtuous to err on the side of caution, and a lack of virtue not to do so? Aristotle called prudence one of the intellectual virtues. "...In art he who errs willingly is preferable, but in practical wisdom, as in the virtues, he is the reverse." When a lot is at stake, human beings tend to act prudently. For the possibility of large damage they insure themselves. The deeper reason lies in the notion of regret. Williams correctly observes that "the constitutive thought of regret is generally something like `how much better if it had been otherwise', and the feeling can in principle apply to anything of which one can form some conception of how things would then have been better." We disagree with Williams that this is essentially an a posteriori notion. Regret can be anticipated. For our purposes we shall introduce the notion of agent-regret, i.e., the regret that a person feels towards her own past actions. It is the agent-regret that we anticipate that makes us take insurance or drive soberly.
The philosophical structure of agent-regret is important for determining its normative implications. Agent-regret is the rupture between the present self and the past self. If the present self rejects the actions of the past self, then there is evidently a violation of identity. Identity is the condition for ethical accountability. This does not mean that every violation of identity points to a moral violation. If I regret that I did not buy the shirt, then there is a conflict between my present and past self, however there is no moral issue. Having or not having the shirt is generally not a moral issue, but killing or not killing a person generally is. In order to determine the moral culpability of driving drunk, the agent-regret of a drunken driver that killed the teenager on the bike should be capable of being anticipated. The regret could indeed have been anticipated because for the driver there were easily accessible, alternative courses of action available that would likely have prevented the disastrous outcome. Andrea could have taken a cab or could have limited her alcohol assumption in the first place. The decision rule that we propose is the minimax of anticipated regret, i.e., minimizing the maximum anticipated regret. This means that one should choose to act in such a fashion that the worst possible regret over all possible alternatives is minimized.
Although this criterion is risk-averse, it does achieve more intuitive solutions than the strict minimax rule with respect to extreme situations. For instance, a sober and prudent driver of a car both have a chance to cause a deadly accident. However, we do not consider all sober and prudent drivers morally responsible for this inherent probability. The minimax of anticipated regret criterion can account for this fact. For a sober driver there are often no easily accessible alternatives available. Possibly one of the safest and most economical ways to get from A to B is by driving a car soberly and prudently. The agent-regret that can be anticipated as a result of a deadly accident beyond the control of the driver shall not be very high. The driver may regret the accident, but he will not, or not to a great extent, regret his driving as such. This result is consistent with what Rawls calls deliberative rationality. "A rational individual is always to act so that he need never blame himself no matter how his plans finally work out... Therefore any risks he assumes must be worthwhile, so that should the worst happen that he had any reason to foresee, he can still affirm that what he did was above criticism." Unfortunately, life is too unpredictable to achieve complete deliberative rationality. For this reason regret will always remain a constitutive part of the individual's self-evaluations of her past actions.
Regret --- the ordinary and the extraordinary
The idea of regret exhibits a pre-occupation with the extra-ordinary, rather than with the ordinary. Is it justified to pay so much attention to what only rarely occurs and even to let moral judgment depend upon what, in these rare cases, could happen? The concept of `morality' is derived from the Latin, mores, meaning habit, or custom. Etymologically it may seem that morality is generally aimed more at the ordinary than at the extraordinary. Rescher argues, "moral evaluation as we actually practice it generally reflects the ordinary course of things. Ordinarily, breaking and entering is a wicked thing to do. Ordinarily, driving drunk increases the chance of harm to others." As a result of that, Rescher claims that moral judgment has to take into account what can reasonably be expected. "Moral blame or credit ... hinges on what can reasonably be expected and not on actual outcomes --- or what actually chances to occur. It is this gearing to the issue of what can reasonably be expected to happen that detaches moral evaluation from the issue of actual outcomes in a way that factors luck out of the picture."
Also Rescher attempts to "factor luck out of the picture," as we do. However, instead of being concerned with the extraordinary, with what might happen, Rescher believes that moral judgment should only take into account what is reasonably likely to happen, and he gives several examples that seem inconsistent with his definition of what can "plausibly be expected." On the one hand, Rescher gives an obvious example: "Ordinarily, mendacious people cause pain." On the other hand, to say that `alcohol ordinarily increases the probability of harming others' is a distinct use of the word `ordinarily,' because, ordinarily, a person who drives drunk does not harm anyone at all.
Although we share Rescher's conclusions, Rescher's analysis is conceptually unsatisfactory. It is not the ordinary that determines the moral worth of an uncertain action. On the contrary, we should in our moral deliberations give consideration to the extraordinary. The small probability of a nuclear accident should be taken into account as a decision is being made about building them or not. Moreover, negligence in reducing the probability of a disaster may imply moral culpability on the part of the decision maker, even if nothing happens.
Figure 4. Regret's eroding effect on deliberative rationality
The internal functionality of moral judgment hinges on the double nature of humanity. It is our capability of doing and becoming aware of radical evil, as Kant would say, that makes us susceptible to moral judgment. Human beings are to act according to deliberative rationality, what morally safeguards them from many uncertainties with respect to the outcomes. However, human limited rationality will always confront a person with the unexpected. Also, chaos and the free choice of others present themselves as irreducible imponderables in human life. Subsequently, despite deliberative rationality a human being may always confront regret in the course of life. We propose an adjustment of Rawls' picture of deliberative rationality, that considered each point of the future equally relevant for decisions of a moral subject (cf. Figure 4).
Morality is a function of two variables: foresight and repentance. A moral subject has a duty to foresee the potentiality of the course of her action. She should minimize the possibility of unexpected harm. The driver of a car should have the deliberative rationality to avoid driving drunk, because she knows that if she drives drunk and causes a fatal accident she would feel regret. The inevitable regret indicates that driving drunk is inconsistent with deliberative rationality, for the reason that she should not have felt regret if the action was in accordance with it. Regret is the corrosion of deliberative rationality, and it constitutes the ground of moral reprehension. If an accident is the result of a contingent brake failure, the driver may regret the unexpected, but the presumed lack of proper alternatives removes the basis for retrospectively wishing that she had altered her decision.
D. Justice and Luck: Fair Opportunity
It seems to me that a major error is made if risk is seen as something peripheral to ethical questions. What more fundamental ethical question could there be than: Who should bear what risk?
Luck intersects the workings of justice in many unfortunate ways. The criminal goes free and the innocent is convicted because of the capricious course of luck and bad luck. The tension between luck and justice has long been recognized. One of the oldest personifications of the tension is Astraea, a legendary goddess representing both luck and justice. Aratus of Soli in Cilicia, a didactic poet at the court of Antigonus II Gonatas of Macedonia (315-245 BC), describes in his Phaenomena the tale of Astraea, the Goddess of Justice. In three stages of the world, the Golden, the Silver and the Brass Age, Astraea appears in three different shapes. In the harmonious Golden stage, where people are naturally kind to each other, Astraea is omnipresent and "Justice herself ... abundantly supplied [the people's] every need." In the Silver stage people are greedy and selfish, and Astraea, although still on earth, "with threats rebukes their evil ways." Justice has become a matter of duty and prohibition. In the Brass stage people fight wars and kill each other, and justice is absent. Astraea has fled to the sky, becoming the Libra in the starry night. In astrology, Libra is the seventh sign of the zodiac, and she is represented as a blindfolded woman holding a balance scale, the symbol of justice.
Astraea bears also a relationship to risk and luck, even within her family. She is said to be the daughter of Astraeus, the father of the stars and dice. Moreover, Astraea's name is etymologically related to `astralagos', which is Greek for one of the proximal bones of the tarsus, i.e., the part of the foot between the metatarsus and the leg of the higher vertebrates. It is suspected that the anklebones of sheep marked on four sides are the forerunners of modern dice. One of the questions we shall ask in this section is whether dice (Astraeus) are able to give birth to justice (Astraea). The answer to this question will result from an investigation of another striking similarity between the ancient imageries of justice and luck. Both justice and luck are represented as blindfolded women. However, the differences of their blindness will be an essential issue in this section. We shall argue that luck abstracts from those attributes that are essential for justice, and vice versa. Luck abstracts from desert, whereas justice abstracts from personal favor.
The fact that Astraea's relationship to both justice and luck is dialectical instead of symbiotic is not uncommon in Greek mythology. Hermes is the protector of commerce and travel on the one hand and theft on the other. Aratus plays out beautifully the opposition between justice on the one hand and risk and luck on the other, when he gives a description of the people in the Golden Age when justice was omnipresent:
Greed for wealth from far away did not cause them to build ships and entrust them to the hazards of the winds. ... There were no boundary stones marking off their owners' small domains, for they were quite safe without them.
Safety or the absence of hazard, the absence of risk and the absence of luck, is a precondition for the presence of justice. This matter will be explored in greater detail in the chapter on safety.
It will be this tension between Libra and Fortuna, between justice and luck that will be the subject of investigation in this section. We shall begin by discussing justice in its own right. Each of three modern theories of justice that will be discussed exhibits its own commitment to compensation and adjustments due to luck's inequality. We shall argue that a theory of justice based on fair opportunity is the only theory that can deal with luck in a way that is consistent with the moral principles that we have developed in the first part of this chapter.
In the Republic Plato argues that justice consists in a harmony that emerges when the various parts of the whole perform the function proper to them. For an individual, justice comes forth when the three components of the soul --- reason, will and appetite --- perform their appropriate tasks. For a society, justice consists in the harmony among the three groups of people in a society --- the philosophers, the police force and the citizens --- , i.e., when each group does what it is supposed to do: to rule, to control or to produce. When everybody in this harmonious conglomerate functions according to what is her proper place, then justice flows through the whole of society. In this section, we shall discuss the justice of society, rather than justice as an individual virtue. Plato's pupil Aristotle gives a further demarcation of our topic.
Aristotle's treatment of justice has been the starting point of most Western accounts. In the fifth book of the Nicomachean Ethics Aristotle asserts that "all men mean by justice that kind of state of character which makes people disposed to do what is just and makes them act justly and wish for what is just." Justice is the habit of acting justly. In other words, justice is that which comes forth out of a just procedure. Aristotle distinguishes between a just distribution of wealth or other goods and justice in reparation, as, for example, in punishing a person for the crime she committed. The former is called distributive justice, the later corrective or rectificatory justice. Distributive justice has to do with rewards and punishments, whereas corrective justice has to do with equal transactions between human beings. Distributive justice "is that what is manifested in distributions of honour or money or the other things that fall to be divided among those who have a share in the constitution." Corrective justice requires the individual to act equitably in actions or transactions affecting the interests of her fellow citizens. There is a common core to both. The key element of justice is, according to Aristotle, to treat like cases alike and to treat distinct cases unequally. This is called the principle of formal equality. By itself this account of justice is void of any content. Thus central to a concept of justice is to work out in which relevant respects cases should be called equal. Later thinkers have proposed several grounds that should count as the criterion of equitability. That means that they supplemented the merely formal definition of justice, treat like cases alike, with a material principle of justice. Proposals include that similarity on the basis of comparisons of need, effort, contribution, merit, desert or free-market status.
Cicero argued that justice consists of "giving to each his own" (`suum cuique tribuens') and Simonides was of the opinion that "to give what is owed to each is just" (`to ta opheilomena hekastoi apodidonai dikaion esti'). These definitions are reformulations of the principle of formal equality. They do not specify any criterion on the basis of which can be decided what can count as one's own. In the distribution of goods and evils justice specifies a rule what should be handed to a person as her own, that is, what "fair part" is hers.
Most philosophers agree that justice does not cover the whole moral spectrum. There are moral concepts that cannot be derived from justice. A just society is not necessarily a morally superior one. In the Roman and Christian tradition justice has been contrasted with and supplemented by benevolence. In many contemporary practical decision problems this still holds true. A very expensive rescue attempt of an almost chanceless victim is not justifiable from the point of view of almost any theory of medical justice, however, the principle of benevolence as a symbolic confirmation of society's strength and compassion may override justice and choose to undertake such action. Justice is a normative concept that can be overridden by other ethical claims.
Justice is be the basis of a legitimate moral claim or obligation. A patient can claim not to be denied equal treatment on the basis of justice. Similarly justice can demand a citizen to pay her taxes in order to redress unjust differences. What kind of claims and obligations depend on the underlying material principles of justice. We shall discuss three theories in their pure forms: utilitarianism, libertarianism and egalitarianism. Each of them emphasizes a certain notion of justice. Beauchamp and Childress argue that "in the face of divergent and controversial appeals to justice, theories of justice are devoted to systematizing, simplifying, and ordering our diverse rules and judgments." They claim that in practical situations a mix of principles is appropriate.
Utilitarianism
Utilitarian theories argue that the distribution of goods is a distribution of utility and that distributive justice consists of maximizing that utility. The total utility of a certain situation is defined as individual preferences that have been aggregated in a single index. Utilitarianism defines its material principle of justice according to the need of people. Utility is seated in what is perceived as useful. Utilitarian theories can disagree among themselves whether utility is a subjective or objective property. The founder of utilitarianism, Jeremy Bentham, defined utility purely subjectively: pursuing happiness and avoiding pain. It is in nature a purely hedonist theory. Modern and practical accounts of utilitarianism have generally defined utility more objectively.
In devising a system of public funding for health care, the utilitarian believes we must balance public and private benefit, predicted cost savings, the probability of failure, the magnitude of risk, and so on.
One can employ values other than, or in addition to, pleasure (ideal utilitarianism) or one can value that which appears as an object of desire. This latter approach is prevalent in modern economics (preference utilitarianism). The tension between `objective' and `subjective' utility is contained in the very core of utilitarianism. In order to make a comparison possible personal utilities have to be transformed into a one-dimensional value: the welfare function. Harsanyi, for instance, adds the personal utilities to arrive at a single collective utility.
Despite obvious criticisms utilitarianism has proven to be an important tool in twentieth century social policies. Its significance in law, politics, and economics is remarkable. The chief reason why this ethical theory has been easily applicable to social circumstances is that it defines the goodness of an action in terms of overall consequences, instead of in terms of individual intentions or abstract principles. Distributive justice, according to utilitarians, is the maximization of utility. In the modern, numerically oriented society utilitarianism, makes normative social decisions on the basis of a calculus of utility. Cost-effectiveness and cost-benefit analyses are a direct result of incorporating the principle of utility in decision making in societal policies.
An important aspect of utilitarianism is that the calculus of utility may indicate that the actual distribution of utility is not the most optimal allotment. Utilitarians generally implore political action to realize justice, including a redress of wealth through taxation in order to benefit those who are genuinely needy. The principle of utility may demand that the institution of general health insurance serve the poor. Equals should be treated equally and utilitarians define this similarity on the basis of need.
Libertarianism
From the rise of the merchant class at the end of feudal times stems the origins of libertarianism. Libertarianism is originally an economic system specifying that economic goods are best distributed on the basis of free-market rule. The United States has largely accepted free-markets for the distribution of goods. From a more recent date stems the idea that a free market produces, besides an `economically optimal' distribution, also a just allocation of goods. Proponents of this view are therefore strongly opposed to any redistribution of wealth.
The argument for a morally just free market is derived from the conviction that there are unalienable rights. Rights theories of ethics have found most acceptance in the United States. Robert Nozick, for instance, defends in his Anarchy, State and Utopia that the rights to life, liberty, and legitimately acquired property are absolute, and no act can be justified to violate them. But what should I do, for instance, if my right to property conflicts with someone else's right to life, for instance, because she is starving? Do I have a duty to assist her? Nozick denies any duty of that kind unambiguously. Similarly a redistribution of wealth via taxation is unjust. People have an inalienable right to their property and no one, not even a state, can infringe on that.
Egalitarianism
Whereas utilitarianism is a consequentialist theory, defining justice as that distribution that yields the highest utility, and whereas libertarianism as a rights theory is strictly deontological, seeing justice as the application of a fair procedure, i.e., the free market, egalitarianism is a mixture of both. In its radical form, however, egalitarianism is consequentialist, defining justice as the extent to which the benefits and burdens in a society are distributed equally over its members. This radical egalitarianism has played an important rhetoric role in the emancipation of women and suppressed minorities. A justification for radical egalitarianism is however not so easy to give. There are obviously many differences between human beings and the burden of proof is on the side of egalitarians to argue why those differences are not relevant.
A more carefully formulated egalitarianism is developed by John Rawls in his A Theory of Justice. His theory is deontologically justified with a slight consequentialist flavor. Rawls takes the Aristotelian formulation of justice as his point of departure. Justice consists of treating like cases alike. There are close conceptual connections between justice, on the one hand, and fairness and desert, on the other. "One acts justly toward a person when that person has been given what the person deserves." That is only fair, one could say. Rawls explicitly defines justice in terms of fairness. The Rawlsian material principle of justice, that is derived from the idea of fairness, is that to each person comes what she deserves. This idea by itself does not necessarily lead to egalitarianism. A meritocracy is based on the same principle and not all meritocracies would opt for an egalitarian principle of justice.
There is another idea central to Rawls' theory, which taken by itself corresponds with the main premise of this dissertation.
[A free-market arrangement] permits the distribution of wealth and income to be determined by the natural distribution of abilities and talents. Within the limits allowed by the background arrangements, distributive shares are decided by the outcome of the natural lottery; and this outcome is arbitrary from a moral perspective. There is no more reason to permit the distribution of income and wealth to be settled by the distribution of natural assets than by historical and social fortune... The extent to which natural capacities develop and reach fruition is affected by all kinds of social conditions and class attitudes. Even the willingness to make an effort, to try, and so to be deserving in the ordinary sense is itself dependent upon happy family and social circumstances.
After he determined that justice is defined by desert, Rawls discovers that many of our ordinary notions of desert are dubious because many of the subject's defining qualities (intelligence, talent, motivation) are not deserved at all but instead determined by the lottery of life. We happen to be born with intelligence. Even human attitudes such as laziness and self-discipline are dependent on genetic and social factors.
The combination of justice as fairness with the recognition of the moral neutrality of life's lottery leads Rawls to an egalitarian theory of justice. All vital and economic goods should be distributed equally, unless an unequal distribution would work to everyone's advantage. Moreover, from the point of view of justice the undeserved advantages and handicaps as a result of the social and the natural lottery should be compensated. Everyone should be given a fair opportunity.
The earliest recorded occurrence of the lottery was in Babylon. Also the Bible gives evidence of the use of lots in the distribution of goods. In the Old Testament the Lord instructs Moses to take a census of the people of Israel and to divide the land among them according to the outcome of a lottery. Some Roman emperors were known gamblers and several held lotteries to give away goods and slaves during parties and feasts. In Europe the first lotteries appeared in the fifteenth century in Burgundy and Flanders. Towns organized lotteries in order to raise money for fortifications and aid to the poor. La Lotto de Firenze in Florence was among the first lotteries in 1530 that paid money as prizes. La Lotto was so successful that other Italian cities quickly followed. Queen Elizabeth I held a general lottery in 1566 to finance several public works, such as repairing harbors. James I gave permission to the Virginia Company to organize a lottery to raise money for the settlement of Jamestown in the New World. In many European countries the lotteries that were organized by the state were an important source of income that financed many public projects. The advantage over tax was that they were easy to organize and that participation was voluntary, even passionate. This lead finally to their downsizement. It was argued that lotteries encouraged mass gambling. In 1826 Britain prohibited lotteries. Nowadays, in many countries of the world there exists state organized lotteries, besides the many other forms that have sprung from La Lotto de Firenze, such as modern gambling games, lotto, keno, and bingo.
Personified luck and its dangers
The analogy of human life and games of chance is an old one, dating back to the Ancients. In the Roman era the prime ruler of both life and gambling was the same goddess, Fortuna. She controlled the unpredictable occurrences in a human life and at the gambling table. The `wheel of fortune' still contains both of these connotations. In Hellenic times luck was represented as an impersonal force. In the Roman era luck became personified as a goddess, which made luck seem more controllable. The Roman goddess, Felicitas (the goddess of good luck), was honored specifically by successful commanders for whom she was a special protector. Caesar had a temple built for her. Emperors hoped to control the fate of the empire by worshipping her as a symbol for the blessings of their regimes.
If one could get the favor of Fortuna, then one had `luck at one's side.' If one had angered the goddess, then it was no good to count on anything. Luck became interpreted as a matter of desert and effort. If one was unlucky, then one had failed to pay Lady Luck the proper attention. That held true for a gambling game, as well as for events in life. It is not surprising that often Fortuna (Luck) and Anagkê (Necessity) were thought to act together. This connection reveals something essential about the human mind. Rescher observes, "the linkage doubtless roots in the human penchant for seeing Reason at work everywhere." Human beings want to know why they are unlucky and they are willing to invent pseudo-reasons to soothe this curiosity.
The modern perspective is dramatically different. With respect to games of chance, there is a rational theory that describes the probabilities involved and postulates that there is no explanatory reason why in matters of chance sometimes things turn out this way, and sometimes that way. It has discredited superstitious ideas about chance but has certainly not exterminated them. To be sure, many people believe that luck's mechanism itself can be influenced by knocking on wood, carrying their lucky stone or a rabbit's foot, etc. The personification of luck is not merely a harmless fossil of an old tradition. It is a superstition of the worst kind because it produces a false sense of control and distracts the attention from more sensible ways to deal with luck: being prudent and well informed by the best developed theory of probability that is available.
The ancient imagery of the wheel of fortune is present in an imaginative story by Borges. The story proves wonderfully well to illustrate the ubiquity of luck in human reality and how luck --- when made an explicit object of awareness --- violates an intuitive sense of justice. In The Lottery in Babylon the narrator introduces a fantastic custom from his native, exotic country:
I come from a dizzy land where the lottery is the basis of all reality.
The narrator, apparently a slave on a ship, explains that he earned his fate by random assignment in a lottery. These strange events give the story at first a sparkle of fantasy. One of Borges' commentators is surprised by "the jolting ethical and political oddity of such a system." However, eventually it becomes clear that that distant, "dizzy land" is, after all is said and done, our own: some sort of lottery is the basis of all our reality.
The narrator tells the story of how the lottery became institutionalized in Babylon. In a far past, he recalls, there were lotteries that assigned winning prizes to those who had bought the winning ticket. These lotteries wore out, however, and it was not until a mysterious institute, the Company, reformed the lotteries as also to include losing tickets. Losers would have to pay a fine. The new lottery cultivated not only hope, but also terror, and became an instant success. Losers who could not pay their fine, were sued by the Company and some ended up in jail. For the rich the lottery became a favorite sport. The poor felt that they were denied access to something essential and they rebelled.
There were disturbances, there were lamentable drawings of blood, but the masses of Babylon finally imposed their will against the opposition of the rich. The people achieved amply its generous purposes. In the first place, it caused the Company to accept total power. (That unification was necessary, given the vastness and complexity of the new operations.) In the second place, it made the lottery secret, free and general. The merchanary sale of chances was abolished. Once initiated in the mysteries of Baal, every free man automatically participated in the sacred drawings, which took place in the labyrinths of the god every sixty nights and which determined his destiny until the next drawing.
Now, the lottery received a totalitarian slant. Everyone had to participate in the lottery. The drawing of the lottery was concealed from people. The Company, who had gained total power, used the secrecy to give the people the slight hint of a certain purpose in the outcome of the drawing.
In many cases the knowledge that certain happinesses were the simple product of chance would have diminished their virtue. To avoid that obstacle, the agents of the Company made use of the power of suggestion and magic.
Although the drawings were done at random, the secretive Company employing coincidences to give the outcome an appearance of purposefulness. The telos was a mere creation of the Company, and even a more radical reform was implemented. Some Babylonian theorists had wondered: "If the lottery is an intensification of chance, a periodical infusion of chaos in the cosmos, would it not be right for chance to intervene in all stages of the drawing and not in one alone?" The new reform proposed to make not only the drawing random, but also the prizes of the drawing. One could `win' the conviction of being the murderer of C, and subsequently chance could assign you to be rescued from the death-sentence by a random acceptance of the grace-request by the randomly appointed governor. "In reality the number of drawings is infinite. No decision is final, all branch into others."
The Company, the lottery, and the secret interpretation-schemes of all outcomes became the basis of all truth in Babylon. The narrator tells us that Babylonian reality has become so full of chance and that its working is so complete that it has become difficult to distinguish between a lottery and an intentional act. The narrator explains the consequences of this extreme circumstance.
The Company, with divine modesty, avoids all publicity. Its agents, as is natural, are secret. The orders which it issues continually (perhaps incessantly) do not differ from those lavished by impostors. Moreover, who can brag about being a mere impostor? The drunkard who improvises an absurd order, the dreamer who awakens suddenly and strangles the woman who sleeps at his side, do they not execute, perhaps, a secret decision of the Company? That silent functioning, comparable to God's, gives rise to all sorts of conjectures. One abominably insinuates that the Company has not existed for centuries and that the sacred disorder of lives is purely hereditary, traditional. Another judges it eternal and teaches that it will last until the last night, when the last god annihilates the world. Another declares that the Company is omnipotent, but that it only has influence in tiny things: in a bird's call, in the shading of rust and of dust, in the half dreams of dawn. Another, in the words of masked heresiarchs, that it never existed and will not exist. Another, no less vile, reasons that it is indifferent to affirm or deny the reality of the shadowy corporation, because Babylon is nothing else than an infinite game of chance.
However extreme this situation may seem, this last, most radical stage of the institution of the lottery in Babylon reminds us of the human condition in general. In the final development, the Company has become completely anonymous. The drawings are completely secretive and cannot be distinguished from purposeful behavior. The secret agents of the Company stage purpose, or, maybe, the human mind is essentially teleological. Human evolutionary survival has depended to a large extent on the development of our brain, and our ability to make meaningful associations between the multitude of facts surrounding us. Certainly, human intelligence has been able to capture general patterns, but when confronted with a concrete, individual case, the situation is much more complex. Although we have accurate statistical predictions for the total number of murders in the United States next year, we are not able to predict the individual cases. This is the mark of randomness. In the nineteenth century the Belgian statistician, Lambert Adolphe Jacques Quetelet, discovered this idea.
According to Quetelet, statistical laws showed that order was universal, that irrationality had its reasons, that even crime was subject to law. ... In the average, everything particular or exceptional balances out, and we are left with a certain penchant for crime characteristic of a given community.
Borges captures this idea, when he describes the man who murders the woman as if he was chosen by a lot to do so, having a certain penchant for crime.
Apparently it has become impossible to distinguish between a world ruled by intention and a world ruled by chance. The list, that the narrator gives, of "vile" suggestions about the nature of the Company establishes the argumentative link between Borges' fiction and the human world.
- "...the Company has not existed for centuries and that the sacred disorder of lives is purely hereditary." This view corresponds most accurately with the modern scientific view of the origin of the universe. Many secular physicists believe that the universe was deterministically created out of a set of random (here: absence of design) initial conditions combined with natural laws. Borges' Company was responsible for a beginning without purpose. After that chance withdraws from reality, and the disorder is purely deterministic.
- "... [the Company] never existed and will not exist." This is the strong deterministic view combined with a believe in design. There was neither chance involved in the creation of the universe nor in the preservation of it. We can think hereby of certain religious view, including Christian, Jewish and Muslim perspectives.
- "... the Company is omnipotent, but that it only has influence in tiny things: in a bird's call, in the shading of rust and of dust, in the half dreams of dawn." This idea corresponds perfectly with what Reverend William Paley, a creationist, observed in 1802:
What does chance ever do for us? In the human body, for instance, chance, i.e., the operation of causes without design, may produce a wen, a wart, a mole, a pimple, but never an eye. Amongst inanimate substances, a clod, a pebble, a liquid drip, might be; but never was a watch, a telescope, an organized body of any kind, answering a valuable purpose by a complicated mechanism, the effect of chance. In no assignable instance hath such a thing existed without intention somewhere.
The idea that chance can only be responsible for the accidental dates back to Aristotle, and has been incorporated into Christian culture. God organizes the overall structure of life. Only the accidental is a matter of chance. In a certain way, it was Darwin who picked up this idea when he presented his idea of random variation, i.e., "cause without design," which would produce a design without a designer.
- "...[the Company is] eternal and ... it will last until the last night, when the last god annihilates the world." The workings of chance and luck are omnipresent. Reality "is nothing else than an infinite game of chance." From the human point of view many occurrences are haphazard, in the sense that they lie on unpredictable intersections of many different causal lines. This even holds for our personality to some extent. Thomas Nagel calls this "the phenomenon of constitutive luck --- the kind of person you are, where this is not just a question of what you deliberately do, but of your inclinations, capacities, and temperament."
Despite the fact that these were merely "vile" suggestions about the nature of the Company, it becomes clear that the story is an analogy and that the Company can be interpreted as a metaphor for chance in some sense.
Why should any particular lottery,
whether run by Nature or not,
enjoy some special status?
We shall concentrate on the philosophical merit of the fourth point, i.e., that `essential' aspects in human reality are a matter of luck. This view encounters philosophical opposition from some religious views. The belief that the human faculties are simply distributed at random seems as dogmatic as the view that this is not the case.
The difficulty with all these forms of gaming is that they are built on a false premise: namely that good comes to us by chance, not by God. This premise makes good seem uncertain --- a matter of luck instead of the result of divine laws that are totally reliable. ... Man, as God's spiritual idea, always has all the good he needs.
Although this is not the position that we defend, we would like to accommodate those views in a weaker version of the fourth thesis. We withhold an opinion about whether human reality is a matter of design or not. In a weaker version of this view we allow opinions that see the cosmos as the expression of divine or other kinds of laws, as well as views that see mostly chance and luck at work in human reality. The weaker version consists of the view that, although the overall distribution of abilities and inclinations may be a matter of design, it is a matter of chance which individuals are distributed to these abilities and inclinations. So instead of considering the abilities, inclinations and social background as the prizes in a lottery, we take them as our subjects and consider the individuals as the prizes distributed to them. This weaker version accommodates teleological as well as anti-teleological views. It shapes the ancient analogy of life and games of chance into a new form and provides a model of one insight in life: that many things that are central to our existence happen to come to us by mere luck. Human endowments, talents, and heritage are a matter of luck.
Nicholas Rescher believes that the lottery analogy of life is invalid. Rescher insists that the endowments that we have are not a matter of chance, but of fortune. This is not a mere play of words. He means that it is meaningless to talk about a chance-like assignment of endowments to people (or of people to endowments, for that matter). Rescher argues that the individual's individuality is intertwined with the individual's endowments and that it is therefore impossible to imagine that endowments are randomly assigned to an individual's individuality. Rescher believes that the idea of a lot in life is a matter of metaphorical confusion.
Only if one takes too figuratively the idea of a lot in life --- by (quite absurdly) thinking of human biographies in terms of a lottery of lifeplan allocations to preexistingly identifiable individuals --- can one conceptualize a person's overall fate or destiny in terms of luck. For only then would the sum of all the goods and evils befalling people become reduced --- comprehensively and automatically --- to a matter of chance allocation. This is obviously unrealistic... There is no antecedent, identity-bereft individual who draws the lot at issue with a particular endowment. One has to be there to be lucky.
The individual is merely fortunate if she possesses great intelligence (or unfortunate if she has poor mental qualities).
The central point in Rescher's argument is the distinction between luck and fortune. According to Rescher, "luck is a matter of having something good or bad happen that lies outside the horizon of effective foreseeability," whereas "you are fortunate if something good happens to or for you in the natural course of things." This distinction may seem clear but it leaves considerable room for interpretation. What may seem to be in the natural course of things from one perspective, may be completely unpredictable from another. Whereas the child considers her intelligence the most natural thing in the world, her parents before birth could not tell what it was going to be. For them she was lucky that she is so intelligent.
Obviously, if one conceptualizes the lottery of life as a ghost drawing a ticket to see what kind of person it will become in this world then it sounds like a silly idea. However, although we do not draw lots to determine whether we shall be involved in a plane-crash, it is still meaningful to conceptualize this event in terms of a chance event. Rescher is mistaken that drawing a lot is the only way to conceptualize the chance-like distribution of endowments and talents to the people in society. We believe that the lottery of life is a useful metaphor and we shall employ it in subsequent sections.
Fair Chance --- a principle of distributive justice?
Fortune, that arrant whore,
Ne'er turns the key to the poor.
William Shakespeare ---
King Lear (Act 2, Scene 4)
In this section we shall argue that chance cannot be applied in a divisory use that aims at being fair. The common English expression, a fair chance, may therefore be misleading. Instead of morally endorsing an equal random chance for the distribution of good and evil, the expression actually stands for a `reasonable opportunity.' Randomizing methods are often, instead of source of fairness, a cause of an unfair distribution.
The importance of the distinction between luck and fortune, with which the last section ended, lies in the moral conclusion that we can draw from either one of them. We say that someone is lucky when she happens to have a talent for music; definitely, she is fortunate too. Normally she can enjoy the fruits of her talents and there is nothing unethical about that. In this section we shall ask whether it can be argued that the fact that her endowments were handed out to her without her deserving it sets some limit to what she deserves to gain from her talents. Can chance ethically serve as a principle of distributive justice? In more common terms, is it fair to be lucky? Or is this question conceptually mistaken, such as a libertarian may argue: "It is simply unfortunate, not unfair, if one cannot afford to pay for health insurance." We shall explore this issue by looking at the concept of a `fair chance.'
"You can say what you will, but you had a fair chance to stay away from the cookie-jar," said the mother to her child when she spanked him for stealing a cookie. The expression `a fair chance' suggests that chance can be a principle of fairness. The child was presented with two alternatives, staying away from the cookie jar or not, and when the child ended up in the second alternative he was punished for it. "We had our chance" is often a reconciliation with the facts that we recognize as deserved. Rescher mistakenly believes this interpretation of the concept of `chance' refers to the use of a randomizing instrument. When he discusses the division of three stakes --- two equal and one much less --- among three people, he makes the following remark:
In failing to equalize the shares it should at least be assured that all concerned should have an equal chance of obtaining a satisfactory lot. Again, "equal opportunity" --- or rather, here, "equality of risk" --- provides a means of quasi-equalization in a situation in which equalization pure and simple is infeasible, or rather, is undesirable.
Does moral luck exist after all? Can a fair distribution of goods be the result of genuine chance, i.e., randomness? Can we speak of a fair chance, i.e., a morally loaded randomness? Despite Rescher's argument and common belief, the answer should be negative. The expression `a fair chance' is semantically confusing, because it suggests a relationship between randomness and merit, that does not exist. When we say that the child had a chance not to steal a cookie from the jar we mean that she had an opportunity, a choice, not to do so. Randomness had nothing to do with it. Similarly, when someone accepts her fate saying that she had her chance, she does not attribute her misfortune to chance, but to the lack of own initiative when there was an opportunity. Opportunity, in contrast to chance, presupposes the ability of active participation on the side of the subject.
It is a misunderstanding of the expression "a fair chance" to employ a random device to determine a fair distribution of goods. Fairness refers to the equality of opportunity or of distribution. A random device excludes a priori any opportunity on the side of the players. Sometimes with more than one author of a book a random device is employed to determine the order of names. This supposedly gives everyone a `fair chance' to be the first mentioned on the cover and in possible references. Strictly speaking, this is however incorrect. The misunderstanding arises from the equivocal use of the concept "chance," namely chance as opportunity and chance as randomness. A randomizing instrument cannot provide equal opportunity.
Rawls' theory of justice is based on the idea of fairness, echoing Cicero's "to be just is to give everybody her own." What is one's own for Rawls is what a person possesses minus what she does not deserve to have. The lottery of life distributes talents and abilities in a way that does not have any moral value, because the distribution, seen at large, happens at random. When the alternatives are unequal, randomness is incapable of fairness. An individual does not have a `fair chance' to receive the endowments she got, because there was no opportunity for her to obtain them. The individual's talents and biological, medical, social and economic heritage have been distributed at random by birth. Randomness, instead of a source of fairness, is a source of unfairness. In the next section we shall address the question whether a principle of redress should be an integral part of the principles of justice or whether such principle, as libertarians claim, violates the inborn right to possess individual property.
We have argued that a lottery is generally incapable of fairness. A lottery principally creates an unfair distribution because the principle underlying the distribution is not based on desert. Does this mean that there is no place for a lottery to determine a distribution of goods and evils? Thomas Gataker, a preacher in early seventeenth century England, wrote a treatise Of the nature and use of lots. In this treatise he discusses the divisory purpose of a lottery, i.e., for effecting distributions or allocations of goods or evils. The discussion of the use of lots for the distribution of goods stems from much earlier. Cicero, Augustine and Aquinas have all written on the topic. Despite their reservations with regard to the employment of lots, they did allow a small, legitimate place for the divisory use of chance. We shall discuss this matter in the chapter on game theory which is exclusively dedicated to the employment of genuine randomness in a distributive function. Probability does have a legitimate role in the distribution of goods that are not subject to moral considerations.
When confronted with a practical, even moral, problem, a majority rule can have unfair consequences. In the case of a political process, a fifty-one percent majority would always rule over a forty-nine percent minority. A solution could be to organize a lottery after the election in which each party has a chance to win that is proportionate to its electoral ratio. Probably, the basis of the intuition that this distribution scheme is so-called fair relies on the fact that a probability is a relative frequency. Thus, in the long run, the fifty-one percent majority will rule fifty-one percent of the time, whereas the forty-nine percent minority will rule forty-nine percent of the time. However, this intuition is flawed. There is no guarantee, for instance, that simply due to the mechanisms of chance a one percent minority will not rule the country forever. Philosophically, this example should clarify that a chance mechanism itself cannot be the source of fairness. Fairness is rooted in a proper allocation, i.e., in this case, according to the true ratios. There exists determinist methods which are to able to guarantee, for instance, that injustice of the kind described above will not occur. One major drawback of the determinist solution, which is morally superior, is its practical feasibility. David Heyd correctly observes that in those cases randomization can be a second-best strategy.
There are cases which cannot be decided "rationally" in [a] straightforward way. The failure of reason may occur in the following cases of theoretical impasse: the total absence of reasons, the existence of competing reasons of equal weight, or the incommensurability of the competing reasons. ... The second-order strategy of practical reason in these cases is aimed at satisfying an alternative goal to the standard one of choosing on the balance of first order reasons. The normative goal is often the respect for equality of the parties involved in the distributive choice, and when this equality cannot be achieved by dividing the benefit or the burden in question, then granting equal opportunity or chance is a second-best strategy.
Despite its legitimacy, it should be noted from the start that chance is incapable of fairness. Probability is void of moral qualifications.
In the beginning of this chapter we recognized an interesting similarity between the traditional image of Fortuna, luck, and Libra, justice. Both are depicted as blindfolded women. The blindfolding symbolizes abstraction. A blindfold disables one to see everything in concrete detail, and presents an abstracted picture. Justice and luck do not abstract from the same features. Instead, they abstract from aspects that are essentially contrary. In brief, luck abstracts from justice and justice abstracts from luck. Under ideally identical circumstances, it is unjust to deny a better qualified Asian-American a job in favor of a white American on the basis of her race. To the society at large individual `selves' have been distributed randomly over the available races. In this section we ask which features can count as due to luck to which justice should be blind, and under which circumstances justice should do so.
One of the central probabilistic features of a human being is her gender. She could have been a man, and he could have been a woman. The natural lottery of life decided differently. Because gender is not determined on the basis of desert, it should be considered irrelevant from the point of justice. Is this true for all cases? When a movie director selects a person for a female role, it does not seem unjust that he will only consider women. On the other hand if employers only want to hire female secretaries, then this would involve an unjust distribution of jobs. Men would be unjustly discriminated against. A secretary's femininity is not essential to her work, and therefore employers should abstract from gender in their choice of a secretary. In another case in 1995, the national bar chain Hooters was forced by the Equal Opportunity Office to hire both men and women. Hooters' corporate strategy had been to hire only female waitresses of certain proportions to attract a certain clientele. Did Hooters indeed violate principles of fair opportunity? No, it did not. The women employed by Hooters were essentially employed because of their gender, just as the movie actress. Enforcing fair opportunity would not do justice, but it would simply eliminate Hooters. It is important to keep matters of the head and matters of the heart separated. Although Hooters did not violate justice, it may very well have violated good taste.
In the discussion of the appropriate application of the principle of fair opportunity, it should be determined whether or not the luck-dependent features are relevant to the distribution criterion. If these features are relevant, then the principle of fair opportunity is not applicable --- e.g., the selection of a movie actress, of a waitress position in Hooters, or of an intelligent student for a graduate school.
These examples taken into account, we shall make a first attempt to give a definition of fair opportunity. Fair opportunity is not a form of absolute equality or homogeneity, but it is a structural attempt to let individual forces settle in a more just fashion. The principle of fair opportunity stipulates that no person should be granted social benefits on the basis of properties distributed to her by the lottery of life, unless these properties are relevant to obtaining these benefits.
Nonetheless, there may be other considerations that overrule or amend the principle of fair opportunity. Fair opportunity is a prima facie rule that has to be weighted in the presence of other rules. An important amendment is what is called the principle of redress. It argues that although a certain criterion of distribution may depend on relevant luck-dependent features, the advantages that are enjoyed should be taxed at least partially in order to compensate those who did not have the opportunity to be a recipient of these advantages. Whereas the principle of fair opportunity is shared by most, the principle of redress has encountered much opposition. We shall return to the latter in a moment.
The fair opportunity rule systematically rejects distributions based on unfavorable properties that are due to luck. A source of extensive and ongoing debate is which properties are unlucky and which are unfortunate. The outcome of this debate is extremely important for the scope of application of the principle of fair opportunity. The idea is that those properties that are unfortunate do not have to be considered in the application of this principle. The corresponding attitude is characterized by the statement: `It is too bad, but such is life.' Libertarians have pointed out that not all factors beyond a person's control have to be subject to the principle of fair opportunity. The unfortunate circumstances of some may at most inspire some benevolence (charity) on the side of those more fortunate, but these circumstances cannot be called unjust. The libertarian Tristram Engelhardt argues that "where one draws the line between what is unfair and unfortunate will, as a result, have great consequences as to what allocations of health care resources are just or unfair as opposed to desirable or undesirable." Arguing from a libertarian point of view, the right of possession is absolute. Generally, situations of inequality may be undesirable, however, it is up to the individual whether she wants to compensate for it or not. The egalitarian Rawls draws the line very differently.
Perhaps some will think that the person with greater natural endowments deserves those assets and the superior character that made their development possible. Because he is more worthy in this sense, he deserves the greater advantages that he could achieve with them. This view, however, is surely incorrect. It seems to be one of the fixed points of our considered judgments that no one deserves his place in the distribution of native endowments, any more than one deserves one's initial starting place in society. The assertion that a man deserves the superior character that enables him to make the effort to cultivate his abilities is equally problematic; for his character depends in large upon fortunate family and social circumstances for which he can claim no credit.
Rawls' application of the principle of fair opportunity stretches over the whole spectrum of abilities, endowments and motivations. His account is an attack on the autonomy of a human being, not very different from skeptical views such as defended by Williams and Nagel. Whereas motivational properties are partially dependent on social environment and therefore on luck, we disagree with Rawls that they are purely accidental. If motivation is thought to be completely disconnected from personal merit, then the region of morality, including justice, is a no-man's land.
Libertarian and radically egalitarian theories are generally inflexible to cope with practical circumstances. If a certain distribution is identified as undesirable, it is unsatisfactory from the point of view of justice that it depends on contingent individual initiative that such a situation is compensated. On the other hand, applying the principle of fair opportunity to all possible luck-dependent factors at least frustrates a sense of justice or eliminates morality altogether. An orthodox interpretation of Rawls denies a smoking individual the responsibility for possible health problems. Although smoking is partially dependent on factors beyond the individual's control, generally she is thought to make a free choice. Therefore, a higher insurance premium for smokers is not necessarily in conflict with justice. As Herbert Spencer put it:
`Everybody to count for one, nobody for more than one.' Does this mean that, in respect to whatever is proportioned out, each is to have the same share whatever his character, whatever his conduct? Shall he if passive have as much as if active? Shall he if useless have as much as if useful? Shall he if criminal have as much as if virtuous? If the distribution is to be made without reference to the natures and deeds of the recipients, then it must be shown that a system which equalizes, as far as it can, the treatment of good and bad, will be beneficial. If the distribution is not to be indiscriminate, then the formula disappears. The something distributed must be apportioned otherwise than by equal division. There must be adjustment of amounts of desert.
In order to come to a full interpretation of the principle of fair opportunity, some tough decisions have to be made to determine what can count as unlucky, and thus is in need of some sort of compensation. Properties such as gender, race and intelligence seem essentially beyond a person's responsibility, but religion, nationality, or social environment can be overcome. Other properties such as a smoking or alcohol addiction are to an even larger extent the result of an autonomous choice. Instead of an absolute distinction between autonomy and dependence, there exists a continuum stretching from gender as an undeserved dependence factor to eating an apple for lunch as an autonomous choice.
And, as I am an honest Puck,
If we have unearned luck
Now to 'scape the serpent's tongue,
We will make amends ere long;
William Shakespeare ---
A Midsummer Night's Dream (Act 5, Scene 1)
The natural and social lottery have privileged people in unequal ways beyond their responsibility. It seems fair that those who are treated poorly on the basis of luck alone should receive some form of compensation. This is the principle that undeserved inequalities call for redress. The principle of redress as is an active application or a reinterpretation of the principle of equal opportunity. It is based on the idea that one deserves compensation for damage incurred beyond one's own responsibility. The damage could be physical, economic, but also immaterial. In the social and natural lottery, the damage that some players incur is the loss of opportunity. They miss the possibility to receive, or the capability to achieve those things that others happened to get without associated merit.
When reinterpreting the principle of fair opportunity in the light of the principle of redress, the definition of the principle of fair opportunity specifies that those who have been disadvantaged are to be compensated to the extent that their disadvantage is beyond their responsibility, and no person should be granted social benefits solely on the basis of properties distributed to her by the lottery of life. Since two decades the principle of redress has been applied in job-application and school-admission processes in Western democracies, in the form of affirmative action. In social-economic terms the principle of redress has entered the political and social discourse as the right to a decent minimum.
Rescher believes that there is a legitimate place for some form of redress for the undeserved negativities as a result of the unpredictable ways of fortune. "Certain particular sorts of bad luck no doubt can and should be compensated for by a sufficiently affluent society:
- loss by `acts of God': earthquakes, storms, floods, natural disasters
- loss due to unavoidable occasional malfunctions in the operation of a public infrastructure resource (the vehicular traffic system, the air transportation system)
- loss by warfare, civil riot, terrorism, criminality."
However he is very much concerned with the "reasonable limits" of redress and he argues that "compensation for the negativities of fate and fortune can go only so far, ... limits being set by both considerations of practicability and considerations of desirability. Social policy considerations of the usual kind are at work here. And the operative guiding principles should be: (1) that the person involved sustains an individually unforeseeable though statistically predictable loss, (2) that such loss is sustained in a way in which the victim bears no substantial personal responsibility, in that the issue lies outside the range of her effective control, and (3) that considerations of significant public advantage are involved." Rescher tries to grasp the intuition that we should not be treated completely equal: the intelligent should be educated better than those who are not.
In recent years the opposition against affirmative action and the right to a decent minimum as principles of redress has grown significantly. Despite its intuitive appeal, the principle of redress does conflict with some other held beliefs. If the undeserved properties are relevant to the benefits received, do we believe that the fortunate owner of these potentialities should forgo the use of them? For example, should the fortunate owner of intellectual capabilities of a privileged class forgo a university education to the advantage of a socially disadvantaged person with a lesser ability? Does this not violate the idea of merit?
The answer to these questions has surprisingly much less to do with the idea of merit, but rather with the idea of probability as a relative frequency with respect to a homogeneous reference class and, eventually, with responsibility. Rawls and Nagel, in their respective skeptical moods, have argued that principally nothing can reasonably be said to be a matter of desert. Moral sensitivity, self-discipline and effort are to an extent dependent upon the environment in which one grew up or the plumbing that came with birth. Williams marks these factors as circumstantial and constitutional luck. In a similar move as in our rejection of moral luck, we discount these morally skeptical considerations as the result of a determinist world-view.
Instead we postulate the existence of human agency and, consequently, of personal responsibility such as we defined in the first section of this chapter. That does not mean that everything is within control of the acting subject, but her actions influence the probabilities over the range of outcomes. That means that with respect to a certain outcome, O, there exists a homogeneous subdivision of the full reference class into feasible actions, A1, A2, ..., An, which affect the probabilities of O. The feasibility of the actions, A1, A2, ..., An, removes any basis for affirmative action. Any person who wants to accomplish O is able to choose among the n feasible actions to attempt to achieve the desired outcome.
For example, the prerequisite of any consideration of affirmative action should be the existence of a homogeneous subdivision of the reference class on the basis of properties beyond the subject's control (for instance: race or gender), say R1, R2, ..., Rn, which affect the probabilities of the desired outcome, O. Let us assume that the desired outcome is admission to a certain university, and thereby the P(O|Ri) the fraction or quota of reference class Ri that will be admitted. On the basis of our account of distributive justice as fair opportunity, it may seem that variation of the values P(O|R1), ..., P(O|Rn), is a sufficient condition for affirmative action. However, if the subdivision, R1, R2, ..., Rn, consists of, e.g., n hierarchical levels of intelligence, then certainly P(O|R1) > P(O|R2) > ... > P(O|Rn). It seems dubious, nonetheless, that the principle of redress should be applied here to guarantee that P(O|R1) = P(O|R2) = ... = P(O|Rn), i.e., the admission quota for all levels of intelligence to be equal.
The reason that this kind of discrepancy shows up in our analysis is the fact that we have not discussed another prima facie principle of justice, which is to compensate for the radical outgrowths of the principle of fair opportunity. This is the principle of self-realization. Fair opportunity avoids a situation where people are rewarded and punished for those things that they did not deserve. On the other hand, even in her very constitution is a human being subject to luck. The way that each person is aware of herself is in this very constitution. This God-given potentiality deserves (in some sense of the word) to be explored and developed.
A further specification of affirmative action is needed. The homogeneous subdivision of the reference class should not only be beyond the subject's control, but it should also be conceptually irrelevant to the desired outcome as such. It is our claim that if there exists such a subdivision, say R1, R2, ..., Rn, then there is a basis for affirmative action in order to attempt balancing the quota, P(O|R1), ..., P(O|Rn). Affirmative action redresses the undeserved and irrelevant properties of a certain group of people to ascertain that they receive equal opportunities to achieve the same outcome.
Unfortunately, this relatively simple criterion of affirmative action does not completely alleviate the tension with the principle of self-realization. In fact, it is certain that they will conflict with each other on some level. Besides the irrelevant subdivision, R1, R2, ..., Rn, there may exist another subdivision, Q1, Q2, ..., Qm, that is relevant to the outcome desired O, i.e., P(O|Q1) > P(O|Q2) > ... > P(O|Qn), and that explains some of the differences among the quota with respect to the Ris, i.e., P(O|R1) > P(O|R2) > ... > P(O|Rn). Whenever redress would be applied according to R1, R2, ..., Rn, the people in the upper levels of QiRj, for instance Q1R1, will undoubtedly feel that they cannot fully realize themselves whereas they do possess the characteristic Q1 central to the project to be accomplished. In everyday terms, for instance, although it is true that ethnicity should be an irrelevant factor with respect to academic performance, the white segment of the population has a clear advantage over any other racial group. Applying a redress principle fulfills the demands of justice. Nonetheless, it creates moral tension, because those people that are white --- irrelevant --- and intelligent --- relevant --- would now lose their place to a person with a qualitatively lower relevant characteristic. Redressing one mishap of justice cannot but violate some other principle of justice. It is up to the actual ordering of prima facie principles in each specific case to decide which one of the two should be given priority.
An important prima facie principle in favor of the principle of redress we shall mention again at the end of this section. It is called the right to a decent minimum, which calls for at least a minimal access to certain basic good, notwithstanding any kind of distribution of `relevant' contingencies Q1, ..., Qm.
If some places were not open on a basis fair to all, those kept out would be right in feeling unjustly treated even though they benefited from the greater efforts of those who were allowed to hold them. They would be justified in their complaint not only they were excluded from certain external rewards of office such as wealth and privilege, but because they were debarred from experiencing the realization of self which comes from a skillful and devoted exercise of social duties. They would be deprived of one of the main forms of human good.
For instance, the education of a mentally weak person is expensive and society at large, and maybe also the mentally retarded person could gain by forfeiting her right to a minimum of education. Could such a distribution of advantages be just? The answer is that it could not. If a distribution is not in accordance with a proper abstraction of irrelevant properties --- the mentally retarded do have the ability for minimal education --- then the distribution cannot be called just. The reason is that if selection criteria are irrelevant they ultimately conflict with what Beauchamp and Childress call "the right to a decent minimum" or what Rawls calls "experiencing the realization of self."
Chapter 3. Gaming and Deciding
He no play-a da game.
He no make-a da rules!
Can the scientist teach us the rules of how to deal justly situations involving uncertainty? Can the game-theorist give us a moral analysis of the game of life? Can the statistician tell us what risk we should manage? No. They no play-a da game, so they no make-a da rules. In this chapter we shall argue that formal theories are morally neutral in their dealings with uncertainty. We agree with Binmore that game theory is a set of tautologies, and the results of the theory are never good or evil, but adequate or inadequate --- like any axiomatic theory.
The optimality considerations that those theories describe can be of use to society in cases in which there is no moral relevance. A donkey when caught between two equally attractive bundles of hay could starve to death because it would not have any motivation to prefer one over the other, and only motivation can bring him to act. Human beings have access to random devices that precisely in situations that are causally under-determined, provide a choice. In particular, the divisory property of the lottery can be employed in optimal distribution considerations and ideas of efficiency.
Game theory, decision theory and statistics are important tools for the moral philosopher. They can help the philosopher to make informed choices. We shall study two cases in which formal theory analyzes uncertain situations. Subsequent application of moral principles together with considerations of efficiency favor a certain course of action. First we shall argue that under the threat of multiple risks it is best to reduce the largest risk first, rather than a less significant risk because probabilistic considerations prove that such a strategy is more optimal. Secondly, statistical hypothesis testing is employed to investigate the possible dangers of certain substances. The uncertainty involved puts the statistician at risk to either condemn a safe substance or to let a toxic substance of the hook. The moral question before us is, `On what side do we prefer to err? On the side of caution or on the side of good science?' We shall argue that --- unlike the official current policy of the EPA --- caution deserves a place in the decision process.
One of the central ideas in the previous chapter is that luck has no moral input. On the contrary, one need to compensate for undeserved disadvantages which are the result of bad luck, because what does not have moral significance, e.g., luck, should not be used as an ordering principle for morality or justice. We thus excluded chance as a social tool for the moral architecture of society. Many modern accounts of contract theories inspired by game theory aspire to do just this. A core idea of contractarian ethics is that substantial moral principles can be considered as the outcome of a morally neutral social bargain. We deny this. The argument is subtle: on the one hand, we do want to employ chance in general, and game-theoretic considerations in particular, as social tools for certain parts of the architecture of society, but on the other hand, it should be noticed that the result of this employment will not be a juster society, only a more prosperous one. In this chapter, we shall prepare our method of approach for the next chapter in which we shall analyze the issue of risk and safety by employing game and decision theoretic means.
First, we wish to explore and unnerve the premise of contractarian ethics: that rational agency is sufficient for a rationally defended morality. Since Hobbes, justice has often become interpreted in contractual terms, on the basis of some minimal rationality assumptions. We shall argue that although there is a rational way to choose morality, there is no rational requirement to choose morality. The contractarian enterprise has shown a vast array of opinions on this matter. Before we reach our conclusion we shall take a slightly longer route, and show in a brief synopsis how four contractarian enterprises dealt with the matter of morality.
In the Hobbesian state of nature everyone finds oneself confronted with a generalized prisoner's dilemma. There are no laws that restrain the actions and rights of others, and therefore each aims to defend his own interest as much as one can. However, "as long as this naturall Right of every man to every thing endureth, there can be no security to any man ... of living out the time, which Nature ordinarily alloweth men to live." When all become aware of the possibility of mutual advantage of a pact, they will agree upon the several "Articles of Peace" that regulate the interaction between everyone. Until this moment, only "Reason," in its configuration of self-interest, will dictate the agent what to do. When the pact is proposed, it is congruent with one's selfish interests to agree upon laying down some of one's rights towards others. However, as the Foole opposes, the same selfish rationality that leads me to agree to this commitment will make me break it whenever this is in my self-interest to do so. In other words, making a commitment may be profitable, but compliance with this commitment is a totally different matter. Hobbes' solution is a non-moral, exogenous one: a sovereign with the full power of punishment makes the people comply with their commitments. It does not change a person's preference of breaking her commitment over compliance. However, the fear for punishment substitutes this preference for the preference of complying. For Hobbes, there is never truly moral significance to the issue of compliance.
This is essentially different in Gauthier's case. Gauthier's analysis of the initial situation shares important similarities with Hobbes', but it is Gauthier's aim to show that the outcome of the initial bargain is a moral principle, as opposed to an external constraint. Gauthier can be read "as a single piece of argument, leading to a breathtaking conclusion: that morality can be derived from rationality." Gauthier does not start with a war of all against all, but with a perfectly competitive market. In this perfect market, free competition automatically guarantees an `optimal' outcome for all people in the society, according to Gauthier. Thus, such market constitutes a so-called moral free zone because there is no need for morality to compensate for, or regulate, any mishaps in the distribution of goods. "Were the world such market, morals would be unnecessary." However, the troubles start when the market is not perfect and invaded by externalities. It is in this market with externalities that the bargaining game begins, which looks like Hobbes' generalized prisoner's dilemma: a war of all against all. Gauthier introduces his theory of rational bargaining which resembles the Hobbesian intention of the people in the state of nature to come to some Articles of Peace.
According to some form of bargaining the Gauthierian subjects reach a certain `optimal' agreement. However, the Foole's argument endangers this agreement in the same fashion as it did to the Hobbesian agreement. Whenever it is in the advantage of a certain subject to break the agreement, straightforward utility maximization calls for breaking the commitment. If this was the only consideration, then "Convenants are in vain and but Empty words," and the prospects for rational cooperation would seem bleak. Gauthier's answer to this is not external constraint, but the necessity of a moral transformation of the bargaining subject. Gauthier's argument is that people's dispositions are not opaque to one another. Moreover, without the disposition to comply the concept of commitment is empty. As opposed to Hobbes' solution of an external constraint, "here a commitment to justice serves as an internal constraint on her choices, affecting not her beliefs about the possibilities open to her, but her way of evaluating those possibilities." According to Gauthier the idea of straightforward maximization is inconsistent when it comes to bargaining. For this reason he introduces the idea of a necessary, moral transformation of the bargaining subject, which he calls the idea of constrained maximization.
Gauthier's game theoretic analysis of the social contract is unorthodox in the eyes of many game theorists. Especially his bargaining theory and the idea of constrained maximization have been considered eccentric. An orthodox game theoretic opposition to his ideas comes from Ken Binmore, who rejects moral imports in the game theoretic tool-kit. Binmore does not wish to deny the existence of morality, but he refuses to consider it as separate from `economic considerations.' Binmore explicitly discusses this issue by means of the distinction between a homo ethicus and a homo economicus. The former is that aspect of a human being that makes moral choices, whereas the latter is that part that explicitly maximizes her own utility.
However, if homo ethicus is to be characterized in terms of a capacity to feel sympathy and to make commitments, then there is no need to see him as a separate species from homo economicus... Nothing says that homo economicus cannot be passionate about the welfare of others... Nor does anything disbar homo economicus from making some types of commitment. What kinds of commitment he can make depends on what commitment mechanisms are available to him.
Game theory is a set of tautologies, according to Binmore. Any kind of behavior, moral, immoral or amoral, is consistent with game theory, in the same way that it is consistent with the fact that 2+2 = 4. That `2+2 = 4' is true and that `2+2=3' is false are matter of internal mathematical considerations. Similarly, the conclusions of game theory are tautologically true, and do not affect anything in the real world (because they always hold in the real world, given that the assumptions are fulfilled) let alone any moral principle. Binmore does not necessarily object to Gauthier's idea of constrained maximization, but he denies that it constitutes a new principle within game theory, or that it logically follows from game theoretical considerations. The only way that such a constraint can be modeled is as a move within the game. For Binmore, game theory does not teach us anything about ethics. The best the game theorist can do is to model people's moral considerations properly.
Binmore leaves us with a form of game theory stripped from a normative punch. It may be an important tool in the tool-kit of a moral philosopher, however, it does not have moral implications. Nevertheless, we believe that Gauthier does not have to be upset --- although he would do better not to interpret constrained maximization game theoretically. Instead he should move closer to Rawls interpretation of morality in terms of a contract and to Hobbes interpretation of a model for morality in terms of external constraints. As Binmore suggests Gauthier's ideas could be modeled in game theory by means of external constraints affecting the payoffs. These constraints could --- besides punishments or monetary awards --- certainly be moral considerations, or moral transformations as Gauthier calls them. To explain where these moral transformations arise from it would do him well to move to a Rawlsian interpretation of the social contract.
Rawls' contractarian theory of justice has been misinterpreted from its first appearance. Wolff's influential commentary elucidated the Rawlsian model as deriving substantial moral principles from minimalist assumptions about rationality.
Rawls had an idea. It was ... one of the loveliest ideas in the history of social and political theory. ... Rawls proposed to construct a formal model of a society of rationally self-interested individuals, whom he would imagine to be engaged in what the modern theory of rational choice calls a bargaining game. His intuition was that if ... he posited a group of individuals whose nature and motives were those usually assumed in contract theory --- then with a single additional quasi-formal, substantively empty constraint, he could prove, as a formal theorem in the theory of rational choice, that the solution to the bargaining game was a moral principle having the [desired] characteristics.
This interpretation of Rawls is close to what Gauthier attempts to do in his Morals by Agreement, and it is no wonder that he admits to be charmed by it.
However, the true Rawls has denied explicitly any game theoretic interpretation of his theory, and bargaining has no place in his idea of justice. The free and rational subjects that Rawls assumes in the original position are not thought as bargaining for their own, best possible share. These hypothetical subjects are assumed to act on their sense of justice and their conception of the good, on which they agree behind a veil of ignorance about their role that they will fulfill in the future. They do not bargain for goods, but they agree on principles of justice. Because the Rawlsian contract is a moral contract, there is no need to give a meta-theory of compliance. The contract determines the character of the self-motivated moral personality.
We conclude this section with the observation that the bargaining-game is essentially non-moral. Game theory can help to locate optimal choices, given a set of values and moral of those bargaining. The choice for morality is essentially prior to any bargaining and is a self-motivated choice.
From distributive to allocative justice
In this section we shall study the relationship between actual distributions and distributive justice. It was the conclusion of the previous section that considerations of justice are only marginally related to numerical distributions. The corner stone of distributive justice is the principle of equal opportunity. Equal opportunity signifies an equal possibility for every human being to take personal responsibility in accomplishing or obtaining a certain thing. Equal opportunity is distinct from equal probability in the sense that the latter does not explicitly acknowledge the role of individual responsibility as the locus habitas of merit. Therefore, equal probability is incapable of producing any form of distributive justice.
In the last three decades game theory has become applied to philosophical considerations of distributive justice. Attempts have generally been made by analytically oriented philosophers with an affinity, although not necessarily support, for utilitarianism. Numerical distributions have been considered and different optimality criteria have been compared. One of the most influent theories of justice has been proposed by John Rawls.
Members of society are asked to envisage the social contract to which they would agree if their current roles in society were concealed from them behind a veil of ignorance. From the point of view of those in such an informational state, the distribution of advantage in the planned society would be determined as though by a lottery. The idea is that a social contract agreed in such an original position would be "fair", the intuition being that "devil take the hindmost" will not be an attractive principle if you yourself might end up with the lottery ticket that assigns you to the position at the rear.
Rawls proposed that the numerical equivalent to this philosophical attitude is the maximin criterion, that is to maximize the welfare of those who are worst-off in society.
Table 9. `Counter-example' Rawls' maximin
Distribution I
Distribution II
share
units
share
units
Andrea
62
Andrea
70
Brenda
61
Brenda
70
Cheryl
60
Cheryl
59
Sen formulated a famous objection against Rawls' maximin criterion in the form of a `counter-example' and an appeal to common sense (cf. Table 9). If maximizing the welfare of the worst-off is taken as the principle of justice, then Distribution I is more just. However, as Sen contends, Distribution II is "obviously" better. There are utilitarian reasons to justify this claim. The total utility of the second distribution is larger than that of the first. Moreover, the second distribution possesses the quality of having "the greatest good of the greatest number," the classic formulation of the utilitarian principle of justice. However, if A(ndrea) and B(renda) are the two most sought-after criminals of the country, then it becomes dubious whether common sense still wants to qualify Distribution II as more just. Utility considerations do not take into account what an individual deserves, whereas desert is an important element of justice. Rescher concludes that "the `principle of utility' cannot be a serious candidate for a principle of distribution when its formulation does not take account of the desert ... of the individuals involved."
Nevertheless, considerations of justice are related to utility deliberations, as it is unjust to satisfy non-existent needs in an economy of scarcity, much in the same way that inefficiency is unjust by wasting the goods themselves. In themselves, inefficiency and negligence of utility are not unjust. The additional premise of scarcity adds coherence and interrelatedness to distributive justice. Each share stands in relation to any other and thus efficiency and attention to utility become involved in justice. Utility is an auxiliary variable and never the ground of justice. A distribution is in itself just nor unjust. Rawls correctly notices that
a distribution cannot be judged in isolation from the system of which it is the outcome or from what individuals have done in good faith in the light of established expectations. If it is asked in the abstract whether one distribution of a given stock of things to definite individuals with known desires and preferences is better than another, then there is simply no answer to this question.
It is therefore a misconception that distributions are intrinsically just or unjust. The justice of a distribution is grounded in fairness and desert, not in the numerical values of that distribution.
Most distributions of goods and evils are not subject to considerations of distributive justice. That the winner of the tennis match deserves a prize is not a matter of justice, as Rescher mistakenly assumes. That she deserves a prize is because the rules of the game specify so. If it were specified that after the tennis match the prize is to be given according to the outcome of a coin toss (e.g., the loser gets the prize if head shows up), then the loser of the tennis match can legitimately claim the prize if she wins the coin toss. Also this rule is not a matter of justice --- at least not in the traditionally narrow sense that we take this word. It is well known that Aristotle defined a wider sense of justice, namely those activities that are in correspondence with laws and rules. We have excluded those situations from our previous investigations because laws and rules are not necessarily based on any principle of justice.
However, that does not mean that there are no normative considerations underlying those practices that are void of justice. In a casino there are games that have higher and lower expected gains. A rational person will probably choose to play a game with the highest expected gain. Similarly, in the game of life there are better and worse means to arrive at given ends. In this chapter we shall deal with instances of hypothetical reasoning that are subject to chance and uncertainty. In means-end reasoning the aims are predetermined and the task is to find the most efficient means towards these ends.
We shall consider cases in which considerations of distributive justice do not play a role. Instead of distributive justice we shall speak of allocative justice, a term that we borrow from Rawls. "By contrast the allocative conception of justice seems naturally to apply when a given collection of goods is to be divided among definite individuals with known preferences and needs. The goods to be allotted are not produced by these individuals, nor do these individuals stand in any existing cooperative relations. Since there are no prior claims on the things to be distributed, it is natural to share them according to desires and needs, or even to maximize the net balance of satisfaction. Justice becomes a kind of efficiency, unless equality is preferred." To emphasize the difference between allocative justice and distributive justice we shall consider two possible distributive shares for Andrea, Brenda and Cheryl, such as indicated in Figure 5.
Figure 5. Fairness and optimality
Although the first distribution does represent an equal distribution among the three recipients, the situation is insufficiently specified in order to decide whether, from the point of view of distributive justice, this distribution is deserved or desired. If Brenda and Cheryl are two criminals of war, then Distribution II has clear advantages. Or, if the distribution represents the outcome of a bingo-evening of the local church, then all participants may prefer the second distribution over the first. In both instances the equal distribution of shares seems unwanted. Distribution II even maximizes the share of the worst-off.
In the situation in which Andrea, Brenda and Cheryl perform the same job and are asked make a choice for a reward system, then they may decide to choose for the second distribution and assign themselves the shares A, B and C according to the outcome of randomizing device. Assigning everyone an equal chance optimizes the expected utility of two of the three persons, moreover it provides all players an equal amount of expected utility. This is a form of quasi-equalization.
Before we move on to embrace this form of quasi-equalization and optimization we want to stress that the equality of chance does not reintroduce fairness nor justice into the picture, nor should it be confused with the "equality of opportunity." Everybody has roughly an equal chance of being killed in a car accident. However, after the lottery has distributed its prizes, most people are alive and a fourteen year old girl died under the wheels of a truck. Calling this distribution fair or just is category mistake. In matters of importance a lottery proves unsatisfactory. We do not want our spouse nor our career to be decided by the flip of a coin. In the previous chapter we explained at length that the inequalities as the result of the lottery of life need to be compensated. Random quasi-equalization is a "council of despair" in the distribution of what is considered a primary good, and should be only be used in less pivotal situations according to, what Gataker calls, the Rule of Caution. "Concerning the matter of business wherein Lots may lawfully be used, the rule of Caution in general is this, that Lots are to be used in things indifferent onely."
Thomas Gataker lived at the eve of the birth of probability theory. During his lifetime he published an at the time influential treatise, Of the Nature and Use of Lots, on the philosophical character and practical use of lotteries. Although cautiously, he argued that lotteries could be used for the distribution of certain goods and evils. He identified two circumstances; either the to-be-distributed commodity is of no essential importance to any of the recipients, or the recipients are in all relevant respects of equal desert and unequal commodities are to be distributed. Gataker analyzed an hypothetical circumstance in which the Church and its ministers are prosecuted. He argued that chance should decide whom of the ministers should stay on their posts and whom should flee. Lotteries play a similar role in modern game theory. Because they do not have any moral import, lotteries cannot be used to decide upon matters of justice. However, when the issue is of no essential importance --- for instance, when two persons find themselves at odds to decide to attend a party or not --- game theorists introduce "the idea of a choice as a lottery with possible actions as prizes."
Let us consider the following example. Tina is interested in Eric, but Eric is really not interested in Tina. There is a party and Tina wants to come, if Eric comes. Eric, however, does not want to come when Tina comes. If she does not go and he is there, then she would feel very bad. If she does not come and he does not go, then she is indifferent. If she comes and he is not there, then she feels uncomfortable. On the other hand, if we put ourselves in Eric shoes, he would feel bad if he and Tina would be on the party together. He would be indifferent, if she came and he would stay at home. He would feel slightly bad, if he would not come to the party when she is not there.
We permit ourselves some initial liberties with regard to translation of the utilities described in this little story. We represent them as having a meaning beyond their mere ordinality and assume that they are interpersonal. The utilities presented in this story are listed in Table 10.
Table 10. `To go or not to go?' That's the question.
Eric
comes to the party
does not come
Tina
comes to the party
(1, 0)
(1/4, 1/2)
does not come
(0, 1)
(1/2, 1/4)
The fundamental assumption underlying rational choice within game theory is that agents always act as to maximize their --- possibly enlightened --- self-interests. This definition does not categorically exclude sympathy or altruism. We can take interest in other people's interests. Whether this interest is guided by principles, care for others, reason or duty is not important. It reveals the interests that the agent possesses at that moment. In this little story the preferences of the two players are given. For each player there are two pure strategies: to go or not to go to the party. If Tina will go to the party, then Eric will not go, but in that case, Tina would rather not go as well, however, then, Eric would like to be at the party, which in turn will have Tina change her mind again, and she would want to come, which... etc.
We are all players in the game of life with divergent aims and aspirations that make conflict inevitable. In a healthy society, a balance between these differing aims and aspirations is achieved so that the benefits of cooperation are not entirely lost in internecine strife. Game theorists call such a balance, an equilibrium.
The pure strategies that Tina and Eric may decide upon are out of balance. In social interaction it often happens that given the choice of someone else we would be better off doing X instead of Y. However, if we were to do X, then for the other person her choice would not be maximizing her utility, and she would choose some other actions, which for us makes Y to be a utility maximizing response. Unfortunately, given Y the other actor would prefer her initial choice, etc. In these pure strategies (Do X, or do not do X) it can happen that choice is pending, and our deliberation would be moving around in cycles. The solution for this problem has thus to fulfill two requirements. The first requirement is that the strategy optimizes one self-preference given the optimal response of the other player. Secondly, the optimizing strategy should be in balance, i.e., in equilibrium.
Mixed strategies, i.e., strategies in which the agent's action depends on the outcome of a coin toss or a roll of a die, have always an equilibrium. "A lottery over possible actions is termed a strategy... Each set of strategies, one for each interacting person, determines a lottery over outcomes. We define an expected outcome as such a lottery, as the product of the lotteries or strategies chosen by each person... An expected outcome is in equilibrium... if and only if it is the product of strategies, each of which maximizes the expected utility of the persons choosing it given the strategies chosen by other persons." In the case of Tina and Eric this is easy to imagine. Consider the mixed strategy that Tina goes with probability p to the party, and that Eric comes to the party with probability q. The expected utilities for both players become:
E[UTina] = p (1.25 q - 0.25) - 0.5 q + 0.5
E[UEric] = q (- 1.25 p + 0.75) + 0.25 p + 0.25
Figure 6. Mixed strategies always have an equilibrium
Now Eric and Tina have to choose a q and p respectively so that it maximizes their expected utilities and so that the outcomes are in equilibrium. Note that if q < 1/5, then Tina would choose p = 0, However, if Tina does not go to the party (p = 0), then Eric will go the party (q = 1). Then Eric and Tina would be caught up in an infinite switching of strategies. If q > 1/5, then Tina would choose p = 1, which then would have Eric set his q = 0. And thus, also this solution is unstable. The only solution for q that is in equilibrium is q = 1/5. Similarly, the single equilibrium solution for p is p = 3/5. The equilibrium solution is a value for the lottery that does not predetermine the outcome of the other person's lottery. When faced with this problem and these utilities, Tina and Eric could rationally agree on applying a randomization instrument to determine whether they will attend the party.
There is a problem of compliance. Although this solution is in equilibrium before the coin-flip, there is no guarantee that a utility maximizer will stick to the agreement. If the coin decides that Tina and Eric both go to the party, then what makes Eric stick to his agreement? For Eric, a utility maximizer, it is not rational to stick to the agreement, because he would enhance it by defecting to not going to the party. However, if Tina realizes in turn that Eric will not stick to his agreement, she will change her action and they are caught up again in an infinite regress.
The problem is that as soon as the lottery is performed and the outcomes are known, nothing in the view of a selfish rationalist could prevent him from adjusting his strategy accordingly. To solve such regressum ad infinitum the outcome should only be known by the agent, and information about the outcome should principally be shielded from the other party. The necessity of privacy is contained in a fundamental property of probability: a probability dissolves in the view of the actual outcome. If probability is the answer to solving the problem, agreed upon by both parties, then the only way the agreement can hold is as long the probability remains a probability. There are two ways in which a probability remains a probability: on the one hand, it could be by a fundamental time-ontological separation of subject and outcome: the outcome has not crystallized yet because it lies temporally ahead of the subject. On the other hand, there could exist an epistemic separation of subject and outcome: as long as the subject did not receive any information about the outcome of the lottery, the probability still exists for her.
In this way the issue of compliance can be avoided, because two rational subjects aimed at utility maximization will separately perform the own randomized experiment and construe their actions accordingly. Nonetheless, a new bargaining situation arises if both Eric and Tina arrived at the party (or if Tina arrives and Eric is not there). Then the outcomes of the two lotteries become known to both parties and compliance becomes an issue again. If Eric is a consistent utility maximizer, he would go home again, unless the costs for going home would exceed the disutility of Tina's presence. If there are no costs associated with going home, Eric would go home, and subsequently Tina would. Both persons being at home, might then consider the question again and perform a new lottery.
New developments in bargaining theory have attempted to overcome this stalemate. The idea is to formalize commitment and to turn commitment into a simple tautology. In Rubinstein's theory of non-cooperative bargaining (1982) there is one additional assumption that a discount factor for an agreement delay has to be taken into account to model the loss of utility of deadlock.
The Prisoner's Dilemma
A famous example in the literature of game theory is the Prisoner's Dilemma. It represents a dilemma in some uncertain choice situations. First, we shall portray the classic formulation of the dilemma, then we shall show that the `best' strategies yield a solution that does not seem an `optimal' --- in some sense of the word. The Prisoner's Dilemma has played to imagination of many philosophers and game theorists. Hobbes portrayed a rudimentary form of the Prisoner's Dilemma in his description of the state of nature. It has inspired game theorists to find `solutions' for the dilemma. We shall argue that the Prisoner's Dilemma is a mathematical toy, not a model for human cooperation. To solve a situation that looks like a Prisoner's Dilemma is to change the rules that turn it into another mathematical game. It is our claim that such solution is actually a dissolution of the problem and it does not answer any ethical question.
In the classical form of the Prisoner's Dilemma the two heroes of our story, Tina and Eric, are two notorious criminals that have been arrested for a minor felony. The police want to convict them for their previous crimes, but there is not sufficient evidence. The only people who could provide the evidence are Tina and Eric themselves. The police interrogates them separately. If one cooperates with the police and the other does not, then the former goes free whereas the latter is convicted up to ten years for all previous crimes. If both confess, then either one will be sentenced to five years in prison. If both deny the charges, they only become convicted to one year for the minor felony. The negative values in Table 11 represent the negative utilities of a prison sentence.
Table 11. Prisoner's Dilemma
Eric
cooperate with police
denies the crime
Tina
cooperates with police
(-5, -5)
(0, -10)
denies the crime
(-10, 0)
(-1, -1)
Again we presuppose that Tina and Eric are consistent utility maximizers. Assuming that each person attempts to maximize her own benefit, the dilemma becomes apparent. If Eric cooperates with the police, then it is better for Tina to cooperate too (a choice between -5 and -10), and if Eric does not cooperate with the police then it is still better for Tina to cooperate (a choice between 0 and -1). So no matter what Eric does, it is better for Tina to cooperate with the police. According to the Sure Thing Principle Tina will confess. The situation is symmetric, so the same holds for Eric. Utility maximizing considerations lead Eric and Tina to confess both and getting both sentenced to jail for five years, whereas if they both had denied the crime each would have received only a one year prison-term.
Apparently a pure strategy does not work in the Prisoner's Dilemma. However, Tina and Eric could follow a probabilistic or mixed strategy. Say, Tina confesses with probability p, and Eric confesses with probability q. The expected utilities of these lotteries are:
E[UTina] = p (4 q + 1) - 9 q + 1
E[UEric] = q (4 p + 1) + 9 p + 1
Eric and Tina have to choose a q and p that maximize their expectations and that bring the outcome in equilibrium. From these equations no matter what q is, Tina's optimizing strategy is always to choose p = 1. Similarly for Eric: no matter what p is, it is always most profitable for him to choose q = 1. So even the mixed strategy results in the case where both confess and go to jail for five years.
The strategy in which both players defect `wins' over any other pure or mixed strategy, because no player will use a strongly dominated strategy. Should this analysis of the Prisoners' Dilemma worry us about the possibility of human cooperation? No, it should not. Binmore is correct in pointing out that Gauthier does not `fix' game theory when he proposes constrained maximization as the method to reach the Pareto improvement, (not confess, not confess), in the prisoner's dilemma. Game theoretic considerations of this pay-off scheme lead to the (defect, defect) solution, much in the same way, to use a technical analogy, that the EM algorithm in NPML estimation often leads to a local and not the global maximum: we may wish it did not, however, that is just the inherent property of the method.
The larger point that we wish to make is that there are no moral conclusions to be derived from a game theoretic analysis of a certain situation. More specific, when one is faced with a prisoner's dilemma, it does not follow that one ought to defect. Gauthier's constrained maximization is a possible and proper moral analysis of the prisoner's dilemma within a Kantian framework. Gauthier argues that the concept of commitment is incoherent if it does not entail a certain level of behavioral compliance, because people's dispositions are translucent to one another. The following argument by Gauthier has a Kantian flavor.
Since our argument is to be applied to ideally rational persons, we may simply add another idealizing assumption, and take persons to be transparent. Each is directly aware of the disposition of his fellows, and so is aware whether he is interacting with straightforward or constrained maximizers. Deception is impossible.
Binmore falsely accuses Gauthier of committing the fallacy of the twins. Binmore is mistaken that Gauthier's argument is a game theoretic analysis. It is a moral analysis. This moral analysis can be described by a different game-theoretic game, although the moral game is still the same as before.
Despite his detached analysis of the prisoner's dilemma, Binmore falsely believes that the mathematics does suggest a morally superior solution. His description of the moral choice in the prisoner's dilemma analyzes to what the game would numerically translate if one employs a Rawlsian/Kantian or a utilitarian framework. For a Kantian only those strategies that describe the same action of the agents involved, are relevant, whereas for a utilitarian the total combined utility of a strategy has relevance for each individual. For the utilitarian the utilities (the utils) are assumed to be interpersonal. Table 12 is adapted from Binmore.
Binmore's suggestive names for the second and third game indicate his belief that these are dead-end strategies. Binmore's naturalism confuses him about the boundaries of evolutionary considerations and morality. However, mere evolutionary superiority cannot prove the moral superiority of a certain interpretation.
Table 12. Binmore and the Prisoner's Dilemma*
Prisoner's Dilemma
Rawlsian/Kantian Dodo
Utilitarian Dodo
not confess
confess
not confess
confess
not confess
confess
not confess
2
2
3
0
2
2
0
0
4
4
3
3
confess
0
3
1
1
0
0
1
1
3
3
2
2
In this section we have shown that game theory is a helpful tool, but should always be subject to exogenous moral considerations. A similar idea appears in the next section in which we shall study one of economists' holly cows: the expected value. Many generations of economists have employed the opportunity that expected utility maximization offers to reduce all the uncertainty-considerations of a certain probabilistic system to the computation of a single number. We shall explore some of the computational aspects, conceptual confusions and moral implications of the concept of mathematical expectation.
Expectation, a guide to action?
A bird in the hand is worth two in the bush.
A mathematical concept that has lived a life of its own outside mathematics is mathematical expectation. In its marriage with game theory it produced a child named "the principle of expected utility maximization" that sometimes has been considered the "principle of rational bargaining" with a "context-free universality of applications." The aim of this section is first to clarify the concept of expectation, and as a natural consequence, question the universality of this principle. An expectation is much like the two birds in the bush and we shall argue that it is sometimes rational to settle for the one in the hand.
From the early beginning of probability theory the attention of the theory has never been restricted to probabilities alone. On the contrary, it took a while before the probability concept was developed. People like Huygens, Fermat and Pascal considered probabilities only in a derivative sense. The primary notion for both of them was some kind of `fair exchange,' or average pay-off in modern terms. Uncertainty was not studied for its own sake, but to facilitate and systematize decision making. The problem of decision making under uncertainty is the impossibility to rank the alternatives hierarchically according to utility level. An unlikely catastrophe may be preferred over a certain injury, although one would prefer the injury over the catastrophe under certainty.
Pascal's wager had to weigh the two alternatives open to him. Either to believe in God with the risk that he did not exist or not to believe in God risking to be condemned for eternity. In the history of ideas, Pascal's wager-argument has been reconstructed in several ways. The reconstructions are as instructive as what Pascal himself had to say. Bernard Williams reconstructs the wager-argument as a qualitative and selfish choice. Let U be the wager's utility function, and let the two alternatives --- God exists, or God does not exist --- be numbered, j = 1, 2.
The pagan wager will be condemned to eternal hell if God exists, whereas his religious twin will only spoil his Sundays in church if God does not exist. Therefore, minj U(non-believing) is much worse than minj U(believing).
On the other hand, if God does not exist, the pagan wager will not waste his Sundays and can get to his dinner a bit quicker without having to pray. However, the utility difference between a pagan and a religious wager in that case is not very different. Therefore, maxj U(non-believing) is not much better than minj U(believing). Consequently, if we behave with caution and solely in our self-interest, then we would prefer to believe over not to believe.
In the modern, game theoretical translation of the problem, the wager is thought to associate subjective probabilities to the two alternatives. The wager, although quite convinced that God does not exist, cannot discard the slight chance that God exists. Therefore, assume that P(God exists) = e , a positive but very small value. The decision problem now becomes a comparison between two expected utilities.
The expected utility of believing:
E[U(believing)] = e U(heaven) + (1-e ) U(wasted Sundays) = e ¥ + (1-e ) (-C) = ¥ .
Whereas the expected utility of a pagan life:
E[U(non-believing)] = e U(hell) + (1-e ) U(enjoyable Sundays)= e (-¥ ) + (1-e ) C = -¥ .
An easy comparison between the two alternatives (¥ > -¥ ) clearly shows that according to selfish rationality leading a religious life is preferable.
Both accounts provide useful metaphors of a modern decision processes under uncertainty. Whereas the first kind of approach resembles the decision processes of many federal risk-agencies, the latter is propagated since the development of modern game theory by Von Neumann and Morgenstern in the forties and Savage's subsequent, subjectivist expansion.
In this section we shall discuss the expected utility approach. In discussing the Bayesian probability theory we have analyzed several of the presuppositions of the expected utility approach. An expectation generally resembles the `amount' that a gambler will `earn' on average if she keeps playing the same game indefinitely. The principle of expected utility maximization postulates the equivalence (not necessarily the identity) of an average outcome and upon what one should act in a one-shot bet. Expected utility maximization has the advantage of being unbiased in the long run and Viscusi is confident that "upon reflection most people ... would accept the theory's underpinnings."
Expectation is a mathematical quantity defined as the sum of all values times their probability. The concept of expectation has often been the subject of confusion. The misleading quality of the concept of mathematical expectation is its name. If an experiment is performed, for instance rolling a die, we do not really expect that the outcome will be 3.5, i.e., the expectation. The confusion of the every day use of the term expectation with the mathematical concept of the same name is widespread. Novick and Jackson confuse the two different meanings when they say: "The number of colleges one would expect to need to test to find one with a mean 19.0 or greater is 1/p . To get n schools with the required score one would expect to need to test n/p . For example, if n = 4 and p = .237 ... , one would expect to test about 17 schools to get the required 4 additional schools with the required scores." Similarly Binmore is prey of this conceptual confusion: "Adjusting the probabilities [that are estimated by using the upper level of the 95% confidence interval rather than the mean] amounts to lying to ourselves what we expect." Another example comes from Lindley who interprets the statement "I am 90% certain that the neutrons have been produced as a result of a thermonuclear reaction," in the following way: "The 90% quoted by the scientist was probably his assessment of a fair bet, since the statement, unlike a bookmaker's, would not be made with the expectation of gain." The concept of expectation in the second part of the statement is the everyday-life concept, whereas the concept of a "fair bet," i.e., a bet with mathematical expectation being zero, is clearly not. In order to keep the two apart, one should realize that what we expect is that what has a high probability of occurring, whereas a mathematical expectation is a long run average value.
The true meaning of mathematical expectation is a long run average. If the gambler had the choice between the average of an infinite amount of gambles and the stake, then the choice would be, with probability one, between two identical things and thus she would be indifferent, with probability one. However, the actual choice situation is not between the average gain of an infinite amount of gambles and the stake, but between a single gamble and the stake. Risk-neutrality assumes the equivalence between gamble and stake, which actually only exists in the long run. This assumption is sometimes acceptable, however there are instances where it is not even true by approximation. From the early days of probability theory stems the St. Petersburg Paradox: how much should a rational person be willing to pay to play the following bet:
The gambler flips a coin until he gets a tail. If i is the amount of consecutive heads that showed up before the tail showed up, then she receives 2i+1 dollars.
Note that, if X is defined as the amount of winnings, then her expected gain E[X] is infinite.
E[X] = S a a pX(a) = S 2i+1 1/2 i+1 = ¥
Therefore, a risk-neutral gambler should be willing to pay any amount of money to play this game. It unlikely, however, whether anybody would ever risk her entire fortune to play this game and ending up with, e.g., only sixty-four dollars after an already quite rare five-head consecutive run. Expectations rely on asymptotic considerations, and the gambler of life may simply not have an opportunity to realize such a run. Using John Maynard Keynes' words freely, we can say: "Asymptotically, we shall all be dead." This should at least cast a shadow of doubt that a mathematical expectation can count as an a priori principle of decision making, as Gauthier and others have suggested.
What the St. Petersburg Paradox shows us is that the there is no inherent connection between mathematics objects and rational action. The Paradox only exists as a paradox if one believes that such a connection does exist. That Daniel Bernoulli understood this fact clearly demonstrate his definition of moral expectation as an attempt to model human moral intuitions in response to the Paradox. He argued that any gain is less than proportionately related to moral enjoyment (whereas damage possesses a more than proportional relationship). According to this moral expectation a rational (where rational is taken to be the same as acting in accordance with moral expectation, by definition) person would pay at most $1 for the game.
The first risk considerations stem from the era of the rise of the merchant class in thirteenth century Italy. This period saw a steady increase of risks. The emergence of the bank-system enabled ventures with higher stakes, and improvements in ship-construction tempted more hazardous trips to places never sailed before. A risk-averse attitude in the early days of risk emerged and has stand the test of time as being wise. "Sober men of affairs knew to avoid the gambling tables and to distribute their cargo among several ships." The first forms of insurance stem from that same period. It was considered to be wise to insure one's goods and one's health, and insurance is a risk-averse action --- if it were not risk-averse, then insurance companies would soon be bankrupt; they exist by virtue of a long run advantage, i.e., the long run disadvantage of those insured.
One danger of a critique of risk-neutrality is that it may easily misunderstand the tautological nature of modern game theory. Disowning its Benthamite origins, the economic theories have rejected the idea that a utility function can provide a reason why a person chooses A over B. Instead it deduces that the utility of A was greater than B, because the person chose A when B was available. People act as if they are maximizing their expected utility. Primarily, therefore, modern game theory is a descriptive theory.
We do not wish to argue with a tautological theory, but it is our aim to unnerve any misplaced normative conclusions from it. Some people have not only argued that reliance on expected utility maximization is not only a economically superior, but also a morally superior strategy. As a consequence of his concept of a homo economicus, Binmore comes close to this point. His idea of the homo economicus is that it has a set of personal and empathetic preferences, which are fixed in the short run. In the long term both personal and empathetic preferences are subject to change. Binmore shows that "it is time that corrupts societies --- and, given long enough, it corrupts them absolutely. It does so by creating a society whose forum for moral debate is the market-place." The reason is that Binmore believes that only those people with a risk-neutral meme will survive. Closed societies that live on the verge of extinction may better be guided by risk averse principles, but in open societies risk neutral subjects will survive. Binmore himself is disinclined to say that it risk neutrality is morally superior, however his conception of the homo ethicus as contained in his homo economicus drives him to this evolutionary naturalism.
However, the connection between normative concepts and expectation is unwarranted. The idea that a bet a the critical odds is fair is a category mistake. Lindley argues that "the scientist would not offer a higher odds because he thinks it is unfair to him, and if he made it at lower odds he would stand to win, which would be unfair to his opponent." The idea that risk-neutrality has some kind of moral superiority is postulated but never justified. Where morality is concerned, considerations of risk-neutrality are irrelevant. What they actually mean is that the bet is risk-neutral, i.e., that the game has a zero expected gain. Mathematical speculations and game theoretic tautologies have no moral relevance. The idea of fairness only attains to an equitable distribution over those who are of equal merit. A bet or lottery never fulfills such requirement. Why should any chance-outcome have a special status?
When the principle of expected utility maximization is taken as a tautological axiom in a game theoretic framework, we do not wish to argue it. However, if it is assumed to be an actual guide of action, instead of a description, then we have shown several breakdowns of expected utility theory. Expectations rely on a mistaken equivalence of a single game decision with long-run-average considerations. Our moral reservations against expected utility theory are against its reliance on the average, the normal, and the fact that it fails to take into account fully the unexpected or the catastrophic. If there is too much at stake the proverbial bird in the hand may be preferable over its two cousins in the bush. Despite these problematic aspects risk-neutrality has a long run leading edge in frequently occurring and marginally important cases, and can be useful in managing the ordinary.
The image, thus far, of life as a continent constantly threatened by the ferocious sea of risk should be amended in one important aspect. Instead of being threatened from one side human life is subject to risks from all sides. Building a dike on one coast will inevitably lead to neglect on the other. Human life in that respect is less like The Netherlands fighting with the sea behind its dikes, but more like Delos, a wandering island attacked from all sides by the sea. To pursue this metaphor further, we shall argue that if we were Delos we would do well to host its modern Leto, statistical decision theory and game theory, to get some stable ground under our feet.
In life we face risks, sometimes in a sequential and sometimes in a parallel fashion. In the last decades, cancer has become an important cause of death, however, only after other risk-factors reduced in significance. Due to the success of modern medicine many people die now from cancer, whereas before they would have died earlier of another disease. After we withstood the first wave of risk, the next one rolls toward us. This is the sequential nature of risk. The other aspect of risk is its pervasiveness in the present. Several risks face us at once. We could die in a car-accident, or become ill from toxic fumes of a malfunctioning chemical plant nearby. Our island of life could be flooded by actualizing risks from any side at any time.
The mathematical theory of competing risks is concerned with the assessment of a specific risk in the complicating presence of other risks. In this section we shall look at the decision theoretic background of the issue of competing risks. Although the right attribution of risks can never be just, an improper choice under uncertainty can be unjust. It will be shown that Von Neumann and Morgenstern axioms and our own calculations dictate to spend money on reducing the largest risks, even if the risk reduction is in relative and absolute terms not as large as of smaller risks. Another issue that we discuss in this section is the competition of so-called knowledge risks and health risks. In statistical hypothesis testing of carcinogens falsely attributing carcinogenicity competes with incorrectly leaving a certain carcinogen unregulated.
Choice between two evils --- the best safety effort
Paradox of the Bullets --- discrete case
The social architect is confronted with a vastness of surrounding risks of different, not always easy identifiable natures. Necessarily, she will have to make a choice about the way to spend her resources to safety measures. She could reduce the risk of a nuclear accident, but then there is less money available for airline safety. Occupational safety competes with product safety. The more money a company has to spend on its worker, the less it may be willing to spend on the safety of the products it produces.
We have to make some simplifying assumptions. We shall assume that there are only two risks competing with each other, each with transparent probabilities. The safety measures are assumed to have an equal absolute impact on the two risks. The actual situation that we shall consider is the following:
Some bullets are loaded into a revolver with six chambers. The cylinder is then spun and the gun pointed at your head. Would you now be prepared to pay more to get one bullet removed when only one bullet was loaded or when four bullets were loaded?
We shall consider several departures from this classical bullet example that will reveal some essential features of proper risk management. In the original case we look at two `competing' risks, each of which threatens us in a different possible world. We want to know for which risk we would be willing to pay more to have it reduced.
In a case like the one presented above Binmore notices that, "usually, people say they would pay more in the first case because they would buy their lives for certain. However, the Von Neumann and Morgenstern theory says that you should pay more in the second case, provided that you prefer life to death." This fact is surprising to many, because it reveals that under the assumptions stated it is more worthwhile to diminish the highest risk. This surprising result has been named the Bullet Paradox, although the paradox exists merely on the level of initial intuition. The paradox dissolves as soon as one grasps the proof by the use of the Von Neumann and Morgenstern axioms.
We shall present the example in a modified form. We shall drop the implicit assumption that the utility of dying is the same irrespective of having paid whatever sum of money. There are several important reasons why to consider the problem without this last assumption. It is questionable indeed whether the utility of death is independent from the amount of money spent. One can certainly anticipate the regret of dying with an X amount of money spent on the safety measure, removing a bullet. This regret is proportionate to the value of X. The higher X, the more regret one would feel. Secondly, if we would consider, instead of an impersonal, notional person, a person with regard for posterity, then the value of death is dependent on the amount of money spent on prevention. The higher the amount X, the lesser would be available for posterity, and the lower the value of death. Finally, making the utility of death dependent on the amount of money paid for the safety measure makes the solution to the problem more general and its application wider. It allows us, later on, to generalize the solution to cases where injuries or other forms of damage are substituted for death.
We return now to the classical bullet example. Table 13 contains all the possible outcomes of the two games ordered according to their relative utility. The first game concerns the payment of $X in order to have the only bullet removed out of a gun with six chambers. The second game is the game where the player pays $Y dollars to have one bullet removed out of a gun with six chambers and four bullets. Paying the money to have a bullet removed stands her for the paradigm of implementing a safety measure with costs involved.
Table 13. Possible outcomes in the two bullet games
Outcomes in two games, implementing safety measures or not
Game Number
Safety measure
No measure
Safety measure
No measure
1. (1 bullet gun)
LC={shot dead, paying $X }
L={shot dead}
C={being alive paying $X}
W={alive}
2. (4 bullets gun)
LD={shot dead, paying $Y }
L={shot dead}
D={being alive paying $Y}
W={alive}
In Table 14, we list the probabilities for each of the alternatives in the two games. Implementing the safety measure leads in both cases to a reduction of the risk to die.
Table 14. Probabilities of possible outcomes in the two bullet games
Probabilities involved in the two games under safety or not
No safety measure
Safety measure
Game Number
Dead
Alive
Dead
Alive
1. (1 bullet gun)
1
/65
/60
1
2. (4 bullets gun)
4
/62
/63
/63
/6It is a typical situation that the social architect is confronted with multiple risks, each of which have to be considered for being reduced by means of a safety measure. When in psychological studies people were asked whether they would spend more money on taking out the one bullet out of the gun with only one bullet or more on taking out one bullet from the gun with four, they significantly tended to be willing to pay more in the former case. People felt that buying their life for certain was worth more to them than just reducing a probability of saving their lives. On the basis of the Von Neumann and Morgenstern principles of rational action should we be willing to pay more in the second case to get one bullet from the four removed or in the first case?
The answer depends on how one values death vs. death losing an amount of money. A person is willing to pay an amount of money for having a bullet removed to the extent that she feels indifferent between the loss of the money she pays and the advantage of having a bullet removed. For each game we can write:
Game 1. u(C) = 1/6 u(L) + 5/6 u(W) [1 bullet gun]
Game 2. 1/2 u(D) +1/2 u(LD) = 4/6 u(L) + 2/6 u(W) [4 bullets gun]
In a utility function the zero point and the unit can be chosen arbitrarily due to the scale and translation invariance of the Von Neumann and Morgenstern utility function; given those, the utility function is fixed. Choosing u(LD) = 0 and u(W) = 1 we can express the preference for C and D as a function of the utility of L.
u(C) = 1/6 u(L) + 5/6
u(D) = 3/4 u(L) + 2/3
As Figure 7 of this relationship suggest the meaningful range of u(L) is from 0 to 0.25, because for u(L) = 0.25 then u(D) = 1 = u(W). Given the assumption that we prefer more money over less, the relationship should hold that u(D) = u(W)=1.
Figure 7. Relationship between the utility of (C) 1 out 1 or (D) 1 out 4 removed
The conclusion is that for values of u(L) between 0 and 1/7, u(C) > u(D). This means that if are `quite indifferent' between death, u(L), and death having paid a sum of money, u(LC) or u(LD), then we value being alive minus $X more than being alive minus $Y. Ergo, X < Y. Therefore, if a person is approximately indifferent between being killed by a bullet with or without her money, it is rational for her to pay more for getting one of the four bullets removed than for removing the one bullet in the gun with one.
If u(L) lies between 1/7 and 1/4, then the gut-feelings of most people proves right, although probably not for the right reason. Most people explain that they are willing pay more for the certainty of being alive than for a process that is risky anyway. However, the reason according to rational game theory is that one considers dying without money so much worse as compared to simply dying. The image is that of a caring mother who does not want to leave her children without money.
This analysis gives a first indication that reducing highest risks is the most effective strategy, when one is confronted by multiple risks. However, in the classical bullet example the risk-structure is very artificial. It is assumed there that there are two parallel universes in each of which there is a certain discrete risk threatening the subject. In the next sections we shall formulate more realistic extensions of the bullet example.
The Best Safety Effort with a Fixed Budget
Risk prevention programs do often not determine the size of their budget. They are normally presented a fixed budget that the program can use for several activities. It is therefore a more realistic consideration to assume a fixed budget of size $N from which safety measures will be financed. This safety budget shall have to be used on safety efforts only. In decision theoretic terms this translates to the following two assumptions.
Assumption 1. The risk taker has a fixed budget $N available for reducing two competing risks.
Assumption 2. The risk taker takes no interest in the budget that has been specified for safety measure only and can never be used for her personal enjoyment. Therefore, the following relationships hold: u(LC) = u(LD) = u(L) and u(C) = u(D) = u(W).
Another realistic improvement over the previous section is that, here, the two risks are assumed to be present at the same time. Combined with the previous two assumptions this implies that the utility considerations of the previous section reduce to probability considerations. We want to improve our chances of survival by spending out budget in the best way on the best safety measure. This requires another assumption to make the problem identifiable. We shall assume that the risks that we face are independent from one another.
Assumption 3. The two risks are independent from one another.
First we shall return to a bullet example where two risks compete for slaying the risk taker and we can put our budget to use only in a discrete way. Then we shall see how a proportionate risk reduction affects safety efforts and we shall extrapolate the impact these results have for safety regulations.
Case 1. Discrete bullets. Imagine that two guns are pointed at your head, one with four bullets and the other with one bullet. They will both be fired. However, before that moment you can decide to bribe one sniper to take out a bullet. These circumstances increase the urge to come up with the most useful strategy. If the budget is interpreted as a bribe to have a person from the firing squad take out a bullet, then you can only bribe one of the two executors. Whom should you bribe? Now that money has lost its importance, the overall utility function has reduced to a probability of survival that one should try to maximize.
Table 15. Survival probabilities
P(survival | 1 from 1 removed)
P(survival | 1 from 4 removed)
1 ´ 2/6 = 12/36
5
/6 ´ 3/6 = 15/36Table 15 suggests that instead of trying to reduce the risk for one component to a minimum, it is more profitable to reduce first the risk of the component with highest risk.
Case 2. Continuous bullets. The bullet example has both the advantage and disadvantage of simplicity. We shall make the example more general and realistic by dropping two constraints. In the new situation, instead of two guns two abstract hazards endanger the risk taker. The first risk has probability of (1-p1) of being fatal, whereas the second risk strikes with probability (1-p2). Without loss of generality p1 is thought to be smaller than p2. Thus, we have a higher chance of survival with respect to the second than with respect to the first risk. Secondly, the budget $N can be broken up to be allotted partially to both the first and second risk. We assume that the fraction x of the budget is used for safety measures with respect to risk 1, and fraction (1-x) is devoted to reducing the second risk. We shall distinguish two cases. In the first case we shall assume that the budget has a proportional impact on either risk. If the full amount of the budget is spend on risk i then this lead either to a constant p% safety improvement or p% risk reduction for that risk. Surprisingly whether it concerns a safety improvement or risk reduction makes a difference. In the second case we assume that the total risk reduction is absolutely p%. It can be subtracted from either the first, the second risk or from a combination of both. In all cases we ask the same question: "What is be the best strategy?"
Case 2.A1. Proportional safety improvement. When a risk manager is confronted with the choice of spending money between certain safety measures, her decision will depend heavily on the impact that is expected from the measures. In this subsection we shall assume that the safety budget has a proportional impact on the risk on which it is spent. So if the full budget is spent on risk 1, then survival rates on risk 1 alone will improve with p%. The following calculations involve optimization of the survival probability.
pi is the survival probability for risk i alone.
x is the fraction of the budget spend on risk 1.
p1, new = (1-xp)p1, p2, new = [1-(1-x)p]p2.
Consequently:
P(survival) = p1, new ´ p2, new = p1p2 [-p2x2 + p2x + (1+p)].
Maximizing a quadratic function with a negative head coefficient, we find:
xmax survival = -b/2a = 1/2.
The conclusion is that if the budget has a proportional, constant safety improving effect on the risks involved, then it is most efficient to divide the budget equally over improving the safety. In this way we maximize the probability of survival. This outcome does not rely on the values or orders of magnitude of the risks involved.
Case 2.A2. Proportional risk reduction. It would seem that a constant proportional risk reduction leads to exactly the same recommendation for the way that we should to spend our safety budget as for a constant safety improvement. Surprisingly this is not the case. We shall assume here that the impact of the safety measure is a proportional reduction of the risk of that risk on which it is spent. Thus if the full budget is spent on reduction of the first risk then the risk of death will be reduced with p%. Again, without loss of generality we assume that p1 < p2.
(1-pi) is the probability of death of risk i alone.
x is the fraction of the budget spend on risk 1.
(1-p1, new) = (1-xp)(1-p1), (1-p2, new) = [1-(1-x)p](1-p2).
Consequently:
P(survival) = p1, new ´ p2, new = (1-p1)(1-p2)p2x2 + p[(1-p1)p2 + (1-p2)p1]x + p1p2.
Maximizing a quadratic function with a positive head coefficient on a limited interval, [0, 1], we find:
xmax survival = 1.
The conclusion is, surprisingly, not the same as in the previous case. One might have expected that it would not matter whether the safety measures had a constant proportional risk reduction or a constant proportional safety improvement. However, it matters significantly. It turns out that if the safety budget reduces risk proportionally and with a constant proportional rate, then it is most efficient to spend the whole budget on the highest risk with the lowest survival rate --- which is p1 here. One general conclusion that does follow from case 2.A1. and case 2.A2. is that it is never profitable to reduce the lowest risk to a minimum. None of the considered cases deems this efficient. In the next subsection we shall consider another possible safety impact and its subsequent policy suggestion.
Case 2.B. Absolute safety improvement. In this subsection we shall consider the case that the budget improves safety with an absolute value p%. If the risk manager decides to spend the whole safety budget on risk i, then the new survival rate of risk i will increase absolutely with rate p. Clearly this formulation is `symmetrical' with respect to risk reduction, i.e., an absolute safety improvement of rate p is equal to an absolute risk reduction of rate p. Therefore, we do not have to consider these two cases separately as we had to do above in case 2.A. We assume again that p1 < p2.
pi is the survival probability for risk i alone.
x is the fraction of the budget spend on risk 1.
p1, new = p1 + xp p2, new = p2 + (1-x)p.
Consequently:
P(survival) = p1, new ´ p2, new = -p2x2 + p(p2 - p1 + p)x + p(p1 + p2).
Maximizing a quadratic function with a negative head coefficient:
xmax survival = -b/2a = 1/2 + (p2 - p1)/2p.
This conclusion is important. What does it tell us? Apparently, if the initial risk rates are equal then it is best to improve the safety rates equally and spend half the budget on the one and half the budget on the other risk. If on the other hand risk 2 exceeds risk 1 more than that the safety effort could make up for, i.e., if (p2 - p1) ³ p, then the best thing for the risk manager to do is to spend the safety budget completely on reducing risk 1. If the risks are less than rate p away from each other then a mix of the safety budget gives the best results.
The Larger Picture
For the larger picture the bullet examples yield some interesting conclusions. First, it is clearly important what the initial safety levels are for the decision how much money to spend on what safety efforts. This is often a point of individual and societal negligence. Viscusi notices remarks that "people often react with alarm to a risk increase but may not be willing to spend much to achieve a comparable risk reduction." People tend to accept the current risk status as a general status quo and the absolute levels of risk do not matter to them. However, in safety efforts money can be spend more wisely if these initial safety levels are taken into account and the different risks are distinguished accordingly.
Currently, there are no coordinated efforts to tune safety efforts to rational decision making. Unfortunately, the safety measures on which most money is spent are often exotic, small risks. A recent example involves attention to the deaths of young children and babies as a result of forcefully inflating airbags. The fatality rate is around thirty children in the last five years. This highly publicized case takes away the attention from the 40,000 other traffic deaths --- among whom also many children --- each year, and it induces the risk of spending money inefficiently on small risks.
The first place where risk regulation should be coordinated is within federal regulatory agencies. These agencies have been subject of much criticism, specifically for their uncoordinated efforts. Some of the criticism, however, has not been accurate. Some critics have argued that the best standard for regulatory agencies is the amount of dollars invested per life saved.
It is useful to think about risk-averting policy in terms of the rates of tradeoff involved, such as the cost per expected life saved. Using this lives-saved standard of value highlights the most effective means of promoting our risk reduction objective. The cost-effectiveness of existing regulations ranges widely, from $200,000 per life saved for airplane cabin fire protection to as much as $132 million per life saved for a 1979 regulation of DES cattle feed. These wide discrepancies reflect differences among agencies in their risk-cost balancing as well as differences in the character of risk-reducing opportunities.
Discrepancies of a factor 660 certainly indicate misplaced policy priorities. Nonetheless, certain differences among safety efforts are justified. Those agencies that deal with greater public risks should be allowed to make more costly regulations than those that have lesser impact on society.
Cognitive vs. health risks --- competing hypotheses
The Blind Men and the Elephant
It was six men of Indostan
To learning much inclined.
Who went to see the Elephant
(Though all of them were blind).
That each by observation
Might satisfy his mind.
The First approached the Elephant,
And happening to fall
Against his broad and sturdy side,
At once began to bawl:
"God bless! but the Elephant
Is very like a wall!"
…
The Second, feeling of the tusk,
Cried, "Ho! what have we here
So very round and smooth and sharp?
To me `tis might clear
This wonder of an Elephant
Is very like a spear!"
And so these men of Indostan
Disputed loud and long,
Each in his own opinion
Exceeding stiff and strong.
Thought each was partly in the right
And all were in the wrong!
J. G. Saxe (1816-1887)
The world through the eyes of a statistician is like an elephant as approached by a blind man. The statistician sees the world in terms of probabilities, never certainties. A statistician gathers evidence for increasing support and accuracy and she manipulates that evidence according to a calculus of uncertainty. Uncertainty is a frustration of reason. Retrospectively, human beings may be capable to assess a situation accurately, however, prospective human reality is all but translucent. Human decision making faces uncertainty and unpredictability.
Statistical hypothesis testing is one of the means to come into terms with the level of risk in the world. However, hypothesis testing itself is also touched by the original sin of uncertainty with which it is dealing. The truth of the conclusions from statistical inference can only be expressed in terms of probabilities. When statistical hypothesis testing is applied to determining the carcinogenicity of a certain substance, two risks face the statistician: falsely expanding the edifice of knowledge or falsely keeping harmful substances unregulated. For the sake of science the statistician should attempt to keep the former --- called type I error, or false positive --- down, whereas for the sake of health she should reduce the latter --- called typed II error, or false negative. These scientific and moral claims are, unfortunately, not consistent with each other, because cost considerations only allow a limited sample size. It is society's task to come to terms with these two conflicting demands.
Cancer is a major cause of death in the United States and elsewhere in the developed world. In the United States only over 400,000 deaths per year can be attributed to cancer. These deaths fall into two subgroups, i.e., either self-inflicted or `active' (tobacco and alcohol related, responsible for about 30-45%) or due to `involuntary' or `passive' causes (related to diet, occupational hazards and air pollution, counting for some 45-55%). It is not exactly known to what extent chemicals in people's diet cause cancer, let alone which chemicals. Cancer causing substances are called carcinogens. The EPA has estimated that there are about 55,000 of these potentially toxic substances. One study reports that about 6,000-7,000 of them have been tested, and 10-16 percent were found to be carcinogenic in animals. The regulatory board, the Occupational Safety and Health Administration (OSHA), compiled a list of 2,400 suspected carcinogens and for only 570 of them the National Institute of Health has gathered sufficient evidence to regulate them to some level of carcinogenicity. In this section we shall be concerned with the moral issues arising from the word "sufficient".
There are several ways to assess the carcinogenicity of a certain substance. Due the moral inadmissibility of controlled studies, in which carcinogens would be randomly assigned to the research subjects, epidemiological studies are most widely used. It consists in making statistical inferences from a sample that happens to be exposed to the potential carcinogen compared to a sample that is not exposed to the substance. Epidemiological studies do have theoretical disadvantages that may hinder achieving accurate results. However, in this section we shall assume that no such complications occur. We assume that the data come from a perfect random sample where exposure to the potential carcinogen has, somehow, been randomized.
The only constraint that the statistician faces in this situation is a limited sample size. She attempts to identify the cause of a relatively rare disease (i.e., in the order of magnitude of 1/10,000 in the general population, such as leukemia) by means of a cohort observational epidemiological study. The epidemiologist considers two hypotheses. The first hypothesis --- usually called the null hypothesis --- states that a potential carcinogen is not associated with the incidence of cancer, while the alternative hypothesis claims that such connection does exist.
H0: The substance is not a carcinogen,
H1: The substance is a carcinogen.
Sampling and subsequent calculations are aimed at making a single inference: either accepting the alternative hypothesis or not rejecting the null hypothesis. However, the statistician will never be able to tell for sure which one is true because the incidence rate is cognitively hidden from her. For that reason, four distinct situations can arise when making the inference:
- She classifies the substance as harmless, which it indeed it.
- She classifies the substance as a carcinogen, which it indeed is.
- She classifies the substance as a carcinogen, while it is harmless.
- She classifies the substance as harmless, while it is a carcinogen.
Table 16. Possible results of hypothesis testing
H0 is true
H0 is false
Accept Ho
+
-
Reject Ho
-
+
The first two are correct inferences, whereas the latter two inferences are mistaken. As in any form of inductive inference there exists the inherent risk that the inference is incorrect. The third inference where the statistician believes that she has to do with a carcinogen while, in fact, it is harmless, is called a type I error or false positive. The fourth inference is an example of a type II error or false negative. Historically, the type I error has been taken most seriously. In many settings of statistical research this error describes a mistaken addition to the body of knowledge. For instance, in certain medical studies the type I error describes the false belief that a certain treatment is healing. Such precedents help to explain the emphasis on reducing the type I error. A type II error, or false negative, on the other hand, often indicates that the cause of a certain phenomenon can not be detected by the data and remains unknown. The type II error typically represents the failure to expand our knowledge. In a scientific setting it is understandable that this failure is considered less serious than falsely expanding the edifice of knowledge, the type I error. The argument there is sound and runs as follows: `The world is full of still unknown principles and if the study fails to reveal a certain principle now, it will be discovered somewhere in the future.'
The traditional emphasis on avoiding type I errors combined with considerably less attention to type II errors have shaped the statistical practice of hypothesis testing to a large extent. Generally, a stricter standard applies to the risk of a false positive than to the risk of a false negative. It is thought less serious to fail to expand our knowledge than to fail in expanding our knowledge. A stricter standard translates to a lower acceptable probability of making a mistake. In the case of the study of carcinogens, the epidemiologist can calculate the risks associated with the four possible inferences from Table 16. They are recorded in Table 17.
Table 17. Probabilities of the possible results of hypothesis testing
The probabilities a and b are the probabilities of a type I error and a type II error, respectively. To avoid error, a and b should be as small as possible. The epidemiologist may choose probabilities very close to zero, which effectively means that she does not accept any chance of making a mistake: she regulates a substance only when it is really a carcinogen and keeps the substance unregulated only when it is certainly harmless. Unfortunately, in order to reach these levels of certainty the data should be really `convincing.' Intuitively it is clear that the data can hardly be convincing when the sample size is too small or when the rate of occurrence of cancer due to exposure is only marginally higher. For example, if the background rate of cancer of type X is 1/10,000 (one person in a population of 10,000) and if a certain substance is really a carcinogen and increases the risk of incidence with a factor five, making the incursion rate 5/10,000, then an unrealistically large sample — more than 80,000 people — is needed to prove the toxicity of this carcinogen `without' risking to make any error (a = b = 0.01).
For practical purposes the sample size should be considered fixed in epidemiological studies. Cost considerations make sample sizes far over 2,000 human subjects unlikely. Moreover, the epidemiologist has to decide prior to her analysis what level of relative risk, d , her test will be able to detect. The choice of d depends on several factors. For serious, life-threatening and prevalent diseases the epidemiologist will set the detectable relative risk very low, close to one. For other, less serious afflictions higher values of d are appropriate. Another factor is the extent to which society can justifiably put the exposed group at risk. If the group exposed to the substance does so on a voluntary basis, it may be tolerable to allow a higher relative risk than for involuntary exposure. Also the mere size of the population exposed to the substance can be a concern. If this group consists of the whole population, a lower relative risk may be considered.
When the sample size n and the relative risk d are fixed by external considerations, then in the statistical analysis the trade-off will be directly between the magnitude of the type I error, a , and the magnitude of the type II error, b . From the early beginning the scientific community has employed the 95 percent rule, i.e., a = 0.05, in hypothesis testing and federal agencies have generally taken this rule as a symbol of scientific objectivity and employed it in statistical testing. In a scientific environment that endorses values of a only smaller than 0.05, this trade-off implies an automatic compromise to values of b , i.e., it exhibits no concern for the risk that the general population is exposed to an unknown, hazardous substance. For example, in the previous example of a relatively rare disease (background rate = 1/10,000) the epidemiologist wants to be able to detect a relative risk d = 5 with a sample size, n, of 2,150 people and a pre-set type I error, a = 0.05. From these conditions it follows that the type II error, b , will be 0.81. This means that in the case the statistician is testing a truly toxic substance with an incidence rate at least five times higher than the background rate, then in 81% of the times the test will classify the substance as harmless.
Research and regulatory institutes such as the Environmental Protection Agency (EPA) and the National Institute of Occupational Safety and Health (NIOSH) investigate the possibility of carcinogenicity of suspicious substances. The criteria that these institutes apply are deeply influenced by a scientific bias with respect to levels of significance and power. The scientific atmosphere within the field of risk assessment has set standards that do not pay attention to moral considerations. The Interagency Staff Group on Carcinogens, a governmental risk manager, for instance, showed little awareness of any moral issue when it wrote in 1986 that the 95 percent rule is "a reasonable statistical bench mark with a history of use in the bioassay field. It would be imprudent to disregard it as an index of significance without good reason." The risk manager pushes her responsibility away by simply adopting text book standards. However, the emphasis on so-called `good science' does make a morally loaded choice, if only by default.
We cannot have it both ways, both strict standards of knowledge and health-sensitive (and moral) regulations. At some level in the decision process these two demands have to be weighed against one another in order to find a proper balance. Viscusi argues that "the present focus of policy on risk thresholds and potential carcinogenicity is shared by the scientific research. [It would be better that] policies are based on the overall desirability of regulation rather than on the level at which a risk can be identified." We should not by default follow a certain so-called tested rule or criterion, such as the 95 percent rule. Especially in cases where a type II error is not so innocent, such as in the study of carcinogens, purely scientific criteria have to be forgone for the sake of public health.
There has been a strikingly similar case where the burden of proof rested on the side of public health — with disastrous consequences. One of the most controversial cases in the public health history has been the disputed effects of lead in gasoline. Relatively recently it has been proven under strict statistical criteria that leaded gasoline has been responsible for decrease of intelligence and many behavioral problems in generations of children. Only forty years after the introduction of leaded gasoline, health scientists proved the subtle increase of risk for which lead is responsible, even though suspicions had existed much longer. As early as 1924, a year after its introduction, the magazine "Public Health" described the possible hazards of leaded gasoline. In 1925 a prosecutor demanded a hearing into the matter, but the gasoline company in question successfully shielded its product from further regulation because no evidence was available that such hazard really existed. The burden of proof rested on the prosecutor instead of on the gasoline company. The prosecutor lacked proper statistical means because all this happened a decade before sound statistical hypothesis testing was introduced by R. A. Fisher. Nevertheless, even now, with all the statistical methods that we have to our avail, the burden of proof is still on the side morality and care for health. It is time that this changes and that the two competing risks facing the statistician are balanced properly.
Chapter 4. Safe shall be my going...
Safety
War knows no power.
Safe shall be my going,
Secretly armed against all death's endeavour;
Safe though all safety's lost;
Safe where men fall;
And if these poor limbs die, safest of all.
Rupert Brooke (1887-1915)
How safe should be our goings? When society sends its poets to die on the battle fields of Europe in the First World War, it becomes clear that safety is not merely an individual decision, but a decision for society at large. In this chapter we shall study the concept of risk, its historical connotations, its present predominance and future effects. From considerations of risk we shall look at the issue of safety, and analyze the relevance of the ideas of justice for safety with means that we have developed in Chapter 2 and Chapter 3.
Over its six hundred years long history the concept of risk has developed from representing an uncertain opportunity to a calculable relative frequency. This development reflects the seeming control that human society has gained over uncertainty. Human beings tend to believe that they can avoid the risks of those activities in which they execute some kind of control. Studies have shown, for instance, that a large majority of the people believe that they are better than average drivers. This is an irrational belief because not more than half of the people is actually a better than average driver. Even the belief that all risks can be identified and estimated goes beyond human capabilities and limitations. The unexpected is an ineliminable part of human reality.
The nature of risk presents ideal opportunities for a free market to indulge in risky technologies and unsafe behavior. We shall coin this the free rider problem of risk. First, one may secretly and intentionally take a risk without paying for possible consequences. although in the long run, society will be confronted with the consequences. Secondly, risk is `invisible' in the short term, which makes it difficult for risks to have an impact on the decision process. Only to a certain extent can risk assessment make up for this. The risks of certain complex systems are difficult to assess. And even if risks are known, the paradox of safety or external benefits of a certain safety measure may still disturb the market of safety.
For those reasons there is a role for intervention, in particular for government intervention. To some extent the role of the government should consist in compensating for the defects of the market. An important way of achieving this is to provide risk information about uncertain practices, risky jobs and hazardous substances. An overall concern for government is that of distributive justice. When a society has to make decisions about extremely risky circumstances, it should be guided by the principle of deliberative rationality.
The word risk appeared for the first time in the vocabulary of the Italian merchant class of the fourteenth century. In the previous two centuries the merchant class had emerged as a leading political power in the Italian cities. Italian merchants spread over Europe and beyond, especially by means of sea transport. These expansions, as a consequence, increased opportunities of loss or damage. Also in this period a concentrated financial structure of bankers had instituted a secure money and credit system. The presence of financial institutions made it possible to secure the financial dangers of the trips by means of a contract. The contract between financier and merchant was based on a precarious balance of the dangers and financial consequences of the expansions for both parties --- like a gamble. It was in this association of sea transport and gambling that the concept of a cliff (r i z a ) could take on the meaning of risk (riesgo) during the Italian Renaissance. The medieval mind did not have to deal with economic uncertainties because economic life was a loosely coupled and linear system in the absence of an extensive financial structure and economic expansion. The institution of a financial structure enabled the development of riskier projects during the Renaissance and of the concept of risk itself.
Initially risk stood for the financier's gamble to warrant a safe outcome of an individual trip. Risk was an individual gamble, not to be compared with its impersonal, modern counterpart. The factors that were to be taken into account were the level of competence of the seafarer, the destination, the state of the ship, etcetera. The financier compared the expected (in the sense of `most likely') gain with the possible losses of this particular trip. Also, risk had an exclusive economic flavor in its early days.
The transformation of the concept of risk over the course of the following centuries exhibits an evolution that began in fourteenth century Italy. From an individualized, strictly economic gamble for which it stood, the concept of risk came to stand for any chancing negativity. This was not only a conceptual expansion, but also a methodological-technical progression. Previously, an individual contract dealt with the individual peculiarities of what had to be insured. It gradually became clear that the risk-evaluation of a situation could make use of certain general features. With the rise of the probability concept in the seventeenth and eighteenth century, ideas of statistical homogeneity became familiar. This development gave insurers the opportunity to make more accurate determinations of the risks to which their clients were subject. At the same time technological progress and the perfection of a global credit system increased the complexity and tightness of society. In this kind of world potential hazards increased exponentially. Individual risk-assessments lost out against more accurate and less time consuming statistical calculations. The replacement of an individual gauge by a statistical estimate also relieved the insurer from a restriction to measure only personal, economic risks. Risk expanded conceptually to incorporate not only economic hazards, but any kind of uncertain dangers, irrespective of personal or impersonal agenthood.
Unlike the old concept of risk, the modern concept incorporates the specific features of chance and probability that make it subject to some of the difficulties that we have discussed in the previous chapters, in particular, the homogeneity requirement and the danger of hindsight due to the asymmetry of chance. When the concept of risk came to be defined as a probabilistic concept, it assumed that the level of hazard could be determined by a set of general features that singled out homogeneous subclasses. The disconnection between individual merit and level of risk endangered the opportunity for the insurer to encourage cautious behavior in the individual. As long as one compensates for the absence of such opportunities by a more accurate analysis of the general properties relevant to the hazard, the trade-off is favorable. In later sections of this chapter we shall argue that the absence of such relevant homogeneous subclasses --- not uncommon in a new technological world "built in seven decades" --- indeed constitutes a problem for a meaningful risk concept. On top of that, the temporal asymmetry of chance carries over to the concept of risk, and it shall be shown that this fact tends to select technologies that are riskier than what is desirable.
Some empirical facts on risk-behavior
A gambler in a casino prefers to roll the die, rather than have someone else roll it. However, with respect to the chances of winning, it does not matter whether it is her greatest enemy that rolls the die or herself. Many gamblers exhibit behavior of fictive control over situations that are principally out of their control. The idea that luck can be manipulated or controlled is a stubborn feature of human nature and lies at the root of religious worship of Luck in many cultures. For example, the Shichi-fuku-jin ("Seven Gods of Luck") are important deities in Japanese mythology. In Italy Fortuna was worshipped extensively from pre-Roman times onwards; in the Baltic religion and mythology it was considered good luck to have a zaltys, a harmless green snake, in one's house --- and bad luck to kill one; even Niels Bohr (Danish physicist, 1885-1962) was said to have had a horseshoe on the wall. In this section we shall look at some empirical evidence for the existence of this human belief that one can make oneself immune to risk and susceptible to good luck.
Figure 8. Risk-behavior: risk versus benefit*
Studies in the last three decades have revealed interesting facts about risk behavior. Risk behavior can be defined as the ratio of risk to the benefits received for that risk; it is an empirical, economical measure of how market forces shape people's risk-perception. Sometimes, there are safety measures on hand that people choose to forgo because the costs of the safety measure did, apparently, not balance the increased safety. Such perceived benefits of a safety measure can be estimated by the value that the public is willing to spend on it.
Figure 8 suggests that risks can be distinguished as active risks, over which we imagine some kind of control, and passive risks, to which we are subject without possibility of control. The general public appears willing to accept active risks of a factor roughly a thousand times greater than passive risks. For instance, although the public estimates the benefits of avoiding the risk of natural disasters and train-accidents to be approximately equal, the fatality rate of natural disasters is of the order 10-10 fatalities per person hour exposure, whereas train accidents account for 10-7 fatalities per person hour exposure.
We fear and reject risks where we are passive recipients of harm. The plant, we feel, not unreasonably, should not blow up, the dam break, the air controller goof, the Ford executives fail to protect a gas tank from exploding; over these risks we have little control. But we are willing to take risks with driving, skiing, and parachuting.
A distinction between the human response to actively assumed risks and the reaction to risks to which humans are passively subjected does indeed appear in these empirical data.
The discrepancy seems to suggest that people either feel very threatened by circumstances beyond their control, or underestimate the risks of events over which they believe to have control. Nevertheless, luck is simply beyond anyone's control. For that reason, the discrepancy between active risk behavior and passive risk behavior is witness of a certain irrationality. Rational people would on average value passive risks and active risks equally. The divergence gives some weight to our hypothesis that the free market can deal, at best, imperfectly with risks. In next sections we shall study some of the reasons why a free market has to fail in managing risks properly.
No system is risk free, therefore, it may seem that the difference between systems of risk is merely gradual, that is, that risks differ only in magnitude. We shall show that this assumption lacks an accurate understanding of the qualitative changes of risk within the modern technological world. Two qualitative factors of risk systems shall be singled out: the level of coupling and the complexity of the system. These two factors determine the extent to which risks can be managed rationally. A loosely coupled system that has linear interactions allows for a manageable risk control. Reduced slack and more mutual interactions diminish the manageability of the system. In extreme cases control becomes mere fiction.
When the Kemeny Commission investigated the Three Mile Island nuclear accident, it found that in the nuclear plant there were many loose hanging wires and many of the plant's engineers did not have a general understanding of nuclear power. One of the commissioners, Patrick Haggerty, General Director of Texas Instruments, down-played the importance of these findings. He said that in Texas Instruments' plants wires often hang loose and engineers were often ignorant of the plant's system as a whole. Of course, a Texas Instruments plant is not a nuclear power plant, but in terms of risk the difference is merely gradual, he said. This is, however, not true. The production process of calculators may incur trouble without affecting other parts of the plant. Most interactions are simple and there is generally enough slack. An error in the process can easily be isolated and repaired. In a nuclear power plant isolation of an error can be much more difficult. A mistake disperses itself quickly in the system as a whole.
The world contains many systems each with their own functionality. A society as a whole is a system, a coal-fired utility plant is. An automobile, a central heating system, DNA technology, a university, a nuclear power plant and a production line are all examples of functional systems. Risk is present in all of these systems. Risk is induced by errors. Three kinds of errors are possible: operator errors, component errors and system errors. In any system we can distinguish the potential for these three kinds of errors. A car accident can happen due to drunk driving, a malfunctioning break or due to a design error in the car itself. The Kemeny Commission blamed the Three Mile Island accident mainly on the operators of the plant, however, certain components were defective --- there had been troubles with several indicators --- and the system interacted in unprecedented ways.
The systems themselves are characterized by different levels of complexity and coupling. There are four differentiations within risk-systems possible (cf. Table 18). A central heating system is an example of a system that is linear and tightly coupled. The interactions in a central heating system are simple and direct. A drop in temperature triggers the system to be activated and it shuts off again when the desired temperature has reached. The trigger mechanisms are immediate and thus the system is called tightly coupled. Other examples of linear, tightly coupled systems are dams and rail transport. Other linear systems are more loosely coupled. Assembly-line production is a linear process. It consists of several subassembly lines operating sequentially, and quite independently. Also a post-office is a linear, loosely coupled system. Loosely coupled, but non-linear and more complex systems are institutions with multiple objectives, for instance the Welfare System. Other examples are universities. A university is an institution in which interactions are complex because often interdisciplinary and international. At the same time, actions within a university are not likely to trigger immediate far-stretching consequences for the system as a whole. Therefore, universities are called loosely coupled. Characteristic of loose coupling is the availability of much slack. There is time to prevent unwanted interactions. The last class of systems under consideration are complex and tightly coupled processes. These are systems with a complexity of a university, in which there is no slack and interactions are immediate.
Table 18. Different risk systems
Risk Systems
Loosely Coupled
Tightly Coupled
Linear Interactions
Assembly line, post office, single objective service.
thermostat, dam, rail transport.
Complex interactions
multiple objective service, university, Welfare system.
nuclear power and weapons, DNA technology.
Beauchamp and Childress discuss the potential danger of complex systems. Their argument is that the effects of complex systems can never be evaluated in separation, because of the possible synergy within the system.
There may be several elements of uncertainty, one of which may be the way technologies will combine and interact. Even though it may be possible to give reasonable estimates of the risk of a new technology by itself, society may be ignorant of how that technology may combine and interact with other technologies to produce unanticipated effects. For example, it may be uncertain which effects simultaneous exposure to several chemicals will have on individuals, because the interaction of the chemicals may produce synergistic rather than additive effects.
If a complex system does not allow for correction of unwanted synergistic effects and if the system has a catastrophic potential, the situation can become dangerous in unforeseen ways.
Nuclear energy has often been subject of irrational fears. The truth of the matter is that in several respects nuclear power is superior to traditional forms of energy. It does not deplete the earth's natural resources of oil and gas. It is a `clean' form of energy --- if we disregard the issue of nuclear waste for a moment. It may even be possible in the future to build a nuclear reactor that is loosely coupled and has linear interactions. However, at the present moment nuclear reactors in the US are of two types, pressurized light water reactors (PWRs) and boiling water reactors (BWRs). Both are complex and tightly coupled systems. The experimental breeder reactors, such as the Fermi experiment, qualify as even more tightly coupled and more complex. Characteristics of such systems are that interactions are non-linear, and that there is no slack to deal with unexpected situations. A few minutes into the Three Mile Island accident, there were three audible alarms sounding and many of the 1,600 annunciator lights were on or blinking in the control room.
Despite its politically stabilizing effect in the sixties and seventies, the nuclear weapons arsenal has posed large threats to world-safety. There have been numerous accidents where nuclear weapons were involved. Often the accidents were simple "industrial failures," such as accidents during transport. In 1966 a bomb was accidentally dropped from a plane in Palomara, Spain, resulting in a very costly cleanup-operation. North Carolina is today still where it was two decades ago only because one of the six safety devices on a twenty-four megaton bomb worked and prevented it from going off. These are simple failures. Despite low probabilities of accidents, the sheer multitude of dealings with nuclear bombs every day results in numerous accidents. Besides transporting, also storing the nuclear war-heads is subject to risk. Because war-heads are near military bases with much air traffic, they are stored in underground silos. These silos have to be made resistant to earthquakes and other calamities. The safety devices surrounding nuclear weapons transfer the system from an in principle linear technology to a complex process. Moreover, the clean-up operations of several antiquated nuclear weapons plants present the country with the inevitable bill of any nuclear high risk operation. "Over the next 50 years, the Department of Energy estimates that $250 billion will have to be spent nationwide to clean up nuclear weapons production sites --- roughly the same amount that was spent over the last 50 years to produce the bombs." There are signs, however, that the American Congress is not willing to spend this amount of money, thereby, taking another free ride on the risks that these crumbling facilities with aging pipelines and vast amounts of plutonium pose. With the 1951 Rocky Flats, one of the "highest risk plutonium facilities," just fifteen miles away from downtown Denver this may be a "Disaster Waiting to Happen."
During the Cold War the two superpowers, the US and the USSR, were scrutinizing each other, "perhaps almost as narrowly as a man with a microscope might scrutinise the transient creatures that swarm and multiply in a drop of water." A comparison with H. G. Wells' book, The War of the Worlds, is particularly appropriate because both the US and the USSR feared a surprise attack from the other side, much in the same way that the Martians initially captured most of Britain. The US possesses several early-warning systems to indicate if it is under attack by missiles from the Russians. Atmospheric disturbances, interfering signals, computer errors and accidents have been the cause of inaccurate readings and mistaken action. In 1980 alone there were about 4,000 false alarms at the NORAD headquarters. In combination with a Launch-on-Warning system, this posed a viable risk of starting an accidental nuclear war. Some critics estimated in 1986 (just before the relief of tension between East and West) that with current false alarm rates, an accidental nuclear war was to be expected somewhere between three and fifteen years. The early warning system itself was not likely to improve its false alarm rates, because of the tight coupling and its complexity. The complexity was partly the result of the necessity to rely on computer decision making. The software can never be fully tested and program specifications are written in ambiguous English. Moreover, an employee who overrides a computer and causes a problem is often fired, whereas an employee who follows the faulty computer advice has a convenient scapegoat. There were cases in which an ambiguity in software led to a periodical false alarm. In another instance, there was a warning for a massive attack from Russian side. It turned out that a test-program that ran on an auxiliary computer had come, somehow, `on-line'. As it would take Russian submarines only eight to ten minutes to hit targets in the US, there was little time to correct mistakes in the decision process. Within ten minutes it had to be determined whether the alarm was false or not. If it was decided that a real attack was registered, US defense policy stipulated to have B-52 bombers and missiles on their way, irrevocably, before the enemy missiles hit targets in the US.
Risk assessment, statistical explanation, and complexity
Linear systems have linear causal connections between events. In a linear system there is no uncertainty about the causes of possible malfunctionings. Repairing or preventing a mishap is a linear process without unexpected interactions. Uncertainty and risk enter the system only in predicting an accident. Risk assessment consists in attaching probabilities to possible malfunctionings. For instance, the belt of the manufacturing line may break once every four years, causing a temporary delay in production. The engine of the line is built to last fifteen years, and then it will be replaced. However, there is a probability it breaks down earlier. Every day of this manufacturing process can be considered as a lottery with causal explanations as prizes. If the production line stops and the motor is still running, then the belt must have broken, whereas if there is no humming sound, the engine must have died. When the event has happened, a certain causal explanation will be favored.
Figure 9. Explanatory lotteries
In complex systems the issue of explanation is more complicated than in linear systems (cf. Figure 9). Again the situation can be seen as a lottery, but now the prizes are not causal explanations; instead, the prizes are epistemic lotteries themselves. The following account of an aspect of the Three Mile Island accident shows the difficulty of explanation and the riskiness induced thereby. This additional risk, as a result of a lack of proper understanding, cannot be quantified in a meaningful way. In the control room of the Three Mile Island plant, one of 1,600 indicators was supposed to show the temperature in the drain tank. If the temperature was up in that tank, then supposedly there had been a loss of coolant from the core reactor, a so-called LOCA, which in turn would become overheated and might cause a nuclear meltdown.
[This] indicator showed the temperature of the drain tank; with hundreds of gallons of hot coolant spewing out and going to the drain tank, that temperature reading should be way up. It was indeed up. But there had been trouble with a leaky PORV [pilot-operated relief valve] for some weeks, meaning that there was always some [hot] coolant going through it, so it was usual for it to be higher than normal. It did shoot up at one point, [the operators] noted, but that was shortly after the PORV opened, and when it didn't come down fast that was comprehensible, because the pipe heats up and stays hot. "That hot?" a commissioner interrogating an operator asked, in effect. The operator replied, in effect, "Yes; if it were a LOCA [loss of coolant accident] I would expect it to be much higher." It was not the LOCA they were trained for on the simulators that are used for training sessions, since it had some coolant coming in through an emergency system, and some coming in through HPI [high pressure injection], which was only throttled back, not stopped.
The reading of the temperature was ambiguous. The reading was, in effect, an epistemic lottery in which the prizes were causal explanations for that reading. The reading most likely indicated that there was no loss of coolant, that the higher than usual readings were the result of a leaky valve and that the peak was caused by opening the valve. It was less likely that there was a loss of coolant because the temperature was not high enough. When given a choice, the operators acted on the most likely causal explanation, thus failing to note that there really had been a loss of coolant in the core, creating the disastrous Three Miles Island accident.
We have shown in this section that the risks in complex, tightly-coupled systems are often invisible and difficult to quantify. Risk assessment often depends on assumptions of linearity and independence, which are generally not satisfied in these kinds of systems. This complicates any form of organizational risk management in complex systems. Dangerous situations do occur because explanations are epistemic lotteries and there is no opportunity to single out the true explanation.
Safety can be bought --- there is a price for everything. The question that we shall address is "Will a free market provide sufficient level of safety?" We shall show that there are reasons to believe this is not the case. Risk upsets a market, and is therefore likely to produce non-optimal outcomes. Moreover, the temporal asymmetry of risk affects the distribution of consequences over future generations in ways that are not just. Consequently we shall ask, "Is government regulation of safety desirable, and if so, to what extent?" Although there are pitfalls to this kind of intervention, there are reasons of efficiency and justice that make regulation in our technological world desirable.
A perfect free market without externalities provides each citizen with a balanced packet of goods, in which safety could be one. The individual buys some commodities and forgoes others in such a fashion that under her present state of beliefs her individual utility is optimal. If a person decides not to buy airbags for her car, it is because she judges that the extra costs of that airbag do not balance a possible increased utility due to increased safety. A free market is sensitive to an individual's preference structure. The system creates niches for those who want to pay for additional safety-measures. Safety can be traded as any other commodity, such as staplers.
In the previous chapter we have shown that ideal economic models do not have moral value. The contractarian enterprise that attempts to deduce moral principles from bare rationality assumptions proved to be vain. An ideal free market cannot specify, for instance, the conditions under which the individual comes to state her preferences. Does she really know what she wants and does she make her decisions voluntarily? These two conditions --- which bear great resemblance with Aristotle's provisions of a moral action, to wit, being intentional (orexeis) and voluntary (bekousios) --- are not relevant within the framework of a free market. Free market outcomes have no moral superiority.
In this chapter it is one of our aims to show that the market of safety is not free at all. Misinformation and manipulation, such as advertising, upset a market. Here the difference between staplers and safety becomes apparent. Whereas the public can hardly be misinformed about a stapler, this danger is very real for a safety measure. Because `nothing ever happened' the public may have no experience of the actual benefits of certain safety measures. Also, human's limited rationality causes miss-assessments of risks. Both aspects disrupt the free market process.
In the next section, we shall show that a risk is to some extent a market externality. Safety measures possess external benefits, which makes them, in general, unsuitable for free market distribution. As a result of our analysis of risk as a free ride, we shall argue in later sections that some form of governmental regulation should supplement market distribution and that this intervention should be inspired by appropriate principles of justice.
Risk, tradable good or externality?
... if its principal assumptions are satisfied, market forces will set job risk levels efficiently on a completely decentralized basis, involving no interference with the decisions by workers or firms.
Are the principal assumptions satisfied, indeed? Is risk a tradable good or an externality? For risk to be a perfect tradable good it should be completely transparent to producer and customer, to employer and employee, to the seller and the buyer of risk. In this section we shall present four arguments that risks are imperfect market goods. Briefly, risks are intransparent to the market, risk awareness is a public rather than an individual good, risks are subject to deception and distortion, and safety efforts tend to be swamped in general averages. Consequently, a market of safety shall exhibit inefficiencies.
Risk can be considered to be a product. Risks can be sold by insuring oneself. The safer the car the more we are willing to pay for it. On the other hand, risk is not a product as any other. First of all, a problem of risk is its epistemic and ontological intransparency. The assessment of risk information exhibits the same imperfections as that of any other uncertain event. Risk assessments are tainted by hindsight. In the face of risk and uncertainty people construct a model of anticipated reality. New facts are interpreted in light of this model. Anticipated are those events that have a high prior probability. Conflicting information is, at first, discarded as misperceived or faulty. Information is only imperfectly integrated in anticipation. The normal way of affairs obscures serious consideration of possible deviations of that path. Secondly, when we consider risk awareness as a form of useful information with economic value, it will encounter the same problem of as any other market of information. Information by its very nature is a public good. Another party can acquire it without diminishing the productive value to the owner. "Since generating information is often a costly process, there can be a temptation to hold back from making the effort, in the hope of free riding." Thirdly, the market tends to give too rosy estimates, because risk compensation depends on the risk taker's knowledge of the risk. Producers will try to make their products look safer than they really are, so a higher price can be asked for them. Similarly, employers will obscure occupational hazards to cut down on risk compensation for their employees. Companies are willing to sell adverse information only for a high price. Finally, accurate risk information can generally only be collected with a reference class that is big enough. Often a whole branch of industry, rather than individual companies, is used to indicate safety levels. The risk information that comes available to the potential customer or employee is therefore the average safety level of the product or job. Consequently, individual companies are not stimulated to provide up to date safety levels, because they have the possibility of a free ride on the safety measures of others. Moreover, any of their own safety efforts would be swamped in the general average.
In this section we shall argue that misinformation of risk levels and the epistemic and ontological invisibility of risk provide the opportunity of what economic theory calls a free ride. Free rides are upsetting market interactions. Consequently, the safety standards and risk levels set by the market processes are necessarily insufficient.
In economic theory the free rider problem stands for the free consumption of a certain good that has been paid for by others. Two possible consequences of the free rider problem are overconsumption and undersupply. A lighthouse at sea would benefit all ships passing by that shore. If a group of seamen decides to build a lighthouse, others will freely benefit from that service. They have a free ride. As a result lighthouses tend to be underprovided. Another example is public radio. With the reduction of governmental support of late, people are asked to make donations to enable public radio to broadcast programs. However, even those who do not contribute anything can `consume' the radio programs, if their fellow citizens make enough contributions.
In some instances the free rider problem can regress into a situation that Hardin called the Tragedy of the Commons. In his article of the same title, Hardin showed that commons tend to be overgrazed. If there is a piece of land on which n farmers can have as many cattle graze as they wish, then a Pareto optimal solution is that each farmer grazes N/n cattle, where N is the amount of cattle that a single owner of the common would have grazing. However, it is for each farmer individually most profitable to graze as many cattle as possible, whatever the other farmers do. Thus, the commons decline to the disadvantage of all. Similarly, the consumption of fresh air is free and as a result there has been an overconsumption of it in the form of air-pollution, making the living conditions on certain places on this earth almost impossible. For each car-owner, the marginal cost of the air pollution caused by herself is less than the profit of driving the car. For a company, to externalize its costs in the form of air-pollution is the cheapest solution for getting rid of its waste. Consequently, the `fresh air commons' have been overgrazed.
We want discuss a variant of this phenomenon that we coin the free rider problem of risk. Taking a risk assumes the role of listening to public radio in this free rider problem, although there is one essential difference: whereas navigating on sea using a lighthouse or listening to public radio can be a truly free ride, risk will eventually demand that outstanding dues are paid. Often the negative effects --- for instance of difficult tractable health hazards --- are paid for by all, not only the `risk-polluter'. In that case the free rider problem of risk can turn into a tragedy of the commons.
The following example illustrates the free ride opportunities that risk provides. In 1942, the Hooker Electrochemical Corporation obtained permission to dump some of its chemical waste in Love Canal, a landfill on the bank of the Niagara River in Upstate New York. Instead of properly cleaning up its toxic waste, the Hooker Company decided to take a free ride by dumping it. "For corporations, it is intrinsic to their mode of operation to `externalize' all costs they can. Externalizing costs implies avoidance of responsibility for side effects; or, alternatively, passing the responsibility on down the chain of distribution to the customer, to the public at large, or to the government." The Hooker Corporation avoided the costs of properly disposing the waste of its production and thereby took a risk that someday the chemical waste would become a problem. In 1978, after excessive rains and melting snow had penetrated the canal's cover, the rising water carried the toxic chemicals to the surface. As a consequence, disease spread among the local population of Love Canal.
There are two ways in which a free rider problem of risk can arise. First, the risk taker could intentionally take the risk and hide its magnitude. Market forces often tend to masquerade true levels of risk. "Much information that firms do provide is not intended to enable workers to assess risk more accurately. Rather, it is directed at lowering workers' assessments of the risk." Lower risk assessment means lower risk premiums that firms have to pay to their workers, because more workers will be willing to accept a lower pay for a job that is perceived as relatively safe.
In the 1950s the US government was eager to push peaceful applications of atomic energy. The private utility companies were not very interested as the production of atomic energy was more expensive than conventional methods. Large incentives were offered --- even threats. When the utility companies agreed to build nuclear reactors, the federal government had a design on hand for reactors in nuclear submarines. However, these reactors were compact, responsive and had to be shut off to be refueled. All these characteristics were not especially useful for utility plants. They needed large reactors that generated constant power and would not need to be taken out of production. Scale adjustments and design changes were implemented without proper considerations for safety. "This unseemly haste has left us with a particularly complex and tightly coupled design, and a design that was assumed to be capable of being scaled up in size without any serious complications." The US government took a deliberate free ride endangering its people in Harrisburg, Detroit, Diablo Canyon, Charlevoix, and many other places. Taking a risk in tightly-coupled and complex systems such as nuclear power or aviation can have disastrous consequences.
Taking a risk is a refusal to pay for safety measures. It can be called a free ride if, as in many instances, a larger unit than the risk-taker herself is put at risk. The clean-up operation of the core melt-down accident at the Fermi breeding reactor near Detroit in 1966 was a risky operation. The account, based on a book by Charles Perrow, will serve as an illustration of how risk of certain modern technologies pervades the homes of citizens today. The melt-down itself had been a close call. A report of the Atomic Energy Committee in the early sixties, that was immediately classified after publication, had predicted that in the case of a severe accident at Fermi, about 65,000 people would die and about 250,000 would receive radiation levels of over 150 rads. An accident of this magnitude almost happened indeed at Fermi in 1966. "[After the melt-down] for a month the reactor sat there while the company let it cool and planned the next step. Then the engineers very carefully removed the top and hoped that none of the fuel subassemblies were stuck together in such a way as to produce `criticality' (the conditions for fissioning). ... It took three months to learn that four were damaged, and two stuck together. It took five more months to remove them. ... During this time, and for many months afterwards, the reactor had to be constantly bathed in argon gas or nitrogen to make sure that the extremely volatile sodium coolant did not come into contact with any air or water; if it did, it would explode and could rupture the core." The clean-up operation went well and a commentator jubilantly remarked that "much additional benefit was derived from the recovery operations ... not the least of these was the experience gained by the personnel directly involved." Perrow correctly perceives that "we may be very happy that these personnel had their experience increased, but unhappy that most of Detroit had to be at risk to secure the gain." The essential question is whether or not, in the light of the report by the Atomic Energy Committee that was published before the accident, operation of the Fermi reactor would have been considered justified if indeed something disastrous had happened. This is the requirement of deliberative rationality as has been discussed in Chapter 2, and to which we shall return at the end of this chapter.
Intentional deception of risk levels is not the only instance of how risks create a free ride and disrupts proper functioning of the market. Secondly, the epistemic and ontological `invisibility' of risk create another opportunity for the free rider problem to arise. We want to show how the nature of risk is responsible for structural safety problems by offering four cases in which the following principles are exemplified. First, certain health risks actualize only after many years. Possible consequences of taking the risk can therefore have only very limited impact on the decision process. Secondly, technologies whose risks are not well-understood generally have a potential for a free ride. Thirdly, even if risks or safety effects of certain practices are known, it is still possible that the market solution is flawed. And finally, the paradox of safety is a phenomenon that is likely to frustrate the safety effect of any safety measure that is introduced in the market place.
1. Postponed consequences of risk
The Occupational Safety and Health Administration (OSHA) inspects businesses for safety and health violations. Safety violations are those kinds of violations that pose immediate danger to the worker, whereas health violations are practices that potentially cause diseases, e.g., cancer, often only after ten or twenty years. Whereas safety violations are characterized by their visibility --- a bare electric cable, a loose screw, etc. --- health violations are in general much less visible. Partially for this reason OSHA has found four times less health violations than safety violations per inspection hour. As a consequence "the well-known problems in monitoring health hazards may affect OSHA inspectors, who devote most of their efforts to identifying readily monitorable safety risks." The irony of this is that regulation is confronted with the same free ride opportunity that it tried to overcome. "The risks that market forces are perhaps least equipped to handle --- toxic and hazardous substances --- account [for only a small percentage of OSHA's list of violations]" The most invisible risks upset not only the market compensation, but they are equally difficult to locate and handle by means of regulation.
2. Unknown risks are positively selected
Technological risk appears often as a side-effect of a certain technology. "Technology is almost always introduced on the basis of substitution, as a new thing which does an old job better," however "the new and the old are never truly congruent. While we pay attention at the outset to the overlapping parts of the substitution, what dominate in the long run are the unsuspected aspects of the substitution resulting from those features of the new that are different of the old." Many examples of technologies have restructured our lives in unanticipated ways. The car was introduced as a replacement of other means of individual transport, but it also completely changed the use of space. In many cases the unforeseen aspects of technology constituted or caused a sometimes unforeseen risk. Car accidents today are responsible for 40,000 deaths yearly in the United States.
It is our claim that there is a free ride potential in technologies whose risks are unknown, such as complex and tightly coupled technologies. "Market forces will tend to be biased toward overly risky technologies that are not well understood." We shall give a decision theoretic foundation for the claim that unknown risks are often boosted over known ones. The conclusion that can be drawn is that without any government regulation market forces tend to increase risks, in particular, complex technologies with many unknown risks will be stimulated more than any of the market players would want.
For traditional energy production the risk levels are established on the basis of years of statistical data. Assume that the probability of at least one major accident per one million (mln) traditionally produced kilowatts is p = 0.5. Accident information over one period will not change the known risk levels of traditional energy production, because the relative frequency has been established over a long run. The risk levels of nuclear energy are not nearly as clear as those of the traditional energy providers. If for the first one million kilowatts a prior probability of 0.5 (`uniform prior') is chosen, then there are two possible posterior probabilities. If a major accident happens during the first one million nuclear kilowatts, then the posterior probability of a major accident during the second one million kilowatts increases to 2/3; if not, then it decreases to 1/3. These results are summarized in Table 19.
Table 19. Unknown risks are positively selected
type of energy
probability of a major accident Accident happens
production
during 1st mln KWs
during 2nd mln KWs
in 1st mln KWs
traditional
½
½ (played)
Doesn't matter
nuclear (no accident)
½
1
/3 (played)No
nuclear (accident)
½
2
/3 (avoided)Yes
This simplistic numerical example captures an important point. It is attractive to engage in unknown risks, because there exists the possibility to learn about the risks. If prior probabilities do not differ very much from traditional risk levels, taking a free ride on ignorance will be selected by market forces. In this example, there is no difference between traditional and nuclear energy production in the first period, but if we had chosen nuclear energy in the first period, then there is a leading edge in the second period. If a major nuclear accident occurred in the first period, we have the option to quit in the second period and thus avoid the game with the 2/3 risk level. If nothing had happened, the risk level would have decreased to 1/3 and we would choose to continue with nuclear energy. As a result, society may end up with an irreversible nuclear power program, even if later it turns out that it was not as safe as expected.
3. Side-markers and known risks
In many instances risk levels can be estimated quite accurately. The impact of certain safety measures has been studied and this information is available to the public. In car technology the use of headlights, side-markers and seat-belts have known risk reducing effects. For each of these measures the individual can make an informed choice. Nonetheless, we shall show that the amount of safety features provided by the free market is not sufficient.
A critic of governmental regulation of safety measures is Martin Wohl. His criticism stems from a belief in the normative superiority of the free market. A key concept of his moral capitalism is `marginal utility.' In a perfectly free market each individual will consume in such a fashion that the marginal utility of his consumption balances precisely the marginal disutility of having to pay for it. Indeed, if everything what an individual buys is consumed by that individual then the free market will result in an equilibrium. Therefore, citizens will purchase side markers to that extent that the utility that it provides them will outweigh the costs. If someone does not buy side markers, it is because its marginal utility is less than the marginal utility of costs, i.e., one would be less happy to have side markers than not to have them.
If one is an active participant of traffic, one knows that the side markers on other cars do not only benefit the drivers of those cars, but also oneself. The safety effect of this measure is cumulative and affects the entire public. According to economic theory of marginal utilities, each driver with side markers would be willing to put a certain value on other people having side markers too. However, in a perfect free market decisions are based only on individual marginal utilities. The positive marginal utilities of other people's side markers are externalities to the market. In the same way that we argued that lighthouses in a perfect free market would be under-provided, an argument can be made that in a free market side markers would not be provided to the extent that the public considers useful.
4. The Paradox of Safety
A common argument in the risk-safety discussion is that future developments of technology will bring technological solutions to its own, current safety problems. Modern technology has indeed brought an enormous range of safety devices for almost any thinkable risk. Paradoxically, it did not bring society more safety. Deep soothing voices in advertisements say that new model cars have been tested scientifically, and driving these cars is safer than ever, nevertheless the death toll has remained approximately the same of the last three decades. Here the paradox of safety enters the picture. Generally, effects of safety measures are tested under `everything equal' circumstances. However, safety devices change customers' risk attitudes, reducing, and sometimes annihilating their net effectiveness.
The paradox of safety has been greatly overlooked, and is often not taken into consideration when designing safety measures. The reason is not difficult to grasp. In this technological age safety has generally been regarded as an engineering problem with economic constraints. The problem is thought to be of the type `How to make stronger and safer, economically viable products?'. What is overlooked in this formulation of the problem are the consequences of human interaction with the safety device. Again, this is no surprise. Market forces only react to short-term concerns. Changes in risk attitude are mid-term effects --- only after sufficient time will they show up as a statistically significant number.
The general insensitivity of the free market to real problems of changing risk attitudes calls for intervention. For sure, the free market will provide safety measures in order to make up for our changed risk behavior and there is no reason to stop that. On the other hand these safety devices are generally aimed at lowering the odds and disutility of an accident, not at what is at the heart of the problem: the change in risk attitude. Actions to raise the awareness of possible dangers --- for instance, to have companies print the nutrition information or nicotine content of cigarettes on the package --- can, in several instances, be an effective safety measure; a safety measure that market forces do not select.
Concluding this section, we sum up the major results. New, complex and tightly coupled technologies with statistically inaccessible risks are ideal free rides that are positively selected by market forces. Also in less tightly coupled systems, such as car technology, free rides on unknown risks occur. Safety technology can not be the only answer to the safety problems in society. Psychological dampening of safety due to changing risk attitudes should be taken into account in devising safety measures.
Within the market process itself there exists a method that internalizes the external aspects of risk. Risk-taking is subject to self-regulation via insurance. Insurance is the leveler that translates risk-taking, i.e., future damages, into present fees. If an individual is acting more risky, then that will result in a higher individual premium. That means that insurance can provide a market incentive to reduce risk and to increase safety. In the presence of insurance a free ride on risk seems impossible. However, reality deviates from this ideal. There are several reasons why the insurance market cannot completely provide the proper incentives. First, unless a mandatory insurance covers everyone, only bad risks tend to insure themselves. This increases premiums and makes it subsequently less attractive for those that are less at risk, to take insurance. Secondly, misinformation about the risk that people face results in a less than optimal insurance market. If people have biased risk perceptions, they will insure themselves either insufficiently or unnecessarily. It could also be that insurers miss-assesses risks. Only if insurers have an accurate estimate of the risks involved can they determine proper premiums for individuals.
In order to determine the risks involved, insurers depend for data on actualizations of risks. Accidents and hazards that have small probabilities will yield only limited data points. In the following example we shall look at a marine insurance case. For decades the number of ship accidents has been rising. Even the accident rate per ton-mile has been increasing with more than five percent per year. The dollar loss of cargo, vessels, and property runs in the hundreds of billions per year, worldwide. Marine safety boards have made many recommendations to improve safety and to reduce risk, but efforts have not resulted in any decrease of accidents. With the huge losses involved, it may seem surprising that owners have not done more effort to make their fleet safer. It may seem even more surprising that the premiums of ship-insurance have not been an incentive to do so.
One reason why insurance is an imperfect market tool to improve safety is that premiums are often determined by average risk levels in a certain risk group, rather than by individual behavior. This situation bears a close resemblance with the Tragedy of the Commons. Marine insurers establish premiums for each individual ship. It is however difficult to determine a ship's performance record, because probabilities of actual mishaps are generally very small. On average a ship goes down once in 180 years, which is six times its lifetime. Consequently, insurance rates for ships are generally not based on performance, but on factors unrelated to risk-behavior. If all vessels would improve safety on board, it would reduce the overall insurance premium. However, for each individual company or owner, it is not worth improving safety because the insurance premiums are not based on it. Similar to the tragedy of the commons, it is for ship owners more profitable to improve productivity and to graze, so to say, the commons of risk, because there is no way that premiums can be tailor-made according to the risk behavior of their ships and crews and the increased losses of vessels and cargo is paid for by all. As a result, incidence rates have grown and losses ballooned.
In other branches of society, insurance has been more successful in stimulating safety. People with dangerous hobbies generally pay higher premiums. Certain car insurance schemes charge fees according to the individual's safety-record. This stimulates safer driving. On the other hand, most car insurances have categories that are not determined by safety behavior but which are based on variables with a statistically significant correlation to the number of accidents. The age-group from 16 to 24 possesses a higher risk of being involved in car accidents. Accordingly drivers in this category pay higher insurance premiums. The disadvantage of this kind of premium policy is that it does not stimulate young drivers to drive more safely. They are stigmatized as a group of dangerous drivers, and no incentives are offered to drive more safely.
Two conditions need to be fulfilled for insurance to be effective in changing risky behavior. First of all, premiums should depend on actual risk-behavior of the individual, ship crew or whoever-is-the-agent. Secondly, adverse selection --- i.e., only those at risk tend to insure themselves --- threatens the viability of any insurance program and serves as an important rationale for a form of a mandatory insurance system. These two conditions combined, risk-behavior related premiums and a mandatory insurance, increases the capacity of the insurance market to stimulate safety.
The conclusion is that insurance has the potential of making market forces account for risk-behavior, but at the same time insurance can only be an imperfect tool. One reason is deeply rooted in the nature of probability. A probability is the infrequent appearance of a certain characteristic among an otherwise homogeneous group of individuals. If premiums are established on the basis of large averages, then they do not account for the actual safety performance of the individual, and give an opportunity for a free ride. The insurance premium would not give the risk-takers an incentive to behave more safely, because the marginal `inconvenience' due to safer behavior would not be compensated for by a lower premium. Secondly, the effectiveness of insurance would be greatly increased if it was obligatory and general. The issue of general and obligatory regulation brings us to the issue of next section: government intervention.
In this section we shall deal systematically with the role of government intervention. We have seen that market forces are partially incapable to deal effectively with safety issues. We shall also show that market forces cannot account for considerations of distributive justice. A market begs the question of what is a just distribution of risks within a society. Moreover, a market blatantly ignores the wealth and well-being of future generations. On each of these issues there is a potential role for a government to intervene. We shall review past regulations, evaluate their effectiveness and show how these can be improved.
Some believe that government intervention is principally suspect. Regulation by government limits the supposedly inalienable freedom of the individual. This position is summed up by Viscusi:
If individuals were fully informed of the consequences of their decisions and made rational choices, then in a democratic society we should respect these choices.
We shall argue that this position is essentially flawed. If an individual commits a well considered crime for his own benefit, then a democratic society will usually attempt to interfere and punish the individual. Viscusi's statement has to be amended in some serious ways. The converse of the Viscusi's claim is less controversial: If individuals are not fully informed or do not make rational choices, then a democratic society has a rationale for governmental correction. Even libertarians agree that deficiencies of the market create a basis for government intervention. In the previous sections it has been our aim to show that such deficiencies indeed exist in the market of safety. Risk possesses many characteristics that make it suitable for a free ride. However, beyond the problems of the safety market, there are other reasons that justify market adjustments. It is our claim that any market outcome is subject to considerations of justice. We shall recapitulate our considerations of distributive justice of Chapter 2 and indicate how they apply to considerations of government intervention.
The history of extensive safety intervention began three decades ago. The Nixon and Ford administrations seriously explored the possibilities of government regulation. In the past three decades many agencies that deal with safety, such as OSHA, EPA, FAA, NRC, and NTSB, have come into existence. However, evidence of the effectiveness of their regulation is not unambiguous.
There exists a wide variety of government intervention, ranging from providing information to zero-tolerance penalties. Libertarians are on one end of the spectrum, many special interest groups on the other. Martin Wohl, a libertarian, wishes to limit intervention to providing information in order to restore the natural free market process that was interrupted by externalities, i.e., misconceptions among the customers about the actual risk of a certain good or activity. For libertarians the single largest externality in a free market is the limited rationality of the consumer. "Failure to account for these matters results in externalities, suggesting that government action of some sort is desirable so long as the total external and internal benefits outweigh the additional costs." To the extent that complete information can be achieved they believe that the free market attains an optimal point.
However, we have demonstrated in previous sections that externalities go far beyond misinformation or the lack of information. We recapitulate some of them here. In complex, tightly coupled systems accurate risk-assessments are simply impossible, and therefore no government could give accurate information about the risks involved. Technologies with unknown risks are positively selected by market processes, increasing the overall unsafety in society. Therefore, dangerous complex and tightly coupled systems are selected because risks in such systems are unknown. In other situations safety devices are not selected, because the buyer is not the only consumer of the good. The market's quasi-equilibrium balances at a point where marginal costs do not match the overall marginal utility of the safety device. Sometimes, paradoxically, the presence of a safety device decreases the overall safety, because it reduces the caution that the subject used to exhibit when she was aware of the risk. As an example we shall discuss the difficulty of government intervention in the presence of this safety paradox.
Paradox of safety: the need for caution
The safety paradox shows up in almost any instance where safety measures are implemented. The safety measure reduces the attentiveness of the people who deal with the risk in question, which in turn dampens the protective effect of the measure. As long as the net effect of the measure is positive the investment for the safety measure may be justified. However, if the net effect of the measure becomes negative, the investment has been a waste of resources. Both governmental as free market decisions produce paradoxes of safety.
One instance involved the regulation of safety caps on medicine bottles. The Federal Government regulated that medicines can only be distributed in bottles that are sealed by so-called "safety caps." The idea was to reduce the number of child-poisonings by keeping dangerous medicines out of the hands of children. The safety caps made it indeed more difficult to open medicine bottles, but not only for children. One of the immediate disadvantages of the regulation was that old people and persons with arthritis had trouble opening their medicine bottles. Consequently, some did not close the bottle after first use, which caused open-bottle poisoning. Another typical safety-paradox effect was that people that before used to lock dangerous medicine safely away, became more careless and gave children more opportunities to get in touch with these medicines, because they reckoned themselves safe. Statistics suggest that the net effect of the safety cap regulation has been more casualties instead of less.
The safety paradox also appears in its mirror image. When Sweden in the seventies switched from driving on the left-hand side of the road to the right-hand side, experts expected the accident and fatality rate to skyrocket. However, no rise of accidents occurred and instead the traffic fatality rate in Sweden dropped significantly the year after the measure had been introduced. Apparently, the Swedish people had recognized the immanent danger of switching this fundamental traffic rule and they responded to that danger by driving more carefully resulting in an overall reduction of risk.
When a government intends to regulate safety, it is not enough to come up with ingenious, technological, safety measures, if it is not clear what behavioral effects they will have. Safety is not merely an engineering issue. One of the most effective safety devices is caution. Caution has psychological aspects to the extent that it is related to issues of risk awareness. We shall study this aspect in the next section. We shall propose ways in which regulation of risk information can be effective and what the proper role of government is in matters where risks are unknown. The philosophical aspect of caution has to do with considerations of present values of future events. This discussion in the last section will take up the issue of deliberate rationality again, and submit both policy and economic considerations to the requirements of justice.
Providing Risk Information
In previous sections we have shown that the absence of risk information in combination with market forces stimulate unsafe behavior. We shall provide another example of this phenomenon in the hiring process of workers in high risk industries. The aim of this example is to examine the need for regulated provision of risk information. We shall study the most efficient way to inform the public about risks that threaten it.
There are many instances in which inaccurate risk perception leads to hazardous situations. We have considered several examples in which market forces tend to underestimate or simply ignore risks. In the Love Canal case the Hooker Electrochemical Corporation enjoyed a free ride on the risks it took. The risks the company posed to the local population were unknown and came only to the surface when the toxic substances did. In the next example we shall examine the hiring process in hazardous industries. New workers induce turnover costs for companies. These costs result from training the new employees and the initial period of the new worker's non-productivity. Hazardous industries have the disadvantage as compared to conventional industries that a larger fraction of workers will learn on the job that the risks are too high for them and quit. Industries minimize costs. Hazardous industries tend to minimize these turnover costs. Sometimes they adopt technologies with unknown risks or they paint a picture of safety that is too rosy. Both will have the effect that more workers are willing to take the job and will be satisfied with a lower (initial) wage. The market tends to select misinformation, thereby deceiving the worker.
If workers know that they tend to be deceived by risk information, could it be possible that a market of risk information will develop? Savage has shown that information is an imperfect market good and that a market of information would face several ineliminable problems. If a future employee would want to buy risk information about her future job, she would have no way of evaluating how valuable the information is to her, i.e., how much she should be willing to pay for it. Moreover, the company would be willing to sell unfavorable information only for a high price.
Government regulation is needed to compensate for the market-defects. More risk information, however, will not necessarily help. "It may also be difficult to disclose what is known in a way that facilitates an informed choice." First of all, the information should be presented in a way that helps workers to make informed decisions and, secondly, in a way that stimulates companies to implement safer technologies.
Governmental agencies provide statistics that can help people to make more informed decisions. Often, safety statistics are shown as averages in a certain branch of industry, for instance, how automobile safety compares to aviation safety, or how occupational risk of textile manufacturing compares to the occupational risk of the chemical industry. With this information the traveler can make informed decisions about his plan of travel, and the worker can decide in which branch of industry it would be safer to work. However, one disadvantage of presenting the information in this way is that it does not discriminate between individual companies. In effect, it punishes the safest company within a branch by the overall higher rate of risk of the other companies. Presenting the risk information in averages does not reward those who act safely or punish those who do not, and thereby it stimulates an individual company to take a free ride on the safety of others.
This particular problem could be amended by presenting more accurate risk-information. We propose regulation to introduce safety rankings. A job-applicant could consider the relative level of safety when applying. A limited number of categories, for instance four (far above average, above average, below average, far below average), would have clear advantages. If the applicant is presented the information in such a way, then the knife cuts on both sides. On the one hand, the worker can easily process this kind of risk information and can make better informed choices. Free rides on unknown risks are reduced. On the other hand, the safest companies receive the award that they can reduce risk premiums, whereas those companies with the worst safety records will have to pay more in order to get new workers.
The form in which risk information is presented matters a great deal for the efficiency of that information. Simplicity is one of the most important requirements. To present risk information relative to a well-understood risk will likely increase the effectiveness of that information. There are two aspects of risk that have to be presented. A risk has a certain level of seriousness and a certain relative frequency. For instance, the magnitude and probability of smoking could be taken as an anchoring value. If the risk of carcinogen X (or job Y) is presented, it could be listed as having a magnitude 1.14 times higher than lung-cancer as the result of smoking and a relative frequency 2.54 times lower than lung-cancer as the result of smoking. Anchoring new risks on old, well-understood risks is a promising way to provide risk information.
Other forms of regulation
Correcting the market process by providing risk information is not always possible. Sometimes the cost of provision is too high, sometimes risk assessment cannot assess the risks accurately. Viscusi says that "where there is a broad consensus on a rational course of action, however, and either the cost of providing information is high or individuals cannot process the information adequately, then mandatory requirements may be preferable to risk information efforts." We shall not give a detailed policy outline, but merely stipulate some principles for such policy.
There is an important role for government intervention in systems that are tightly-coupled and complex with a catastrophic potential. In these systems risks are inaccessible. The best risk assessment will do only a poor job in estimating the probability of a melt-down or an `accident' with genetically engineered material. Both systems, nuclear energy and DNA engineering, pose unknown risks to society. Modern expected utility theory is of no use because probabilities are practically unpredictable. If society is expected to make a decision that can count on a broad consensus, we propose the application of the rule of caution in these controversial systems. When the renaissance-merchant had to decide without proper risk information whether to use one or several ships to transport her cargo, she argued that it would not hurt so much if she would lose part of her cargo several times as opposed to losing her whole freight at once. She applied the rule of caution and divided her cargo over several ships, because if one would sink the others would probably be unharmed. When society faces a similar problem with tightly coupled, complex systems with a high catastrophic potential, it is wise not to take the risk, and to look for alternatives instead. A society may be capable to accept the loss of hundred persons every year, but it may not have the flexibility to deal with a sudden loss of 10,000 people once a century.
Binmore seems to believe that this argument is flawed. He argues that applying caution is a form of misplaced conservatism. In western democracies, the argument goes, risk reduction has often gone hand in hand with economic prosperity and development. If we curtail economic development, we may miss significant opportunities to reduce risks even further. Caution is, however, not the same as conservatism. Conservatism is defined as being more pessimistic than what on average can be expected. Caution is the belief that the average is not all that matters. Sometimes averages are inaccessible, sometimes they are irrelevant. The rule of caution should not be the only principle of governmental risk policy, but it is an important one.
There is need for governmental risk provision in order to produce an adequate risk awareness. The market of risk information is flocked with externalities. Intervention should be aimed at compensating the deficiencies of the market. If regulatory effort can be aimed at improving risk information, market processes will generally produce proper safety levels. Besides risk information there is an important role for direct risk regulation. Instances of inaccessible risks or risks with a catastrophic potential demand to be guided by the rule of caution.
Distributive justice and regulation
... for the safety of our children and grandchildren.
This statement has become part of political rhetoric in the last decade. This does not guarantee, however, that the care of future generations has really been a serious concern for safety policies. Parents generally have empathic preferences relating to their direct off-spring. Consequently, there may exist some genuine resistance against risks that may actualize after a delay of twenty to fifty years. Whether this resistance translates into safety measures depends on how urgent future consequences, the role of future discounts, and the demands of rationality are regarded.
How should future generations --- or even one's future self, for that matter --- be regarded in decision making processes? According to Rawls, the future is as relevant as the present. The requirement of deliberative rationality specifies that one should act in such a way that one will never feel regret for any possible outcome. Bernard Williams notices that, for Rawls, the temporal structure of moral relevance is like a rectangle. The future is worth the same at each instance. Deliberate rationality specifies that the present has no moral or utilitarian priority over future moments.
Parfitt, inspired by game-theoretical considerations, has argued that the individual --- in so far as relevant for her actions --- exists only at the present moment. At the next moment this individual dissolves and gives way for a new individual with new preferences. The future exists only to the extent that it has been discounted in terms of present value. Concern for future generations has often been expressed in terms of the present valuation of future values. It has been argued, for instance, that, instead of cleaning up a hazardous waste site now, we can set $1 dollar aside this year and the future value of this $1 dollar in two hundred years will be enough to clean up this site then. In market terms, the present and future relate to each other as a triangle with its base in the present.
Figure 10. Value of the future
Parfitt is an exponent of market thought and thereby clearly exhibits the deficiencies of the market insofar as it is used as a normative model of behavior. The discount of future values is undesirable when it comes to health and safety issues. Viscusi explains that "if we want to leave future generations with an efficient level of safety and environmental quality, then we should place greater value on the benefits to these generations than we would on our current welfare because the increased value of health and environmental benefits that these future beneficiaries of today's policies will have." Market forces have an opposite effect. Instead of greater importance these forces have a diminished concern for the future due to discounted future values.
There are several reasons why future generations have been left out in health and safety regulations. First, the generations that come after our children and grandchildren are of no electoral concern. Elected politicians, even if they have recognized that market processes tend to underestimate the value of health and safety regulations for future generations, have no electoral impulse to be concerned about the remote off-spring of their electorate. The same limitations that the market encounters constrain the ability of a democratic political system to act upon valid concerns for the future.
Despite the absence of instantaneous economic or political gain, the requirements of distributive justice demand from this generation that it should not to deprive future generations from equal opportunities. This principle of deliberative rationality treats all future generations, including the present one, as equally important. In the face of uncertainty and risk society should act like a prudent merchant. There is a lot at stake and we do not want to lose it all. There is only one game of life and it can be played only once. Therefore society should deal with risk in such a fashion that in the future it does not have to regret the decisions it makes right now.
Appendix A. Bayesian Calculations
In this appendix we shall show how to update prior probabilities according to a Bayesian scheme. We shall consider only estimation of the probability of a certain event E, p = P(E). We assume that the prior distribution p 0 is a beta(p, q) distribution.
How should a Bayesian choose the prior distribution, p 0? This depends on two factors:
- What does she believe that p is?
- How confident is she about her belief?
The answer to these two questions will yield two equations from which the Bayesian statistician is able to solve p and q. First, her prior belief about p can be translated into either one of the following two statements:
- E[p 0] = p/(p+q)
- mode(p 0) = (p-1)/(p+q-2)
Secondly, the Bayesian should translate her confidence into her prior estimate into a number of experiments that she believes her estimate is worth, i.e.,
number of experiments that prior estimate is worth = (p-1) + (q-1)
In the case that nothing is known about p Bayesians generally choose for p 0 a uniform (0, 1) distribution. This is the same as a beta(1, 1) distribution.
One performs n independent trials. The Bayesian statistician estimates p = P(E) on the basis of the posterior distribution p n, which is the updated version of p 0 given the data, X1, X2, ..., Xn, where Xi is the indicator function, 1{E}. Given p , the number of successes, S i Xi, out of n independent trials is a binomial (n, p ) distribution. Thus, the model density, mp (k), is given as:
mp (k) = (nk) p k (1-p )n-k.
After she has observed that S i Xi = k, she proceeds to deduce the posterior density function, bn(p ), of the distribution p n:
bn(p ) = p(p |S iXi=k)
= C p(S iXi=k|p ) p0(p )
= C mn(k) b0(p )
= D p k (1-p )n-k p p-1 (1-p )q-1
= D p k+p-1 (1-p )n-k+q-1.
Where C and D are the proper constants.
Thus, the posterior distribution p n is a beta(k+p, n-k+q) distribution.
Estimates of p = P(E) are based on p n, for instance,
- p » E[p n] = (k+p)/(n+p+q), or
- p » mode(p n) = (k+p-1)/(n+p+q-2).
The advantage of the first estimate is that the estimator of the probability, p , will share an important feature with a true probability: being a long run average. However, if the distribution of p is bimodal, then the average will not be a useful estimator, and the mode, i.e., a kind of maximum likelihood estimator, will serve that purpose better.
Appendix B. Sample Size and Type II Error in Carcinogen Studies
A certain disease has a background rate of p1 and we perform an experiment in which we expose N people to a certain potential carcinogen and have a control group of N not exposed to the substance. We introduce the following two random variables:
X = # of incidents of the disease among the N people exposed
Y = # of incidents of the disease among the N people not exposed
As a measure of toxicity of the substance we look at the number T=X-Y. If this number is large enough we conclude that the substance is a carcinogen, and if it is close to zero than we accept that the substance is of no danger. In other words, the number T, the test-statistic, is used to determine which of the following two hypotheses is true:
Ho: The substance is non-toxic
H1: The substance is toxic
Before we can determine the truth of any one of these hypotheses, we have to make some preliminary observations. First of all, we should ask ourselves what level of risk we are prepared to take to make a mistake. As we explained, there are two different kinds of mistakes and, therefore, two different kinds of risk-levels. The first, the probability that we falsely regulate a non-toxic substance (i.e., we falsely ascribe toxicity to the substance), is usually kept low, normally at a =0.05. That means that we allow ourselves only once in the twenty times to make a mistake when we regulate a certain substance. The second kind of error, the probability that we falsely assume that the substance is safe, seems to be no less important --- especially for public health. The probability of this error is indicated by b . A level for b has to be fixed prior to the experiment.
The next question is what is actually meant by toxicity. This matter is non-trivial. From a God's eye point of view it is clear that a certain substance is toxic if associated with it there is an incidence rate of cancer that is higher than p1. Having an epistemic curtain between us and the true parameter values, we humans have to fix a level of toxicity that we want to be able to discern. This level of toxicity is called the relative risk, d . It is defined as the number of times that a toxic substance increases the incidence rate of the disease:
incidence rate among exposed
Relative risk = --- --- --- --- --- --- --- --- --- --- --- --- --- --- --- ---
.incidence rate among non-exposed
The null hypothesis and the alternative hypothesis can now be translated in something statistically meaningful. As before, let the background rate of the disease be p1 and let incident rate of the substance be given as p2.
Ho: p2 = p1
H1: p2 > d p1
After specifying the values of a , b and d , the epidemiologist can determine the last variable before the beginning of the experiment: the sample size, n. The sample size is completely determined by the three values of a , b and d . An asymptotic value is given in Walter, 1977:
,
where p2 = d p1, p = (p1 + p2)/2, q = 1 - p, q1 = 1 - p1 and q2 = 1 - p2.
For relatively rare diseases, i.e., for p1 small, 2 p q » p1q1 + p2q2 and the formula can be simplified to:
.
If the circumstances do not allow a variable sample size, n, and if besides a fixed relative risk, d , the 95 percent rule is applied to the significance level, a , then the probability of the type II error becomes variable and follows from the previous equation:
.
Example
An epidemiologist wants to find out whether a certain substance causes leukemia, a disease with a background rate of 1/10,000. The relative risk she aims to detect is at least five, i.e., d = 5. She does not want to make any concession to scientific rigidity and moral correctness, so that she sets the probability of both possible inference errors very low, a = b = 0.01. From these values she concludes that she has to have a sample size of at least 81,399.
Figure 11. Clear identifiability of hypotheses if enough observations
For this sample size the two distributions (of the amount of leukemia cases if the substance is and is not toxic) are distinct and safe inference is possible (cf. Figure 11).
Considering the sample size, n, as a free variable is, however, unrealistic. A sample size of 2,000 is often a practical maximum when it involves an observational study of human beings. Assume an epidemiologist fixes the sample size at n = 2,150, and in accordance with the 95 percent rule chooses a = 0.05. She aims to detect relative risks of at least five, thus d = 5. From the above formula it then follows that b = 0.8126. Thus, in a practical setting where the sample size is limited if one chooses the type I error in accordance with the traditional 95 percent rule, then in 81% of the studies where the substance is a carcinogen this fact will go unnoticed.
Adler, A. J., Wolff, P., ed., Philosophy of Law and Jurisprudence, Chicago: Encyclopaedia Britannica, Inc., 1961
Aratus Solensis, Callimachus --- Hymns and Epigrams. Lycophron; with an English translation by A. W. Mair. Aratus; with an English translation by G. R. Mair. Cambridge: Harvard University Press, 1970
Aratus Solensis, The Aratus ascribed to Germanicus Caesar, edited with an introduction, translation & commentary by D. B. Gain, London: The Athlone Press, 1976
Arbuthnoth, J., An Argument for Divine Providence, taken from the Constant Regularity observ'd in the Births of Both Sexes, in Philosophical Transactions of the Royal Society, 27: 186-90, 1710
Aristotle, Metaphysics, translated by H. G. Apostle, Grinnell, Iowa: The Peripatic Press, 1980
Aristotle, Nicomachean Ethics, translated by W. D. Ross, Oxford: Oxford University Press, 1992
Aristotle, Physics, translated by H. G. Apostle, Grinnell, Iowa: The Peripatic Press, 1980
Baldwin, J. M., Dictionary of Philosophy and Psychology, Vol. I-II, Gloucester, Mass.: Peter Smith, 1960
Baron, J., Thinking and Deciding, Cambridge: Cambridge University Press, 1988
Beauchamp, T. L., Philosophical ethics, an introduction to moral philosophy, New York: McGraw-Hill, 1982
Beauchamp, T. L., Childress, J. F., Principles of Biomedical Ethics, 3rd ed., New York: Oxford University Press, 1989
Bell, C. R., Uncertain Outcomes, Lancaster: MTP Press Limited, 1979
Binmore, K., Game Theory and the Social Contract I: Playing Fair, Cambridge Mass.: MIT Press, 1994
Borges, J. L., "The Lottery In Babylon," in Labyrinths, Selected Stories & Other Writings, p. 30-5, New York: A New Directions Book, 1964
Brown, H. I., Rationality, London: Routledge, 1988
Carnap, R., Logical Foundations of Probability, Chicago: University of Chicago Press, 1962
Chronological Dictionary of Quotations, ed. E. Wright, London: Bloomsbury, 1993
Coates, J. F., "Why Government Must Make a Mess of Technological Risk Management," in Risk in the Technological Society, eds. Hohenemser and Kasperson
Cranor, C. F., "Some Moral Issues in Risk Assessment," in Ethics, Vol. 101, October 1990, No.1
Cranor, C. F., Regulating Toxic Substances, A Philosophy of Science and the Law, New York: Oxford University Press, 1993
Demchuk, A., Proceedings of the Conference on the Risk of Accidental Nuclear War, Vancouver, May 26-30, 1986.
Engelhardt, H. T., Jr., "Health Care Allocations: Responses to the Unjust, the Unfortunate, and the Undesirable", in Justice and Health Care, ed. Shelp, Boston: Kluwer, 1981
Fisher, R. A., "Student," Annals of Eugenics, 1939, 9, London: Cambridge University Press
Frankena, W. K., Ethics, 2nd ed., Englewood Cliffs, NJ: Prentice-Hall, Inc., 1973
Gataker, T., Of the nature and use of lots, second edition, London: John Haviland, 1627 [1619]
Gauthier, D., Morals by Agreement, Clarendon Press, Oxford, 1986
Guathier, D., Sugden, R., Rationality, Justice and the Social Contract, Themes from Morals by Agreement, Ann Arbor: The University of Michigan Press, 1993
Gigerenzer et al., The Empire of Chance, how probability changed science and everyday life, Cambridge: Cambridge University Press, , 1989
Gigerenzer, G., "The bounded rationality of probabilistic mental models," in Manktelow and Over, Rationality, psychological and philosophical perspectives, 1993
Harsanyi, J. C., Rational behavior and bargaining equilibrium in games and social situations, Cambridge: Cambridge University Press, 1977
Hardin, G., "The Tragedy of the Commons," in Science, 162:1243-1248, 1968
Hart, H. L. A., Punishment and Responsibility: Essays in the Philosophy of Law, Oxford: Clarendon Press, 1968
Hempel, C. G., Oppenheim, P., "Studies in the Logic of Explanation," in Philosophy of Science, XV (1948), pp. 135-75.
Hempel, C. G., Aspects of Scientific Explanation, New York: The Free Press, 1965
Heyd, D., When Practical Reason Plays Dice, pre-print, forthcoming paper.
Hohenemser, C., and Kasperson, J. X., Risk in the Technological Society, AAAS Selected Symposia Series 65, Westview Press, Inc.: Boulder, Colorado, 1982
Huizinga, J., Homo Ludens, A study of the play element in culture, New York: Harper & Row, 1970 [1944]
Johnson, G. D., Nussbaum, B. D., Patil, G. P., and Ross, N. P., "Innovative Statistical Mind Sets and Novel Observational Approaches to Meet the Challenges in the Management of Hazardous Waste Sites," in Challenges and Innovations in the Management of Hazardous Waste, R. A. Lewis and G. Subklew, eds. Air & Waste Management Association, Pittsburgh, PA, 1995, pp. 3-32.
Jonas, H., Technik, Medizin und Ethik --- Zur Praxis des Prinzips Verantwortung, Frankfurt am Main: Insel-Verlag, 1985
Kant, I., Fundamental Principles of the Metaphysics of Morals, translated by T. K. Abbott, London: Longmans Green, 1927
Kelsey et al., Methods in Observational Epidemiology, New York: Oxford University Press, 1989
Kemeny et al., The Need for Change: The Legacy of TMI, Washington D.C.: Government Printing Office, 1979
Kirschenmann, P. P., To Catch the Drunken Driver And Larger Responsibilities of Science and Technology, Technical Report Free University, Amsterdam, 1990
Le Cam, L., Yang, G. L., Asymptotics in Statistics, Some Basic Concepts, New York: Springer-Verlag, 1990
Lindley, D. V., Introduction to probability and statistics from a Bayesian viewpoint, New York: Cambridge University Press, 1965
Li, M., Vitanyi, P., An introduction to Kolmogorov complexity and its applications, New York: Springer-Verlag Inc., 1993
Manktelow, K. I. and Over, D. E., Rationality, psychological and philosophical perspectives, London: Routledge, 1993
Maugham, W. Somerset, The moon and sixpence, New York: Modern Library, 1919
Miller, J. W., Paradox of Cause and Other Essays, New York: W. W. Norton & Company, 1990
Nagel, T., Mortal questions, Cambridge: Cambridge University Press, 1979
Nussbaum, M. C., The fragility of goodness, luck and ethics in Greek tragedy and philosophy, Cambridge: Cambridge University Press, 1986
Novick, M. R., Jackson, P. H., Statistical Methods for Educational and Psychological Research, New York: McGraw-Hill, Inc., 1974
Nozick, R., Anarchy, State and Utopia, New York: Basic Books, 1974
Ott, L. R., An Introduction to Statistical Methods and Data Analysis, Belmont, CA: Duxbury Press, 1993
Owens, D. J., Causes and coincidences, Cambridge: Cambridge University Press, 1992.
Paley, W., Natural Theology, or Evidences of the Existence and Attributes of the Deity, Collected from the Appearances of Nature, London: R. Faulder and Son, 1807 [1802]
Palladino et al., Public Safety, A Growing Factor in Modern Design, Washington, D.C.: National Academy of Engineering, 1970
Peirce, C. S., Collected papers of Charles Sander Peirce, Vol. II, ed. C. Hartshorne and P. Weiss, Cambridge, Mass.: Harvard University Press, 1933
Perrow, C., Normal Accidents, Living with High-risk Technologies, New York: Basic Books, 1984
Pratt, J. W., Raiffa, H., Schlaifer, R., Introduction to Statistical Decision Theory, Cambridge, Mass.: The MIT Press, 1995
Putnam, H., The meaning of the concept of probability in application to finite sequences, New York: Garland Pub., 1990
Rao, C. R., Statistics and Truth, Putting Chance to Work, Fairland, Maryland: International Co-operative Publishing House, 1989
Rao, C. R., Uncertainty, Statistics, and the Creation of New Knowledge, New York: Springer-Verlag, Inc., 1996
Rawls, J., A Theory of Justice, Cambridge, Massachusetts: The Belknap Press of Harvard University Press, 1971
Rescher, N., Distributive Justice, A Constructive Critique of the Utilitarian Theory of Distribution, Indianapolis: The Bobbs-Merrill Company, Inc., 1966
Rescher, N., Luck: the brilliant randomness of everyday life, New York: Farrar Straus Giroux, 1995
Richter, L. B., "Gaming: a benefit or a blight," in Christian Science Sentinel, Vol. 98, No. 13, Boston: The Christian Science Publishing Society
Ritter, J., Gründer, K., Historisches Wörterbuch der Philosophie, Band 8, Basel: Schwabe & CO AG Verlag, 1992
Ritter, J., Gründer, K., Historisches Wörterbuch der Philosophie, Band 9, Basel: Schwabe & CO AG Verlag, 1995
Rubinstein, A., "Perfect equilibrium in a bargaining model", Econometrica, vol. 50, pp. 94-100, 1982
Sagan, C., The Dragons of Eden, Speculations on the Evolution of Human Intelligence, New York: Ballantine Books, 1993 [1977]
Salmon, W. C., "Statistical Explanation," in Statistical Explanation and Statistical Relevance, ed. W. C. Salmon, London: The University of Pittsburgh Press, 1971, pp. 29-88
Salmon, W. C., Statistical Explanation and Statistical Relevance, London: The University of Pittsburgh Press, 1971
Savage, L. J., The Foundations of Statistics, New York: Wiley, 1954
Schwartländer, J., "Verantwortung", in H. Kring et al., Handbuch philosophischer Grundbegriffe, München, Bd. III, 1974, 1577-1588
Scott, R. L., Jr., "Fuel Melting Incident at the Fermi Reactor on October 5, 1966," in Nuclear Safety, 12:2 (March-April 1971): 123-34
Sen, A. K., Collective Choice and Social Welfare, San Francisco, California: Holden-Day, Inc., 1970
Simon, H., Models of Man, New York: Wiley, 1957
Smith, D. E., A Source Book in Mathematics, New York: McGraw-Hill Book Company, Inc., 1929
Sophocles, Oedipus Tyrannus, edited and translated H. Lloyd-Jones, Cambridge, Mass.: Harvard University Press, 1994
Spencer, H., The Data of Ethics, New York: D. Appleton & Co., 1879
Stegmüller, W, Probleme und Resultate der Wissenschaftstheorie und Analytischen Philosophie, Band IV, Berlin: Springer-Verlag, 1973.
The New York Times, James Brooke, "Plutonium Stockpile Fosters Fears of `a Disaster Waiting to Happen'," December 11, 1996, A16
The New York Times, Gina Kolata, "Acrimony at Hearing on Revising Rules for Liver Transplants," December 11, 1996, A20
Tversky, A., Kahneman, D., "Evidential impact of base rates", in D. Kahneman, P. Slovic, and A. Tversky (eds.), Judgment under Uncertainty: Heuristics and Biases, Cambridge: Cambridge University Press, 1982
Urban, K., Osterkamp, S., Wider die Risikogläubigkeit, unpublished, 1995
U.S. Interagency Staff Group on Carcinogens, "Chemical Carcinogens: A Review of the Science and Its Associated Principle," Environmental Health Perspectives 67 [1986]: 201-82
U.S. Congress, Office of Technology Assessment (OTA), Cancer Risks: Assessing and Reducing Danger in Our Society, Boulder, CO: Westview, 1982, pp.3-31.
Viscusi, W. K., Risk by Choice, Regulating Health and Safety in the Workplace, Cambridge, Massachusetts: Harvard University Press, 1983
Viscusi, W. K., Fatal Tradeoffs, Public and Private Responsibilities for Risk, New York: Oxford University Press, 1992
Von Neumann, J., and Morgenstern, O., Theory of Games and Economic Behavior, Princeton, New Jersey: Princeton University Press, 1944
Waterstone, M., Risk and Society: The interaction of Science, Technology and Public Policy, Dordrecht: Kluwer Academic Publishers, 1992
Walter, S. D., "Determination of significant relative risks and optimal sampling procedures in prospective and retrospective comparative studies of various sizes", in American Journal of Epidemiology, 1977, Vol. 105, No. 4, The Johns Hopkins University School of Hygiene and Public Health
Webster's Collegiate Dictionary, seventh edition, 1963
Wells, H. G., The War of the Worlds, New York: The Berkeley Publishing Group, 1988
Williams, B., Moral Luck, Philosophical Papers 1973-1980, Cambridge: Cambridge University Press, 1981
Zedler, J. H., Grosses Vollständiges Universal-Lexikon, Band 37, Graz - Austria: Akademische Druck, U. Verlagsanstalt, 1962 [1743]