Two Argumentation Based Decision Support Systems We example essay topic
Indeed, we now rely on computers even for certain forms of decision-making: so-called "knowledge-based expert systems", or "decision support systems", have been developed over the last thirty or so years to support, or sometimes even replace human decision-making (Alty and Coombs 1984; Buchanan and Shortliffe 1984). As the name suggests, expert systems attempt to automate (by means of a knowledge base and inference mechanisms) the knowledge and reasoning skills of experts in a given domain, such as medical diagnosis, marketing decisions, and so on. In this essay we are concerned with recent attempts to develop decision-support systems for processes of public policy argumentation. Like earlier expert systems, these argumentation-support systems incorporate both knowledge bases and inference mechanisms. Unlike the earlier systems, however, they place greater emphasis on the processes used for reasoning and inference than on the database of knowledge from which conclusions are drawn; thus we might designate them as "argumentation systems" in contrast to the earlier "knowledge systems". Dialectical approaches in particular have drawn attention in this developing area of Artificial Intelligence research.
Indeed, some commentators have recommended the employment of such systems, via the Internet, to enable democratic participation in public policy decision-making processes (e.g. Ess 1996). The kind of argumentation systems we examine here are noteworthy for their attempt to draw explicitly upon dialectical theories of argumentation as a framework for processes of public deliberation involving multiple parties and interests. The use of argumentation systems to assist public deliberation raises a number of conceptual and social-ethical questions that have yet to be fully addressed. On the one hand, argumentation systems provide a deliberative forum whose results can advise and, one hopes, improve the quality of decisions. If appropriately designed, such systems should be able to assist debate by tracking the various claims and arguments, by searching databases for relevant information, and by continually updating and assessing the overall state of the debate. They thereby help participants argue in a dialectically responsible manner, while offering them ample scope for modifying aspects of the system, such as the stock of inference rules and proof standards.
On the other hand, the highly qualitative and publicly controverted character of public policy argumentation pose serious questions about what would constitute an "appropriate design". In this paper we aim to clarify the criteria for an appropriately designed argumentation system. To focus and concretize the discussion, we start by describing, in Section 2, two current proposals for computer-mediated argumentation and decision-making, the Zeno system of Gordon and Karacapilidis and their colleagues at the German National Research Centre for Information Technology and the Risk Agora of McBurney and Parsons. In Section 3 we elaborate the basic evaluative dimensions appropriate for the deliberative contexts in which such systems are to be used. In the subsequent sections we indicate how one might apply these deliberative criteria for assessing the two argumentation systems. As we shall see, the critical assessment of argumentation-support systems must draw upon an understanding of the discursive bases of responsible public legitimation and must acknowledge inherent limits on what formal design features can achieve for legitimation.
2. Two Argumentation-Based Decision Support Systems (2.1) We begin by describing the two argumentation systems. The first is the Zeno system of the German National Research Centre for Information Technology (GMD) (Gordon and Karacapilidis 1997, Gordon et al. 1997, Karacapilidis et al. 1997). Zeno was developed to support decision-making in urban planning, as part of a larger European Community-funded project to develop innovative information systems infrastructure for public collaborative environmental planning.
In these domains, there are multiple interested parties, with diverse professional or private backgrounds, interests, preferences and viewpoints, and they are often geographically dispersed. Because the application domain of Zeno involves urban planning decisions, the system has to integrate information which is spatially indexed with information which is not. And because the users may be diverse and geographically distributed, the system requires intuitive and easy-to-use interfaces, provided, preferably, across an internet platform. Neither of these elements of software design was technically straightforward, but these issues will not concern us here. Our focus is on the argumentation and decision support elements of Zeno. The developers of Zeno define their system as "a mediation system":" a kind of computer-based discussion forum with particular support for argumentation.
In addition to the generic functions for viewing, browsing and responding to messages, a mediation system uses a formal model of argumentation to facilitate retrieval, to show and manage dependencies between arguments, to provide heuristic information focusing the discussion on solutions which appear most promising, and to assist human mediators in providing advice about the rights and obligations of the participants in formally regulated decision making procedures". (Gordon and Karacapilidis 1997, p. 10). The argumentation model used by Zeno is a formal version of the informal Issue-Based Information System (IBIS) model of Ritter and Webber (1973), modified for the urban-planning domain. The IBIS model identifies several atomic elements of a discourse: Issues, the topic about which a discussion is conducted; Positions, which express some statement relevant to an issue; and Arguments, which present statements in favour or against particular Positions. Thus (using an example from Karacapilidis et al.
1997), an Issue may be: "Which site should the airport be located?" ; Positions may then be statements designating different alternative sites, or groups of possible sites; and Arguments may be positive and / or negative attributes about these site alternatives and statements, such as: "Has easy access" or "Is private land". Each element of this model can be attached to each other element at any time, so that, for example, a new issue can be raised at any point in a discussion, thus creating a subsidiary discussion on the new issue. The IBIS model is well-suited for the display of a discussion as a hierarchical graph; the implementation of hypertext links enables a user to move easily around this graph from one thread of a discussion to another, or to access background data, supporting documents or contextual information, etc, associated with any element of the discussion. Among the objectives of the Zeno project is to enable snapshots of a debate: "One important goal is to provide easy access to the current state of the planning process, at any time". (Gordon et al.
1997, Section 2). In order to provide for these, the designers of Zeno modified the IBIS model to permit the expression of preferences. Positions have the form of logical propositions but they do not have a context-independent truth status. Their meaning is defined by their role in a particular thread of discussion. Preferences are defined as particular types of positions with an internal structure of comparison of two (non-preference) positions. For example, two planning options for siting an airport, such as "public site" and "easy access" might enter into a preference that considers easy access as "more important than" a public site.
This preference then constitutes a qualitative constraint, which may or may not be supported by further arguments (i. e., positions), and which may or may not be consistent with other constraints. These preferences and constraints, being positions, may themselves be the subject of discussion, via the articulation of arguments and the raising of further issues. Zeno provides users with an overview of the argumentative status of positions, preferences, and constraints. By considering to extent to which each position satisfies the articulated constraints, Zeno permits positions to be labeled as acceptable or not at any time in a discussion. These position-labels can then be aggregated in various ways to assign labels to Issues, and Zeno does this to indicate the extent to which their current argument support meets defined standards of proof. The Zeno developers argue that no set of proof-standards is applicable across all application domains, and so they adopted a set of five labels from the field of jurisprudence, namely: (1) Scintilla of Evidence; (2) Preponderance of Evidence; (3) No Better Alternative; (4) Best Choice; and (5) Beyond a Reasonable Doubt (Gordon and Karacapilidis 1997).
Each label is provided with a formal definition in terms of the presence or absence of positions and arguments (although these definitions are not claimed to be instantiations of the legal definitions of these terms). With this argumentation structure, it is then straightforward for the system to present to a user the current status of a discussion and to show this as it changes. The designers of Zeno identify the generic speech acts involved in contributing to a discussion, for example, "Raise an issue", "Assert a position", "State a preference", etc (Gordon and Karacapilidis 1997). However, they do not, in the work published to date, articulate a definitive list of such speech acts or the rules that govern their use. By contrast, our second example of an argumentation system, the Risk Agora of McBurney and Parsons (2000, 2001 b, 2001 c) is fully specified in this manner, and we now discuss this system. (2.2) The Risk Agora has been proposed as a system to support deliberations over the potential health and environmental risks of new chemicals and substances, and the appropriate regulation of these substances.
Determination of these issues typically first involves debate within the relevant scientific community over whether or not a significant correlation exists between the putative chemical cause and any observed health effects, and then, if a significant relationship is identified, a subsequent debate in the wider community over the consequences of alternative regulatory options then occurs. Initial development of the system (McBurney and Parsons 2000, 2001 c) has focused on the first of these debates, the scientific debate, where the designers adopted explicit philosophies of science and of rational discourse. The philosophy of science they draw on is Pera's three-person game model of science, where progress is made through the work of, firstly, scientists undertaking experiments, whose experiments provoke reactions from, secondly, Nature, whose responses are in turn mediated through, thirdly, the scientific community (Pera 1994). Because the members of the scientific community are (presumed to be) rational and willing participants in the process, and because the assertions in these scientific discourses are all subject to contestation and defence, the authors adopt a philosophy of rational discourse for such debates. For this they draw on Habermas's theory of discourse ethics (Habermas 1983/1991), whose rules were first fully articulated by Alexy (1978/1990), and the principles of rational mutual inquiry of Hitchcock (1991). Within this framework, the authors then articulate the locutions and rules of a dialogue-game, in the style of Hamblin (1971) and MacKenzie (1979), specifying the pre-conditions necessary for the execution of each locution and the changes each locution effects.
The locutions of this game permit assertion, contestation and defence of propositions, modes of inference, assumptions and consequences, in what is essentially a persuasion dialogue (Walton and Krabbe 1995). Certain locutions incur requirements on the speaker to defend a statement if subsequently contested, and the formalism permits this to be in the form of an argument for the statement. Statements, assumptions, consequences and modes of inference may all be asserted with an attached modality, expressing the speaker's degree of confidence in the assertion. As in the Hamblin games, Commitment stores track assertions made in the course of the debate.
McBurney and Parsons have also begun the task of the detailed specification of a system to support discussion over regulatory options (McBurney and Parsons 2001 b). In this work, they have drawn on Habermas's theory of communicative action (Habermas 1981/1984) to define types of locutions (i. e., speech acts) appropriate for such discussions. Because these discussions are about actions, their formalization as dialogue games requires models of deliberation dialogues (Walton and Krabbe 1995), which is the subject of another paper at this OSSA meeting (Hitchcock, McBurney and Parsons 2001). Thus, the work of specifying the Agora to support regulatory debates is on-going. Unlike the Zeno system, the Risk Agora is not intended to support debates in real-time, and the designers have also not yet developed the intuitive, graphical interfaces present in Zeno. Rather, the Agora is intended to formally model and represent debates in the risk domain, so as:" 1.
To understand the logical implications of the scientific knowledge relating to the particular issue, and the arguments concerning the consequences and value-assignments of alternative regulatory options. 2. To consider the various arguments for and against a particular claim (including regulatory options), how these arguments relate to each other, their respective degrees of certainty, and their relative strengths and weaknesses. 3.
To develop an overall case for a claim, combining all the arguments for it and against it. 4. To enable interested members of the public to gain an overview of the debate on an issue. 5. To support group deliberation on the issue, for example in Citizens' Panels. 6.
To support risk assessment and regulatory determination by government regulatory agencies". (McBurney and Parsons 2001 b). Although the Risk Agora has different aims from Zeno, its designers also desire to enable snap-shots of a debate to be taken at any time. This requirement has added importance in the risk domain, where regulatory decisions must be made even though a final determination of the state of scientific knowledge on a particular issue is not possible.
For this reason, and like Zeno, the Agora defines a set of labels which are attached to each claim on the basis of the arguments presented for and against the claim up to that time. Probable claims, for example, are those for which an argument has been presented in the Agora, but for which no rebutting arguments (arguments for the negation of the claim) or undercutting arguments (arguments for negations of an assumption or an intermediate premise) have been presented. In this way, the dialectical status of a claim can be assessed at any time, thus providing a snapshot of the debate to that time. The designers of the Agora then examine the likelihood that a snapshot, taken at some finite time after commencement of a debate, is indicative of the longer-term state of the debate, assuming such a stable state is achieved. They show that the Agora has desirable properties when used for inference from finite snapshots to longer-term states in this way (McBurney and Parsons 2001 c).
3. Dimensions for Evaluating Argumentation Systems Along what dimensions might one analyze and evaluate such systems? For software systems designed for well-defined, decomposable and measurable tasks, such as those for the production of bills for use of an electricity network, for example, assessment of system competence and quality is straightforward; standardized methods have been developed, are in widespread use, and influence good software design (e.g. Kirwan and Ainsworth 1993). However, as Parker (2000) observes, these methods are inapplicable for most decision support systems, since most decisions and decision-processes are not amenable to such reductionist task analysis. She further notes that no methods have been developed for decision support systems. Perhaps it is because decisions are not amenable to reductionist analysis that relatively little attention has been paid in the Artificial Intelligence community to the question of the quality of decision support systems.
Groothuis and Svensson, in research presented as recently as December 2000 and investigating the quality of computer-supported welfare assistance decisions in the Netherlands, say: "To our knowledge this is the first investigation into the extensive use of expert systems in the daily practice of handling a very complex administrative task" (Groothuis and Svensson 2000, p. 9). It is true that designers of what we have termed knowledge systems - expert systems that encode some body of expertise - will typically compare performance of the system against a group of human experts through a number of test cases. However, such comparisons are fraught with difficulties. In some domains, what counts as "expertise" may be open to dispute, or subject to cultural and contextual variation. In other domains, normative decision theory (e. g., Lindley 1985) does not reflect the decision methods people actually use. Although one may simply take this as an illustration of deficiency of human decision-makers, in some cases one may not have any means of assessing the normative methods as superior.
Moreover, normative decision theories have tended to focus only on those elements that are quantifiable, and so may ignore much that is salient to good decision-making (cf. Rehg 1997 a). Assessing the quality of advice-giving systems poses further specific difficulties. If advice is not taken, is it necessarily of low quality or unhelpful to the decision-maker?
And how does one assess the quality of advice if the world changes in a salient way between the giving of the advice and its execution? Still further difficulties are raised by the fact that advice is often linked with certain contextual limitations or caveats, which users sometimes ignore. How does one assess the quality of advice in such cases? Finally, how does one assess advice for extreme situations or rare events? Beneath the foregoing methods of assessment and their vicissitudes lurks a shared assumption, namely that there is a correct or right or "true" decision to be reached by following a set of context-independent (albeit domain-specific) inference rules. However, if assessments based on this straightforward model run into difficulties for knowledge systems, we should expect even greater difficulties when we attempt to evaluate argumentation systems for public deliberative domains that involve multiple interested parties and highly complex value-laden issues.
Decisions over urban planning policies, for example, may have no inherent, or independently accessible truth or correctness, since any viewpoint from which we may judge them will invariably be partial and never completely disinterested; only the process used to reach the decision can tell us whether or not it deserves to stand (Forester 1999; Bohman 1996). Although the intended applications are different, both Zeno and the Risk Agora seek to support deliberative decision-making by multiple participants in a public policy domain. Given the difficulties we have just sketched, any evaluation of argumentation systems such as Zeno or the Agora must begin with a more sophisticated understanding of the relevant kind of decision-making at issue: specifically, assessment criteria must draw not only upon argumentation theory but also on an understanding of public policy formation. Our analysis thus takes as its point of departure Forester's deliberative approach (1999, chap. 6) to mediated public policy formation and dispute resolution. The key to this approach lies in its understanding of deliberation as a process that is both and reasonable, and thus able to generate legitimate decisions, which is to say: rationally and publicly acceptable decisions.
There are four major criterial considerations or dimensions that issue from a deliberative approach. In this section we elucidate these dimensions, before applying them to Zeno and the Agora in subsequent sections. (3.1) A deliberative model of policy formation contrasts with the conventional pluralist model that conceived public policy questions as matters for negotiation and bargaining. Deliberative approaches deny that all public issues and choices reduce to bargaining (Bohman and Rehg 1997). Although decisions over the division of scarce resources, or involving the conflict of particular (non-generalizable) interests, are typically resolved by resort to negotiations or bargaining, decisions over what actions to take in some circumstance require deliberations (cf. Walton and Krabbe 1995; Habermas 1996, chap.
4). According to Forester, mediators should approach public planning and policy formation as deliberations. This approach has important implications for what one expects of both the process and its possible outcomes. As a number of democratic theorists have argued (e. g., Elster 1986; Michelman 1988), deliberative political processes, unlike negotiations, require participants to adopt a civic standpoint oriented toward the transformation of individual preferences and interests in the direction of reaching agreement on a common good or general interest.
Forester (1999, p. 184) calls this "the self-trans formative condition". Whereas participants in negotiations enter the decision-making process with their preferences fully-formed and aim to achieve a compromise position that balances competing preferences, participants in deliberations may learn from each other and even from the very fact of interacting. As Michelman elsewhere defined it:" Deliberation... refers to a certain attitude toward social cooperation, namely, that of openness to persuasion by reasons referring to the claims of others as well as one's own. The deliberative medium is a good faith exchange of views - including participants' reports of their own understanding of their respective vital interests -... in which a vote, if any vote is taken, represents a pooling of judgments". (Michelman 1989, p. 293) Because of this openness to change, participants to a deliberation - assuming they are rational - should be willing to share information, a strategy that may not be in the self-interests of participants to a negotiation. To the extent that participants are unwilling to share their knowledge and preferences, we may consider the decision-process to be a negotiation rather than a deliberation.
Within this deliberative approach we identify three further dimensions of analysis. To begin with, we want self-transformation to be reasonable: if participants change their views in response to others' input, then these changes should lead toward a substantively better outcome. Thus we must examine the properties of topic-specific considerations stemming from the particular question or dispute, which determine the adequacy conditions for high quality outcomes (3.2). However, as we shall see, substantive quality cannot be assessed independently of considerations bearing on the participants and their roles (3.3), which in turn point to difficult questions concerning the process itself and its "vulnerability to insinuations of power" (Forester 1999, p. 177) (3.4). (3.2) For any particular dispute resolution or policy question, one can identify specific considerations tied to the particular issue or dispute at stake, and an adequate deliberative outcome must, presumably, take all these relevant considerations-or at the least, those which are most pressing or salient-into account. An environmental dispute, for example, will typically turn on particular scientific facts about the ecosystem in question, fallible prognostications about the impact of different alternative actions on that ecosystem, and economic assessments about the costs and benefits of different options for different affected parties.
In addition, one can also expect that the parties will disagree over quality-of-life issues, fundamental goods and values (aesthetic beauty vs. economic growth, for example), and even, at a deeper level, over basic otologies and worldviews (Kriesberg et al. 1989; Rehg 1999). How well a deliberation addresses the range of relevant considerations determines the quality of its outcome. In effect, the reasonable processing of all relevant considerations-the pertinent questions and objections-determines what counts as an adequate outcome. The idea of relevance obviously plays a crucial role in this context-but what does it involve? Although the conception of argumentative relevance is very much an open research question (Johnson 2000, pp. 199-204), the factual dimensions of policy argument suggest, on the one hand, a process-independent conception: a consideration is relevant if failure to take it into account in a deliberation can lead to an unsuccessful policy.
On the other hand, relevance is also partly internal to the deliberative process: at least some kinds of considerations, such as interests and values, must be relevant for the participants if they are to be relevant at all. This twofold conception of relevance points toward a more precise characterization of the substantive quality of an outcome or decision. Note, to begin with, that substantive quality should probably not be reduced to "truth". Rather, given the complexity of the various relevant considerations, one does better to characterize substantive adequacy in terms of the rational legitimacy and subsequent success of a policy choice (cf. Habermas 1996; Rehg 1997 b). The standard of success stems from the external aspect of relevance.
Because policy decisions partly rest on factual assumptions, policies can, in some sense, fail to be "correct", given the way the world actually is. An environmental dispute resolution may rest on false scientific assumptions, or it may depend on an economic forecast that proves mistaken. To this extent, the renewed interest of argumentation theorists in truth is not entirely misplaced (Goldman 1999, chap. 5; Johnson 2000). But in policy-making contexts, factual claims intertwine with other sorts of reasons, for which the predicates of "true" versus "false" may be less appropriate than something like "justified"-and thus reasonable-versus "arbitrary". Consequently, what matters is that, in taking all the relevant considerations into account, participants construct and reconstruct their viewpoints in a manner that is not only logically consistent but, more importantly, dialectically responsible, or "responsive" (Goldman 1994).
To put this idea in a nutshell, participants hold a dialectically responsible position insofar as they have addressed all the relevant questions and objections and thereby reached the most plausible outcome relative to the alternatives. One may, to be sure, link dialectically responsible positions with a broadened notion of truth, for example along the lines of Rescher's dialectical account of plausible reasoning or Hintikka's game-theoretic semantics (Rescher 1976; 1977; Hintikka, 1968). According to such approaches, claims that hold up in a process of dialectical reasoning as more plausible, or that have the support of a winning strategy for the given argumentation system, enjoy a (defeasible) presumption of truth, relative to the appropriate burden of proof. Arguably, this concept of "truth" is applicable to many, if not most, decisions in the public policy domain, including the application domains for both Zeno and the Risk Agora. In any case, the internal aspect of relevance is closely linked with dialectical responsibility, and thereby with rational legitimacy. For unless the relevant considerations are addressed to the satisfaction of the affected parties, policy decisions are not likely to be considered legitimate.
Inasmuch as deficits in legitimacy tend to have a negative effect on compliance with decisions, the dialectical quality of a deliberative outcome is also crucial for its successful implementation (see Lind and Tyler 1988). Thus, substantive quality of the deliberation-its adequacy in taking all the relevant considerations into account-naturally leads into the question of participant inclusiveness, how well a process has given the various affected parties the chance to voice their concerns and affect the outcome. This takes us into the third and fourth dimensions of evaluation. (3.3) To address the issue of inclusive participation, we must characterize the involved parties and their roles. Here, we may first ask how many people are involved in the decision-process being supported by the computer system; the number involved may influence the model of argumentation that is appropriate. As Forester's case studies (1999) makes clear, public policy deliberations may involve numerous parties with quite different perspectives and interests.
Including all the affected parties and giving them voice in the deliberation is crucial to the legitimacy and consequent success of such deliberations. But evaluating argumentation systems for overt or "formal" exclusions-exclusions that are explicit in the distribution of roles and entitlements -- only goes part way. More important are those subtle forms of exclusion and coercion that might be built into the design itself as a system of rules. This issue brings us to the role of the system in deliberation, in particular the power of the system in fostering or impeding the expression and reasonable processing of information and viewpoints-an issue we take up under the fourth evaluative dimension.
(3.4) A legitimate deliberative process must not only grant all the affected parties entrance but also give them effective opportunities to voice their opinions and, still further, foster the participants' capacity for learning from one another. Only in this way will the outcome represent a generally acceptable resolution of the dispute or question at issue. This dimension of evaluation thus requires us to examine how well a deliberative design, precisely as an interaction, fosters a collective, rational learning process, such that reasonable and generally acceptable-and thus legitimate-outcomes are more likely. In particular, we must scrutinize deliberation for subtle forms of exclusion and for potential obstructions in the participants' capacity for learning from one another. This dimension of analysis thus complements and goes beyond the dimensions of substantive quality and overt (or formal) participant inclusiveness. In other words, we are concerned here above all with the freedom that participants have to express their views and learn from one another in a reasonable manner.
Thus, the fourth dimension of analysis scrutinizes deliberative processes for their freedom from domination or coercion, whether the coercion issues from the external pressures that inhibit the exchange of views or from internal psychological mechanisms. Forester (1999, p. 184) terms this the "non coercive condition" on deliberative participation. We shall thus have to examine the two argumentation systems for the kinds of power they exert on the public deliberation they assist. If such systems in some sense "mediate" or at least facilitate deliberation, then they wield considerable power over the process and its outcome. As Forester (1999, p. 180) points out: "If parties to public dispute-resolution processes not only construct agreements but reconstruct themselves-in part as a result of being exposed to new information, in part as a res ult of the constellation of participants-then the political significance and power of the mediator-facilitator's role is more important to understand than ever before" (Forester 1999, p. 180). 4.
Deliberative Self-Transformation We now turn to a brief assessment of each of the two systems, Zeno and the Risk Agora, in terms of the four categories sketched above in Section 3. Our intention in these remaining sections is not to provide an exhaustive assessment, but simply to sketch some of the more obvious moves, as an illustration of the basic idea. The first dimension of evaluation concerned the self-trans formative character of deliberative processes. Here we note briefly that, at least on the surface, or in terms of their formal design features, both argumentation systems are designed to allow for deliberative self-transformation and not for bargaining or negotiation. Because the Risk Agora is not intended to support real-time debates, its "participants" may in fact be representatives of positions, rather than real persons. However, the system does allow retraction of claims previously asserted or accepted, and thus permits self-transformation.
For Zeno, in the paper that describes the model of decision-making used in the domain, revision of constraints is explicitly permitted so as to eliminate any inconsistencies in these (Gordon et al. 1997). However, because precise dialogue rules have not been presented, it is not clear how this revision process is undertaken. It is also unclear if participants may revise their statements of other preferences, positions or arguments, as there do not appear to be specific locutions for retraction of previous statements. 5. Substantive Quality The analysis of substantive inclusiveness developed in Section 3.2 suggests that, to begin with, one assess the substantive quality of argumentation systems for public deliberation by their capacity to include all the relevant considerations, that is, the relevant factual considerations, empirical prognostications, and values.
On the one hand, the external aspect of relevance suggests that the system should be able to incorporate all the factual information that would be relevant to a successful policy, insofar as success depends on the truth or accuracy of certain assumptions about the world and the persons affected by the policy. Concretely, this means that the system should be able to draw upon the relevant expertise and make this available to participants. On the other hand, as we have seen such expertise is not the only source of relevant information, nor is it usually decisive for a legitimate outcome. Because relevance also has an internal (i. e., participant-relative) sense, substantive quality requires that the value-orientations, interests, and other considerations that the participants themselves consider relevant should be taken into account by the argumentation system, and thus opened up to public discussion. Again, the formal design features of both systems, Zeno and the Risk Agora, seem to allow for such substantive inclusiveness. Zeno does not seem to restrict the kind of information that can count as a "position", which can include both facts and preferences, and thus it can incorporate, at least indirectly, value-orientations.
To be sure, Zeno may prove to be limited in its ability to distinguish between mere preferences and deeper values-a limitation that may be a problem in some contexts. Here the Risk Agora goes further, allowing for assertions of various types of speech acts (factual claims, claims about what is right, or valuable, and so on; see McBurney and Parsons 2001 b). Secondly, substantive quality also requires that the relevant information and arguments be processed in a dialectically responsible manner. For systems engaged in supporting a human participant or participants to construct, evaluate, contest and defend arguments, we may propose the following list of evaluative questions for an argumentation system (building on Verheij 1999, p. 43): Does the system track the issues raised? Does it track the assumptions made and the reasons adduced? Does it track the conclusions drawn and the counterarguments adduced?
Does it track the justification status of the statements made and the commitments incurred? Does it verify that users obey the pertaining rules of argument? Does it identify omissions and weaknesses in arguments? Does it identify counter-arguments?
Both Zeno and the Agora meet most of these requirements. The exceptions are: it is not clear whether and how Zeno handles retraction or revision, as was mentioned above; Zeno tracks but does not prohibit violations of its rules, so as to permit the greatest possible flexibility to participants (Gordon et al. 1997); and neither system identifies weaknesses or omissions in arguments. 6. Formal InclusivityThe third dimension of evaluation is concerned with the extent to which all the stakeholders are represented in the world of the computer-aided decision-process. The formal design features of neither Zeno nor the Agora appear to limit participation in any overt way, so that all those people interested in the discussion topic and willing to accept the rules of the dialogue are permitted to participate (in the case of Zeno) or able to have their views represented (in the case of the Agora).
In other words, each system is formally inclusive: it gives anyone an equal right to enter the process. This criterion of formal, however, is quite limited. In particular, it leaves two problems unresolved, which reveal the importance of the context-of-use in assessing such systems. First, formal does not mean that all affected parties will in fact avail themselves of the system. One must ask whether all the important human stakeholders in fact involved with and through the computer system. Do all the conversations between them take place through the medium of the system, or do some of them occur off-line?
The answers to these questions for any system will depend on the specifics of their implementation. We could imagine, for example, a situation where use of Zeno was mandated for an urban planning decision, with all stakeholders forced to conduct conversations occurring in and through it. The system has been designed to support this level of use, and the technical design ensures incorporation and integration of information of heterogeneous types (maps, blueprints, reports, email messages, etc). However, without such mandating of use, there is no guarantee that many, and especially many important, interactions between stakeholders would not occur away from the system. Thus, although argumentation systems may assist in clarifying issues, identifying areas of agreement and differences, and deriving the consequences of arguments, so assisting debate, much of the debate may in fact occur off-line. This situation becomes problematic, however, if all the relevant contributions are not represented in the system, for then one cannot expect it to provide an accurate record of the debate or of the status of various positions.
Second, in decision-making contexts the particular formalism itself can present a subtle form of exclusion. For example, if a particular viewpoint rests on styles of argument that are excluded-or simply not representable-in the system, then those viewpoints are excluded from the start. This issue becomes important for types of inferences or arguments whose validity is highly context-dependent (e. g., appeals to authority, or to emotions, etc. ; see Walton 1995). This danger is lessened in the Agora, which allows participants to challenge and revise the rules. The Zeno system, on the other hand, attempts to avoid a pre-set formalism for generating arguments, and it seems to allow participants to choose from a range of burden-of-proof standards; indeed, the system even allows users to violate system norms (Gordon and Karacapilidis 1997, pp. 16-17). Presumably these features also lessen the danger of inbuilt exclusion, albeit at the cost of efficient decision making.
With these cursory observations, however, one has only scratched the surface. The possibility of still more subtle, context-dependent forms of exclusion and coercion may arise from inequalities among participants in their familiarity with rule systems and their computer skills, which tend to favor more educated participants. Generally speaking, inequalities in "political capacity"-which can arise in a variety of ways-threaten to engender an elitism of the better educated (Bohman 1996, chap. 3). Although we cannot address this issues here in much depth, we can say something about the starting point for such an analysis: what role does the system itself play in the deliberation, and how might this steer the deliberative process, for better or worse? 7.
The Role of the System in Deliberative Contexts: Toward a Non coercive Process (7.1) Assessing the power of argumentation systems presupposes an understanding of their role (s) in deliberation. Like expert systems in general, argumentation systems can play a variety of roles in deliberation and decision-making. Three kinds of role especially interest us here: participant support, record-keeping (what we will call the "orrery" role), and forum-creation. We begin by simply clarifying how each system plays each of these roles. First, then, argumentation systems may support participants. In fact, the Zeno system provides support in a number of ways.
By tracking the deliberation as it unfolds-the changing commitments, open questions, and so on-Zeno supports all the human participants, presumably in a neutral fashion. But it can also provide a kind of partisan support insofar as it assists participants in their construction of arguments and counterarguments. Systems that provide this type of support have been called argument-assistance systems in artificial intelligence (Verheij 1999). Finally, it supports a human moderator or mediator, where this person is not him- or herself one of the decision-makers, for example by identifying common assumptions across different arguments. As we have already seen, such mediators play an important role in public policy deliberations. Urban planners, for instance, often assist community groups to reach a consensus in this manner (Forester 1999).
Indeed, Zeno is designed with just these sorts of support in mind. Although the Risk Agora has a somewhat different aim, by reconstructing arguments it not only helps all participants track a debate but can also support individual users as they construct their own positions and arguments. And when the Agora system is fully specified, and thus able to support deliberation over the consequences of regulatory options, it may also support those tasked with facilitating such decisions. Second, the system may play a record-keeping role, or what we call an orrery role, on the analogy of mechanical models of the solar system; it is often undertaken so as to achieve an understanding the reasoning used in the decision-process. Each of the two systems records the reasoning used to reach a decision, Zeno in real-time and with complete accuracy, the Agora in a formal reconstruction of the process. Indeed, Agora's main role is the orrery one.
Third, an argumentation system may support the entire process of decision-making by providing a forum in which to undertake dialogue, with defined protocols for this discourse. In this role the system provides something like a structured space in which participants interact. The forum-creating role is central to Zeno's design. In contrast, although the Risk Agora is designed as a forum for discussion, this is only in an ideal sense, as it provides a forum for the reconstruction of arguments rather than for real-time support. Note that neither system plays the role of participant or decision-maker. There are, to be sure, argumentation systems envisioned for this more active role, for example the StAR system designed for the automated prediction of chemical properties such as toxicity.
Such systems may replace a human decision-maker, a prospect that has raised concern on the part of philosophers and computer scientists. When the system plays the role of participant, it needs to be able to generate, evaluate, contest and defend arguments itself. This is true even for the mediation role, as a mediator may need to find common ground between different positions; for example, he or she may need to argue that two opposing positions share common assumptions, or that one implies another. How well a given system executes these tasks will be important in any evaluation. Neither Zeno nor the Agora appear to have the capability to generate, evaluate, contest or defend arguments. For the Agora, acting as an orrery, these capabilities are not required.
For Zeno, because the designers do not seek to automate the role of mediator, the system can also operate without these capabilities (Gordon and Karacapilidis 1997, p. 11). Indeed, from a computer science perspective, the generation of appropriate arguments in a dialogue, even using only the limited sets of locutions of Zeno or the Agora, is a challenging research problem, and one that is not yet solved. Rather, both Zeno and the Agora support responsibility to explain decisions reached through their use only insofar as the participants justify their statements to each forum. The Agora has a stronger claim than Zeno to this capability, as the dialogue rules require those participants asserting claims to provide arguments justifying these claims when questioned or contested.
It is not clear that similar requirements are incurred by the participants in Zeno. (7.2) With the roles of each system clarified, we are in a position to sketch the assessment issues raised by the insinuations of power in deliberative processes. Expert systems raise a number of questions in this regard: What is the institutional status of computational-deliberative results? On whose behalf is the system developed and deployed?
Who should have rights to its findings? -and so on. These questions illustrate the issues that may arise from an examination of the social and institutional relationships surrounding the use of any decision-support technology. None of these questions has a straightforward answer, and just resolution may only be possible on a case-by-case basis. Exploring them, one is confronted with those whom in sociology have been called "the locally powerful" (Bell 1978) - people who influence or control the decisions of others, even if only in a particular context. In the present context, one would expect the locally powerful to include users who are especially adept at employing the argumentation system to support their viewpoint, or those users whose style of reasoning and public representation is favored by the system formalism. The danger in such inequality is that the more powerful players-perhaps even despite their good intentions-would fail to give some viewpoints the consideration they deserve on the merits.
Viewpoints and arguments would be hastily dismissed, not given an adequate hearing. These non-standard approaches would thus be subtly forced out of the deliberative forum. In this scenario, the quality of the deliberation is harmed in a number of ways. Substantively, relevant considerations are not duly taken into account. Such truncation can in itself lead to policies that are less than adequately informed, even about certain factual considerations, and are thus less likely to prove successful. But truncated deliberations can also mean that the interests and values of some parties are overlooked, a deficit that is bound to damage the legitimacy of the outcome, its acceptability to all those expected to comply with the decision at issue.
Finally, by not considering some viewpoints sufficiently, possibilities for self-transformation would be missed, both by those with power and those lacking it. We are thus brought back to the issue broached in Section 3, regarding the freedom of the participants for a reasonable self-transformation. Only now we cannot define such reasonableness by appealing to the argumentative mechanisms built into the system, the formal design features treated in Sections 4-6. For now it is precisely such features-the dialectical formalism and formal -that are in question. We are asking how such system features perform, how they are actually used, in particular contexts. Thus one must rather have recourse to what Habermas (1990, pp. 88-89), drawing on Alexy (1978/1990), has dubbed "process" criteria.
Although Alexy's first criterion is that of formal, we are now concerned above all with those criteria or "rules" that define the freedom of participants to construct and reconstruct their viewpoints within discourse. The decisive critical perspective behind such criteria is this: to what extent does the deliberation-as a social interaction-allow participants rationally and freely to adopt and change their viewpoints, that is, solely on the basis of an insight into the better argument? The rules that define such freedom call for equal, coercion-free participation in the process itself. Thus one might summarize the requirement this way: each participant should have an equal freedom to express and revise his or her viewpoints, feelings, interests, questions, and so on.
Such a requirement cannot be guaranteed by any concrete set of formal rules; rather it involves an "idealization" that actual discourses and deliberation only rarely satisfy in full. However, if participants are to consider deliberative outcomes as trustworthy or legitimate, then they must suppose that their actual deliberative process has at least approximated this idealization. More precisely, the actual process of argumentation must supply participants with the sense that their positions are ones that could hold up under idealized conditions of discourse (see Habermas 1993, pp. 49-57). Applying these process criteria takes us beyond the assessment of formal design in abstracto to its performance in concrete deliberative contexts.
One can, to be sure, design the formal features of system with something like Alexy's criteria in mind. In doing so, one sees to it that the system formalism makes provision for the various kinds of argumentative moves that participants should each be able to employ if they are to represent and change their views. Our assessment of substantive quality suggests that the two systems to some extent incorporate, as formal design features, aspects of Alexy's rules. In fact, McBurney and Parsons (2001 c) are quite explicit about this: they show that the Risk Agora implements an applicable subset of Alexy's rules of discourse ethics (Alexy 1978/1990) and 15 of Hitchcock's 18 principles of rational mutual enquiry (Hitchcock 1991). Zeno seems to have a lesser capacity to represent different types of speech acts, and its dialogue-game rules are not articulated (at least, in the publications known to us), but it does allow for participants to represent and change their preferences.
But formal design mechanisms and their abstract assessment leave open the critical question that is decisive for the process level: how well participants of diverse backgrounds and capacities make actual use of this formalism in concrete contexts of use remains an open research question. It is true that the identification and formal demonstration of the discourse properties of a system will provide confidence to its users. In the final analysis, however, such systems will be successfully implemented only insofar as parties to deliberations actually use the systems and consider computer-assisted outcomes as superior. At the process level, then, the pertinent question is this: how well does the actual use of the argumentation systems for a given deliberation lead to an outcome that is legitimate, that is, reasonable in light of argumentative idealizations? Given the different roles delineated above, specific questions such as the following become important.
Again, the's questions are more illustrative than exhaustive of the possibilities. First, given the particular context of use, do the support mechanisms favor some participants over others? Here the partisan support role merits particular scrutiny. Is argument-assistance equally available to each participant? Is the system flexible enough to capture all the various styles of argument that might plausibly merit consideration? Second, given the particular context of use, is the tracking or record-keeping genuinely "neutral"?
Marxist and postmodern critics of power have cast serious doubt on ideals of neutrality (e. g., Harding 1998). The crucial question here may perhaps be formulated in a more concrete manner as follows: Can each affected party perceive that the system has represented its position, interests, values, and arguments accurately? Again, the flexibility of the system formalism becomes an important factor in system quality. The third area for questions involves the system as a forum. Here the basic question follows directly from the idealizing requirement above: given the particular context of use, is the forum one in which each party has an equal opportunity to present and argue for its viewpoint-and is equally free to reconstruct its position in the light of the counterarguments? 8.
Conclusions This paper has explored a number of evaluative issues associated with the use of computer decision support systems for public deliberation, in particular the Zeno system of Gordon and Karacapilidis and their colleagues at GMD and the Risk Agora of McBurney and Parsons. We have argued that assessments of the quality of such systems present challenges to system designers and users, and we have suggested a conceptual structure within which to undertake such assessments. This structure draws on deliberative democratic theory as developed in Forester's (1999) framework for assessment of deliberative planning processes. Our analysis indicates that the formal design features of the two argumentation systems capture many of the features of reasonable argumentation-mechanisms for ensuring substantive inclusiveness of relevant considerations, dialectically responsible argumentation, and an overtly open forum of participation. But these properties do not ensure high quality performance in the actual deliberative contexts of use. For that, all the parties affected by a policy decision must actually avail themselves of the system and, in addition, find that the system increases their understanding of the issue, their ability to voice their concerns, and their confidence that outcomes and decisions are thereby superior to what they would otherwise have been.
Acknowledgments This work was partly funded by the British Engineering and Physical Sciences Research Council (EPSRC) through a PhD studentship. We gratefully acknowledge this support. We are also grateful for discussions on these topics with Trevor Bench-Capon, Rod Girle, Muni Haklay, David Hitchcock and Bart Verheij.