"Documenting the reinvention of text:

the importance of imperfection, doubt, and failure."

A paper presented at MIT, October 25, 1997, in the panel "Images and Texts as Digital Publications," a part of the conference on the "Transformations of the Book," an event in the Media in Transition series.

John Unsworth

University of Virginia

The title of this conference-"Transformations of the Book"-has already been called into question here, and it is an instance of the rhetorical trope that has come to characterize much of what we say, write, and read about the subject of electronic text, the World-Wide Web, and information technology in general: the trope is one of change, invention, evolution, with overtones of progress and improvement, and with undertones of inevitability and universality. We meet this trope in mass-media news and advertising about computers and communications, in the promotional literature of our educational institutions, in scholarly books and articles about hypertext and digital libraries, and in grant proposals for electronic scholarly projects which aim, or claim, to break new ground, undertake pilot projects, provide models for the future.

My focus today will be on the academic part of what is clearly a larger cultural trend, and specifically on hypertext projects and hypertext theory, as they address the subject of transformative change-but I will be holding these projects and this theory to an extrinsic standard, namely the standard of science. I think I can predict the objections to this exercise, but in spite of those, I believe this is a worthwhile experiment, and a worthwhile discussion, because it may help us to sharpen distinctions among different kinds of writing about hypertext, and because it may help us to arrive at some principles for evaluating both theoretical and applied work in this area of research. Among other conclusions, I will be arguing that if a project can't fail and doesn't produce new ignorance, then it isn't worth a damn.

I should say, at the outset, that my remarks are not intended to be a criticism of the projects whose results, or whose ruminations, you have seen and will see, before and after this talk. Indeed, I think most of these projects have succeeded to the extent that they have because they have followed some of the precepts I will discuss, though they may not have done so consciously and they may not have said so explicitly. I was cheered, in fact, to walk in (late as usual) to this conference just in time to hear Peter Robinson remark that the Cambridge edition of the Wife of Bath's Prologue, "which we once considered a great success, should now be considered a failure." This implies, and Peter's subsequent discussion demonstrates, that experiments are in fact being conducted, that evidence is being gathered and evaluated, and that lessons are being learned. Without this, we will get nowhere, though it is true that we may yet get there very fast.

I should also acknowledge that my remarks today are the direct result of being asked a question for which I didn't have a very good answer, about a year ago. At a conference at the University of Maryland, Neil Fraistat (whose Romantic Circles Web site some of you may know) asked me if there were any writing on specific humanities hypertext projects that was neither promotional nor anecdotal, but that reported and analyzed and theorized the experience of constructing such a project. I could think of a couple of examples, but only a couple, and none perfectly apt. The conversation with Neil progressed to the topic of the importance of reporting and analyzing failure in any research activity, humanistic or scientific, and to the patterns of funding that discouraged such reporting and analysis. I owe whatever illuminations emerge in the following to that conversation, and I take it as an emblematic instance of a research opportunity-namely, a question for which there should be an answer, for which one could imagine an answer, but for which no very good answer was at present to be found.

Certain Limits, Uncertain Cases

At the most basic level, the level of survival, it is a given that resources--in academia as elsewhere--are limited, and that we struggle for these resources in the form of institutional support, outside grant funding, and release time. Given these limited resources, we are obviously obliged, for practical as well as intellectual reasons, to argue for our projects and our programs. In short, there is a kind of evolutionary pressure at work in the transformation of the book: some projects will survive, others will not; some theories will flourish, others will wither. If we hope that rationality rather than sheer force might guide this process, then the only rational course, for both the proposers and the funders of such projects, is to declare and defend our evaluative criteria, particularly when we consume or allocate resources that might otherwise go elsewhere.

By the same token, and before going further in my discussion, I would say that any academic or funding activity bears the same responsibility: it is no more a justification to say "it has always been done" than to say "it has never been done." In either case, we need to know why it should be done, and we need to know how we will determine whether we succeeded or failed in the endeavor.

I'd like to begin, then, by reading (in abridged form) two theses from Sir Karl Popper, the founder of the philosophical school known as critical rationalism, a school of thought from which many of the arguments in what follows will be derived:

First Thesis: We know a great deal. And we know not only many details of doubtful intellectual interest, but also things which are of considerable practical significance and, what is even more important, which provide us with deep theoretical insight, and with a surprising understanding of the world.

Second Thesis: Our ignorance is sobering and boundless. . . With each step forward, with each problem which we solve, we not only discover new and unsolved problems, but we also discover that where we believed that we were standing on firm and safe ground, all things are, in truth, insecure and in a state of flux. ('The Logic of the Social Sciences' in The Positivist Dispute in German Sociology, 1976)

Popper's theses, and his writings in general, do a fine job of expressing something that I want to emphasize today, in the context of "The Transformations of the Book," namely, on the one hand, the importance-the utility-of what we do know and, on the other hand, the ephemeral, contingent, transitional character of that knowledge-and therefore, the need for experiment, the indispensibility of mistakes, and the necessity of recognizing, documenting, and analyzing our failures.

Transformation as Evolution

There is no question that the book, or more properly text technology-what Jay Bolter calls "writing space"-is currently undergoing a major transformation. Inasmuch as we think of this transformation as progress, or hope that it will be, these changes are implicitly being treated as evolutionary. It has been observed that

[a]ny theory of evolution is about processes of change. An extra requirement for an evolutionary theory is that purely random and entirely time-reversible patterns are excluded; evolution concerns exclusively change that is, at least statistically, irreversible. To qualify, irreversible change must entail processes that lead to emergence, or at least the persistence, of ordered structure in space and time. (The New Evolutionary Paradigm, [Laszlo E] 1991, p xxiii)

Evolution is our name for a positive, unidirectional change-an alteration in the direction of something better, where better is defined as more complex, more ordered, more useful, more adaptive, more fit to a particular purpose. The test of whether a transformation qualifies as an evolution, then, is whether or not it improves on what it changes, and does so in a way that external forces are likely to reward and reinforce.

Is Change Improvement?

We know from observation-of our own aging bodies, for example-that not all changes are improvements. So if we are advocating a change, or participating in one, we ought to be deeply concerned with evaluative questions. In the case of the transformation of the book, the question could be phrased "Does hypermedia improve on the book?" And this question that ought (in principle) to be answerable, with some combination of empirical evidence and rational argument. But in order to gather such evidence, or make such arguments, we would first need to establish evaluative criteria. What might such criteria look like, in the case of hypertext projects or hypertext theory?

Before attempting to answer that question, I should point out that the criteria by which evidence would be selected and on which arguments would be based will be rather different in these two cases: theory has one set of responsibilities, and craft has another. But the two are, or ought to be, connected and mutually responsive. Where experimental endeavors are concerned, theory ought to be able to explain, predict, and produce practical results, and practice ought to provide the occasion to test, implement, modify, or falsify theoretical assertions.

Evaluative Criteria in Hypertext Theory

Hypertext theory is recent but broad and interdisciplinary field: it includes literary scholars of many different periods and specialties, philosophers and sociologists, computer scientists, user-interface and human-computer interaction experts, librarians, publishers, and practitioners. Hypertext theory is still sorting out its relationship to the even broader fields of literary theory, communications and media theory, architecture and design, and many others. In an important sense, then, the task for hypertext theory at this point is to define itself, to describe and understand its constituent parts, and (perhaps most of all) clearly identify the object of its attention. What I have to say here about evaluative criteria is addressed to a narrowly defined "hypertext theory," and even within that, principally to the literary type, but I think it could apply as well to the broader field of media studies in which hypertext theory sometimes finds itself. In addition, I'm going to work with a much narrower meaning of the word "theory" than is usually used in connection with hypertext, and especially in literary hypertext theory. In brief, "theory" here is taken to mean assertions (about the nature or function or design or impact of hypertext) that have the potential to be proven or disproven.

Can it be falsified?

The first criterion I would propose, in evaluating theoretical statements about hypertext, is borrowed directly from Popper, namely the criterion of falsification. As Popper has it, if a statement cannot possibly be proven false, then it can't be considered a scientific statement: it might be a perfectly legitimate example of some other kind of statement (metaphysical, philosophical, poetic, etc.), but it is not scientific--because, for Popper, the distinguishing feature of science is that it proceeds by making assertions that can be falsified, testing them, and perserving, modifying, or discarding its beliefs based on those tests.

Obviously, this first criterion raises the question of what we are to call writings on hypertext that don't make claims which could be falsified: "Essays" might be a good choice, in the tradition of Montaigne; appreciations, musings, metaphysics-all these are open too. My point is not that all writing about hypertext should take the form of empirical assertions, only that we should have a clear way of distinguishing the genre of writing about hypertext that we are reading, and if that writing calls itself "theory" then we should expect it to provide us with (dis)provable assertions-and when a theorist of hypertext does make claims of a factual nature (such as the claim that hypertext is an improvement over the state of text in printed form), then the person making that claim has obliged himself or herself to support those claims with empirical evidence and rational argument-not to prove the assertion true (something which can't ever be done, even in science), but only to make the best case that can be made, given both what we do know and what we don't.

This first criterion, falsification, is extremely important: if we do think that we are "reinventing the text," if we suppose that we are in fact inventing or doing "research" in any sense of the word, then we must have a theory to guide that research, and it must be possible for that theory to be proven wrong by the evidence. In short, if failure isn't a possibility, neither is discovery.

It should be noted, too, that the possibility of failure is not simply a matter of the nature of our assertions, but also of the climate and terms of our funding: in the sciences and in the humanities alike, the current atmosphere is not friendly to failure-largely because of the emphasis on short-term, gainful outcomes (marketable products, if you will). The emphasis on marketable products is obviously an expression of society's desire to 'get its money's worth' out of research funding of all kinds, but I would argue that, if we really want to get our money's worth, we should make sure that we don't fund "research" that investigates problems the solutions to which are already known, nor should we fund research that selects problems likely to be solved successfully in one funding cycle. Of course, we don't want to encourage failure for its own sake either, but it seems clear-to me at least-that we should favor those projects that stake out difficult territory, have a well-thought out approach to that territory, and can at least define what failure, or in a narrower compass, falsification, would be.

Is it explanatory?

In simplest terms, the purpose of science-and of knowledge more generally-is to explain. In the sciences, as elsewhere, this is generally a matter of degree, not of absolutes, and one measure of the value of a theory is its reach: all other things being equal, the theory that explains more of the observable data associated with a particular problem area is generally considered a better theory. I see no reason why the same should not be true of hypertext theory, or of theories concerning new media more generally. In reasoning about the transformation of the book, or its disappearance, or the emergence of whatever will or will not replace it, we may proceed from isolated observations, but our conclusions on the larger topic ought to be able to explain more than the individual observations from which they are derived. In other words, theory in this realm, as in others, needs to rise above particulars to generalizations (and, as earlier proposed, those generalizations ought to be able to testable against evidence, and potentially falsifiable)

Is it predictive?

This is a difficult one, not only for the humanities, but for social sciences as well. In "Replies to My Critics," Popper paid special attention to the predictive function as a means of distinguishing between scientific and non-scientific reasoning. What he concluded was that

There is a reality behind the world as it appears to us, possibly a many-layered reality, of which the appearances are the outermost layers. What the great scientist does is to boldly guess...what these inner realities are like. This is akin to myth making....[and] [t]he boldness can be gauged by the distance between the world of appearance and the conjectured reality, the explanatory hypothesis.

But there is another, a special kind of boldness-the boldness of predicting aspects of the world of appearance which so far have been overlooked but which it must possess if the conjectured reality is (more or less) right, if the explanatory hypotheses are (approximately) true....

...[I]t is this second boldness, together with the readiness to look out for tests and refutations, which distinguishes 'empirical' science from non-science, and especially from pre-scientific myths and metaphysics.

I do think that this second kind of boldness can be expected, in rare instances, from theories about the transformation of the book, about hypertext, about whatever this object of our discussion may be called: new "aspects of the world of appearance" (of information) will emerge, within our generation and the next and the next, theory could aspire to predict those appearances. The theory that does so could also look for tests and refutations, even before they appear.

Is it productive?

A good theory should be productive in a number of ways: it should inspire argument, it should give rise to new ideas, observation, and speculation, it should allow us to do things--things we couldn't do before, things we didn't know we wanted or needed to do, things we hadn't imagined doing. In short, it should be fertile. Again, I see no reason why this criterion should not be applicable in our domain as well as in others, and in fact I expect this quality to be valued above (and sometimes at the expense) of all others, in our domain. Whether or not we believe Marx or Freud as explainers or predicters, we in the humanities still value them highly because they have been and continue to be productive-productive of discourse, above all.

Is it persuasive?

In measuring the persuasiveness of a theory, I can think of no better metric than that proposed under the heading of "conformity" by the Principia Cybernetica project. In this remarkable Web, the Conformity node begins by noting that "the more people already agree upon or share a particular idea, the more easily a newcomer will in turn be infected by the meme." The author of the node (Heylighen) notes that "conformity pressure is mostly irrational, often rejecting knowledge that is adequate because it contradicts already established beliefs," but he goes on to point out that

Conformity pressure is an expression of "meme selfishness." As memory space is limited and cognitive dissonance tends to be avoided, it is difficult for inconsistent memes to have the same carriers. Cognitively dissonant memes are in a similar relation of competition as alleles: genes that compete for the same location in the genome. Memes that induce behavior in their carriers that tends to eliminate rival memes will be more fit, since they will have more resources for themselves.

Clearly, one would not want to privilege persuasiveness, or successful meme selfishness, above other criteria for evaluating theoretical proposals, but inasmuch as the evolution of the book is a co-evolution, proceeding in a complex relationship with ideas about the evolution of the book, we should recognize that in this case there is a material interaction between theory and its object, and that a successful theory may achieve its success-even on predictive grounds-as a result of its persuasiveness.

Evaluative Criteria in Hypertext Projects

As I noted earlier, the evaluative criteria appropriate to hypertext theory and to hypertext practice are likely to be different. Whereas the criteria I would apply to theoretical statements turn largely on the claims implied or expressed at an epistemological level, the criteria I would apply to hypertext projects have more to do with the implementation of theory, and thus with the results themselves, or with the goals expressed for the particular experiment. We should be able to say whether a particular project's goals proceed from some implicit or explicit theory or theories, and we should be able to say whether these goals seem to us to be worthy, and why, but we do not, and should not, on the whole, expect a particular project to focus its energies and resources on elaborating or defending its theoretical superstructure: it is enough, I think, that it should provide evidence for accepting or rejecting a theory, produce a useful product, and/or raise interesting new problems or solutions.

Does it declare the terms of its own success or failure?

It is fair, I think, to require new projects in the area of electronic texts, digital libraries, hypermedia editions, to declare the terms of their potential success or failure. If I can't tell you that much about what I propose to do, then I don't know what I'm doing, or why. If I do know what and why, then I know what will constitute success or failure, and I ought to articulate that. Granted, it may be difficult to provide a clear and immediate formula that will really make sense of the extrinsic measurements one could gather-hits on a web site? Citations in the scholarly literature? Acceptance at the high-school level?-but at the intrinsic level one ought to be able to establish milestones for production and functional specifications for use, at the very least. Frankly, the only metric that is likely to matter to the universities that sponsor such projects is their success in attracting outside funding, but scholars, designers, and funding agencies ought to care more than that about these simple intrinsic criteria. This is not to say that failure to meet these goals should be considered sufficient reason for abandoning the project-but if the initial functional and production goals of the project are not met, then that ought to be the occasion for an analysis of failure, which in some cases might be the most valuable thing to come out of the project.

Does it formulate a methodology for solving the problem it addresses?

This is a criterion that applies in rather different ways to the beginning, the middle, and the end (if any) of a project. At the beginning, a problem-solving methodology ought to be required, but it shouldn't be regarded as a failure if that methodology is revised in the process of completing the project, since we assume (if this is research) that there will be some sort of feedback loop between the problem and the solution, and as the problem is progressively analyzed and considered, the methodology for solving it will also be refined. In the middle of a project, if there has been no change at the methodological level, then I would suspect that the problem selected was not really a problem at all. If, at the end of the project (and I haven't seen the end of one of these projects yet), the methodology couldn't be formulated in general terms, then I would suspect that nothing much had been learned from the experience of tackling this problem. In fact, I think that successful hypertext projects are continually reformulating their methodology, and their only failure, on the whole, is the failure to document the stages in and reasons for their methodological evolution-a very real failure, though, since we could learn a great deal not only from their product, but also from their process.

Does it address (or generate) unsolved problems?

In Conjectures and Refutations (1960; 1968), Karl Popper notes that

Every solution of a problem raises new unsolved problems; the more so the deeper the original problem and the bolder its solution. The more we learn about the world, and the deeper our learning, the more conscious, specific, and articulate will be our knowledge of what we do not know, and our knowledge of our ignorance. For this, indeed, is the main source of our ignorance-the fact that our knowledge can only be finite, while our ignorance must necessarily be infinite.

This passage gives us, I think, a very compact, elegant, and persuasive criterion for deciding whether a real problem has been addressed, and solved-namely, the test of whether the solution of that problem has raised new problems. All of my personal and pedagogical experience strongly inclines me to agree with Popper that acquiring new knowledge means discovering new ignorance. Given that, then hypertext research projects should be expected to address unsolved problems (otherwise their problems belong to the arena of production rather than that of research), and the proof of their having done so should be that they culminate in a new plateau of ignorance-a new set of unsolved problems.

Can its solutions be generalized?

Finally, on the topic of evaluative criteria for hypertext projects, I would suggest that the solutions a project does arrive at-notwithstanding the new, unsolved problems it should raise-ought to be generalizable to other work in other disciplines and other contexts. This principle is, at the applied level, very like the principle, at the theoretical level, that says a theory should be broadly explanatory. The practical experiment that produces the greatest number of tools, methods, errors, or insights that can be generalized to other projects, other disciplines, other contexts, will be the most successful experiment, at least as research (mind you, it may not be the most popular on the Web, or the most marketable). I'd go even further, and suggest that at this early stage in the evolution of our methods and this medium, we should give the highest priority to projects that clearly demonstrate a potential for generating generalizable solutions-provided, of course, that they can say why those solutions are needed and how they might be arrived at.

Conclusions

We are in an important evolutionary moment: an important transformation is taking place, and we are a part of it. Many things that we take to be trivial, or embarrassing, or simply wrong, will be of interest to our peers in the future. Our first responsibility, therefore, is to document what we do, to say why we do it, and to preserve the products of our labor-not only in their fungible, software-and-hardware-independent forms, but also in their immediate, contemporary manifestations. The greatest mistake we could make, at this point, would be to suppress, deny, or discard our errors and our failed experiments: we need to document these with obsessive care, detail, and rigor. Our successes, should we have any, will perpetuate themselves, and though we may be concerned to be credited for them, we needn't worry about their survival: they will perpetuate themselves. Our failures are likely to be far more difficult to recover, in the future, and far more valuable, for future scholarship and research, than those successes. So, if I could leave you with a single piece of advice, it would be this: be explicit about your goals and your criteria, record your every doubt and misstep, and aspire to be remembered for the ignorance which was uniquely yours, rather than for the common sense you helped to construct.