What do Scholars Need and Expect from Electronic Texts? Lessons Learned at IATH.

by

John Unsworth

University of Minnesota, October 4, 2001

I'm going to answer the question put to our panel with an assertion, a definition, and two examples.

Assertion:

Many scholars need a good deal more than they expect from electronic texts, and a small but growing number of scholars need and expect more than you can imagine from electronic texts. The first group, the majority of scholars engaged in teaching and research that has text as an object of study, do not expect much, based on their experience with electronic texts so far: they don't expect the texts they need to be available in electronic form; if a text is available, they don't expect it to be a very good edition, and they generally don't expect it to come with any kind of apparatus. They don't expect to be able to annotate the text, or really do anything with it except for reading or printing it--and sometimes, the text won't even allow itself to be printed. Finally, if the text is available, they expect it to disappear: they don't expect to be able to find it in the same place, at the same address, from one year to the next, and they don't expect it will ultimately persist at any address for more than a few years. To remedy these negative expectations, libraries and publishers would need to create better, and better-edited, electronic texts, create many more of them, and find ways to provide persistent URLs for those texts; it would also be very nice if it were possible to do more than read and print these texts--if there were tools for annotation, analysis, comparison, and other scholarly uses of electronic text.

There is a second group of scholars, a smaller group, involved in actually editing electronic texts, or otherwise creating originally digital scholarly resources. These scholars will help libraries and publishers to meet some of the needs of the first group, but their needs are even greater. They need not only good texts and tools to use with them, but also human collaboration and support from library and computer professionals who understand technology, standards, and, above all, the principles of knowledge representation and the techniques available for that activity in the context of information technology.

Definition:

Knowledge representation is, strictly speaking, a sub-discipline in the field of artificial intelligence, but it is also an interdisciplinary methodology that combines logic and ontology to produce models of human understanding that are tractable to computation. I'm interested in knowledge representation because it offers a very exact description of what we've been groping toward (with varying degrees of intentionality and varying degrees of success) in the many projects at the Institute for Advanced Technology in the Humanities, and because it is a methodology that is at the core of the University’s new MA in digital humanities, which will enroll its first class in the fall of 2002. I don’t claim expertise in this field, and I’m grappling now with things I have never mastered, or had to master, in the past—formal logic, math, philosophy—but I’m aware as I grapple that I already know some of this, that some of the premises of this method are lessons I’ve already learned, from experience.

“What is a knowledge representation?” Three artificial intelligence researchers at MIT argue in Figure 1 “that the notion can best be understood in terms of five distinct roles it plays, each crucial to the task at hand:”

1. A knowledge representation (KR) is most fundamentally a surrogate, a substitute for the thing itself, used to enable an entity to determine consequences by thinking rather than acting, i.e., by reasoning about the world rather than taking action in it.

With respect to the texts and contexts I work in and around, knowledge representation usually takes the form of markup (in extensible markup language, XML, or in Standard Generalized Markup Language, SGML), or it takes the form of databases-—flat tables or relational structures. We’ll see examples of each in a moment: both are clearly “substitutes for the thing itself:” the entity that uses those surrogates to determine consequences is, in one sense, a piece of software (a search engine, for example), but in another sense it is a human being, who comes to understand the consequences of his or her own assumptions and beliefs by making them explicit and then putting them in play with one another and with a text.

2. It is a set of ontological commitments, i.e., an answer to the question: In what terms should I think about the world?

This is very important to understanding what one does when producing these surrogates. There is a logical component to an SGML structure or a database, but it is very basic and content-neutral—-just some Boolean operators, some rules of logic at that level. The more complicated stuff happens when you start to name the predicates for your predicate calculus-—the things of which your logical statements are true, or false. The names you use imply a perspective, a purpose, and more than that, an understanding of the constituent elements of the object of your attention, and their relations to one another in that object. We’ll see a simple example in the context of the Salem project in a moment.

3. It is a fragmentary theory of intelligent reasoning, expressed in terms of three components: (i) the representation's fundamental conception of intelligent reasoning; (ii) the set of inferences the representation sanctions; and (iii) the set of inferences it recommends.

Intelligent reasoning about a literary text, for example, may take many forms: a materialist critique, a philological study, a critical edition, a cultural history. The “fundamental conception of intelligent reasoning” differs somewhat in each case, as do the inferences sanctioned and the inferences recommended under each model.

4. It is a medium for pragmatically efficient computation, i.e., the computational environment in which thinking is accomplished. One contribution to this pragmatic efficiency is supplied by the guidance a representation provides for organizing information so as to facilitate making the recommended inferences.

This is a bit opaque, but very important—-as John Sowa says in his book Knowledge Representation, the requirement of computability is what separates knowledge representation from pure philosophy, and it is also what brings me, and the projects at IATH, in contact with knowledge representation. If you want a computer to be able to process the materials you work on--whether for search and retrieval, analysis, or transformation--then those materials have to be constructed according to some explicit rules, and with an explicit model of their ontology in view.

5. It is a medium of human expression, i.e., a language in which we say things about the world.

It may not look like it when you look at SGML markup, for example, but that is a language that both humans and computers can understand, and it is a language in which we say things about the world.

According to Sowa, knowledge representation consists of logic, ontology, and computation. Logic disciplines the representation, but is content-neutral. Ontology expresses what one knows about the nature of the subject matter, and does so within the discipline of logic’s rules. Computability puts logic and ontology to the test, by producing a second-order representation that validates and parses the ontology and the logic of the knowledge representation.

Examples:

Let's look at one example, and see what these three ingredients look like in practice.

Figure 2 shows a database table from Ben Ray’s Salem Witchcraft project—a “tuple.” A tuple is a simple (but powerful) data structure containing two or more components—for example, a row (with two or more fields) in a table in a database. A tuple is a structure that can be repeated any number of times, always with exactly the same parts—a set with a certain number of elements. The simplest tuple would be a matched pair—out of which, properly deployed, some very complicated things can be built.

Logic: The logic here is that of the tuple—and, and, and. Of course, some of these elements can be null, and some cannot be null, so it’s not the case that every row in the database will have every field filled out.

Ontology: With respect to ontology, the first thing to be said is that the database schema can enforce some data-typing (which is a level of ontology in itself, with respect to the data) But at a higher level, the table at the top left in Figure 2 expresses a certain ontological view of marriage, a demographic view, but also a western, judeo-christian view. It says that marriage has a begin-date and an end-date, a place, a unique husband and a unique wife, and so on. Admittedly, these are pretty basic ontological commitments with respect to marriage, but they still express a perspective and imply a purpose.

The right-hand table in Figure 2 shows the set of tables in this database, and suggests the larger ontology of Salem that's in play here, with a focus on family relationships, corporate entities, property, sources (manifestations) and events.

Computation: The animation at the bottom shows one way—visualization—in which this knowledge representation of Salem, its individual residents and the events in which they are involved, is computed. In order for Ben Ray to provide us with this visualization of accusations played out over space and time, and across households and townships, the knowledge representation according to which his data is prepared must capture all these ontological features (“husband” and “wife” for example) according to some pre-defined logical rules (“every husband has one and only one wife; every wife has one and only one husband; etc.). And then software has to compute this all—in this case, a Flash presentation of a map, with timeline, on which data from the Salem database is plotted.

Interestingly, in talking with Ben Ray recently about this project and how it changed his research, I heard him say two things (unprompted by me) that resonate with earlier descriptions and discussions of knowledge representations:

1. Research conducted as humanities computing constantly requires you to test your particular assumptions against your general model--you continually validate the particular with respect to the general, and vice versa.

2. Research in such a framework discourages you from accepting and relying on knowledge representations produced by others—in the form of existing indices, for example. You want instead to generate those things on the basis of your data, your ontology, etc..

Figure 3 shows a representation of the data and structure of Jerome McGann’s Rossetti Archive, produced by a computer program (Graphviz)—a visualization of the actual content of the Rossetti Archive and the connections made explicit in the course of systematically applying a knowledge representation to representations of Dante Gabriel Rossetti’s writings and pictures: ergo, a representation of a representation. This visualization shows that the data structure of the Rossetti Archive is more of a lattice than a tree: it has many cross-connections, few dead ends.

This graphic also demonstrates that you can generate complexity by applying some fairly simple rules to a large amount of information, especially when the information is itself originally produced by a human being.

And, finally, Figure 3 is an example of the analytical and expressive power of knowledge representations, and at least one kind of answer to the question “why would you do this?” The computer can deal with far more information than you can, and even though it can't (yet) reason, it can show you opportunities for reasoning you would never find without it.

In short, the intellectual outcomes in all of these examples are as follows:

The process that one goes through in order to develop, apply, and compute these knowledge representations is unlike anything that humanities scholars, outside of philosophy, have ever been required to do. This method, or perhaps we should call it a heuristic, discovers a new horizon for humanities scholarship, a paradigm as powerful as New Criticism, New Historicism, or Deconstruction—indeed, very likely more powerful, because the rigor it requires will bring to our attention undocumented features of our own ideation, and coupled with enormous storage capacity and computational throughput, this method will present us with patterns and connections in the human record that we would otherwise never have found or examined.