Networked Scholarship: The Effects of Advanced Technology on Research in the Humanities

John Unsworth

University of Virginia

Unless otherwise noted, items published by the Institute for Advanced Technology in the Humanities are copyrighted by the authors and may be shared in accordance with the Fair Use provisions of U.S. copyright law. Redistribution or republication on other terms, in any medium, requires express written consent from the author(s) and advance notification of the publisher.

This morning, I read in the Boston Globe that QVC is preparing to bid on Paramount, giving up merger talks with Home Shopping Network; Bell South is discussing buying 22% stake in QVC held by Liberty Media in order to provide cash for the Paramount bid. Liberty, who has pledged $500 million toward the $9.5 billion Paramount bid, is itself the subject of a merger proposal from TCI (Tele-Communications Inc.) Liberty holds a controlling stake in . . . Home Shopping Network. TCI and Bell Atlantic just completed one of the largest mergers in history; QVC is bidding against Viacom for Paramount-Viacom and TCI are old corporate enemies: Viacom owns MTV. Paramount owns Fox (heads of Paramount and QVC used to work together on the FOX network), also owns the Knicks and . . . incidently . . . Simon and & Schuster, the largest book publisher in the United States.

I mention this tangled tale from the business section because the largest book publisher in the United States appears in it as a footnote. If we're not fairly aggressive and organized about our role in the "emerging National Information Infrastructure," humanities scholarship will be a footnote as well.

I should begin by describing what the Institute is and does. The Institute for Advanced Technology in the Humanities is intended to advance computer-mediated research in the humanities. It tries to do this by bringing in UVa faculty fellows in residence, and by supporting a number of associate fellows, on grounds and off, each year. The fellows receive equipment, technical support, project-design consultation, and access to shared Institute resources; fellowship at the Institute allows humanists to work with technologists as partners, to participate in the design of software that will meet general needs in humanities scholarship, and to discover the tools and resources that already exist. The Institute encourages cross-fertilization not only between fellows and technologists but also among the humanists from divergent disciplines--indeed, its breadth and collegiality are its distinguishing features.

The Institute was made possible by a major grant from IBM, and by matching commitments from the University of Virginia. Most of our technical development and research efforts are based on the RS/6000 platform, but we have rigorously observed open-systems standards for tagging, image-formatting, and data-collection in general. Our efforts, from both a technical and a humanities point of view, are aimed at the research/telecommunications environment 2-3 years down the road: the technology we are working with is already mass-market, but it is a couple of years from being on everyone's desk.

Current research projects underway at the Institute focus on Dante Gabriel Rossetti, Piers Plowman, the Pompeian Forum, and the Civil War; associate fellows are working on projects involving linguistic morphology, the black death, and the process of scientific invention. My own work at the Institute is focused on turning Postmodern Culture, the oldest peer-reviewed electronic journal in the humanities, into networked hypermedia.

This is the body of work, then, that I would like to use as my example of advanced technology in the humanities, for the purposes of discussing the impact of technology on research and on teaching in the Humanities. In particular, I'd like to focus on the research of Jerry McGann and his Rossetti Archive.

We'll be looking at McGann's research with XMosaic, a client for the World-Wide Web client-server protocol. It's important to note that this protocl is based on a subset of SGML (standard generalized markup language) called HTML (hypertext markup language), and it is also significant that the clients (for Unix, WIndows, and Macs) are freely distributed on the internet. Let me note here that we are looking at a considerably less than optimal display because I wanted to do a real-world demonstration. I'm running a 486 laptop--actually a PS2 pretending to be DOS machine pretending to be a Unix XWindows workstation.

I must also note that I could not sign the Gateways-to-Knowledge copyright form as it stands, because I don't--Jerry doesn't--yet have copyright, or republication permission, for what I am about to show you. In fact, I have set up the server in Virginia so that it will only let people on the Institute's private subnet view these materials, and in order to show them to you, I'm logging on to that machine and sending the display back here to Harvard.. Copyright, then, is (depending on your point of view) one of the major impediments to, or enablers of, networked research in the humanities, especially in the humanities. If a library or museum sets prices for rights based on the book model--20 or so plates, a few thousand copies, a high per-image price--it won't scale up to projects with thousands of images. It's also not clear that, by digitizing or manipulating digital images, new copyright may not be created.

Enough technical and legal digression. Rossetti is an ideal subject for hypermedia research, not only because, as a poet, painter, and book-craftsman, Rossetti was a multimedia artist, but also because of his interest in the connections between poems and paintings, texts and other texts. Dante Gabriel Rossetti was conceptually a hypertext--a hypermedia--author, generations ahead of his time.

In the Rossetti Archive, Jerry has begun to bring together all of the material that make up Rossetti's work, and a healthy slice of the other books and images which informed that work. It is worth noting that, with a work such as The House of Life--Rossetti's sonnet cycle--that the work may exist in may versions, many edition, revisions, etc. The same is true of what may be more familiar works--Whitman's Leaves of Grass, Wordsworth's Prelude, and others.

As Stephen Brier's history text demonstrates, books are hypertexts already, and as I've argued in print, the popular opposition between print-literacy and computer literacy--and especially the dire predictions of what we will lose by moving from one to another--are based on many uninformed and unexamined assumptions.

What we have in the Rossetti archive, then, represents a shift in paradigm, but not a radical break. It takes the notion of the book--an idea whose form has seemed to be fixed for only a couple of hundred years--and subsumes its metaphors and operations under a companion metaphor, that of the archive. If, in a critical edition, every text potentially refers to every other text, and to non-text objects in an archive, then in the electronic version, all stated of the manuscript and all connections with other texts and images are potentially available.

Here is where the limitation of a Voyager-type project, at least at the moment, becomes clear. No time soon will stand alone media overcome significant limitations in storage space, at least when considering archives that will total 20, 50, or 100 gigabytes (hundreds of CDs worth), nor will it be possible, any time soon, to jump from one stand-alone database/program to another. On the other hand, networked multimedia has distinct limitations as well: MPEG movies, for example, currently have no sound track, are difficult to produce or convert from other formats, and don't look as good as Quicktime clips.

On the net, there a layers and layers of protocols running to enable what I'm doing--TCP, LAN workplace, winsock, IP, perhaps Z39.50--each handling a different aspect of getting the information from where it is to where I am. We need, in planning academic use of the nets, to think in similar terms about the distribution of tasks. I'd argue that campus computing needs to take care of communications standards and protocols, librarians need to take care of information formatting (tagging, file formatting, access methods), cataloguing (how to find things), collection development (what to highlight for users) and archiving (as opposed to backing up), and scholars need to attend to quality, peer review, presentation (how data tagging gets used). Finally, there need to be a handful of experimental centers like the Institute, where there is a concerted effort to develop the tools that researchers will need--in our case, scholars in the humanities.

From my point of view, one of the most important, most revolutionary implications of this kind of work is the potential for the archive to be the occasion for collaboration, and through that collaboration, to grow and develop over time. In addition, when the originary scholar or scholars are finished with the archive--if ever--what will be published is not only the result of research (the book, as it were), but also the research material itself. It is as though all notes, records, and findings were published with the book--and remained there, to be expanded and used for different purposes. And, in principle, those who use such an archive could also propose the addition of new material to the editor(s) of the archive, and related archives could be linked to one another.

For students, this means not only that they will have greatly expanded access to primary documents, but also that they will have a more meaningful opportunity to apprentice in the discipline. In addition, with a few exceptions such as MPEG, where good authoring tools don't yet exist, networked hypermedia archives are easy to assemble. I taught my grad assistant, a first year masters student in English with little computer training, to do the basic HTML markup in about an hour. This means that students can easily publish their own research results, a logical outcome of apprenticeship.

I'm neither a techno-optimist or techno-determinist, but to some extent technologies do have a character. The telephone is a one-to-one communications technology; TV and radio are one-to-many technologies. Computer networks are inherently many-to-many communications medium. This in itself will change research and teaching more than anything else. One immediately available consequence of that many-to-many capability is real-time conferencing facilities. Coming up on the screen you will see IATH-MOO, the Institute's real-time, text-based virtual conferencing facility. While we browse the Rossetti Archive in one window, we can be in the Rossetti Room of the IATH MOO in another window, talking to others who are looking at the same materials. This might be compared to the opportunity one has of meeting another reader in the library stacks, and it speaks to the questioner from session #1, who associated collaborative learning space in physical library facilities with the potential for electronic collaboration. The edited online hypertext archive can, if we wish, have a common room.

In closing, I'd like to pull back to the frame, and remember that QVC and Home Shopping are also in here somewhere. The great threat is that while we argue with one another about the degenerative effect of computers on reasoning or the relative merits of print and pixels, billions of dollars are concentrated on laying down a national information infrastructure with no better idea of what to do with this fabulous, historically unprecedented, mind-blowing medium for cultural growth than to bring you the Domino's Pizza Channel. But that's also the great opportunity. Katherine Hayles uses an old-fashioned metaphor of information as a matter of conduit and content: the NII is conduit in need of content. We have a completely revolutionary opportunity to bring the content of academic research and teaching to a mass audience. If Brier's Who Built America or McGann's Rossetti Archive or Ed Ayers' Civil War project were available on our souped-up, interactive television, don't we suppose that at least some people would choose it over pizza?

IATH WWW Server

Last Modified: