The Walt Whitman Archive – Greene 3
January 2011
3What about the digital form – as opposed to working with the materials in analogue form, for example – works well for you, and what does not? How does this site’s digital form contribute to the archive’s strengths and weaknesses?
Mark A. Greene
Director, American Heritage Center – University of WyomingPast President – Society of American Archivists
¶ 1 Leave a comment on paragraph 1 0 The digital form highlights a question faced by every project or repository working to build online digital collections from analog originals. The digital form permits creation of extremely high density and equally high quality facsimiles,[ref]Scanning requirements are in “Technical Summary,” TWWA.[/ref] making it possible to determine physical characteristics of the original impossible to consider during the era of microfilm or photocopying. For example, project transcribers can apparently discern
- ¶ 2 Leave a comment on paragraph 2 0
- ink blots, smudges, and stray pen marks;
- pin holes;
- embossing;
- variations in ink, pencil, or paper;
- distinctions between single and multiple overstrikes[ref]From “5.9.1 What Not to Encode,” TWWA.[/ref]
¶ 3 Leave a comment on paragraph 3 0 While these are “what not to encode,” they denote the expectation that facsimiles supplied to the transcribers are capable of producing such details. As an extra measure of quality, “project staff examine each digital image [when creating JPEGs from original TIFF files] cleaned and cropped for web delivery.”[ref]“Technical Summary,” emphases added.[/ref] The intricate encoding guidelines themselves are also part of making high quality facsimiles accessible.
¶ 4 Leave a comment on paragraph 4 0 The opportunity cost of demanding extremely high standards encompasses both a slow pace of making facsimiles accessible to users and the ability to create a complete archive of Whitman material—this latter price due to the inevitable (though not supportable) decision by some repositories to decline participation if it requires public accessibility to such high quality scans.[ref]See “History of the Project,” TWWA, for brief discussion of some libraries’ requirement to limit user access to 72 dpi facsimiles.[/ref] There is nothing inherently logical about sacrificing a complete collection for highest-quality facsimiles. Nor is there innate reason for choosing quality over quantity, particularly since one can structure a project whereby “inferior” facsimiles are placed online relatively quickly while higher-quality facsimiles are posted, gradually, later. The question of whether (and which) users are best served by large quantity rather than high quality is one accurately answered only by asking users themselves. Surveys or focus groups for gathering user input have not been employed by The Walt Whitman Archive it appears—but neither have they been used by the vast majority of projects and repositories creating online collections.
¶ 5 Leave a comment on paragraph 5 0 For a decade now I have been actively involved in projects to increase the quantity of archival material accessible to users at the expense of traditional notions of quality. What began as an effort to speed processing of large modern manuscript (hidden) collections has evolved into jeremiads on trading speed for precision in most other aspects of archives administration as well—including the creation, description, and mounting of digital facsimiles online.[ref]See, particularly, Mark A. Greene and Dennis Meissner, “More Product, Less Process: Revamping Traditional Archival Processing,” American Archivist, 68:2 (Fall/Winter 2005), 208-263 and Mark A. Greene, “MPLP: It’s Not Just for Processing Anymore,” American Archivist 73:1 (Spring/Summer 2010), 175-203.[/ref] In some cases, including processing and digitization, there is direct evidence that some users prefer “more” over “better” at a ratio of about two-to-one. Far more research is required, certainly, and we should never assume that what holds true for one project/repository and its audience holds equally true for all. But we should certainly be more anxious to determine the answers than we have been heretofore.
I don’t have any suggestions to make, but I’m really drawn to your comment that “there is direct evidence that some users prefer ‘more’ over ‘better’ at a ratio of about two-to-one.” Wow. There’s a two-to-one preference for more information rather than better information. That’s a stunning figure.
The Whitman Archive strives to get high resolution tiff images from repositories, though we typically present jpeg derivates on the site. When transcribing tricky manuscripts, the higher quality images are useful to our staff for enlargement. We also use the tiffs for creating much higher quality than usual jpeg images used with our zoomify feature on selected poetry manuscripts. In some cases, we have not been able to obtain sufficiently high quality images to make the zoomify feature worthwhile.
Throughout the site we have gone ahead and posted lower quality images when that is all we can get–perhaps because the only surviving copy of a newspaper is in microfilm form or because a repository is for whatever reason unwilling to provide a high quality image.
This paragraph raises many complicated issues, including whether our primary concern should be what most users want (still not known, of course) or our sense of what care is warranted by the material itself.
Different users interest in the material is going to affect whether they prefer quantity over quality.
At the Whitman Archive, we also worry that if we privileged speed of production and quantity of output over the quality of the work, that we could be charged with using public money to produce slipshod work. The dangers then would be that someone else would need to come along and seek additional public money to redo the work to create reliable texts. Or, the work would not be redone at all and people would be uncertain whether they could trust anything on the site.
Erm well, if you don’t know what users want, why not ask, or collaborate with people who will do so on your behalf? (then again I would say that, since it’s what I do) Seriously though Mark is right, all my experience suggests that most people tend to want more not better. But then there is of course a conflict in terms of preservation quality and that of markup and editorial standards. Most users also want to know that resources will still be there in a few years when they come back for them as well, which of course relies on high quality data and metadata standards in digitisation. Scholars and archivists tend to want one thing and users the other. There is, however, a difference between raw digital content and edited resources such as this. Users want lots of raw content, but scholarly users expert in a particular area do appreciate the importance of properly edited materials. So here again there is a conflict, potentially between your expert users and those who are making less complex use of the materials. The only way to make informed decisions about how to handle this is of course a user study!
I just want to suggest that it’s not as clear cut as scholars prefering detail and other users prefering breadth. When we asked 600 users from verious H-Net and other lists whether they prefered fewer but more detailed finding aids or a greater number of less detailed finding aids, those prefering the latter were 2 to 1 over the former. The respondent pool had its largest single group, approximately 1/3 of the total, identify as college or university faculty members. The second highest group was “other,” unfortunately; the third highest group was graduate students (about a quarter of respondents). We haven’t yet done the necessary regressions to be certain that faculty favor breadth over depth in this instance, but given their preponderance in the pool it would be very surprising if such were not the case.