Pliny and WordHoard

In a talk at the University of Illinois entitled Pliny: 4 perspectives, I tried to answer the question "Pliny: what is it?" in four fundamentally different ways:

as a software tool to support notetaking and interpretation development traditional humanities research,
as a way to think about software development for humanities tools that promotes more intimate interaction between those tools,
as a device to promote thinking about some aspects of what interpretation is about in the humanities, and
as a prototype, and as a piece of practice-lead research

As a result of Pliny’s winning of an MATC award from the Mellon Foundation in late 2008, we were able to take on a project focusing on Pliny in the second sense, and to allow us to explore (both from a technical/development sense, but also in terms of the user’s experience) some new ways that separately developed digital applications can work intimately together to support scholarship with Pliny. The article Playing together: modular tools and Pliny described some of the background to this, and explained how Pliny’s modularity grew out of and was built on top of Eclipse’s plugin architecture. My interest in Pliny’s modularity, and the evident potential of the Eclipse/plugin framework to both support tool modularity through a new way of thinking about integration between tools, grew out of my evolving understanding of the nature of traditional scholarship and the implications of this for digital resources of all kinds.

Readers of the "Playing Together" paper may well recall Vannevar Bush’s observation in his famous article As We may Think (in section 6, where he introduces the idea of the Memex machine) that the mind works to form associations between objects. As we all know, this key insight fuelled a great deal of the research and innovation in hypertext which has, through the WWW, had a profound affect on the role of computing in our daily lives. But it is important to realise that much of this work – particularly the WWW itself – was not entirely true to Bush’s vision. One of the important differences was in the Memex’s ability to allow its user to him/herself create connections (described by Bush as “trails”) between objects. His Memex machine allowed material from different documents to be projected at the same time in different screens on the machine, and the scholar could at any time be able to establish a link between them – the machine allowed one to make personal mental association visible and permanent. At any point in the future the Memex would be able to present these two pages as linked together. Inspired, then, by the Memex, Pliny has been built in such a way that the individual research can link two documents together. In the digital world materials of interest come in more than one digital media, so the scholar using Pliny is able to link a web page to a page in a PDF document, or can link a portion of an image to notes s/he has taken about a printed book, or other combinations among the media that Pliny supports.

Although Pliny, as it comes out of the box, supported Web page annotation, PDF’s and images (and notetaking for materials that are not digital), it was written in such a way that it could be extended to support other digital media as well, such as video, audio, or 3D objects (indeed, the mechanisms to import new Pliny-compatible components that were added as a part of the WordHoard support to the Pliny Workbench are exactly the one that, were a component to play and annotate video in the Pliny fashion written, it could be added by any user who needed it). By providing these tools in an integrated context of notetaking and the subsequent handling of these notes outside of their original context, Pliny drew our attention away from a focus on the software development of purely annotation components added to, say, websites that support these different media, and towards the purpose of creating these personal annotations in the first place: to support personal scholarship by recording and presenting connections between materials as they arise in the mind of the scholar through his/her work, by recording original thoughts (as original annotations) as these objects are studied, and then as a way of incorporating these thoughts into a structure of interpretation that may (indeed, often will) incorporate personal insights that arise from the reading of a range of separate documents. Annotation tools attached to any single web-accessible resource cannot be equipped to support an individual’s research that in fact ranges over a range of different materials from different sources.

Pliny was never going to be a website, but an application that ran on its user’s machine. It was like, say, Word, and had to be installed on the user’s personal computer. This allowed it to be more flexible about the kinds of resources it could work with, and (by not being served, itself, from the web) allowed these materials from different resources and scattered from different places on the Internet as well as personal objects not served over the internet at all to be brought together.

My thinking about Pliny and its potential to be readily extended to support other kinds of media made it evident that Pliny’s support for annotation need not be limited to relatively static media such as web pages or PDF files or digital video, but could extend to the displays generated by potentially more complex, interactive, and independently developed applications, as long as the developer of each of these applications wrote it in such a way that it was "Pliny-aware". See the following figure, which shows schematically how this would work:

Pliny as application integrator

This figure shows different applications (shown as orange and green boxes) that present separate media objects to the Pliny user. The top three orange applications (which were already incorporated into basic Pliny) present PDF and web pages within Pliny and, by being “Pliny aware”, are able to support their annotation (shown as yellow boxes). These yellow boxes, as well as being annotations attached to a Web page or a page of a PDF file, could also be referenced, linked, and displayed, in other contexts – shown as being assembled and connected to new personal concepts in Pliny’s application area. Pliny, then, provided a kind of glue (or a kind of web) that connected references to documents of various kinds to the user’s set of ideas that are also stored in Pliny.

Note, however, the bottom two green boxes to the left and right of the Pliny box. These represent two Pliny-aware applications (an imaginary “App A”, and the real WordHoard) that, based on the work done in this project, would be able to be added by anyone to their Pliny installation. Here, instead of being applications that display static material in media files, they represented dynamic applications that the user could invoke and then use to explore other kinds of data. In this view of the integration between the application and Pliny, displays that these applications generated from their data could also be annotated, and these annotations could also be integrated into the user’s set of ideas that are represented within Pliny.

The Mellon MATC prize allowed this issues behind this idea to be explored. Was the integration of a full, and complex application, with Pliny to handle notetaking within that application, really practical? How did the act of supporting annotation in the Pliny context affect how the application had to be written? What, if any, were the technical constraints under which such an application would have to be written if it was to support personal annotation of the results it generated, and now onerous was it for developers? We had already explored the development of extensions to Pliny that would allow a user to annotate a GoogleMap, and to work with image data from the image archive provided through the Victoria and Albert’s public API (http://www.vam.ac.uk/api) – but these applications where exploratory in nature, and hence both relatively small and based on only a subset of the full potential of the mechanisms upon which they were based. Could this idea really work when the application was more complex?

I was aware of Martin Mueller and Northwestern University’s WordHoard project before the MATC award had been granted, and had wanted even then to try out the idea of developing a version of WordHoard that would integrate with Pliny. Here was software that, instead of running as a web application in a browser, like Pliny ran as a Java application. Its orientation towards allowing the user to browse and search documents, and to perform various kinds of word-oriented searches on them, plus its host of different kinds of presentations that could arise from this word-oriented work was an excellent trial application. As a result, I proposed to Mellon that the money that they had awarded to Pliny would fund a developer half-time for about two years who would take the WordHoard code and gradually adopt it so that it could run in the context in which Pliny ran, and that could support the Pliny-supported annotation of its displays.

What is the difference between building applications as plugins in the Eclipse manner, and building an application using conventional technologies such as Java/Swing? With conventional Java applications one can indeed include components that come from other developers. However, these separate components often disappear inside the larger packaging—the main application becomes a "Borg application", reusing software development work from others as a way to implement aspects of the software that they need, but hiding them inside its own packaging. Like the Borg on Startrek the aims of the enveloping software projects take over these applications to serve their needs, but then hides them inside its own packaging. In some sense the master project is a big tent containing many different parts that it has swallowed up, and users will see primarily the enveloping application as the thing they are using. Pliny, by virtue of operating in the Eclipse/plugin framework, is not a Borg application in this sense. In the same way as the different applications shown as boxes in figure I would continue to be visible to the user and could share screen space with each other, they can still interact with each other at their boundaries. This supported the interesting nature of applications built this way: the components can be built separately and yet interact much more closely together.

The Pliny/WordHoard project first extended the Pliny standalone software so that it incorporated the Eclipse-provided mechanisms to allow new components to be imported. Then, the WordHoard code was packaged up as a "feature" with associated plugins that could be imported using this mechanism. When this was done, Pliny showed a new menu item which opened WordHoard's startup "Table of Contents" screen. From there, the user could use WordHoard and take advantage of Pliny's notetaking tools at the same time. In the code we implemented a number of different WordHoard displays, but the screenshot below shows one of them to give the idea:

Pliny and WordHoard together

Further information about the Pliny/WordHoard integration can be found at Bradley and Hill 2011, and Bradley 2012.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Pliny and WordHoard

Clone this wiki locally