Tagsets

In its early phase the Orlando team created a new SGML application for encoding the text it was to produce. To produce the Orlando history, it created four main document type definitions (DTDs) or tag sets (which have now been converted to RelaxNG schemas). These schemas structure our information and embody our literary theory. They are the key connection between the literary and the computing in Orlando. Their distinctive feature is their relative emphasis on “semantic” or “content” tags. Produced in extensive analysis involving the literary and computing members of the Orlando Project team, the tagsets are the key connections between the literary and the computing sides of Orlando.

The distinctive feature of the Orlando tagsets is their relative emphasis on “content” tags associated with the knowledge domain of literary history, and the high proportion of these to the “formal” tags associated with sections, paragraphs, and the like. While the formal tags in Orlando are based on the Text Encoding Initiative (TEI) Guidelines, the content tags were devised by the Orlando Project. In total, there are 205 tags, 114 attributes, and 635 attribute values in the Orlando schemas. Those unique to Orlando are concentrated in the Life and Writing tagsets.

All of our schemas include an identical range of structural tags (to handle, for instance, short prose, scholarly notes, or embedded chronological items), and an identical set of “core” tags (for material such as names, titles, dates, places, and organization names which are crucial for hyperlinking and other processing). The heart of the project lies in the two most complex schemas, those paired for encoding discussions of writers’ lives and their texts and literary careers.

Life Schema
The Life schema has a fairly regulatory, hierarchical structure. It structures documents using sixteen major semantic categories of information marked by tags.

Life tags

Many of these tags have unique subtags, such as award and instructor (under education), or geographical heritage and religious denomination (under cultural formation), and/or attributes, such as institution level (on school), or relation to (on location).

Writing Schema
The Writing schema has a flatter, more flexible structure. It structures documents using just three major semantic categories of information marked by tags – Production, Textual Features, Reception. These tags occur repeatedly within a single document, encoding discussions of particular texts or groups of texts.

The Writing schema also defines additional interpretive tags nested within tags in the three major semantic categories.

Production tags

Textual features tags

Reception tags

The Writing schema makes extensive use of inclusions to allow all the tags from any of the three major semantic categories to be used within another one. And, like the semantic tags in the Life schema, many of the Writing tags have unique attributes (such as, on the intertextuality tag, intertext type or gender of author).

An Example: Intertextuality
The intertextuality tag is used to encode discussions of gestures in a text towards other texts. It has two optional attributes: Intertext Type and Gender of Author.

The Gender of Author attribute identifies the the gender of the author who wrote the intertext.

The Intertext Type attribute specifies different kinds of intertextual relationships: Allusion Acknowledged (where the textual debt is explicitly flagged), Allusion Unacknowledged (where textual parallel is left unsignalled), Quotation (words or phrases directly repeated, generally in a prominent position, like title or title-page), Misquotation (deliberate or accidental alteration of words quoted), Parody (a close formal copy, designed to make fun), Satire (a broader send-up of another text), Imitation (adopting characteristics of earlier text as pattern), Adaptation-update (re-telling or ingenious recasting), Prequel (a text designed to come before its related text), Continuation (a sequel), and Answer (a riposte).

Here are some examples of the Intertextuality tag drawn from the Orlando textbase. (Not all of the tagging has been revealed here: all names and titles are also tagged.)

<TINTERTEXTUALITY INTERTEXTTYPE = “ALLUSIONUNACKNOWLEDGED” GENDEROFAUTHOR = “FEMALE”>Critics have not infrequently likened Margaret Oliphant’s Phoebe Junior to Jane Austen’s Emma.</TINTERTEXTUALITY>

<TINTERTEXTUALITY INTERTEXTTYPE = “ADAPTATION-UPDATE” GENDEROFAUTHOR = “MALE”>The title poem imitates the underworld journey of Virgil’s epic hero, but in a female version. Sappho is [Eavan] Boland’s guide on this journey, as Virgil was Dante’s.</TINTERTEXTUALITY>

<TINTERTEXTUALITY INTERTEXTTYPE = “QUOTATION” GENDEROFAUTHOR = “FEMALE”>Critic Rees-Jones sees in the title of Carol Ann Duffy’s Fifth Last Song: twenty-one love poems a reference to Adrienne Rich’s “Twenty-One Love Poems” in A Dream of a Common Language, 1978.</TINTERTEXTUALITY>

<TINTERTEXTUALITY INTERTEXTTYPE = “ANSWER” GENDEROFAUTHOR = “MALE”>She was replying to a number of authoritative male texts about the nature of women: by Burke, Rousseau, and behind them Milton.</TINTERTEXTUALITY>

<TINTERTEXTUALITY INTERTEXTTYPE = “ANSWER”>Henrietta Battier directed her work against the author of a recent publication entitled The Orange, whom she calls Dr Bobadil.</TINTERTEXTUALITY>

<TINTERTEXTUALITY INTERTEXTTYPE = “ALLUSIONACKNOWLEDGED”>Dorothea Primrose Campbell was one of those claiming serious status for the novel by literary allusion.</TINTERTEXTUALITY>

<TINTERTEXTUALITY INTERTEXTTYPE = “PARODY” GENDEROFAUTHOR = “MALE”>Eliza Fenwick’s character Lord Filmar, a rake who models himself on Richardson’s Lovelace, is too frivolous to pose any real threat.</TINTERTEXTUALITY>

<TINTERTEXTUALITY INTERTEXTTYPE = “IMITATION” GENDEROFAUTHOR = “FEMALE”>Lady Mary Wortley Montagu’s romance is modelled on Aphra Behn’s Voyage to the Isle of Love, whose emblematic geography comes in turn from Scudéry.</TINTERTEXTUALITY>