TEI and XML for Humanists

My original blog post is here.

How and why do humanists use programming? XML (eXtensible Mark-up Language) is an accessible but robust syntax for making texts machine-readable, and the Text Encoding Initiative has developed standards for describing texts through XML. These mark-up syntaxes, in conjunction with XSLT (Extensible Stylesheet Language Transformations), allow the humanist to make primary source materials accessible to new kinds of scholars who use technology to see connections in large datasets. Examples of scholarship using these technologies include the Swinburne Archives, the London Lives project, and the Electronic Enlightenment project; both are built on encoding platforms that originate in XML to provide relational data among interconnected documents, often in visual form. This presentation—a report from the 2013 DHOxSS—will contain a hands-on component to introduce faculty to the concept of TEI/XML for projects in the humanities. Resources will be available at cerosia.org (search for “innovations” or “DHOxSS”).

Today, we’ll learn a little bit about what XML and TEI are good for, look at some examples, and then I’ll ask you to try your hand at basic structural markup on a piece of literature. Keep in mind that you can create your own schemas and doctypes, so the sky really is the limit–of course, if not standardized, your work may not be legible to others. But, since we’re only going to be peeping into the abyss, we won’t worry too much about whether your documents are valid or draw accurately on a specific standard. Feel free to make up some tags!


A Very Gentle Introduction to TEI

Getting Started Using TEI (Oxford)

TEI Structures (Oxford)

TEI by Example

TEI Handout, Poetry Edition (UVa)

Samples that I’m working on (The Tatler No.238; Mary Hays)

Sample highly marked-up Swinburne poem, “On the Cliffs” (should open in browser; if not, right-click, save as, and open with wordpad, notepad, or the equivalent)

Sample visualization created using the encoded “On the Cliffs”, by John Walsh

Presentation on “‘Quivering web of living thought’: Conceptual Networks in Swinburne’s Songs of the Springtides.”  (see slides 16-20 for thematic encoding)

TEI Template (UVa) (copy the template code into a new notepad, wordpad, or equivalent program; we’ll use this to play!)

