Category Archives: Uncategorized

TEI and XML for Humanists

My original blog post is here.

How and why do humanists use programming? XML (eXtensible Mark-up Language) is an accessible but robust syntax for making texts machine-readable, and the Text Encoding Initiative has developed standards for describing texts through XML. These mark-up syntaxes, in conjunction with XSLT (Extensible Stylesheet Language Transformations), allow the humanist to make primary source materials accessible to new kinds of scholars who use technology to see connections in large datasets. Examples of scholarship using these technologies include the Swinburne Archives, the London Lives project, and the Electronic Enlightenment project; both are built on encoding platforms that originate in XML to provide relational data among interconnected documents, often in visual form. This presentation—a report from the 2013 DHOxSS—will contain a hands-on component to introduce faculty to the concept of TEI/XML for projects in the humanities. Resources will be available at cerosia.org (search for “innovations” or “DHOxSS”).

Today, we’ll learn a little bit about what XML and TEI are good for, look at some examples, and then I’ll ask you to try your hand at basic structural markup on a piece of literature. Keep in mind that you can create your own schemas and doctypes, so the sky really is the limit–of course, if not standardized, your work may not be legible to others. But, since we’re only going to be peeping into the abyss, we won’t worry too much about whether your documents are valid or draw accurately on a specific standard. Feel free to make up some tags!

Resources

A Very Gentle Introduction to TEI

Getting Started Using TEI (Oxford)

TEI Structures (Oxford)

TEI by Example

TEI Handout, Poetry Edition (UVa)

Samples that I’m working on (The Tatler No.238; Mary Hays)

Sample highly marked-up Swinburne poem, “On the Cliffs” (should open in browser; if not, right-click, save as, and open with wordpad, notepad, or the equivalent)

Sample visualization created using the encoded “On the Cliffs”, by John Walsh

Presentation on “‘Quivering web of living thought’: Conceptual Networks in Swinburne’s Songs of the Springtides.”  (see slides 16-20 for thematic encoding)

TEI Template (UVa) (copy the template code into a new notepad, wordpad, or equivalent program; we’ll use this to play!)

Visualizing Literary Texts

My original blog post is available here.

Ever wonder how web-based tools and text-based analysis intersect? Come find out how to analyze literature (and text more broadly understood) using a variety of online tools that have minimal learning curves.  I will introduce you to a manageable number of such tools–Voyant, Mandala browser, ManyEyes, and others–and then we will experiment with  them as a group and independently. This demonstration and workshop will be useful for pedagogical and scholarly purposes.

I encourage you to bring a sample text or corpora you want to work with during the session; it should be in either .TXT format, a .ZIP collection of texts, .XML, or, in some cases, a URL that points to a text you want to work with. I will also bring a selection of texts for us to draw on. Online materials will be located at http://cerosia.org–search for keyword “innovations.” This session derives from material covered in the 2012 University of Victoria Digital Humanities Summer Institute.

Coursepack: Online tools for literary analysis

Plaintext and ZIPped corpora

Links to electronic text collections:

Bamboo DiRT (Digital Research Tools)

Bamboo DiRT is a tool, service, and collection registry of digital research tools for scholarly use. Developed by Project Bamboo, Bamboo DiRT is an evolution of Lisa Spiro’s DiRT wiki and makes it easy for digital humanists and others conducting digital research to find and compare resources ranging from content management systems to music OCR, statistical analysis packages to mindmapping software.

TAPoR (Text Analysis Portal) [TAPoR2 Test Environment–try this link if the first doesn’t work]

TAPoR is a gateway to the tools used in sophisticated text analysis and retrieval. Browse tools by type or tag, search and use tools, read and create tool reviews, contribute and advertise your own tools.

Voyant Tools

Voyant is a web-based text analysis environment. It is designed to be user-friendly, flexible and powerful. Voyant is part of the Hermeneuti.ca, a collaborative project to develop and theorize text analysis tools and text analysis rhetoric. This section of the Hermeneuti.ca web site provides information and documentation for users and developers of Voyeur. Note: The original name of the environment was “Voyeur,” which was recently changed given the connotations of “voyeur.” You might see these names used interchangeably. You can also get to Voyant Tools via TAPoR.

IBM ManyEyes

View, discuss, and create data sets and visualizations of data sets using a variety of filters including pie charts, scatterplots, bubble charts, treemaps, word clouds, phrase nets, and more.

Google n-Gram viewer

Read more about the Google n-Gram viewer here. See some sample uses of the n-Gram viewer here. Try it yourself!

Zotero timelines

Make a timeline from your Zotero collections to visualize your research.

Juxta (Collation Software for Scholars): http://juxtacommons.org

Sample visualizations, :

Relative word frequencies of five gothic novels (Voyant)

 

What is Digital Humanities?

That is the question! Fortunately, the Internet is a natural home for the digital humanities. Here is a list of good places to start.

Useful Websites

What Is Digital Humanities? – http://whatisdigitalhumanities.com/ 

DH on Wikipedia – http://en.wikipedia.org/wiki/Digital_humanities

DH questions and answers – http://digitalhumanities.org/answers/

A Guide to Digital Humanities, from Northwestern University Library

Key Essays

“What Is Humanities Computing, and What Is It Not?” by John Unsworth

“Information Technology and the Troubled Humanities,” by Jerome McGann

“What Is Digital Humanities and What’s It Doing in English Departments?” by Matthew Kirschenbaum

“The Productive Unease of 21st Century Digital Scholarship,” by Julia Flanders

“Digital Humanities is a Spectrum; or, We’re all Digital Humanists Now,” by Lincoln Mullen

“Who’s in and Who’s Out,” by Stephen Ramsay

“On Building,” by Stephen Ramsay

“The Digital Humanities Is Not about Building; It’s about Sharing,” by Mark Sample

Open Access Books

A Companion to Digital Humanities, Blackwell Publishing

Literary Studies in the Digital Age, Modern Language Association

Digital Humanities Tools and Resources

Get a broad overview of the tools and resources available to do digital research of all kinds with Project Bamboo’s DiRT: Digital Research Tools. “Bamboo DiRT is a registry of digital research tools for scholarly use. Developed by Project Bamboo, Bamboo DiRT makes it easy for digital humanists and others conducting digital research to find and compare resources ranging from content management systems to music OCR, statistical analysis packages to mindmapping software.”

Want to find a tool to help you visualize data on your own? Try the tools organized on Visualizing Information for Advocacy, which “help[s] campaigners and activists around the world to use information, visual representation and digital technologies in their work.”

ss.vis.tools1

See what other scholars and researchers are doing with data visualization at Information Aesthetics.

IBM’s visualization tools at Many Eyes give you multiple ways of looking at your data–cluster maps, relational charts, word clouds, tables, and more.

Try the Voyant suite of visualization tools for textual analysis. Voyant (previously Voyeur) “is a web-based text analysis environment…designed to be user-friendly, flexible and powerful. Voyeur is part of the Hermeneuti.ca, a collaborative project to develop and theorize text analysis tools and text analysis rhetoric.”

voyant.ss

Another great portal for textual analysis tools is TAPoR 2, out of the University of Alberta.