Tag Archives: dh

Juxta Collation of Chaucer's Prologue to the Legend of Good Women


This Juxta Commons collation compares the B-Text and the A-Text of Geoffrey Chaucer’s 14th century Prologue to the Legend of Good Women. The legends were composed around 1386, but the prologue was written earlier and substantially revised later. It is therefore available in two versions, the A-Text and the B-Text. The B-Text is generally considered the final version, but there are substantial differences between the prologues, both in organization and content. The A-Text exists only in manuscript (Cambridge University Library Gg. 4.27) and is sometimes designated “C.” The B-Text follows the Fairfax manuscript, and is sometimes designated “F.” The plaintext files are sourced from the Online Medieval and Classical Library, prepared and edited by Douglas B. Killings. Thanks to my colleague Katie Peebles for her inspiration!

Heatmap: B-Text as Base, showing differences from A-Text

Screenshot: Side-by-Side Comparison + Histogram

View a live side-by-side comparison of A-Text and B-Text here.


Innovations 2012

Ever wonder how web-based tools and text-based analysis intersect? Come find out how to analyze literature (and text more broadly understood) using a variety of online tools that have minimal learning curves.  I will introduce you to a manageable number of such tools–Voyant, Mandala browser, ManyEyes, and others–and then we will experiment with  them as a group and independently. This demonstration and workshop will be useful for pedagogical and scholarly purposes.

I encourage you to bring a sample text or corpora you want to work with during the session; it should be in either .TXT format, a .ZIP collection of texts, .XML, or, in some cases, a URL that points to a text you want to work with. I will also bring a selection of texts for us to draw on. Online materials will be located at http://cerosia.org–search for keyword “innovations.” This session derives from material covered in the 2012 University of Victoria Digital Humanities Summer Institute.

Coursepack: Online tools for literary analysis

Plaintext and ZIPped corpora

Links to electronic text collections:

Bamboo DiRT (Digital Research Tools)

Bamboo DiRT is a tool, service, and collection registry of digital research tools for scholarly use. Developed by Project Bamboo, Bamboo DiRT is an evolution of Lisa Spiro’s DiRT wiki and makes it easy for digital humanists and others conducting digital research to find and compare resources ranging from content management systems to music OCR, statistical analysis packages to mindmapping software.

TAPoR (Text Analysis Portal) [TAPoR2 Test Environment–try this link if the first doesn’t work]

TAPoR is a gateway to the tools used in sophisticated text analysis and retrieval. Browse tools by type or tag, search and use tools, read and create tool reviews, contribute and advertise your own tools.

Voyant Tools

Voyant is a web-based text analysis environment. It is designed to be user-friendly, flexible and powerful. Voyant is part of the Hermeneuti.ca, a collaborative project to develop and theorize text analysis tools and text analysis rhetoric. This section of the Hermeneuti.ca web site provides information and documentation for users and developers of Voyeur. Note: The original name of the environment was “Voyeur,” which was recently changed given the connotations of “voyeur.” You might see these names used interchangeably. You can also get to Voyant Tools via TAPoR.

IBM ManyEyes

View, discuss, and create data sets and visualizations of data sets using a variety of filters including pie charts, scatterplots, bubble charts, treemaps, word clouds, phrase nets, and more.

Google n-Gram viewer

Read more about the Google n-Gram viewer here. See some sample uses of the n-Gram viewer here. Try it yourself!

Zotero timelines

Make a timeline from your Zotero collections to visualize your research.

Juxta (Collation Software for Scholars): http://juxtacommons.org

Sample visualizations, :

Relative word frequencies of five gothic novels (Voyant)



ASECS 2012 Proposal: Student-Curated Web Archives and the Practice of Public Scholarship

This is the proposal for my 2012 ASECS talk; I’ll post the full (and very different) piece soon!

The process of creating sound public knowledge shares a great deal with the knowledge-making procedures in the arts and humanities.  These procedures include interpretation, judgment, imagination, and expression….  In this respect, then, the humanities scholars are natural allies for the public….  In strengthening the public sphere, they can shore up their own place in a society that sees little need for them.”

— Noëlle McAfee, “Ways of
Knowing: The Humanities and the Public Sphere”

I wanted to open with this quote from Noelle McAfee’s “Ways of Knowing” because it gets at something central to what we do, I think, as scholars and teachers of literature–and, in many ways, what we do as scholars and teachers of 18th century literature. If we believe, as John Guillory has shown, that the cultural capital underwritten by English departments today is no longer that of a shared body of knowledge that distinguishes the educated and the elite, but instead that of a set of skills, with writing front and center, then McAfee’s point is even more well-taken. She writes that the “knowledge-making procedures in the…humanities” include “interpretation, judgment, imagination, and expression.” These are remarkably similar to what characterizes the creation of “sound public knowledge.” In both cases, it is not so much a question of what is studied as how it is studied, because the “it” is never completely distinct from the “how.”

Indeed, one of the things I routinely encounter as a teacher of everything from composition to Restoration and 18th-century theater and advanced research methodologies is the desire students have to see subject matter or content as distinct from the form and the structure through which it is represented. By turning students into knowledge-creators, especially public, self-conscious knowledge-creators, we can help overcome this shortsightedness–which is itself a product of an educational system that teaches to the test. Encouraging students to see their work as something that not only exists in and as part of the public sphere, but also itself offers a clear contribution to a scholarly conversation presents one way to transform students into self-conscious knowledge-creators. Technology may pose as many problems as it offers solutions, but with judicious choice and thorough familiarity, some tools can make this transformation less radical and more revelatory.


In “Making Connections: The Humanities, Culture and Community,” part of the findings of the ACLS’s National Task Force on Scholarship and the Public Humanities, James Quay and James Veninga explore the relationship between the humanities, institutions of higher education in the liberal arts tradition, and civic engagement. Considering the radical cultural changes shaping our world today, Quay and Veninga note that the greatest “test of…democracy” is located in “enriching public conversation and extending participation in this conversation to all Americans.” The most central challenge facing higher education today, they find, is overcoming the sense and practice of a divide between academic scholarship in the humanities and public engagement. And yet, this divide is not insurmountable; it is more accurate, and indeed more useful, “to consider scholarship and the public humanities not as two distinct spheres but as parts of a single process, the process of taking private insight, testing it, and turning it into public knowledge.”

This process is most visible when (excuse the generalization) the Ivory Tower meets Joe Public: in a crowded DC museum, in an open, collaboratively-produced web archive like The Hurricane Digital Memory Bank, in a prison performance of The Tempest organized jointly by faculty, students and the incarcerated. In “The Humanities and the Public Soul,” Julie Ellison puts it this way: “The specific importance of public scholarship in the arts and humanities is to provide purposeful social learning, spaces where individuals and groups with ‘trustworthy knowledge’ convene to pursue joint inquiry and invention that produces a concrete result.”

Works Cited

Ellison, Julie. “The Humanities and the Public Soul.” Imagining America: Reports and Resources. 27 July 2011. <http://www.imaginingamerica.org/IApdfs/Ellison.HumanitiesPublicSoul.pdf>.

McAfee,  Noëlle.  “Ways of Knowing: The Humanities and the Public Sphere.” Standing with the Public: The Humanities and Democratic Practice.  Kettering Foundation Press, 1997.

Quay, James, and James Veninga. “Making Connections: The Humanities, Culture and Community.” American Council of Learned Societies.  27 July 2011. 1990. <http://archives.acls.org/op/op11quay.htm>.

Some Omeka Tips and Tools

What is Omeka?

Essentially, Omeka is an open-source and extensible software tool that allows you to create digital archives and collections of resources. For instance, a museum might want to create an accessible web-based repository of some of their collections in a way that makes research (or just more information) about them possible without being physically present in the museum. This archive might include, in addition to a high-quality image of the item, a descriptive essay and other detailed information about the object. A curator might even select a variety of paintings, decorative objects, sculptures, and so on to include in the web archive according to a thematic logic. Conversely, an oral history project might use Omeka to collect, maintain, and make accessible the various audio recordings, videos, and/or transcripts collected as the project continues. Omeka can make these resources into quality primary source materials for scholars, teachers, and students across the globe to work with. You might find this site, from Teaching History, informative–it includes a variety of sample uses for the tool.

Here is a very brief video introducing Omeka, put together by the folks at George Mason’s Roy Rosensweig Center for History and New Media who created the software.

Omeka can also be a very useful tool to stimulate student collaboration and to dramatize some of the basic methods, practices, and preconditions of scholarship. As a tool to help students learn the nature of research from a perspective invested in the architecture of knowledge, Omeka is most potentially interesting to me as a means for teaching research methodologies (especially how information is organized, what that means for conducting research, and how that might help us create our own knowledge more effectively) and scholarly collaboration. One of the challenges we often face as teachers of students at all levels is a certain taken-for-grantedness about knowledge–it just “is” or someone (not really a person, subject to history and ideology) creates it, and I look it up so I can use it in my essay. [more/revise]

Omeka vs. Omeka.net?

There are two versions of Omeka that one can use. The first is less flexible but it has the benefit of not requiring much knowledge on the user’s part. You can create a free account at Omeka.net and create archives from that centralized installation of the platform.

Each site is allocated a certain amount of space, and the process of creating an archive is fairly straightforward. You can invite multiple other users to collaborate with you on the creation of your site–however, those invited users will have to sign up for a free plan with Omeka.net, and the way to move through that is not intuitive. Your student, once clicking the accept invitation link sent via email, will be taken to a page that requires them to sign up–but there is no information indicating whether this will then connect them as users to your site.

Newly-invited users will be directed to this page when clicking the emailed link. Note the different levels of service. If possible, you should have your institution reimburse you for a more robust Omeka.net account.

New users will be taken to a page with the sites they’re contributors or creators of, but they’ll have to sign up first. (A free plan allows you to create one site, but I believe you can be a contributor to multiple sites.) Be sure to inform your students to fill out the signup fields responsibly–real names and appropriate usernames only! It may be helpful to encourage students to use their institutional usernames.

This is what your newly-invited user will see after having signed up for a user account with Omeka.net


The second is much more flexible, but it requires the user to download and install Omeka on her own server (it requires supporting resources, like MySQL, PHP, and so on). With an individual instllation of Omeka, you can also activate any plugin you would like. There is a robust online community–I can particularly recommend using the hashtag #omeka on twitter–but you will be responsible for maintaining the installation, adding and updating plugins, making any tweaks to the code that will generate just the site you want the world to see. If something on your server doesn’t work or isn’t configured properly, then your install may not exhibit full functionality. For beginning users, Omeka.net is probably a better way to go.

Here is an excellent example of a public collaborative memory bank created with Omeka, the Hurricane Digital Memory Bank, which preserves personal stories about hurricanes Rita and Katrina.

Here are two of my personal installations of Omeka–one, I used as a trial run in a research methodologies course, and one, I am currently working on as a digital face for my university’s small special collections room. In the future, I plan to revise the research methodologies course around this second project–though I will probably move to an Omeka.net account instead of hosting it on my own server (who has the time to troubleshoot–or get friends to help you out?) Students in the class will ultimately be responsible for slowly populating the archive and making these somewhat rare materials accessible to other students and scholars beyond the walls of our campus, and in the process, also contributing their own voices to an ongoing conversation. For instance, students will be required to craft a researched descriptive essay that becomes part of the resource they create. This kind of process, however, also requires that students learn about simple cataloging processes, metadata, and controlled vocabularies; how to create quality digital facsimile page images; how to create an XML version of the textual resource they’ve chosen; how to link this resource to existing catalog entries and free-web resources (like ESTC and Google Books). Of course, all this requires time, infrastructure, and either money for hosting, monye/credits for student workers, or (more) time to learn how to troubleshoot server issues, PHP weirdness, and so on. Despite my desire to run my own domain, it may be time to admit that I need more help… Hence, the recommendation that, at least in the beginning, we turn to the handy and professional Omeka.net.

Sample Omeka.net Archives

Describing Your Resources: Controlled Vocabularies and Metadata

What is a controlled vocabulary, and why is it important? More detailed information from Wikipedia

What is metadata? According to the National Informational Standards Organization,

Metadata is structured information that describes, explains, locates, or otherwise makes it easier to retrieve, use, or manage an information resource. Metadata is often called data about data or information about information.

Omeka uses the Dublin Core Element Set to structure the metadata associated with each resource you create. Here is some documentation about the elements in that metadata set. Some elements have recommended controlled vocabularies associated with them, though some do not:

Omeka also includes the ability to further describe your items with Item Type metadata. These fields are not necessary, and they may sometimes even duplicate or (in the case of page images, for instance) confuse the information in the Dublin Core. But, depending on the nature of your resources, you may find the additional descriptive tools helpful.

Note that there is a plugin for self-hosted installs that allows you to draw on LOC subject headings to help you generate controlled metadata.

Plugins can help you create your metadata using controlled vocabularies. Try Simple Vocab and the LOC Subject Headings, in particular.

Let’s build a collection!

  1. Give me your email address. I will add you as a researcher contributor or administrator to the test site I’ve set up–contributor status will be more in line with what your students will see and have access to, administrator status will familiarize you with the interface that lets you build your own site.
  2. Check your email, and click the link to accept the invitation. Create a free Omeka.net account for yourself.
  3. Check your email to activate and login.You will now have the ability to create records in my collection.
  4. Browse over either to Flickr Commons, the VAM, or Vimeo and find a resource that you want to work with. Alternatively, you might choose a website to include in our collection. (If your resource has a specific license, be sure to abide by that copyright license completely.)
  5. Keep this page open in one tab, so you can work with the information it provides. Download the image or video to your desktop, or copy the URL of your website.
  6. Create a record in Omeka.net for your resource. Work through the Dublin Core element set to add metadata about your resource. You might want to take a look at the links above for sample vocabularies.
  7. Be sure to save your work periodically! Omeka.net will time out and you can lose all your work.
  8. When you have finished a draft of your resource, let me know. I will make it public, and we’ll take a look at what we’ve created together.


How might you imagine using Omeka in the classroom? For your own scholarship or research?

Electronic workshopping with google docs

In the past several years, I’ve tried many, many different workshop methodologies–the full class single-paper workshop one day, followed by small-group workshops the next; round-robin workshops; lightning critiques; the simple exchange/read/comment; send your draft to a peer through email and use Word to comment/merge; in-class electronic workshopping with a peer; in-class polishing at the computer lab, and more. The list seems to be endless, and it feels as though none of them have really worked. This term, I’m trying google docs. I don’t really know why I’d not tried it before for workshopping; I’ve used it for other kinds of collaborative writing, like class creation of course policies, group notetaking, and so on, but I’ve not tried it for workshopping per se.

Tonight, we’ll see how it works in a small class. Students were asked to bring their laptops and an electronic document, which we’ll post to a shared folder I’ve created (this could be done prior to class, and I did encourage students to do so, but I am banking on there being stragglers!) and convert to an editable format.  The major questions students addressed as concerns the previous class included 1.) accuracy of theoretical understanding, and 2.) the usefulness of the through narrative they’re to construct for the assignment.

I’ll have students spend a little time crafting questions specific to their own essays at the head of their draft, and adding two or three questions at specific points throughout–this has the added benefit of allowing us to assess any difficulties with the technology. Then, students will work with three other essays in turn to address the extent to which it fulfills the goals of the assignment, in particular by making at least six positive suggestions for changing or refining the content. This can involve suggesting:

  • a quote or a paraphrase,
  • a logical connection,
  • an alternative formulation,
  • a re-organization,
  • a transition, and so on.

Students can reply to or otherwise comment on other reviewers’ comments, as well. Then, I’d like to have students spend a little time crafting a final comment regarding style–in particular, what writing habits did you notice that the author might want to examine during the revision process? What citation habits might the author want to revise? The goal with these specific tasks is to limit and structure the kind of comments peer reviewers can make. By the end of the workshop, each draft should have three reviewers’ comment–students will have to look at the comments to assess whether that draft needs another pair of eyes.

The next step will be downloading the draft as a Word document–this should retain the comments. The essay’s shape will have to be revised, as well as the content, because the upload/comment/download process will strip some of the overall formatting.

Has anyone else used a shared google docs folder for workshopping? I’d love to hear about your experiences!