All posts by thowe

Aphra Behn Society 2015: Wikipedia Workshop


I love going to (to me) new conferences–not only do I get to learn about exciting work in the field, but I also get to meet new people and, ideally, expand my collection of “regularly-attended.” What strikes me most about the Aphra Behn Society is its collegiality, its openness to and mentorship of graduate student work, and the palpable sense of feminist solidarity practically oozing from each session. Very happy-making!

Despite the fact that I am in desperate need of sleep and time to catch up on mounds of grading–and as a result not doing the friendly-joining-thing I should be doing–I am excited to be here and am most definitely planning on returning. ABS reminds me a bit of the EC-ASECS regional conference in terms of general tone, though the crowd here is rather different–I don’t see too many overlapping faces. I may have two new conferences to make a habit of!

Tomorrow, I’m co-leading a workshop with Laura Runge on using Wikipedia in the classroom. My EN340 course this term, Major Women Writers, is doing a Wikipedia project for one of the novels I’ve assigned–Charlotte Lennox’s Henrietta. It seems to have proven (surprisingly, for me!) a lot more challenging than expected–students were very confused, in general, by the way the first book is essentially Henrietta telling her story to Miss Woodby. Coupled, of course, with the fact that there is no Wikipedia entry on the novel, and my students are rather struggling!

A bit of background on the class–it’s a LT-2 Liberal Arts Core course (advanced literature), and it’s also Writing Intensive. This means I have at most one major, and this term, none–challenging, but it does free me up to do all sorts of experimental things. My goal is, at root, twofold: 1.) get at least a handful of students excited about reading something from “back in the day” that apparently has absolutely no (right?) relevance for the modern world, and 2.) hopefully instill a sense of curiosity about writing done in public. I’ve designed a project organized around Wikipedia, since I know most of my students use it as a crib-sheet of sorts–I routinely see the pages on Fantomina, “The Reformed Coquet,” and Evelina up on their laptops during discussion, and so many were frustrated by the lack of readily available information on Henrietta. What better way to instill a bit of healthy skepticism about their sources, while encouraging students to help others in their same situation, while modeling the kind of DIY practices that I believe are essential to being a well-rounded citizen of the world, while also engaging students in just the kind of real-world writing that frequently goes unnoticed as writing. Enter: Wikipedia.

I’m presenting tomorrow at 1:45. Hope you can make it! The project details–from assignment to homework to groupwork–are all available in PDF, here. But, for simplicity’s sake, I’m also posting below an overview of how I structure the project from pre-writing to submission:

Charlotte Lennox’s novel Henrietta does not have a Wikipedia page, by which I’m sure you’re all distressed! So, let’s help out future students by creating one. This is a full-class project. See the entire assignment sheet on Canvas.

Pre-project work:

  1. Homework: Complete the Wikipedia Training for Students tutorial:
    1. Create your account
    2. Explore your sandbox
    3. Create your user information page / biographical sketch
    4. Submit your User Contributions URL via Canvas
  2. In-class: Team Wikipedia Quiz / Go over assignment
    1. In-class: (Activity A) How to recognize original research, point of view problems
    2. In-class: Structure/content: Look at “Fantomina” and Evelina pages–what is included?
  3. Homework: Write 1 paragraph for each part of our hypothetical page, upload to Canvas
    1. summary
    2. key characters
    3. theme
    4. style
    5. overview
  4. Homework: Canvas discussion board research post: Find 1 scholarly or biographical source (no overlaps!), download it to your computer. Read it, and post to Canvas discussion board:
    1. bibliographic entry
    2. upload the source
    3. write 1 paragraph overview/summary of the source
  5. In-class: Wikipedia Workshop/Activity B
  6. Homework: Revise in groups
  7. In-class: Lab revision time; add a source; add an image; add a template note
  8. Homework: Revise in (different) groups; sources, content, and writing
  9. In-class: Lab revision time
  10. Homework: Revise
  11. Due: Your user contributions page URL to Canvas

New OS, new eXist install

Hankering for a change of some sort–though goodness knows why I think I need even more of that in my life, but that’s an existential question best left offline–I decided recently to ditch Windows and commit fully to Ubuntu. I had 11.04 on a computer I didn’t really use all that often, sort of an excuse to continue using Windows, but I don’t know. I guess I just wanted to dive in. Also, I had run into some (more!) problems with GitHub that involved not being able to reconcile branches–I can’t even talk about it. So, I took the coward’s way out and just started from scratch.

Truthfully, I’m always astonished at how much easier things are when you have a general level of literacy about how to find good information.

I’m happy to say, everything so far has gone smoothly–my clunky old ThinkPad T420 is has been de-Fenestrated; I’ve got a nice, shiny new install of 15.04 (Vivid Vervet), and yesterday I turned my laptop into a server with xampp. I had to install Java 8, which was a bit dicey, but an Internet connection and mad Google skillz prevaled. My goal for today was to get eXist-db 3.0 up and running, which was pretty straightforward, especially after having received some help a couple of months ago from @joewiz. The documentation at is wonderfully useful, overall.

But then I thought I’d restore from a backup I’d made, without fully reading that wonderful documentation, and ran into some troubles–notably, a namespace exception:

err:XPST0081 error found while loading module restxq: Error while loading module modules/restxq.xql: No namespace defined for prefix text [at line 271, column 40]

I tried doing the repairs using the xQuery  advice given here,  from the Java admin client, but then I discovered that the Java 8 security settings were too high and I couldn’t run it–so, I had to add an exception, which was a little confusing, as the address I expected, “localhost”, didn’t work. Turns out it was “http://localhost:8080”, but the address was hidden beneath another window. Anyway, that done, I could open the Java admin client, open a connection, and then TOOLS>QUERY to run the xQueries on the server that would clean it up:

xquery version "3.0";
import module namespace repair="" 
at "resource:org/exist/xquery/modules/expathrepo/repair.xql";

Of course, this didn’t work–I was still getting the error from earlier. So, instead of doing more research, which I should have done, I sent an email to the eXist-db forums; just afterwards, I found the answer to my (and other peoples’) question about the start error, here and then from there, to Wolfgang’s fix. Ta-daaaa! The power of other people, I tell you.

Now that that’s done, I can see my restored app on my new localhost install, and that makes me very happy indeed. Now, to continue wrangling with github, upgrading my TEI, and adding functionality to the project!

UPDATE: While I could see my app in the dashboard, and I could run the app, I was unable to use eXide or collections to access files in the app or update them–and everything I did get opened in eXide was opened read-only. It looked as if my db was empty:


The eXist documentation package was also not working; this is what appeared when I tried to open fundocs from the dashboard:

It turned out to be a problem easily fixed–installed packages all needed to be updated via the package manager, and then the dashboard and shared resources repository needed to be re-connected and updated. The xQuery to do that is here, via Wolfgang:

repo:install-and-deploy("", ""),
repo:install-and-deploy("", "")

Now, all is right with the world! Well, with the world on my laptop.


How-to guide for basic collaborating in GitHub

The summer is upon us, and I have had the great good fortune of a grant to be able to work with three IT students from Marymount on the Novels in Context project. We are all learning about how to collaborate using GitHub for version control, and I found myself in the exciting position of having to teach others what I am also teaching myself. So, I wanted to give a very clear step-by-step, which I hope others find useful. I’ll update the document with information on pull requests when that happens. For the time being, here’s the how-to!

CEA 2015 Remarks

CEA 2015, Indianapolis, IN
March 28, 2015

Novels in Context: A TEI Database for Teachers, Students, and Scholars

In Fall 2012, I was teaching my “survey” of World Literature (1500-1800… I know!), a class that I typically teach once an academic year. I use the Norton Anthology, Volumes C and D, because they have a good variety of the materials I like to include, accessible headnotes and introductions, and pretty good footnotes for some of the less familiar and more global literary representatives. Because I also, like many of us here, I expect, teach a variety of other classes with little consistency year to year, I have limited, if any, time to make changes to my syllabi, particularly when they have been working well–as this class had been. In Spring of 2012, Norton released the 3rd edition of their anthology–a new edition that had new materials included, as well as, mind-bogglingly, different translations for some of the pieces that were retained. While I understand the need to offer current materials that represent the best practices in college literature surveys, the demands of the job make this kind of planned obsolescence, to use Kathleen Fitzpatrick’s useful term, especially frustrating. What if we could do better for faculty, and in the process, open the pedagogical experience up to incorporate something of the historical materiality of the texts we’re working with? What if we could also engage students in tightly-organized projects that are of real scholarly use?

Imagine a classroom where harried faculty no longer had to plan lectures, class activities, and assignments around textbook materials that may change or even disappear, and which many students never end up purchasing at all; or, where we no longer had to scrounge around the web for free, quality resources that students may or may not print out and bring to class. Imagine going to a reputable site where you can select from a range of primary source materials that include facsimile page images, relevant and recent headnotes, and even reading questions; a scenario in which you can easily publish that subset to a personal website, or even save them to a single PDF document that can, itself, be posted to a course site or even made available for the cost of the copying in a campus bookstore. Or, imagine how a project that involves students and faculty in creating it might transform an upper-level class. These scenarios are what I imagine for the project I want to share with you today.

Novels in Context: A TEI Database for Teachers, Students, and Scholars (NiC) is a project that links my interests in the digital humanities and eighteenth-century studies. While the scenarios I drew for you earlier is based on a future iteration of the project, I hope today to show you where I’m starting and talk a little about what I want to end up with. I was lucky enough to receive a sabbatical grant to begin work on the Novels in Context project last term, and that gave me time to research a variety of open-source database applications that worked with XML files; it also gave me time to learn the basics of xQuery, the database query language that interacts with the XML files. I spent time thinking about how the documents would be marked up, and what the costs and benefits would be of chosing either interpretive or descriptive markup. I went to the Library of Congress and got in touch with some other special collections libraries to test out the process by which I could acquire page images of early sources to include in the database, and investigated how to link them into the database for web-accessible display. I wasn’t able to do more than scratch the surface of the project as a whole, but this did give me the time and space to start, as well as figure out where the project could go.

So, essentially, Novels in Context is a collaborative digital scholarly project with pedagogical significance that seeks to provide an agile and web-accessible alternative to 1.) the costly and proprietary Eighteenth-Century Collections Online, 2.) the single print anthology on the subject, and 3.) the free but scattered and often unreliable resources available on the Internet. The project will provide a free and open electronic subset of primary source materials focused on the history of the novel in English that is highly curated, extensible, fully indexed, searchable, and accessible. Here, you can see that the basic setup of the app includes a basic user interface that contextualizes the project, offers a full-text search, and displays in list form the materials currently in the database. The files in the database right now are TEI-formatted XML documents–I’ll show you what they look like in raw form a bit later and talk about my markup choices. When you click into a document, you’ll see an excerpt of the full-text, page images of the excerpt (if it is an excerpt–this essay by Johnson is the whole thing, because it’s brief) that has been selected for inclusion because of its significance, and below, some information about the material object and its various online iterations–plus links to the ESTC and 18th Century Book Tracker.

This is all pretty basic and easy enough to do, though for someone unfamiliar with XQL, there was a bit of a learning curve. I first set the project up as a local installation on my laptop, where it runs on Apache, and I used the sample database of Shakespeare’s works as a model. After I became more confident, I posted all the code online, to Github, an application that offers easy ways to version control, collaborate, and publish the code. I met with one of the developers of eXist for coffee one day, and he explained quite a bit of the process to me, which helped greatly (shout out to Joe Wicentowski!); the eXist online developer and user forums are also essential resources for a beginner. Anyone here today can install eXist, download my app from Github, and install it on their own server–I wanted to do this because I think it’s important that literature out of copyright be freely available in the most usable, useful form, and anything I can do to enhance our access is in the realm of good. I also wanted to make it available like this because I hope to work with students in IT this summer to further refine and develop the project in ways I’ll describe later.

Now I’ll talk for a little bit about the XML markup and why I chose to do it this way, as opposed to some other way. By the way: how many of you are familiar with the terms I’m using–particularly XML and TEI?

If not many, explain. If lots, skip. Probably, there’ll be few familiar with these terms?

XML stands for eXtensible Markup Language–it looks a lot like HTML, where text is “tagged” or “marked up” so that your browser can interpret it. The difference between XML and HTML is that XML is extensible–that is, you can add to it, and create new tags and markup depending on your needs; you can create your own schemas that your application can interpret however you ask it to interpret. XML can be used by businesses to keep track of inventory and display that inventory dynamically on the web–but in my experience, it’s most often used by scholars to make texts machine-readable. The Women Writer’s Project, for instance, uses XML–there are many examples available, and you can use google to search for more. It is often used to describe manuscript materials, and there are ways to use it to make other media, like musical scores or videos, readable to machines, too. The idea is that, in the absence of the thing itself, this document can achieve a kind of descriptive exactitude that makes the absent object visible. Depending on the tool you use to “read” or “display” the text, you can get quite detailed. TEI refers to the Text Encoding Initiative, a group of scholars and coders who have created a standard for describing things like manuscripts, early printed matter, and so on. So, TEI is the standard of the XML I’m using–there’s a specific set of tags and syntax that all scholars doing this sort of work would use–you can imagine why that would be necessary.

If we look at this XML file, you can see what I’ve done. I’ve got a header here that describes the publication history of the object–not only the particular text I’m working with, but also the page images I’m working with and its other available electronic iterations, as well. My header also contains information about this electronic version of the text–who created it, who did what part of the creation, and so on. The pedagogical applications of this I think are clear and straightforward, but often overlooked; it is meant to locate the object in time and space, giving it a local habitation and a name, as it were. Too often, students turn without awareness of context to “the web” writ large, not knowing much of anything about the specific thing they’re looking at, much less its provenance, who edited it, and so on. This contributes to, and even embodies, the rootlessness of our student’s reading and research skills. While anthologies can seek to help craft a narrative by putting texts in chronological or thematic order, it is not a given that our students grasp that narrative clearly. If students were to, in collaboration with a faculty mentor, participate in creating such XML documents, they would be required to spend the time putting the text back into its historical and material context–and also begin to understand the relationship between the unmoored “web version” they’re so used to reading, a contemporary student edition like an Oxford edition, and the historical printings or even manuscripts that those editions draw on. This version becomes a point in a line as clearly defined as the context requires.

I won’t go into detail about a lot of the other aspects of the XML here, but I do want to describe the evolution of my thought regarding conceptual markup or annotation. When the project first began, I wanted to identify specific places in the text where a key theme or topic was evident–for instance, a reference to formal realism, or a reference to exemplary characters for imitation, or to the dangers of reading, or to the relationship between novels and the romance tradition. However, I quickly realized that one reader could very easily locate multiple topics or themes in a single span of text–this would make standardizing the database very challenging, and an illogical markup makes for problematic data. A full-text search of the database for keywords would ultimately be just as effective. And finally, I also worried about imposing a reading on the text that students should really discover for themselves. This led me to the way I’m currently conceptualizing the markup, which is more descriptive and clarifying–for instance, defining important and unfamiliar words, identifying allusions, clarifying a reference, or tagging people and dates to create the basis of a network that may become more useful for the application later.

I began thinking of the data less as a collection of topics and more as a collection of things. And because of this, I realized that student contributions could very easily be of real use to the project. Many digital pedagogical projects have short life spans, or are significant purely in the doing of them. Student labor in the digital humanities is often elided, sometimes for very good reasons–the contributions are exercises in pursuit of a specific pedagogical goal, and often that goal is removed from the larger scholarly conversation. We encourage students to see themselves as a part of an ongoing conversation, but we don’t expect them–somewhat myopically–actually to contribute some new understanding of the history of the novel, for instance, or even a new interpretation of this particular text. Sometimes student work is less than polished, and it could be more informed in its research–and that’s okay, it’s the nature of the exercise to be a learning experience. However, what if students could contribute meaningfully? Students can consult library and Internet materials to identify an allusion, or the birth and death dates of a person referenced, and they can identify the structural parts of a text (title, subtitle, page, paragraph, verse paragraph, and so on). Students can go to a special collections library and take 400dpi, margin-clear photographs (often on their phones!) of the page images of a first edition, and if its not the first edition, they can identify which edition it is. They can work with each other and with their faculty to draft headnotes, or reading questions. And what if their XML documents of specific texts or excerpts, headnotes or reading questions, could be submitted to an editorial board that vetted them for accuracy, style, and completeness, to be re-used by you, or you, or you, the next time you’re teaching your class? A clear style sheet would be useful here, of course, but the idea is not far-fetched. Indeed, I think it has real implications for both pedagogy, scholarship, and scholarly publication.

Which leads me to the final portion of my presentation, on future directions. Over the summer, I’m hoping to get a grant to work with three IT students to flesh the project out–I’d like to be able to create a user registration process, whereby individuals can submit XML documents plus page images, for instance. This will require a clearly defined workflow so that an editorial board can identify which documents can be published, which rejected, or which published with revision; I also want to provide a way to display the variety of headnotes and reading questions–perhaps organized by general theme or topic–and make them available, with the documents themselves, for exporting to another source as a coursepack or anthology. Because the project subscribes to an open culture license, these materials will be freely reproducible and distributable. For a nominal fee, faculty could have them bound and available for purchase in the campus bookstore, and/or made available in PDF.

So, these are my next steps–however, I’m in the process right now of submitting the grant application, and I’m learning a lot there, too. In particular, the IT students, I’m realizing, don’t have a clear awareness of digital humanities as a growing field of opportunity, the concept of a free and open source ethos, or important concepts like version control. Coursework in our IT departments focuses on subjects very different–this will be a challenge for the project, but also a real opportunity.

Ultimately, my goal is to make the program and platform itself available for public reuse, with any kind of content. With faculty buy-in, it can be a real alternative to pricey and frequently revised print anthologies by Norton or Bedford. Students and faculty can work together to contribute to a growing, scholarly, and free collection of primary resources, where faculty can —will be a real asset to students and teachers of this important development in literary history. Finally, by actively engaging students from Information Technology, I hope that this project will provide a model for future interdisciplinary collaboration, mentorship, and even publication.
We face a significant challenge, as teachers and scholars, of making distant material relevant for students raised in an environment of standardized, workforce-oriented learning. Part of our work as teachers consists in our attempts to combat this sense of irrelevance and standardization by adopting habits of active, project-based learning, many of which involve students in our own research agendas. The primary source materials in Novels in Context are not only useful for students of eighteenth-century letters, but their presentation also offers a window into the material history of novel reading and publishing. By building the resource in compliance with the Free and Open Source ethos and by incorporating student-authored markup among the scholarly contributions to the database, I hope the project become a public site that makes vivid the active production of knowledge, both in history and into the digital realm. As such, I intend it to offer an interrogation of current modes of scholarly publication as well as textbook, anthology, and coursepack production. Current habits of scholarly publication are emphatically not open to the public, but secured behind institutional paywalls, and typically exclusionary in both content and form; similarly, the costs and methods of textbook production by many accounts is more burdensome than enabling for student learning. The future of publishing, the work of learning, and the demands of public discourse are changing, and as teachers and scholars, part of our charge is to ensure that these changes benefit our students’ intellectual, ethical, and civic growth.