My Chaucer concordance (or Chaucer Concordance,
if you want a name for it) began a year ago as a little tool for some work I was doing on Troilus and Crisyde. I put it online and thought nothing more of it until I noticed a few months ago that it was the first result returned by a Google search for chaucer concordance.
As not to disappoint visitors, I have since added most of Chaucer's extant works and made major performance enhancements.
The texts were cleaned, normalized to an extent (see below), and indexed with Python. The indices are stored on this server as JSON documents, which are loaded asynchronously by the browser as they are needed. The frontend was built in JavaScript and uses jQuery to handle user interactions and AJAX requests.
For performance, I have designed the tool to run without server-side logic; once the indices have been loaded, the tool can essentially run offline. This makes the tool perforce open-source, but it is sparsely documented and somewhat haphazardly organized, so its source code may be of limited use.
My chief hope for the tool, besides that it be of some use to students of Chaucer, is that it serve as a demonstration of the text-processing tasks that can now be accomplished by the browser. A full index of a modest corpus can be quickly transferred to a client's browser, which can then search the corpus much more quickly than it could submit queries to a server and wait for its response. The indices and texts used by this tool come in at less than five megabytes; a far larger corpus could be handled in this way without incurring any more network strain than streaming a video or downloading a photo album. Only the largest corpora really demand a central server to run search queries; a corpus containing even quite a few authors or works can be conveniently searched and analyzed client-side.
That said, this particular concordance demands that I note several caveats:
If you have any comments or questions, feel free to email me.
You might also like my search interface for the University of Michigan's online Middle English Dictionary.
© 2017 Henry Litwhiler