Computer Tools and Applications - Sommersemester 2000 - Anglistik

Working with 'corpora' without markup

Course Home Page

 

There are a range of tools that are able to take straight forward text files and start presenting differing views of them. These tools allow us to get a better understanding of the patterns and regularities that occur in a text. These tools are often based on 'words': that is, we can see how many different kinds of words (types) occur and how often these kinds of words are used (tokens).

We will look at two: one quite established and almost a standard for examining corpora and text archives and one very new and experimental.

One of the most established tools is Wordsmith. This has been under development by Mike Scott of Liverpool University for many years and is marketted by Oxford University Press. The experimental tool that we will look at is called Tatoe. Tatoe stands for "Text analysis tool with object encoding" and is under development by Melina Alexa and Lothar Rostek at the Darmstadt institute of the German Center for Information Technology. We will see that the functionality of these tools overlap, but they are intended for rather different uses.

Step-by-step instructions for the lab session for Wordsmith

Step-by-step instructions for a lab session for Tatoe

 

Course Home Page