Assignment 16th of Feb. 2012

Read the introduction to XSLT:

D. Tidwell (2001) XSLT. O’Reilly Media.
B. DuCharme (2001) XSLT Quickly. Manning Publications Co.

W3C Consortium XSLT page


Assignment 9th of Feb. 2012

The assignments is: Choose another novel from Project Gutenberg (a new one!), download and export to TEI XML. Calculate for an apparent collocation or idiom (or named entity) of 2 words the:
- log-likelihood ratio
- Mutual Information
- Chi2 test

Assignment 2nd of Feb. 2012

Evaluate the assumption that “young king” is a collocation in the novel The House of Pomegranate:

See Manning and Schütze (1999) for some more details.


Using Antconc: Notes 1

Here is a short instruction on using Antconc for simple statistical analysis.


Assignments 19th of Jan.

The task is:

1. Select a book from the archive, and download a version with markups and layout, for example a HTML format.

2. Import this format in OpenOffice (or whichever distribution of it you selected and installed) and export it using the TEI XML export function. (see previous post on the course blog)

- Try to use the document properties menu to add meta information to the document.

3. Edit the exported TEI XML file and fill up more of the meta information in the header of the XML file, e.g. author, publisher, language, edition, information about the source, that is etc.

Make sure that the basic document structure up to the

(paragraph) level is correct, that the section titles and sections are marked up correctly.

You might want to additionally have a look at the TEI Light intro here.


Here is the instruction for TEI XML export from OpenOffice again...


There is also LibreOffice, the same program base as OpenOffice, and for Macs there is also NeoOffice

This is a brief instruction on how to generate TEI XML files with OpenOffice (LibreOffice, NeoOffice).

Here are the necessary steps:

1. Install the newest release of OpenOffice (on Windows, Mac, Linux). I tested it on OpenOffice 3.2 and the most recent NeoOffice. This might imply that on Windows you should have installed a Java SE Runtime Environment, which can be found on the Oracle Java pages (either choose Java SE Development Kit (JDK) or Java SE Runtime Environment (JRE)), this .

2. Get the newest release of the TEI OpenOffice Package from the SourceForge pages, and select the file teioop5-2009-02-05.jar or any newer version, but important, it should be “teioop5”.

3. In OpenOffice open an empty document and select in the menu Tools the submenu XML Filter Settings...

4. In the dialog box select the button “Open Package...” on the right side, locate the teioop5*.jar file you downloaded from SourceForge, and confirm with “Open”.

The XML TEI filter should be listed now among the filters, it is ready for use.

If you have some example document opened up in OpenOffice, make sure the meta information (in the menu File -> Properties...) are set, select in the menu File -> Save As... and in the file dialog under File types: select TEI P5 (.xml), give your file a name and save.