Introduction to Scheme/Python in Computational Linguistics

Course held at the JSSECL 2006 Summer School at the University of Zadar
- some material might be outdated -

Damir Cavar, in 2006 at University of Zadar

This course offers in introduction into computational linguistics on the basis of concrete and practical examples on the basis of the programming languages Python and Scheme.

The concepts discussed in this course include:

- Processing text files (and various code pages)
- Extracting word lists and creating frequency profiles
- Generating N-gram models from text on the basis of characters, morphemes, words, and applying them
- Statistical methods for language identification, text similarity metric, text classification and clustering
- Analysis of corpora and treebanks

- Top-down, bottom-up, chunk-, and chart-parsing

( Syntactic parsing with Context Free Grammars and Probabilistic Context Free Grammars
Syntactic parsing with Categorial Grammars)

All concepts are accompanied with code examples (that can be found here soon), and discussed in detail with respect to implementation and formal properties.


Scheme - MzScheme and DrScheme
Python - and ActiveState