Code examples


  1. Hello-world example (code)
  2. Documentation example (code)


  1. Reading file content into a string variable (code)
  2. Reading from file into list of lines (code)
  3. Reading from file line by line and processing each line (code)
  4. Fast and compact file input (code)
  5. Reading file content into a string variable, embedded in a try statement (code)
  6. Writing string content to a file (overwriting existing file) (code)
  7. Appending string content to a file (code)
  8. Writing Unicode strings to a file with specific encoding (code)


  1. Grammatik file (code)
  2. Grammatik parser (code)
  3. Top-down parser (code)
  4. Bottom-up parser (code)
  5. Chart-parser


  1. Counting words 1 (code)
  2. Counting words 2 (Unicode tokens) (code)
  3. Counting words 2 (Unicode tokens) (code)
  4. Counting words 3 (sort dictionary by value) (code)
  5. Counting words 4 (sort dictionary by value) (code)
  6. Counting words 5 (tokenize with regular expression) (code)
  7. Counting words 6 (without function words) (code)

N-gram models

  1. General n-gram class with persistance functions (code)
  2. Language Identification (LID)
  3. Generating n-grams from the Brown corpus (code)
  4. Chi2 test for collocations (code)
  5. Calculating Mutual Information and Relative Entropy for bigrams (code)

Vector Space and Clustering

  1. Generating a vector space (code)
  2. K-means clustering (code)
  3. Expectation Maximization clustering (code)
(C) 2005 by Damir ─ćavar, dcavar _at_ indiana _dot_ edu