Damir Cavar's Homepage

Logo

Damir Cavar is a Natural Language Processing, AI, and Knowledge Representation scientist

curriculum vitae

publications

talks

research

teaching

code

blog

View My GitHub Profile

14 August 2016

On Ubuntu/Debian/... tools for linguists

by Damir Cavar

DRAFT – Work in Progress

The standard Ubuntu distribution comes with various linguistic tools. I am linking here to 16.04(.1).

Various Finite State Transducer toolkits can be found in the package list that are used for the development of morphological analyzers, tokenizers, and other NLP tools:

There are also ready NLP tools for various languages in the standard package list:

Other Repositories

Some repositories provide more packages that might be interesting or useful for linguistic work, be it language documentation or corpus linguistics:

I set up the SIL repository by creating as root or using sudo a file:

/etc/apt/sources.list.d/sil.list

with this content for Ubuntu xenial (16.04):

deb http://packages.sil.org/ubuntu xenial main
tags: Corpora NLP Speech Debian Ubuntu "Natural Language Processing" "Machine Learning" ML NLTK FLE WFST