Damir Cavar's Homepage

Logo

Damir Cavar is a Natural Language Processing and AI, and Classical and Quantum Computing scientist

curriculum vitae

publications

talks

research

teaching

code

blog

View My LinkedIn Profile

View My GitHub Profile

14 August 2016

On Ubuntu/Debian/... tools for linguists

by Damir Cavar

DRAFT – Work in Progress

The standard Ubuntu distribution comes with various linguistic tools. I am linking here to 16.04(.1).

Various Finite State Transducer toolkits can be found in the package list that are used for the development of morphological analyzers, tokenizers, and other NLP tools:

There are also ready NLP tools for various languages in the standard package list:

Other Repositories

Some repositories provide more packages that might be interesting or useful for linguistic work, be it language documentation or corpus linguistics:

I set up the SIL repository by creating as root or using sudo a file:

/etc/apt/sources.list.d/sil.list

with this content for Ubuntu xenial (16.04):

deb http://packages.sil.org/ubuntu xenial main
tags: Corpora NLP Speech Debian Ubuntu "Natural Language Processing" "Machine Learning" ML NLTK FLE WFST