To test LID, just run:
python lid.py sample*.py
in a command line shell.
Download the complete archive as a ZIP-file.
For an implementation in other languages, have a look at the bottom of the Home-page.
- lid.py - The main code for the language identifier.
- lid-speech.py - The language identifier using Win32 Python extensions and output via speech synthesis. (Only for Windows)
- lidtrainer.py - The main code of the LID trainer. It generates n-gram models from given texts of a specific language.
- sample1.txt - A sample text for testing with LID.
- sample2.txt - A sample text for testing with LID.
- sample3.txt - A sample text for testing with LID.
- Croatian.dat - Croatian language model (just a simple example trained on a couple of text-files).
- English.dat - English language model (just a simple example trained on a couple of text-files).
- German.dat - German language model (just a simple example trained on a couple of text-files).
- Italian.dat - Italian language model (just a simple example trained on a couple of text-files).
- Japanese.dat - Japanese language model (just a simple example trained on a couple of text-files).
- Afrikaans.dat - Afrikaans language model