Damir Cavar is a Natural Language Processing, AI, and Knowledge Representation scientist
by Damir Cavar
If you are interested in participating in a research and reading group on prosodic and supra-segmental speech processing, we are meetings Mondays at 2:30 in the 111 N Bryan Ave house.
Our goals are to understand how to extract detailed properties from speech signal using common libraries (also with a Python interface), extracting:
We are interested in studying this for setting up real-time processing linked to speech recognition services to be able to augment ASR output (speech recognition based transcription) with prosodic cues. We want to relate these cues to information theoretic processing (pragmatics and semantics), processing contrastive stress, intonation contours specific to interrogative or declarative utterances, and so on.
There is a real data and corpus creation component to this project, as well as coding, implementation, and experimenting with ML and Deep Learning speech processing algorithms.
If you are interested to work with us on these problems, let us know and join us.
The group is managed by:
We have some undergraduate students interested in this project, as well as students from Damir’s L665 class.
For more information consult the Wiki-page on GitHub, as well as a code repository.
tags: Deep-Learning GPU HPC Speech Prosody Information-Theory