Publications
BibTeX file (not up to date!)
- Muhammad S. Abdo, Yash Hatekar and Damir Cavar (2025) AMWAL: Named Entity Recognition for Arabic Financial News. In proceedings of the FinNLP-FNP-LLMFinLegal 2025 workshop at COLING 2025.
- Günther Jikeli, Damir Cavar, Weejeong Jeong, Daniel Miehling, Pauravi Wagh, Denizhan Pak (2024) Auf dem Weg zu einer KI-Definition von Antisemitism. In: M. Hübscher and S. von Mehring (eds.) Antisemitismus in den Sozialen Medien. Verlag Barbara Budrich: Opladen, Berlin, Toronto, pp. 269-292.
- Chi Zhang, Akriti Kumari, Damir Cavar (2024) Entangled Meanings: Classification and Ambiguity Resolution in Near–Term QNLP. Paper and Poster presented at the IEEE Quantum Week 2024, Montreal, Canada, September 2024. (full paper, short paper, poster)
- Damir Cavar and Chi Zhang (2024) Semantic Similarities using Classical Embeddings in Quantum NLP. Paper and Poster presented at the IEEE Quantum Week 2024, Montreal, Canada, September 2024. (paper, poster)
- Damir Cavar, Zoran Tiganj, Ludovic Mompelat, Billy Dickson (2024) “Computing Ellipsis Constructions: Comparing Classical NLP and LLM Approaches.” Society for Computation in Linguistics 7(1), pp. 217–226. doi: https://doi.org/10.7275/scil.2147. See (SCiL).
- Damir Cavar, Ludovic V. Mompelat, Muhammad S. Abdo (2024) The Typology of Ellipsis: A Corpus for Linguistic Analysis and Machine Learning Applications. Pages 46-54 of Michael Hahn, Alexey Sorokin, Ritesh Kumar, Andreas Shcherbakov, Yulia Otmakhova, Jinrui Yang, Oleg Serikov, Priya Rani, Edoardo M. Ponti, Saliha Muradoğlu, Rena Gao, Ryan Cotterell, Ekaterina Vylomova (eds.) Proceedings of the 6th Workshop on Research in Computational Linguistic Typology and Multilingual NLP. Association for Computational Linguistics, St Julian’s, Malta. See ACL Special Interest Group on Typology (SIGTYP) 2024, colocated with the 18th Conference of the European Chapter of the Association for Computational Linguistics. (full paper)
- Damir Cavar, Ali Aljubailan, Ludovic V. Mompelat, Yuna Won, Billy Dickson, Matthew Fort, Andrew Davis and Soyoung Kim (2022) Event Sequencing Annotation with TIE-ML (2022) In Proceedings of the 18th Joint ACL - ISO Workshop on Interoperable Semantic Annotation within LREC2022, pages 33–41, Marseille, France. European Language Resources Association. See The Eighteenth Joint ACL - ISO Workshop on Interoperable Semantic Annotation (ISA-18 2022), at LREC 2022.
- Günther Jikeli, Damir Cavar, Weejeong Jeong, Daniel Miehling, Pauravi Wagh, Denizhan Pak (2022) Toward an AI Definition of Antisemitism? Pages 193-212 in M. Hübscher and S. von Mering (eds.) Antisemitism on Social Media. Routledge, New York.
- Damir Cavar, Billy Dickson, Ali Aljubailan, Soyoung Kim (2021) “Temporal Information and Event Markup Language: TIE-ML Markup Process and Schema Version 1.0,” In Proceedings of SEMAPRO 2021, Barcelona, Spain.
- Damir Cavar (2019) Measuring Lexical Semantic Variation using Word Embeddings. Pages 61-74 in J.M.M. Brown, A. Schmidt, M. Wierzba (eds.) Of Trees and Birds, Universitätsverlag Potsdam, Germany.
- Günther Jikeli, Damir Cavar, Daniel Miehling (2019) Annotating Antisemitic Online Content. Towards an Applicable Definition of Antisemitism. arXiv:1910.01214 [cs.CY].
- Damir Cavar, Joshua Herring, Anthony Meyer (2018) Case Law Analysis using Deep NLP and Knowledge Graphs. In Proceedings of the LREC 2018, paper presented at the 1st Workshop on Language Resources and Technologies for the Legal Knowledge Graph (LegalKG), LREC 2018, in Miyazaki, Japan.
- Damir Cavar, Matt Josefy (2018) Mapping Deep NLP to Knowledge Graphs: An Enhanced Approach to Analyzing Corporate Filings with Regulators. In Proceedings of The First Financial Narrative Processing Workshop (FNP 2018), LREC 2018 in Miyazaki, Japan. (download extended abstract)
- Damir Cavar, Lwin Moe, Hai Hu, Kenneth Steimel (2016) Preliminary Results from the Free Linguistic Environment Project. Pages 161-181 in D. Arnold, M. Butt, B. Crysmann, T. Holloway-King, S. Müller (eds.) Proceedings of the Joint 2016 Conference on Head-driven Phrase Structure Grammar and Lexical Functional Grammar. CSLI Publications. See also: FLE
- Małgorzata E. Ćavar, Damir Cavar, Hilaria Cruz (2016) Endangered Language Documentation: Bootstrapping a Chatino Speech Corpus, Forced Aligner, ASR. In Proceedings of the LREC 2016, Portorož, Slovenia. See also: GORILLA
- Damir Cavar, Małgorzata E. Ćavar, Lwin Moe (2016) Global Open Resources and Information for Language and Linguistic Analysis (GORILLA). In Proceedings of the LREC 2016, Portorož, Slovenia. See also: GORILLA
- Małgorzata E. Ćavar, Damir Cavar, Dov-Ber Kerler, Anya Quilitsch (2016) Generating a Yiddish Speech Corpus, Forced Aligner and Basic ASR System for the AHEYM Project. In Proceedings of the LREC 2016, Portorož, Slovenia. See also: AHEYM, GORILLA
- Damir Cavar and Małgorzata E. Ćavar (2014) Visualization of Language Relations and Families: MultiTree. In: N. Calzolari, K. Choukri, T. Declerck, H. Loftsson, B. Maegaard, J. Mariani, A. Moreno, J. Odijk, S. Piperidis (eds.) Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC’14), May 26-31. Reykjavik, Iceland, European Language Resources Association (ELRA), ISBN 978-2-9517408-8-4. See also: MultiTree
- Anelia Belogay, Damir Cavar, Dan Cristea, Diman Karagiozov, Svetla Koeva, Roumen Nikolov, Maciej Ogrodniczuk, Adam Przepiórkowski, Polivios Raxis, and Cristina Vertan (2012) i-Publisher, i-Librarian and EUDocLib – linguistic services for the Web. In Piotr Pęzik (ed.) Corpus Data across Languages and Disciplines, volume 28 of Łódź Studies in Language, pp. 203-212. Peter Lang.
- Damir Cavar, Helen Aristar-Dry, Anthony Aristar (2012) Large Mailing List Corpora: Management, Annotation and Repository. In LREC 2012 Proceedings of the workshop on Challenges in the management of large corpora.
- Damir Cavar, Dunja Brozović Rončević (2012) Riznica: The Croatian Language Corpus. In Prace Filologiczne LXIII (63), Wydział Polonistyki Uniwersytetu Warszawskiego, Warsaw. Pages 51-66. ISSN: 0138-056.
- Damir Cavar, Melanie Seiss (2011) Clitic Placement, Syntactic Discontinuity, and Information Structure. In LFG Proceedings 2011. ISSN 1098-6782
- Damir Cavar, Tanja Gulan, Damir Kero, Franjo Pehar, Pavle Valerjev (2011) The Scheme Natural Language Toolkit (SNLTK): NLP libraries for R6RS and Racket. In Proceedings of the 4th European Lisp Symposium, Hamburg University of Technology, pp. 58-61.
- Damir Cavar (2010) On Statistical Metrics for Selection and Phrasality. In T. Hanneforth and G. Fanselow (eds.) Language and Logos. Akademie Verlag, Berlin. (ca. 25 pages), ISBN 978-3050049311
- Damir Cavar, Ivo-Pavao Jazbec, Siniša Runjaić (2009) Efficient Morphological Parsing with a Weighted Finite State Transducer. *Informatica* 33/1, pp. 107-113. Website of the journal. ISSN: 0350-5596
- Damir Cavar, Ivo-Pavao Jazbec, Tomislav Stojanov (2009) CroMo - Morphological Analysis for Standard Croatian and its Synchronic and Diachronic Dialects and Variants. In: Jakub Piskorski, Bruce W. Watson, Anssi Yli-Jyrä (Eds.) Finite-State Methods and Natural Language Processing, 7th International Workshop, FSMNLP 2007, Ispra, Italy, September 11-12, 2008. Post-proceedings. Frontiers in Artificial Intelligence and Applications 19 IOS Press, pp. 183-190. ISBN 978-1-58603-975-2 (Draft PDF)
- Damir Cavar, Ivo Pavao Jazbec, Bruno Nahod (2009) Struktura i razvoj baze podataka za potrebe projekta Hrvatsko strukovno nazivlje (STRUNA) - projekt koordinacije. Pages 311-317 in: N. Ledinek, M. Žagar Karer and M. Humar (eds.) Terminologija in sodobna terminografija. Ljubljana: Založba ZRC, ZRC SAZU. ISBN: 978-961-254-158-3
- Dunja Brozović Rončević, Damir Cavar 2008: Hrvatska jezična riznica kao podloga jezičnim i jezičnopovijesnim istraživanjima hrvatskoga jezika. In: Vidjeti Ohrid, Zbornik radova XIV. međunarodnog slavističkog kongresa u Ohridu, Hrvatsko filološko društvo - Hrvatska sveučilišna naklada, Zagreb, pp. 173-186.
- Damir Cavar, Ivo-Pavao Jazbec, Siniša Runjaić (2008) Interoperability and Rapid Bootstrapping of Morphological Parsing and Annotation Automata. In Proceedings of IS-LTC 08, Ljubljana, Slovenia. ISBN 978-961-264-006-4
- Paul Rodrigues, Damir Cavar (2008) “Learning Arabic Morphology With Information Theory.” Proceedings of the 41st Annual Meeting of the Chicago Linguistics Society (CLS 41), volume 41, no. 2. Chicago, IL, USA. April 7-9, 2005. pages 49-60. CLS Journal. ISSN: 0577-7240
- Damir Cavar, Dunja Brozović Rončević (2007) “Grammaticality judgments and language usage data.” Proceedings of the fourth Corpus Linguistics conference 2007, Corpus and Cognition Colloquium: The relation between natural and experimental language data; Birmingham, July 27-30, 2007. Abstract.
- Paul Rodrigues, Damir Cavar (2007) “Learning Arabic Morphology Using Statistical Constraint Satisfaction Models.” Pages 63-76 in: E. Benmamoun (ed.) Perspectives on Arabic Linguistics XIX. Current Issues in Linguistic Theory 289. John Benjamins, Amsterdam. Google Books. ISBN: 978-90-272-4804-6
- Damir Cavar, Joshua Herring, Toshikazu Ikuta, Paul Rodrigues, Giancarlo Schrementi (2006) On Unsupervised Grammar Induction from Untagged Corpora. In: P. Kaszubski (ed.) PSiCL: Poznań Studies in Contemporary Linguistics. 41, Adam Mickiewicz University, Poznań, Poland. pp. 57-71. ISBN: 73-7208-165-4
- Damir Cavar, Paul Rodrigues, Giancarlo Schrementi (2006) Unsupervised Morphology Induction for Part-of-Speech Tagging. Penn Working Papers in Linguistics. Volume 12.1. Philadelphia, PA, pp. 29-41.
- Damir Cavar, Paul Rodrigues, Giancarlo Schrementi (2005) Using Morphological and Distributional Cues for Inductive Part-of-Speech Tagging. Proceedings of the Midwest Computational Linguistics Colloquium (MCLC) 2005 at the Ohio State University, Columbus, OH. Online
- Damir Cavar, Paul Rodrigues, Giancarlo Schrementi (2004) Syntactic Parsing Using Mutual Information and Relative Entropy. Proceedings of Midwest Computational Linguistics Colloquium (MCLC) 2004. Online
- Damir Cavar, Joshua Herring, Toshikazu Ikuta, Paul Rodrigues, Giancarlo Schrementi (2004) On Statistical Bootstrapping. In: William G. Sakas (ed.) Proceedings of the First Workshop on Psycho-computational Models of Human Language Acquisition, held in cooperation with COLING 2004, Geneva, pp. 9-16. ACL Anthology
- Damir Cavar, Joshua Herring, Toshikazu Ikuta, Paul Rodrigues, Giancarlo Schrementi (2004) Alignment Based Induction of Morphology Grammar and its Role for Bootstrapping. In: Gerhard Jaeger, Paola Monachesi, Gerald Penn and Shuly Wintner (eds.) Proceedings of Formal Grammar 2004, Nancy, pp. 47-62.
- Chris Wilder and Damir Cavar (2002) Verb Movement, Cliticization, and Coordination. Pages 365-375 of: P. Kosta and J. Frasek (eds.) Current Approaches to Formal Slavic Linguistics. Linguistik International Series Vol. 9, Peter Lang: Frankfurt a.M. ISBN 978-3-631-50311-9
- Sebastian Brandt, Damir Cavar and Uta Störl (2002) A Real Live Web Service using Semantic Web Technologies: Automatic Generation of Meta-Information. In: Proceedings of “On The Move Towards Meaningful Internet Systems” (DOA, ODBASE, CoopIS’02), Irvine, California.
- Damir Cavar and Richard Kauppert (2002) “Strategien für die Implementierung IT-basierter KM-Lösungen: Minimal Invasive Systeme”. In: C. Prange (ed.) Organisationales Lernen und Wissensmanagement - Fallstudien aus der Unternehmenspraxis. Gabler Verlag.
- Damir Cavar and Uta Störl (2002) Automatic Generation of Meta Tags for Intra-Semantic-Web. Pages: 67-77 of: Robert Tolksdorf, Rainer Eckstein (Ed.) Proceedings of XSW 2002 — XML Technologien für das Semantic Web. Lecture Notes in Informatics, Gesellschaft für Informatik, Berlin, Germany, Lecture Notes in Informatics, Vol. 14, June 2002, ISBN 3-88579-343-1.
- Gisbert Fanselow and Damir Cavar (2001) “Distributed Deletion.” In: A. Alexiadou (ed.) Universal of Language: Proceedings of the 1999 GLOW Colloquium. Benjamins, Amsterdam. (Draft PDF)
- Gisbert Fanselow and Damir Cavar (2000) Remarks on the economy of pronunciation. In: Gereon Müller and Wolfgang Sternefeld (eds.) Competition in Syntax, Studies in Generative Grammar 49, ISBN 3-11-016945-2.
- Damir Cavar, Alexander Geyken, Gerald Neumann (2000) Digital Dictionary of the 20th Century German Language. In: T. Erjavec and J. Gros (eds.) Jezikoslovne Tehnologije za Slovenski Jezik. Proceedings of JS 2000, Ljubljana.
- Damir Cavar, Uwe Küssner, Dan Tidhar (2000) “From Off-line Evaluation to On-line Selection.” In: W. Wahlster (ed.) Verbmobil: Foundations of Speech to Speech Translation, Springer Verlag. (Draft PDF)
- Damir Cavar, Uwe Küssner, Dan Tidhar (2000) “From Human Evaluation to Automatic Selection of Good Translations”. In: Proceedings of the Second International Conference on Language Resources and Evaluation, LREC 2000 - Workshop on the Evaluation of Machine Translation, Athens.
- Gisbert Fanselow and Damir Cavar (2000) Remarks on the Economy of Pronunciation. Mscr. University of Potsdam/Technical University of Berlin.
- Gisbert Fanselow, Matthias Schlesewsky, Damir Cavar, Reinhold Kliegl (1999) Optimal parsing, syntactic parsing preferences, and Optimality Theory. (download Rutgers OT-Archive).
- Weissenborn, J., Höhle, B., Kiefer, D. & Cavar, D. (1998) Children’s Sensitivity to Word-Order Violations in German: Evidence for very Early Parameter-Setting, In A. Greenhill, M. Hughes, H. Littlefield & H. Walsh (eds.) Proceedings of the 22nd Annual Boston Conference on Language Development, Somerville, Cascadilla Press.
- Damir Cavar (1999) Aspects of the Syntax-Phonology Interface, Doctoral dissertation, University of Potsdam.
- Damir Cavar and Gisbert Fanselow (1998) “Discontinuous constituents in Slavic and Germanic languages”, Mscr. University of Potsdam.
- Damir Cavar and Wolfgang Menzel (1998) VERBMOBIL: A Speech-to-Speech Translation System. In: T. Erjavec and J. Gros (eds.) Jezikoslovne Tehnologije za Slovenski Jezik, Proceedings of JS ‘98. Ljubljana. ISBN: 961-6303-00-7
- Jürgen Weissenborn, Barbara Höhle, Dorothea Kiefer, Damir Cavar (1998) “Children’s sensitivity to word-order violations in German: Evidence for very early parameter-setting.” In: A. Greenhill, M. Hughes, H. Littlefield & H. Walsh (eds.) Proceedings of the 22nd Annual Boston Conference on Language Development. Sommerville: Cascadilla Press. ISBN: 978-1-57473-032-6
- Damir Cavar and Chris Wilder (1997) “Auxiliaries in Serbian/Croatian and English” In: U. Junghanns and G. Zybatow (eds.) Formale Slavistik. Vervuert: Frankfurt a.M., pp. 3-12. ISBN: 3-89354-267-1
- Damir Cavar (1996) “On Clitics in Croatian: Syntax or Prosody?” Paper presented at the “Workshop on the Syntax, Morphology and Phonology of Clitics” In: ZAS-Working Papers in Linguistics 6 (Oct. 1996), pp. 51-65. ISSN: 1435-9588
- Damir Cavar (1994) Minimalist Aspects of the Syntax of Closed Class Elements, Diploma thesis, University of Potsdam.
- Damir Cavar and Chris Wilder (1994) “Clitic Third in Croatian” In: Linguistics in Potsdam No. 1, 25-63.
- Damir Cavar and Chris Wilder (1994) “Clitic Third in Croatian.” In: H. van Riemsdijk and L. Hellan (eds.) Clitics: Their Origin, Status and Position. Eurotype Woking Papers, Theme Group 8, Vol. 6. Also in: H. van Riemsdijk (eds.) Eurotype volume, Mouton de Gruyter: Berlin.
- Damir Cavar and Chris Wilder (1994) “X0-Bewegung und Ökonomie” In: B. Haftka and C.M. Schmitt (eds.) Was determiniert Wortstellungsvariationen? Studien zu einem Interaktionsfeld von Grammatik, Pragmatik und Sprachtypologie. Westdeutscher Verlag: Opladen, pp. 11-32. ISBN: 9783531124902
- Chris Wilder and Damir Cavar (1994) “Word Order Variation, Verb Movement and Economy Principles.” In: Studia Linguistica 48.1, pp. 46-86. ISSN: 1467-9582, DOI
- Chris Wilder and Damir Cavar (1993) “Word Order Variation, Verb Movement and Economy Principles.” In: Sprachwissenschaft in Frankfurt 10, Frankfurt a.M.
- Damir Cavar and Chris Wilder (1994) “Long Head Movement? Verb-Movement and Cliticization in Croatian” In: Lingua 93, pp. 1-58. ISSN:
0024-3841, DOI
- Damir Cavar and Chris Wilder (1992) “Long Head Movement? Verb-Movement and Cliticization in Croatian”. In: Sprachwissenschaft in Frankfurt 7, Frankfurt a.M.