Damir Cavar is a Natural Language Processing, AI, and Knowledge Representation scientist
This is the course page for Topics in Artificial Intelligence / Advanced Natural Language Processing (NLP) by Damir Cavar.
– August 2023 –
Meeting time: MW, 4:55-6:10 PM
Classroom: Ballantine Hall (BH) 343
Course website: Assignments, slides, and other material will be posted on Canvas.
Credits: 3
Instructor: Dr. Damir Cavar
Office: Ballantine Hall (BH) 511
Phone: (812) 856-5094
Office hours (BH 516): Thursdays, 4:15-5:15 PM and by appointment
Symbolic, statistical, and neural methods are at the core of Computational Linguistics and Natural Language Processing (NLP) in research and applications. This course introduces advanced techniques for NLP based on statistical modeling and machine learning algorithms, including neural network and Deep Learning approaches, bringing them together with symbolic and knowledge-based systems. We aim to bridge research and insights from language and linguistic disciplines and the application of NLP and linguistic technologies from the computer and information science perspective.
This course will cover fundamental notions in probability and information theory, focusing on the concepts needed for common NLP tasks. We will discuss N-gram models, exemplified by an approach to document classification or Part-of-Speech (PoS) tagging. In the next step, we will extend to probabilistic methods and to sentiment analysis. We will study advanced neural network approaches (Deep Learning) for NLP, used for various speech and language processing tasks.
Additionally, we will cover concrete topics such as information extraction and graph-based knowledge representations used for text classification, natural language understanding, dialog systems or chatbots (so called AIs), or information retrieval, and how to use various NLP methods in the context of such systems. There is space to focus in part on topics of interest related to the choice of concrete applications of NLP methods.
We are discussing advanced hybrid NLP methods, covering symbolic, statistical, and neural network methods in the context of particular tasks. All the methods we use apply to a range of tasks in NLP. The mission is to teach students techniques, algorithms, and existing environments to enable them to develop their own strategies to analyze linguistic phenomena using language data, to apply NLP in the domain of information extraction from unstructured data, or to research in the field of AI, psycholinguistic or cognitive language faculty, verbal behavior, and general speech and language technologies.
Crucial aspects of course outcomes are:
This course provides an essential platform for further work in NLP.
Students are encouraged to bring their laptops or other computational devices to class.
The readings and exercises will be accompanied by practical examples using:
This is a tentative schedule. It is subject to change. Updates and changes will be discussed in class. [JM]
refers to the Jurafsky and Martin textnook.
date | topic |
---|---|
08/21/2023 | Introduction and Orientation Meeting |
08/23/2023 | Introduction, MS Ch. 1, JM Ch. 1 |
08/28/2023 | Corpora and Linguistic Annotation, JM Ch. 2, Canvas material |
08/30/2023 | Text Processing, JM Ch. 2 |
09/06/2023 | Edit Distance, JM Ch. 3 |
09/11/2023 | N-gram models, JM Ch. 3 |
09/13/2023 | N-gram models and statistical analysis, JM Ch. 3 |
09/18/2023 | NLP Technologies and Linguistic Annotation, On Canvas |
09/20/2023 | NLP Technologies and Linguistic Annotation, On Canvas |
09/25/2023 | Common NLP Pipelines, On Canvas |
09/27/2023 | Common NLP Pipelines, On Canvas |
10/02/2023 | Naïve Byes, Text and Sentiment Classification, JM Ch. 4 |
10/04/2023 | Naïve Byes, Text and Sentiment Classification, JM Ch. 4 |
10/09/2023 | Logistic Regression, JM Ch. 5 |
10/11/2023 | Logistic Regression, JM Ch. 5 |
10/16/2023 | Logistic Regression, JM Ch. 5 |
10/18/2023 | Vector Semantics and Embedding, JM Ch. 6 |
10/23/2023 | Vector Semantics and Embedding, JM Ch. 6 |
10/25/2023 | Vector Semantics and Embedding, JM Ch. 6 |
10/30/2023 | Neural Networks, JM Ch. 7 |
11/01/2023 | Neural Networks, JM Ch. 7 |
11/06/2023 | PoS-tagging and NER, JM Ch. 8 |
11/08/2023 | RNNs and LSTMs, JM Ch. 9 |
11/13/2023 | RNNs and LSTMs, JM Ch. 9 |
11/15/2023 | Transformers and Pretrained Language Models, JM Ch. 10 |
11/20/2023 | Transformers and Pretrained Language Models, JM Ch. 10 |
11/22/2023 | Large Language Models, On Canvas |
11/27/2023 | Parsing and Semantic Processing, JM Ch. 18, 20, 21 |
11/29/2023 | Graph Models of Knowledge and Semantics, JM Ch. 18, 20, 21 |
12/04/2023 | Graph Models of Knowledge and Semantics, On Canvas |
12/06/2023 | Relation and Event Extraction, JM Ch. 21 |
12/11/2023 | Project Presentations |
12/13/2023 | Project Presentations |
We will be using the most recent 3rd edition of the textbook and additional material shared on Canvas.
Students are welcome to participate in NLP-Lab meetings and projects after consultation with the instructor. See for more details: https://nlp-lab.org/
This syllabus is subject to change and likely will change. All critical changes will be made in writing, with ample time for adjustment.
(C) 2024 by Damir Cavar