CSCI-B 659 Topics in Artificial Intelligence

LING-L 665 Applying Machine Learning Techniques in Computational Linguistics - Neural Networks, Deep Learning for CL/NLP
Spring 2018 at Indiana University

See for full syllabus the online document.

Content

Introduction
Schedule
Projects and Reports

Introduction

This is a graduate course that focuses on the introducing of machine learning techniques that are used in Computational Linguistics.

Machine learning problems in CL are rather non-typical for machine learning because natural language includes a significant level of exceptions. The course will provide an overview of the most important machine learning algorithms, but it will mostly focus on how to apply machine learning to CL problems such as co-reference resolution, morphological analysis, parsing, and word sense disambiguation. In addition to the numerous underlying tasks in ML for CL (and NLP) applications, we will discuss deep learning approaches. We will work with neural network models applied to traditional CL and NLP problems.

Among others, we will cover word vector representations, window-based neural networks, recurrent neural networks, long-short-term-memory models, recursive neural networks, convolutional neural networks, etc.

The course is a series of lectures and hands-on programming exercises.

The course is using material provided by:

Stanford University: CS224n: Natural Language Processing with Deep Learning
The Deep Learning textbook (MIT Press) by Ian Goodfellow, Yoshua Bengio and Aaron Courville
Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed. draft)
University of Oxford: Deep Learning for Natural Language Processing: 2016-2017 by Phil Blunsom

These courses are accompanied by videos, slides, research papers, links to supplemental material and tutorials, and other very valuable information. Please use these resources during our course.

Prerequisites and Requirements

I expect that you are able or acquire the skills to code examples in Python or Go. If you have no programming experience, follow the different links here and on the mentioned course sites and learn Python and Numpy.

Learn Python or [Go]; if you have never programmed before, learn Python first; I recommend using Python 3.x or the most recent distribution of [Go].
Install and learn about TensorFlow, Word2vec.
Refresh your knowledge of Calculus and Linear Algebra.
Update your knowledge of Probability Theory.
Refresh your knowledge of common Machine Learning approaches.
Familiarize yourself with common Linguistic concepts and theories, in particular lexical properties, syntax, semantics, speech; for basic introductions consult Jurafsky and Martin (2017, draft, 3rd ed.) or Bender (2013).

Work through all the relevant Jupyter notebooks at: Python tutorials for NLP, ML, AI

Literature

I do not require any textbook, I recommend the following:

Dan Jurafsky and James H. Martin. Speech and Language Processing (3rd ed. draft) (available online)
Yoav Goldberg. A Primer on Neural Network Models for Natural Language Processing. (download paper)
Ian Goodfellow, Yoshua Bengio, and Aaron Courville. Deep Learning. MIT Press. (read online), book

If you are new to natural language, computational linguistics, NLP, take a look at this book:

Bender, Emily M. (2013) Linguistic Fundamentals for Natural Language Processing: 100 Essentials from Morphology and Syntax. Synthesis Lectures on Human Language Technologies #20. Morgan & Claypool Publishers.

We will read the following papers:

Schedule

Jan. 09	Introduction, Syllabus and Schedule
Jan. 11	Introduction to NLP and CL	Read: Bender 2013 for Linguistics Manning & Schuetze: Ch. 1
Jan. 16	Probability Review	Maleki & Do: Review of Probability Theory Manning & Schuetze: Ch. 2 Goodfellow et al.: Ch. 3
Jan. 18	Linear Algebra Review	Kolter (and Do) Linear Algebra Review and References Goodfellow et al.: Ch. 2
Jan. 23	Optimization; Python, NLTK, WordNet, spaCy	Kolter & Lee: Convex Optimization Review More Optimization (SGD) Review Jurafsky & Martin: Ch. 17 Bird et al (2009) Pilgrim (2009)
Jan. 25	Vectors and Word2vec	Jurafsky & Martin: Ch. 15 & 16 Word2Vec Tutorial - The Skip-Gram Model Mikolov et al.: Distributed Representations of Words and Phrases and their Compositionality Mikolov et al.: Efficient Estimation of Word Representations in Vector Space
Jan. 30	Numpy and Word2vec applied	Johnson: Python Numpy tutorial Lit., see previous session
Feb. 1	Word Window Classification and Neural Networks	Jurafsky & Martin: Ch. 8
Feb. 6	Word Window Classification and Neural Networks	Jurafsky & Martin: Ch. 8
Feb. 8	Advanced Word Vector Models	Jurafsky & Martin: Ch. 16
Feb. 13	Advanced Word Vector Models	Jurafsky & Martin: Ch. 16
Feb. 15	Neural Networks, Single Layer Networks	Jurafsky & Martin: Ch. 8 Goodfellow et al.: Ch. 6
Feb. 20	Backpropagation	UFLDL tutorial Rumelhart et al.: Learning Representations by Backpropogating Errors
Feb. 22	Backpropagation, NNs, QA, Semantics	Collobert at al.: Natural Language Processing (almost) from Scratch Iyyer et al.: A Neural Network for Factoid Question Answering over Paragraphs Socher et al.: Grounded Compositional Semantics for Finding and Describing Images with Sentences Karpathy and Fei-Fei: Deep Visual-Semantic Alignments for Generating Image Descriptions Socher et al.: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank
Feb. 27	Gradients, Overfitting, Activation Function	Bengio: Practical recommendations for gradient-based training of deep architectures UFLDL page on gradient checking
Mar. 1	Tensorflow	Abadi et al.: TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems Tensorflow tutorials
Mar. 6	Tensorflow	Lit., see above
Mar. 8	Recurrent Neural Networks and Language Models	Mikolov et al.: Recurrent neural network based language model Mikolov et al.: Extensions of recurrent neural network language model Isoy and Cardie: Opinion Mining with Deep Recurrent Neural Networks
Mar. 20	Gated Feedback Recurrent NNs, Long Short-Term Memory for Machine Translation	Hochreiter and Schmidhuber: Long Short-Term Memory Chung et al.: Gated Feedback Recurrent Neural Networks Chung et al.: Empirical Evaluation of Gated Recurrent Neural Networks on Sequence Modeling
Mar. 22	Gated Feedback Recurrent NNs, Long Short-Term Memory for Machine Translation	Lit., see above
Mar. 27	Recursive Neural Networks, Parsing	Goodfellow et al.: Ch. 10 Socher et al.: Parsing with Compositional Vector Grammars Ratliff et al.: Subgradient Methods for Structured Prediction Socher et al.: Parsing Natural Scenes and Natural Language with Recursive Neural Networks
Mar. 29	Recursive Neural Networks, Sentiment Analysis	Socher et al.: Recursive Deep Models for Semantic Compositionality Over a Sentiment Treebank Socher et al.: Dynamic Pooling and Unfolding Recursive Autoencoders for Paraphrase Detection Tai et al.: Improved Semantic Representations From Tree-Structured Long Short-Term Memory Networks
Apr. 3	Convolutional Neural Networks, Sentence Classification	Goodfellow et al.: Ch. 9 Kim: Convolutional Neural Networks for Sentence Classification
Apr. 5	General topics: ML, Speech Recognition	Senior: Oxford DL Course: Speech Recognition chapter Hinton et al.: Deep Neural Networks for Acoustic Modeling in Speech Recognition
Apr. 10	General topics: Dynamic Memory Networks	Kumar et al.: Ask me anthing: Dynamic Memory Networks for NLP
Apr. 12	Discussion and Practical Experiments	TBA
Apr. 17	Issues with Deep Learning and NLP	Marcus 2018: Innateness, AlphaZero, and Artificial Intelligence
Apr. 19	Issues with Deep Learning and NLP	Marcus 2018: Deep Learning: A Critical Appraisal
Apr. 24	Project presentations	see Projects and Reports below
Apr. 26	Project presentations	see Projects and Reports below

back to Content

Projects and Reports

Author	Title	Resources
Taslima Akter	Deep Visual-Semantic Alignments for Generating Image Descriptions	Slides (pdf)
	Project: Automatic Question Detection in Speech	Slides (pdf)
Gleb Alexeev	Recursive Deep Models for Semantic Compositionality over a Sentiment Treebank	Slides (pdf)
	Project: Music Genre and Feature Classification Using ML and NLP Techniques via Visual, Audio, and Linguistic Analysis	Slides (pdf)
Khandokar Md. Nayem	Activation Function	Slides (pdf)
	Project: Automatic Question Detection in Speech	Slides (pdf)
Scott McCaulay	Backpropagation: Backward Propagation of Errors to Train Artificial Neural Networks	Slides (pdf)
	Project: Predicting Community Assessment: Voiting for Questions and Answers in Stack Overflow	Slides (pdf)
Carlos Sathler	Neural Nets in NLP Competitions: Two Recent Examples from Kaggle	Slides (pdf) YouTube video/presentation
	Project: ...	Slides (pdf)

Other presentations:

Document Modeling with Gated Recurrent Neural Networks
Recursive Neural Networks
Semantic Compositionality Using Recursive Deep Models
Opinion Mining with Deep RNN
Neural Networks and Single Layer Networks
Dynamic Memory Networks
Efficient Estimation of Word Representations in Vector Space
Gradient Descent

Other final projects:

Part-of-Speech Tagging with Recurrent Deep Neural Networks and Letter Order
Dialog Act Classification for Speech Assistants
Author Identification Using CNNs and RNNs
Question Answering Systems Using Memory Networks
Sarcasm Detection using Neural Networks