Computational Semantics for Natural Language Processing

ETH Zürich, Spring Semester 2021: Course catalog

Course Description

This course presents an introduction to Natural language processing (NLP) with an emphasis on computational semantics i.e. the process of constructing and reasoning with meaning representations of natural language text.

The objective of the course is to learn about various topics in computational semantics and its importance in natural language processing methodology and research. Exercises and the project will be key parts of the course so the students will be able to gain hands-on experience with state-of-the-art techniques in the field.


The final assessment will be a combination of classroom participation, graded exercises and the project. There will be 3 exercise sets which will be a mix of theoretical and implementation problems. Exercises will be released roughly every 4 weeks, and will total to 30% of your grade. Classroom participation (including a research paper presentation) will account for 20% of the grade. The project will account of the rest of the grade (50%). There will be no written exams.

Lectures: Thu 10:15-12:00 Zoom link:

Discussion Sections: Thu 15:15-16:00 Same zoom link


Textbooks: We will not follow any particular textbook. We will draw material from a number of research papers. However, you might find the following textbooks useful:

  1. Introduction to Natural Language Processing by Jacob Eisenstein
  2. Speech and Language Processing by Jurafsky and Martin


18.02   Class website is online!

Course Schedule

 Lecture Date Description Course Materials Events           
  1  25.02     Introduction Diagnostic Quiz Answers to quiz Presentation Preference Indication
  2  04.03  The Distributional Hypothesis and Word Vectors Suggested Readings:
     1. Word2Vec Tutorial - The Skip-Gram Model
     2. Efficient Estimation of Word Representations in Vector Space (original word2vec paper)
     3. Distributed Representations of Words and Phrases and their Compositionality (negative sampling paper)
  3  11.03  Word Vectors 2, Word Senses and Sentence Vectors

(Recursive Neural Networks)
Suggested Readings:
     1. GloVe: Global Vectors for Word Representation (original GloVe paper)
     2. Neural Word Embedding as Implicit Matrix Factorization
     3. Evaluation Methods for Unsupervised Word Embeddings
     4. Word Senses and Word Embeddings Chapter in Jurafsky and Martin
Optional Readings:
     1. A Latent Variable Model Approach to PMI-based Word Embeddings
     2. Linear Algebraic Structure of Word Senses, with Applications to Polysemy
     3. On the Dimensionality of Word Embedding.
 Voluntary  11.03  Python, PyTorch review session by TAs Suggested Readings:
     1. Review of Differential Calculus
Additional Readings:
     1. Natural Language Processing (Almost) from Scratch
 Voluntary  11.03  Matrix Calculus and Backpropagation by TAs Suggested Readings:
     1. CS231n notes on network architectures
     2. CS231n notes on backprop
     3. Learning Representations by Backpropagating Errors
     4. Derivatives, Backpropagation, and Vectorization
     5. Yes you should understand backprop
 4  18.03  From words to sentences…

Recurrent Neural Networks for Language

Case Study: Language Modelling
Suggested Readings:
     1. N-gram Language Models (textbook chapter)
     2. The Unreasonable Effectiveness of Recurrent Neural Networks (blog post overview)
     3. Sequence Modeling: Recurrent and Recursive Neural Nets (Sections 10.1 and 10.2)
Optional Readings (RNNs):
     1. Sequence Modeling: Recurrent and Recursive Neural Nets (Sections 10.3, 10.5, 10.7-10.12)
     2. Learning Long-term Dependencies with Gradient Descent is Difficult (one of the original vanishing gradient papers)
     3. On the Difficulty of Training Recurrent Neural Networks (proof of vanishing gradient problem)
     4. Vanishing Gradients Jupyter Notebook (demo for feedforward networks)
     5. Understanding LSTM Networks (blog post overview)
Project group formation due

Assignment 1 out
 5  25.03  NLU beyond a sentence

Seq2Seq and Attention

Case Study: Sentence Similarity, Textual Entailment and Machine Comprehension
Suggested Readings:
     1. Sequence to Sequence Learning with Neural Networks (original seq2seq NMT paper)
     2. Sequence Transduction with Recurrent Neural Networks (early seq2seq speech recognition paper)
     3. Neural Machine Translation by Jointly Learning to Align and Translate (original seq2seq+attention paper)
Optional Readings:
     1. Attention and Augmented Recurrent Neural Networks (blog post overview)
     2. Massive Exploration of Neural Machine Translation Architectures (practical advice for hyperparameter choices)
List of TA mentored projects released
 6  01.04  Syntax

Dependency and Constituency Parsing
Suggested Readings (Dependency Parsing):
     1. Incrementality in Deterministic Dependency Parsing
     2. A Fast and Accurate Dependency Parser using Neural Networks
     3. Globally Normalized Transition-Based Neural Networks
Suggested Readings (Constituency Parsing):
     1. Parsing with Compositional Vector Grammars.
     2. Constituency Parsing with a Self-Attentive Encoder
 Easter  08.04       
 7  15.04  Syntax II and Predicate Argument Structures

(Semantic Role Labelling, Frame Semantics, etc.)
Suggested Reading:
     1. Semantic Role Labelling chapter of Jurafsky and Martin
     2. Jointly Predicting Predicates and Arguments in Neural Semantic Role Labeling
Assignment 2 out

Assignment 1 due
   15.04  Discussion on Final Projects Suggested Readings:
     1. Practical Methodology (Deep Learning book chapter)
 8  22.04  Predicate Argument Structures II

(Semantic Role Labelling, Frame Semantics, etc.)
  Project Proposal due
 9  29.04  Formal Representations of Language Meaning Suggested Readings:
     1. Logical Representations chapter of Jurafsky and Martin
 10  06.05  Transformers and Contextual Word Representations (BERT, etc.)

Guest lecture by Manzil Zaheer (Google)
Suggested Readings:
     1. Attention Is All You Need
     2. The Illustrated Transformer
Optional Readings:
     1. BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding
     2. Contextual Word Representations: A Contextual Introduction.
     3. The Illustrated BERT, ELMo, and co.
Assignment 2 due
Ascension  13.05  No class due to the Ascension break    
 11  20.05  Natural Language Generation

Case Study: Summarization and Conversation Modelling
Suggested Readings:
     1. The Curious Case of Neural Text Degeneration
     2. Get To The Point: Summarization with Pointer-Generator Networks.
     3. Hierarchical Neural Story Generation
     4. How NOT To Evaluate Your Dialogue System
Assignment 3 out
 12  27.05  Modelling and tracking entities: NER, coreference and information extraction (entity and relation extraction) Suggested Readings:
     1. Coreference Resolution chapter of Jurafsky and Martin
     2. End-to-end Neural Coreference Resolution
     3. Information Extraction chapter of Jurafsky and Martin
 13  03.06  Language + {Knowledge, Vision, Action} Suggested Readings:
     1. Language Models as Knowledge Bases?
     2. Knowledge Enhanced Contextual Word Representations
  17.06     Assignment 3 due
  28.06     Project Progress Report due
   08.08  Final project presentation (or poster session)    
   08.08  Final project report submission    

Assignment Submission Instructions




You can ask questions on piazza. Please post questions there, so others can see them and share in the discussion. If you have questions which are not of general interest, please don’t hesitate to contact us directly.

Lecturer Mrinmaya Sachan
Teaching Assistants Jiaoda LiShehzaad DhuliawalaYifan Hou