Subject

Mathematics

Program

Master of Science (MSc)

Location

Côte-des-Neiges

Instruction mode

On-site learning

Credits

Description

This course will cover the fundamental tools and models used for analyzing and generating natural language data in a computational setting. We will learn about the core principles behind contemporary natural language processing (NLP) methods.

The subjects covered will include the structure of natural language data and how it may be used for downstream tasks such as document search and classification, text generation, and text summarization, using both heuristic-based and neural-network based language modeling.

This course expects previous experience with the programming language Python, though no previous experience with machine learning is required.

Themes covered

The nature of text data
Preprocessing: tokenization and lemmatization
Bag-of-words topic models and naive classification
N-gram language models (Markov models)
Hidden Markov models and part-of-speech tagging
Distributed representations and vector semantics
Recurrent neural language models LSTMs and language generation
Transformers and masked language modeling
Encoder models and semantic search
Encoder-decoder models text summarization and translation

Important notes

Course in French: MATH 60621
Cours réservé aux étudiant(e)s admis à un programme de 2e cycle (ou au B.A.A. si le cours fait partie de la spécialisation).

MATH 60621A

Share this course

Share this course