SemesterFall Semester, 2019
DepartmentMA Program of Computer Science, First Year MA Program of Computer Science, Second Year
Course NameNatural Language Processing
InstructorHuang Hen Hsen
Credit3.0
Course TypeElective
Prerequisite
Course Objective
Course Description
Course Schedule















































































Date Subject
9/11

Introduction to NLP



An overview of natural language processing



Hours: 3


9/18

Linguistic Essentials



A breif introduction of linguistics and its applications in NLP



Hours: 3


9/25

Collocation



Mining collocated words from a collection of documents



Hours: 3 + 3 (codelab)


10/2

Language Modeling



The basic concepts of language modeling and its applications. The smoothing algorithms are also described.  



Hours: 3 + 3 (codelab)


10/9

Wod Sense Disambiguation



Two approaches to word sense disambiguation.



Hours: 3 + 3 (codelab)


10/16

Final Project Announcement



Giving the overview and the description of the final project.



Hours: 3 


10/23

Text Classification



Basic statistical models for text classification and feature extraction. 



Hours: 3 + 3 (codelab)


10/30

POS Tagging



Introduction to sequence labeling and its important application in NLP including part-of-speech tagging. POS tagging in both Chinese and English will be described. 



Hours: 3 + 3 (codelab)


11/6 Midterm Exam
11/13

Chinese Word Segmentation



Chinese word segmentation is a special subject in NLP. How to perform text segmentation with sequence labeling will be introduced. 



Hours: 3 + 3 (codelab)


11/20

Word Embeddings



Word embeddings are very useful in many applications. Basic models for training word embeddings including CBOW, Skipgram, and Fasttext, will be given. 



Hours:: 3 + 3 (codelab)


11/27

Neural Networks for NLP



Deep nueral networks play an important role in modern NLP. This subject introduces the convolutional neural network (CNN) and recurrent neural network (RNN) in NLP. Advanced topics such as attention and pre-trained sentence embeddings are also covered. 



Hours: 3 + 3 (codelab)


12/4

Parsing



Parse tree provides rich information in natural language understanding. This subject introduces two basic parsing schemes and computational models for parsing. 



Hours: 3 + 3 (codelab)


12/11

Discourse Analysis



Many novel applications are based on discourse analysis. This subject introduces discourse relation recognition and discourse parsing. Other topics in discourse analysis will be briefly described. 



Hours: 3


12/18

Semi-supervised Approaches to NLP



Semi-supervised learning is extremely useful in NLP because training data is usually insufficient in novel tasks. The strategies for training models in the semi-supervised fashion will be introduced. 



Hours: 3 + 3 (codelab) 


12/25

Final Project



Presentation of the final project



Hours: 3 


1/1 Off
1/8 Term Exam

Teaching Methods
Teaching Assistant
Requirement/Grading

期中考、期末考以現場筆試進行,出題方向包含課堂所授之技術與觀念,以及活用技術解決實際問題情境。



專題將挑選具有前瞻性與實用性的題目,提供開發資料集,以組隊 codelab 的形式進行,為期一至兩個月。評量標準依效能、名次、方法的創新性為主。



 



Midterm exam: 20%



Term exam: 30%



Term project:30%



Assignments: 20%


Textbook & Reference

Christopher D. Manning and Hinrich Schütze, Foundations of Statistical Natural Language Processing, MIT Press. 1999.


Urls about Course
Attachment