Sunteți pe pagina 1din 6

L, T, P, J, C

Subject Code: Natural Language Processing


3, 0, 0, 4, 4
Objectives To give an in-depth overview of techniques for processing human languages via
computational treatment of words, phrases, sentences and meanings and how to use
those techniques in developing real world applications such as Machine Translation,
Natural language interfaces etc.
Expected Outcome After successfully completing the course the student should be able to

1. Process the Human Languages Such as English and other Indian Languages.
2. Develop Computational Methods for Real World Applications like Machine
Translation.
3. Implement language technology experiment step by step
4. For a given language technological problem consider which computer programs will be
suitable, install them and apply them to linguistic data.
5. Explain the interaction between rule based and probabilistic methods in language
technology.

SLOs 2,6,7,9
Module Topics L Hrs SLO
INTRODUCTION TO NLP
Introduction to various levels of natural language processing, Ambiguities and
1 computational challenges in processing various natural languages. Introduction to Real 3 2
life applications of NLP such as spell and grammar checkers, information extraction,
question answering, and machine translation.
TEXT PROCESSING
2 Character Encoding, Word Segmentation, Sentence Segmentation, Introduction to 6 7
Corpora, Corpora Analysis.
MORPHOLOGY
3 Inflectional and Derivation Morphology, Morphological Analysis and Generation 6 7
using finite state transducers.
LEXICAL SYNTAX
4 Introduction to word types, POS Tagging, Maximum Entropy Models for POS tagging, 6 7
Multi-word Expressions.
LANGUAGE MODELING
5 The role of language models. Simple N-gram models. Estimating parameters and 6 7
smoothing. Evaluating language models.
SYNTAX & SEMANTICS
Introduction to phrases, clauses and sentence structure, Shallow Parsing and Chunking,
6 Shallow Parsing with Conditional Random Fields (CRF), Lexical Semantics, Word 10 7
Sense Disambiguation, WordNet, Thematic Roles, Semantic Role Labelling with
CRFs.
7 APPLICATIONS OF NLP 6 6,7,9
NL Interfaces, Text Summarization, Sentiment Analysis, Machine Translation,
Question answering.
RECENT TRENDS
8 Recent Trends in NLP 2 7,17

Projects may be given as group projects

1. Machine Translation.
2. Sentiment Analysis on Twitter Data.
3. Article Recommendations for News Feed.
4. Distinguishing Opinion from News.
5. Predicting Sentiment from Rotten Tomatoes Movie Reviews.
6. Information Extraction from Collection of Resumes.
7. Text Summarization.
8. Question Answering System.
9. Part of Speech Tagging using a Hidden Markov Model.
10. Semantics-based Text Mining of Biomedical Concepts in Scientific Publications
Text Books
1. Daniel Jurafsky and James H. Martin “Speech and Language Processing”, 3rd edition, Prentice
Hall, 2009.

Reference Books
1. Chris Manning and HinrichSchütze, “Foundations of Statistical Natural Language Processing”,
2nd edition, MITPress Cambridge, MA, 2003.
2. NitinIndurkhya, Fred J. Damerau “Handbook of Natural Language Processing”, Second Edition,
CRC Press, 2010.
3. James Allen “Natural Language Understanding”, Pearson Publication 8th Edition. 2012.

Natural Language Processing


Knowledge Areas that contain topics and learning outcomes covered in the course

Knowledge Area Total Hours of Coverage

CS: IS(Intelligent Systems) & HCI(Human Computer Interaction) 45

Body of Knowledge coverage


[List the Knowledge Units covered in whole or in part in the course. If in part, please indicate
which topics and/or learning outcomes are covered. For those not covered, you might want to
indicate whether they are covered in another course or not covered in your curriculum at all.
This section will likely be the most time-consuming to complete, but is the most valuable for
educators planning to adopt the CS2013 guidelines.]
KA Knowledge Unit Topics Covered Hours

CS: IS NLP Introduction to NLP 3

CS: IS NLP Text Processing 6

CS: IS NLP Morphology 6

CS: IS NLP Lexical syntax 6

CS:IS NLP Language Modelling 6

CS:IS NLP Syntax & Semantics 10

CS:IS NLP Applications of NLP 6

CS:IS NLP Recent Trends 2

CS:IS NLP Total hours 45

Where does the course fit in the curriculum?


[In what year do students commonly take the course? Is it compulsory? Does it have pre-
requisites, required following courses? How many students take it?]

This course is a
 Elective Course.
 Suitable from 6th semester onwards.
 Knowledge on Theory of Computation is essential.

What is covered in the course?


[A short description, and/or a concise list of topics - possibly from your course syllabus. (This is
likely to be your longest answer)]

Part 1: Introduction to NLP


Introduction to various levels of natural language processing, Ambiguities and computational
challenges in processing various natural languages. Introduction to Real life applications of NLP
such as spell and grammar checkers, information extraction, question answering, and machine
translation.
Part II: Text Processing
This Section deals with Character Encoding, Word Segmentation, Sentence Segmentation,
Introduction to Corpora, Corpora Analysis.

Part III: Morphology


This section deals with Inflectional and Derivation Morphology, Morphological Analysis and
Generation using finite state transducers.

Part IV: Lexical Syntax


It introduces to word types, POS Tagging, Maximum Entropy Models for POS tagging, Multi-
word Expressions.

Part V: Language Modelling


This section deals with the role of language models. Simple N-gram models. Estimating
parameters and smoothing. Evaluating language models.

Part VI: Syntactic and Semantics


This section deals with Introduction to phrases, clauses and sentence structure, Shallow Parsing
and Chunking, Shallow Parsing with Conditional Random Fields (CRF), Lexical Semantics,
Word Sense Disambiguation, WordNet, Thematic Roles, Semantic Role Labelling with CRFs.

Part VII: Applications of NLP


This section deals with some real time applications like NL Interfaces, Text Summarization,
Sentiment Analysis, Machine Translation, Question answering.

Part VIII: Recent Trends

This section deals with recent trends in Natural Language Processing.

What is the format of the course?


[Is it face to face, online or blended? How many contact hours? Does it have lectures, lab
sessions, discussion classes?]

This Course is designed with 165 minutes of in-classroom sessions per week, 60 minutes of
video/reading instructional material per week, as well as 200 minutes of non-contact time spent
on implementing course related project. Generally this course should have the combination of
lectures, in-class discussion, case studies, guest-lectures, mandatory off-class reading material,
quizzes.
How are students assessed?
[What type, and number, of assignments are students are expected to do? (papers, problem sets,
programming projects, etc.). How long do you expect students to spend on completing assessed
work?]

 Students are assessed on a combination group activities, classroom discussion, projects,


and continuous, final assessment tests.

 Additional weightage will be given based on their rank in crowd sourced projects/ Kaggle
like competitions.
 Students can earn additional weightage based on certificate of completion of a related
MOOC course.

Additional topics
[List notable topics covered in the course that you do not find in the CS2013 Body of
Knowledge]

Other comments [optional]

Session wise plan


Student Outcomes Covered: 2, 6, 7, 9

Class Topic Covered levels of Reference Remarks


Hour mastery Book
3 Introduction to NLP Familiarity 1
3 Character Encoding, Word and Sentence Segmentation Familiarity 1
3 Introduction to Corpora and Corpora Analysis Familiarity 1
3 Morphological Analysis Usage 1
3 Finite State Transducers. Assessment 1
2 POS Tagging Usage 1
2 Maximum Entropy Models for POS tagging Usage 1
2 Multi-word Expressions Usage 1
3 n-gram Modeling Familiarity 1
3 Smoothing Assessment 1
2 Introduction to phrases, clauses and sentence structure Familiarity 1
2 CFGs, PCFG, Earley Parsing Algorithm Familiarity 1
2 Shallow Parsing and Chunking Familiarity 1
2 Thematic Roles, Semantic Role Labelling with CRFs Assessment 1
2 Word Sense Disambiguation & WordNet Familiarity 1,2,3
2 NL Interfaces, Text Summarization Familiarity 1,3
4 Sentiment Analysis, Machine Translation, QA Familiarity 1,2,3
2 Recent Trends Familiarity 1,2,3
45 Hrs (3 Credit Hrs/week 15 Weeks schedule)

S-ar putea să vă placă și