Documente Academic
Documente Profesional
Documente Cultură
Sangeeta
FSNLP - Introduction
Course Book
Foundations of Statistical Natural Language
FSNLP - Introduction
Computational Linguistics
The Study of computer systems for understanding and generating natural languages How sentences are generated How people communicate to each other
FSNLP - Introduction
FSNLP - Introduction
Computational Linguistics
Rules To distinguish well formed and Ill formed utterances All Grammar Leak: people bend grammar rules to meet their communication needs Rationalist Approach Common Patterns Statistical NLP Known as counting things Empiricist approach
FSNLP - Introduction
Rationalist Approach
1960-1985 Noam Chromsky Chromskyan linguistics
the human mind is not derived by the senses but is fixed in advance, presumably by genetic inheritance. Key parts are hardwired in the brain
FSNLP - Introduction
FSNLP - Introduction
Empiricist Approach
1920-1960, 198X-
organizing and generalizing not tabula rasa General operations upon senses Patter Recognition, Association and Generalization
Language structure can be understood by
general language model and statistical processing on large amount of language use
FSNLP - Introduction
FSNLP - Introduction
Rationalist approach: linguistic competence Knowledge of language structure in the mind of native-
speaker
Empiricist Approach: Linguistic performance
Delivery of language by speaker Affected by many factors, distracting/noise in the
FSNLP - Introduction
not
FSNLP - Introduction
rules)
Sometime its difficult for average humans being Any answers why?
FSNLP - Introduction
world?
FSNLP - Introduction
grammar
On the basis of competent grammar
FSNLP - Introduction
say?
See only syntax (Rule based approach)
FSNLP - Introduction
say?
Leads to movement to non-
categorical way i.e empiricist approach Categorical dividing a sentence in correct or wrong gives no or less information
Sometimes its very difficult to identify
FSNLP - Introduction
#Exercise
Identify which sentences are grammatically
correct 1. John I believe Sally said Bill believed Sue saw. 2. What did Sally whisper that she had secretly read? 3. John wants very much for himself to win. 4. (Those are) the books you should read before it becomes difficult to talk about. 5. (Those are) the books you should read before talking about becomes difficult. 6. Who did Jo think said John saw him?
FSNLP - Introduction
#Exercise
Identify which sentences are grammatically
correct 1. John I believe Sally said Bill believed Sue saw. 2. What did Sally whisper that she had secretly read? 3. John wants very much for himself to win. 4. (Those are) the books you should read before it becomes difficult to talk about. 5. (Those are) the books you should read before talking about becomes difficult. 6. Who did Jo think said John saw him?
FSNLP - Introduction
speech
Example;
While: Time Take a while While : Complementizer While you were out Although valid today, but was invalid before
FSNLP - Introduction
I am googling
FSNLP - Introduction
(simultaneously)
FSNLP - Introduction
FSNLP - Introduction
say?
Example: In addition to this, she insisted that
convention
FSNLP - Introduction
idea)
Convection changes gradually and can be
FSNLP - Introduction
say?
Empiricist approach find common pattern Simple sentences are clearly acceptable or
unacceptable
FSNLP - Introduction
Non-Categorical
Meaning of words change gradually
kind of / sort of
Does not behave as normal Noun +
Proposition pair
Example: He is kind of hungry He sort of understood whats going wrong
FSNLP - Introduction
Probabilistic
The argument for a probabilistic approach to
cognition is that we live in a world filled with uncertainty and incomplete information.
Unseen events Ambiguity
FSNLP - Introduction
scalable
Example: Verb: swallow Rule: Animate being as subject and a physical
object
I swallowed his story The supernova swallowed the planet
FSNLP - Introduction
#Exercise
Dis-advantages of Statistical Approach
Preparing database is a time consuming
FSNLP - Introduction
FSNLP - Introduction
FSNLP - Introduction
FSNLP - Introduction
FSNLP - Introduction
S NP
Our company Aux is
S VP VP V
training
NP
Our company
VP
V
is
NP
NP
workers
VP
AdjP
training
NP
workers
FSNLP - Introduction
FSNLP - Introduction
WordNet
Electronic dictionary Synset Relations between words Meronymy (part-whole relations)
FSNLP - Introduction
Word counts
Function words Word tokens V.S. Word types Some facts 100 most common words: 50.9% tokens
Almost half(49.8%) of word types are
times
12% of the text is words that occur <=3
times
FSNLP - Introduction
Zipfs laws
Principle of Least Effort : People try to minimize
their work
FSNLP - Introduction
Zipfs laws
Principle of Least Effort Zipfs law (Language):
1 f or f r k r
FSNLP - Introduction
FSNLP - Introduction
Zipfs laws
Weak points Highest/lowest rank Refined by Mandelbrot:
f Pr
FSNLP - Introduction
Mandelbrot: Approximation
FSNLP - Introduction
1 f or m r
FSNLP - Introduction
generated text?
FSNLP - Introduction
Applications of NLP
Machine Translation
Meaning in English
six writings
FSNLP - Introduction
E-mail:
FSNLP - Introduction
Unbeatable package of image quality You really need to spend some quality time going
through all the settings before using the Olympus OM-D E-M1.
FSNLP - Introduction