CS1012 NATURAL LANGUAGE PROCESSING 3 0 0 100
AIM
The aim is to expose the students to the basic principles of language processing and typical applications of natural language processing systems
OBJECTIVE
• To provide a general introduction including the use of state automata for language processing
• To provide the fundamentals of syntax including a basic parse
• To explain advanced feature like feature structures and realistic parsing methodologies
• To explain basic concepts of remotes processing
• To give details about a typical natural language processing applications
UNIT I INTRODUCTION 6
Introduction: Knowledge in speech and language processing – Ambiguity –
Models and Algorithms – Language, Thought and Understanding. Regular Expressions and automata: Regular expressions – Finite-State automata. Morphology and Finite-State Transducers: Survey of English morphology – Finite-State Morphological parsing – Combining FST lexicon and rules – Lexicon-Free FSTs: The porter stammer – Human morphological processing
UNIT II SYNTAX 10
Word classes and part-of-speech tagging: English word classes – Tagsets for English – Part-of-speech tagging – Rule-based part-of-speech tagging – Stochastic part-of-speech tagging – Transformation-based tagging – Other issues. Context-Free Grammars for English: Constituency – Context-Free rules and trees – Sentence-level constructions – The noun phrase – Coordination – Agreement – The verb phase and sub categorization – Auxiliaries – Spoken language syntax – Grammars equivalence and normal form – Finite-State and Context-Free grammars – Grammars and human processing. Parsing with Context-Free Grammars: Parsing as search – A Basic Top-Down parser – Problems with the basic Top-Down parser – The early algorithm – Finite-State parsing methods.
UNIT III ADVANCED FEATURES AND SYNTAX 11
Features and Unification: Feature structures – Unification of feature structures – Features structures in the grammar – Implementing unification – Parsing with unification constraints – Types and Inheritance. Lexicalized and Probabilistic Parsing: Probabilistic context-free grammar – problems with PCFGs – Probabilistic lexicalized CFGs – Dependency Grammars – Human parsing.
UNIT IV SEMANTIC 10
Representing Meaning: Computational desiderata for representations – Meaning structure of language – First order predicate calculus – Some linguistically relevant concepts – Related representational approaches – Alternative approaches to meaning. Semantic Analysis: Syntax-Driven semantic analysis – Attachments for a fragment of English – Integrating semantic analysis into the early parser – Idioms and compositionality – Robust semantic analysis. Lexical semantics: relational among lexemes and their senses – WordNet: A database of lexical relations – The Internal structure of words – Creativity and the lexicon.
UNIT V APPLICATIONS 8
Word Sense Disambiguation and Information Retrieval: Selectional restriction-based disambiguation – Robust word sense disambiguation – Information retrieval – other information retrieval tasks. Natural Language Generation: Introduction to language generation – Architecture for generation – Surface realization – Discourse planning – Other issues. Machine Translation: Language similarities and differences – The transfer metaphor – The interlingua idea: Using meaning – Direct translation – Using statistical techniques – Usability and system development.
TOTAL : 45
TEXT BOOK
1. Daniel Jurafsky & James H.Martin, “ Speech and Language Processing”, Pearson Education (Singapore) Pte. Ltd., 2002.
REFERENCE
1. James Allen, “Natural Language Understanding”, Pearson Education, 2003.
AIM
The aim is to expose the students to the basic principles of language processing and typical applications of natural language processing systems
OBJECTIVE
• To provide a general introduction including the use of state automata for language processing
• To provide the fundamentals of syntax including a basic parse
• To explain advanced feature like feature structures and realistic parsing methodologies
• To explain basic concepts of remotes processing
• To give details about a typical natural language processing applications
UNIT I INTRODUCTION 6
Introduction: Knowledge in speech and language processing – Ambiguity –
Models and Algorithms – Language, Thought and Understanding. Regular Expressions and automata: Regular expressions – Finite-State automata. Morphology and Finite-State Transducers: Survey of English morphology – Finite-State Morphological parsing – Combining FST lexicon and rules – Lexicon-Free FSTs: The porter stammer – Human morphological processing
UNIT II SYNTAX 10
Word classes and part-of-speech tagging: English word classes – Tagsets for English – Part-of-speech tagging – Rule-based part-of-speech tagging – Stochastic part-of-speech tagging – Transformation-based tagging – Other issues. Context-Free Grammars for English: Constituency – Context-Free rules and trees – Sentence-level constructions – The noun phrase – Coordination – Agreement – The verb phase and sub categorization – Auxiliaries – Spoken language syntax – Grammars equivalence and normal form – Finite-State and Context-Free grammars – Grammars and human processing. Parsing with Context-Free Grammars: Parsing as search – A Basic Top-Down parser – Problems with the basic Top-Down parser – The early algorithm – Finite-State parsing methods.
UNIT III ADVANCED FEATURES AND SYNTAX 11
Features and Unification: Feature structures – Unification of feature structures – Features structures in the grammar – Implementing unification – Parsing with unification constraints – Types and Inheritance. Lexicalized and Probabilistic Parsing: Probabilistic context-free grammar – problems with PCFGs – Probabilistic lexicalized CFGs – Dependency Grammars – Human parsing.
UNIT IV SEMANTIC 10
Representing Meaning: Computational desiderata for representations – Meaning structure of language – First order predicate calculus – Some linguistically relevant concepts – Related representational approaches – Alternative approaches to meaning. Semantic Analysis: Syntax-Driven semantic analysis – Attachments for a fragment of English – Integrating semantic analysis into the early parser – Idioms and compositionality – Robust semantic analysis. Lexical semantics: relational among lexemes and their senses – WordNet: A database of lexical relations – The Internal structure of words – Creativity and the lexicon.
UNIT V APPLICATIONS 8
Word Sense Disambiguation and Information Retrieval: Selectional restriction-based disambiguation – Robust word sense disambiguation – Information retrieval – other information retrieval tasks. Natural Language Generation: Introduction to language generation – Architecture for generation – Surface realization – Discourse planning – Other issues. Machine Translation: Language similarities and differences – The transfer metaphor – The interlingua idea: Using meaning – Direct translation – Using statistical techniques – Usability and system development.
TOTAL : 45
TEXT BOOK
1. Daniel Jurafsky & James H.Martin, “ Speech and Language Processing”, Pearson Education (Singapore) Pte. Ltd., 2002.
REFERENCE
1. James Allen, “Natural Language Understanding”, Pearson Education, 2003.
EmoticonEmoticon