Găsiți următorul dvs. carte preferat

Deveniți un membru astăzi și citiți gratuit pentru 30 zile
Python Text Processing with NLTK 2.0 Cookbook: LITE

Python Text Processing with NLTK 2.0 Cookbook: LITE

Citiți previzualizarea

Python Text Processing with NLTK 2.0 Cookbook: LITE

evaluări:
4/5 (1 evaluare)
Lungime:
252 pages
55 minutes
Lansat:
May 19, 2011
ISBN:
9781849516396
Format:
Carte

Descriere

The learn-by-doing approach of this book will enable you to dive right into the heart of text processing from the very first page. Each recipe is carefully designed to fulfill your appetite for Natural Language Processing. Packed with numerous illustrative examples and code samples, it will make the task of using the NLTK for Natural Language Processing easy and straightforward. This book is for Python programmers who want to quickly get to grips with using the NLTK for Natural Language Processing. Familiarity with basic text processing concepts is required. Programmers experienced in the NLTK will also find it useful. Students of linguistics will find it invaluable.
Lansat:
May 19, 2011
ISBN:
9781849516396
Format:
Carte

Despre autor


Legat de Python Text Processing with NLTK 2.0 Cookbook

Cărți conex
Articole conexe

Previzualizare carte

Python Text Processing with NLTK 2.0 Cookbook - Jacob Perkins

Table of Contents

Python Text Processing with NLTK 2.0 Cookbook: LITE

Credits

About the Author

About the Reviewers

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Errata

Piracy

Questions

1. Tokenizing Text and WordNet Basics

Introduction

Tokenizing text into sentences

Getting ready

How to do it...

How it works...

There's more...

Other languages

See also

Tokenizing sentences into words

How to do it...

How it works...

There's more...

Contractions

PunktWordTokenizer

WordPunctTokenizer

See also

Tokenizing sentences using regular expressions

Getting ready

How to do it...

How it works...

There's more...

Simple whitespace tokenizer

See also

Filtering stopwords in a tokenized sentence

Getting ready

How to do it...

How it works...

There's more...

See also

Looking up synsets for a word in WordNet

Getting ready

How to do it...

How it works...

There's more...

Hypernyms

Part-of-speech (POS)

See also

Looking up lemmas and synonyms in WordNet

How to do it...

How it works...

There's more...

All possible synonyms

Antonyms

See also

Calculating WordNet synset similarity

How to do it...

How it works...

There's more...

Comparing verbs

Path and LCH similarity

See also

Discovering word collocations

Getting ready

How to do it...

How it works...

There's more...

Scoring functions

Scoring ngrams

2. Replacing and Correcting Words

Introduction

Stemming words

How to do it...

How it works...

There's more...

LancasterStemmer

RegexpStemmer

SnowballStemmer

See also

Lemmatizing words with WordNet

Getting ready

How to do it...

How it works...

There's more...

Combining stemming with lemmatization

See also

Translating text with Babelfish

Getting ready

How to do it...

How it works...

There's more...

Available languages

Replacing words matching regular expressions

Getting ready

How to do it...

How it works...

There's more...

Replacement before tokenization

See also

Removing repeating characters

Getting ready

How to do it...

How it works...

There's more...

See also

Spelling correction with Enchant

Getting ready

How to do it...

How it works...

There's more...

en_GB dictionary

Personal word lists

See also

Replacing synonyms

Getting ready

How to do it...

How it works...

There's more...

CSV synonym replacement

YAML synonym replacement

See also

Replacing negations with antonyms

How to do it...

How it works...

There's more...

See also

3. Text Classification

Introduction

Bag of Words feature extraction

How to do it...

How it works...

There's more...

Filtering stopwords

Including significant bigrams

See also

Training a naive Bayes classifier

Getting ready

How to do it...

How it works...

There's more...

Classification probability

Most informative features

Training estimator

Manual training

See also

Training a decision tree classifier

Getting ready

How to do it...

How it works...

There's more...

Entropy cutoff

Depth cutoff

Support cutoff

See also

Training a maximum entropy classifier

Getting ready

How to do it...

How it works...

There's more...

Scipy algorithms

Megam algorithm

See also

Measuring precision and recall of a classifier

How to do it...

How it works...

There's more...

F-measure

See also

Calculating high information words

How to do it...

How it works...

There's more...

MaxentClassifier with high information words

DecisionTreeClassifier with high information words

See also

Combining classifiers with voting

Getting ready

How to do it...

How it works...

See also

Classifying with multiple binary classifiers

Getting ready

How to do it...

How it works...

There's more...

See also

Index

Python Text Processing with NLTK 2.0 Cookbook: LITE


Python Text Processing with NLTK 2.0 Cookbook: LITE

Copyright © 2011 Packt Publishing

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: April 2011

Production Reference: 1130411

Published by Packt Publishing Ltd. 32 Lincoln Road Olton Birmingham, B27 6PA, UK.

ISBN 978-1-849516-38-9

www.packtpub.com

Cover Image by Sujay Gawand K (<sujay0000@gmail.com>)

Credits

Author

Jacob Perkins

Reviewers

Patrick Chan

Herjend Teny

Acquisition Editor

Steven Wilding

Technical Editors

Hithesh Uchil

Indexer

Hemangini Bari

Production Coordinator

Melwyn D'sa

Cover Work

Melwyn D'sa

About the Author

Jacob Perkins has been an avid user of open source software since high school, when he first built his own computer and didn't want to pay for Windows. At one point he had five operating systems installed, including Red Hat Linux, OpenBSD, and BeOS.

While at Washington University in St. Louis, Jacob took classes in Spanish and poetry writing, and worked on an independent study project that eventually became his Master's project: WUGLE—a GUI for manipulating logical expressions. In his free time, he wrote the Gnome2 version of Seahorse (a GUI for encryption and key management), which has since been translated into over a dozen languages and is included in the default Gnome distribution.

After receiving his MS in Computer Science, Jacob tried to start a web development studio with some friends, but since no one knew anything about web development, it didn't work out as planned. Once he'd actually learned about web development, he went off and co-founded another company called Weotta, which sparked his interest in Machine Learning and Natural Language Processing.

Jacob is currently the CTO/Chief Hacker for Weotta and blogs about what he's learned along the way at http://streamhacker.com/. He is also applying this knowledge to produce text processing APIs and demos at http://text-processing.com/. This book is a synthesis of his knowledge on processing text using Python, NLTK, and more.

Thanks to my parents for all their support, even when they don't understand what I'm doing; Grant for sparking my interest in Natural Language Processing; Les

Ați ajuns la sfârșitul acestei previzualizări. Înscrieți-vă pentru a citi mai multe!
Pagina 1 din 1

Recenzii

Ce părere au oamenii despre Python Text Processing with NLTK 2.0 Cookbook

4.0
1 evaluări / 0 Recenzii
Ce părere aveți?
Evaluare: 0 din 5 stele

Recenziile cititorilor