Natural Language Processing with Python
Analyzing Text with the Natural Language Toolkit
Publisher: O'Reilly Media
Final Release Date: June 2009
Pages: 504

This book offers a highly accessible introduction to natural language processing, the field that supports a variety of language technologies, from predictive text and email filtering to automatic summarization and translation. With it, you'll learn how to write Python programs that work with large collections of unstructured text. You'll access richly annotated datasets using a comprehensive range of linguistic data structures, and you'll understand the main algorithms for analyzing the content and structure of written communication.

Packed with examples and exercises, Natural Language Processing with Python will help you:

  • Extract information from unstructured text, either to guess the topic or identify "named entities"
  • Analyze linguistic structure in text, including parsing and semantic analysis
  • Access popular linguistic databases, including WordNet and treebanks
  • Integrate techniques drawn from fields as diverse as linguistics and artificial intelligence


This book will help you gain practical skills in natural language processing using the Python programming language and the Natural Language Toolkit (NLTK) open source library. If you're interested in developing web applications, analyzing multilingual news sources, or documenting endangered languages -- or if you're simply curious to have a programmer's perspective on how human language works -- you'll find Natural Language Processing with Python both fascinating and immensely useful.
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
O'Reilly MediaNatural Language Processing with Python
 
4.5

(based on 2 reviews)

Ratings Distribution

  • 5 Stars

     

    (1)

  • 4 Stars

     

    (1)

  • 3 Stars

     

    (0)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (0)

Reviewed by 2 customers

Sort by

Displaying reviews 1-2

Back to top

(11 of 12 customers found this review helpful)

 
5.0

Extremely good NLP and Python book

By Dr. Sukanta Ganguly

from San Jose, CA

About Me Designer, Developer, Educator

Verified Reviewer

Pros

  • Accurate
  • Concise
  • Easy to understand
  • Helpful examples
  • Well-written

Cons

    Best Uses

    • Expert
    • Intermediate

    Comments about O'Reilly Media Natural Language Processing with Python:

    This book is a near-perfect blend of Natural Language Processing done Python usage to its fullest. Not only did the authors describe NLP extremely well and provided great explanation to many different conditions but they also showed an effective use of Python to substantiate the technical content.
    The book presents a very detailed explanation of the Python based Natural Language Toolkit, NLTK, which is also the brain child of the authors. NLTK is a great piece of software. I have used the software off an on for the past year and half and really like how it was designed and developed by the creators.
    The book builds up by explaining the usage of Python as a programming language to manipulate words, phrases and sentences. Accessing Text Corpora and direct text processing is very well described in the first hundred and twenty pages or so. Chapter six is an excellent chapter for technologist who would like to learn different ways to classify text. Although it is not in-depth, which did not seam to be the driver for the this book, it presented a simple understanding to the readers.
    The concept of chunking of text and its use in classification is very well explained with examples in the book. The methods of developing context-free grammar and parsing of these CFG's probably needed a little more deeper explanation and perhaps some more examples could have helped.
    Over all the book is an excellent book and I must say that it has been a very long time since I have read a book that was extremely satisfactory.
    I would like to very strongly recommend this book to Python lovers who would like to explore the world of Natural Language understanding, parsing and processing. It brings out a very strong factor of Python programming language. I give this book an "A+".

    (14 of 20 customers found this review helpful)

     
    4.0

    A guide to the classic computer science analysis of natural language text

    By beachwalk

    from Undisclosed

    Comments about O'Reilly Media Natural Language Processing with Python:

    Natural Language Processing with Python is about scanning text samples of human languages like English, or Persian or Chineese with computer routines and doing tasks like counting word frequencies, parsing sentences, and further analyses that begin the difficult task of finding limited kinds of meaning in pieces of text .

    The book has a matching website www.nltk.org.

    This book is addressed to a broad academic community:

    One audience is liberal arts students..

    The second audience is the computer science based student.

    The third audience is teachers and researchers worldwide.

    This book tries hard to be a high quality introduction to natural language processing.

    Natural Language Processing itself is one of the great problems of computing. One of the enjoyable things this book does is the authors carefully outline some of the great problems in computer science that are central to natural language processing. These problems are described starting with the texts and programs provided in the toolkit. The liberal arts students are included right at the start. The discussions include further reading references to the classics of computer science, like Knuth.

    Natural Language Processing is also a field of some interest and utility to linguists, critics, historians, students of language and rhetoric and students of 20th century philosophy. This dimension is also covered with a good sequence of examples and references.

    I remember reading the philosopher Wittgenstein (his writings vintage 1943) where he did thought experiments of putting words in a tray. This way of thinking about meaning is a provocative way of thinking about meaning that could lead to some interesting Toolkit projects.

    The fourth audience for this book might be the programmer seeking an interesting opportunity:

    Is this a book that might help me write a project specific text analysis engine? I have been wishing for a way to clarify and reorganize the Ubuntu Forums website with a structured language query tree.



    Would the NLTK be useful if I wanted to write a search engine?

    Problem one with using the NLTK in a search engine project is the non-commercial clause in the Creative Commons license. Using the NLTK as part of a search engine processing framework would require inquiry and clarification of the license terms.

    Problem two with using the NLTK in a search engine project is the search engine design will still require assembly of many other components. I recently did a Google search on search engines. The first hour of reading didn't really turn up a good search engine design article.

    Would the NLTK be useful if I wanted to figure out the vocabulary used by a specific group of people to talk about a specific subject? A really fascinating item in this book in chapter 6 is the "Maximium Entropy Classifier". Here is the first occurrence in print of a formula for entropy that I can understand and duplicate with a pocket calculator.

    Entropy is a key concept discussed by Shannon in his classic information theory article. I sometimes feel very disappointed that computers are not doing much with information. That fascinating parallel between entropy in information theory and entropy in physics and thermodynamics doesn't seem to be a boundary leading to developments.

    Rather, computers and the Internet are indexing words and moving data very well. But the computers are not doing much in the way of "information processing" as in changing the entropy of a block of text.

    In any case, the Natural Language Toolkit book and program suite is a guide to the classic computer science based approach of analyzing natural language text.

    This review is also posted on slashdot.org in my user Journal with the user name beachdog

    Displaying reviews 1-2

    Back to top

     
    Buy 2 Get 1 Free Free Shipping Guarantee
    Buying Options
    Immediate Access - Go Digital what's this?
    Ebook: $37.99
    Formats:  DAISY, ePub, Mobi, PDF
    Print & Ebook: $49.49
    Print: $44.99