Python for Data Analysis
Data Wrangling with Pandas, NumPy, and IPython
Publisher: O'Reilly Media
Final Release Date: October 2012
Pages: 466

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.

Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It’s ideal for analysts new to Python and for Python programmers new to scientific computing.

  • Use the IPython interactive shell as your primary development environment
  • Learn basic and advanced NumPy (Numerical Python) features
  • Get started with data analysis tools in the pandas library
  • Use high-performance tools to load, clean, transform, merge, and reshape data
  • Create scatter plots and static or interactive visualizations with matplotlib
  • Apply the pandas groupby facility to slice, dice, and summarize datasets
  • Measure data by points in time, whether it’s specific instances, fixed periods, or intervals
  • Learn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyPython for Data Analysis
 
3.9

(based on 27 reviews)

Ratings Distribution

  • 5 Stars

     

    (9)

  • 4 Stars

     

    (13)

  • 3 Stars

     

    (1)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (4)

81%

of respondents would recommend this to a friend.

Pros

  • Helpful examples (22)
  • Well-written (14)
  • Easy to understand (13)
  • Accurate (5)
  • Concise (5)

Cons

  • Too many errors (3)

Best Uses

  • Intermediate (17)
  • Novice (6)
  • Expert (5)
  • Student (3)
    • Reviewer Profile:
    • Developer (12), Maker (3)

Reviewed by 27 customers

Displaying reviews 1-10

Back to top

Previous | Next »

(4 of 5 customers found this review helpful)

 
1.0

Outdated and poorly written

By Tank

from CT

Verified Reviewer

Pros

    Cons

    • Not comprehensive enough
    • Too many errors

    Best Uses

      Comments about oreilly Python for Data Analysis:

      This book is severely outdated, contains numerous errors and it is poorly written. Doesn't seem like the author, Wes, did such a good job on it.

      (1 of 1 customers found this review helpful)

       
      1.0

      Too many mistakes

      By Sean W

      from Troy, MI

      About Me Educator

      Pros

      • Helpful examples

      Cons

      • Too many errors

      Best Uses

      • Expert
      • Intermediate

      Comments about oreilly Python for Data Analysis:

      The book started OK with good content and lines of codes. However, when close to the end, the book is filled with mistakes. The problem is that later chapters cover complicated topics and even small coding mistakes can waste tons of readers' time.

      (1 of 2 customers found this review helpful)

       
      1.0

      Good topic, but lack of effort

      By Pycoon

      from Toronto, Ontario

      About Me Designer, Maker

      Pros

      • Helpful examples

      Cons

      • Not comprehensive enough
      • Too many errors

      Best Uses

      • Novice

      Comments about oreilly Python for Data Analysis:

      The topics in the book cover many aspects of using Python for data analysis and it's a good book for people to start with if having no prior knowledge of Python. However, the author appeared not putting enough efforts to go just one step further into each topic and there are many mistakes in the book.
      I would recommend the book to those who want to understand the power of python for data analysis but definitely not those who want to make themselves profession of data mining.

       
      4.0

      Great read for pythonic Data Analysis

      By KS

      from SF, CA

      About Me Designer, Developer, Maker

      Verified Reviewer

      Pros

      • Easy to understand
      • Helpful examples
      • Well-written

      Cons

      • Too basic

      Best Uses

      • Intermediate
      • Novice
      • Student

      Comments about oreilly Python for Data Analysis:

      Good read for pythonic data analysis. It is a little light on some of the stats- but you can supplement with Think Stats fairly easily. Loses a star for being a little light information wise... but I suppose you can fit absolutely everything into a single book...

      (1 of 1 customers found this review helpful)

       
      5.0

      Excellent book to learn Pandas

      By Tiberiu

      from Bucharest, Romania

      Verified Buyer

      Pros

      • Helpful examples
      • Well-written

      Cons

        Best Uses

        • Intermediate

        Comments about oreilly Python for Data Analysis:

        I bought the book because I wanted to learn how I could use Pandas module for Finance. It is an excellent book for programmers that want to learn data handling with Pandas. As regarding quant analysis, it has only one chapter covering this subject. I recommand the book for quants that already have some basic or intermediate knowledge of Python but want to learn more about data wrangling. The book offers examples from different domains. I also found examples of applied Python in finance that I could not find in other books. The book is focused on Pandas module, although you can find examples regarding Numpy and Matplolib modules.

        (4 of 4 customers found this review helpful)

         
        4.0

        Real Start to Data Analysis with Python

        By Geoff the Numbers Guy

        from Silicon Valley, CA

        Verified Buyer

        Pros

        • Easy to understand
        • Helpful examples

        Cons

          Best Uses

          • Intermediate
          • Novice

          Comments about oreilly Python for Data Analysis:

          I use R for my work and have been interested in learning Python to take my career in a slightly different direction. So I already know about data analysis, just not how to do it in Python. One of the beautiful things about Python (like R) is the wealth of libraries where other people have solved common problems and all you need do is make use of their solutions. In the case of Python, it is especially Pandas that turns it into a good tool for data analysis. And true to the title, the author does a great job of giving you the information you need to set up a Python environment with Pandas and associated packages in place so that instead of writing code to do data analysis, you can get straight to the analysis part.

          One of the best things about this book is the clues it gives to getting your tools working. The one weak point, from my perspective, is the occasional digression into how you would do something in regular Python and why Pandas is better. If Pandas is better, and it's free, why would you want to know about an inferior approach?

          If you know about programming and data analysis, but want to apply your skills using Python, this is a good book to get started.

           
          5.0

          Good book with examples and features

          By Anish Chapagain

          from Kathmandu, Nepal

          About Me Developer

          Verified Reviewer

          Pros

          • Concise
          • Easy to understand
          • Helpful examples
          • Well-written

          Cons

            Best Uses

            • Expert
            • Intermediate

            Comments about oreilly Python for Data Analysis:

            Book is great resource for Python lover and also for Data analysis. Tips, Graphs and Code were really helpful to visualize and interpret.

            (4 of 4 customers found this review helpful)

             
            4.0

            Good reference to deal with tabular data

            By Fábio Fortkamp

            from Florianópolis, Brazil

            About Me Master's Student

            Verified Reviewer

            Pros

            • Easy to understand
            • Helpful examples
            • Well-written

            Cons

            • Lack Of Figures

            Best Uses

            • Researchers
            • Scientists
            • Student

            Comments about oreilly Python for Data Analysis:

            This book solved a practical problem for me. I needed a way to process a hundred text files (with tens of thousands of lines each) containing experimental data (I am a student in a Master of Engineering program in Brazil) and I wanted to use Python, since I was familiar with it. After some research, I discovered a library names pandas and this book, which was written by its main developer. Disclosure: I've got the book through the O'Reilly Reader Review Program.

            The book is not only about pandas, though. The title was correctly chosen: the authors covers various excellent tools in using Python to analyze tabular data. For example, I had used the numerical library NumPy before, and the chapter on it is one of the best introductions I've seen. The book also has chapters on matplotlib (a package to produce 2D plots) and on iPython (an enhanced shell), and you can use them as independent references on these subjects.

            I like two main things about this book. The libraries covered are very object-oriented, and the author explains carefully the concepts behind each class, like the differences between a Figure and an Axes object in matplotlib. I also like how detailed the examples are --- the author presents a new command, and then discuss each option. In particular, McKinney emphasizes how to extract data from a table: by row, by column, filtering by values etc.

            The main problem I had was the lack of figures and diagrams. Like I said, the concepts are well written, but I missed a figure to more easily understand, for instance, the merging of two databases, or the relationship between a Series and a DataFrame (the main data types of pandas).

            This is a minor problem. After I read the book, I started to write my own scripts, and I found myself constantly referring to it and the information I needed was usually easy to find. If you have to deal with tabular data of any sort, doing operations on them, extracting information and creating plots and charts, this book is a very nice companion to have.

            (1 of 1 customers found this review helpful)

             
            5.0

            GREAT book.

            By MCP

            from Napa, CA

            About Me Getting Started, Just Learning

            Verified Buyer

            Comments about oreilly Python for Data Analysis:

            The book is very well written and organized.

            (5 of 5 customers found this review helpful)

             
            4.0

            Great for jump starting data analysis

            By Sarah Bird

            from California

            About Me Developer

            Verified Buyer

            Pros

            • Accurate
            • Concise
            • Helpful examples

            Cons

              Best Uses

              • Intermediate

              Comments about oreilly Python for Data Analysis:

              My favorite thing about this book is the second chapter "Introductory Examples," which is the only chapter I read cover-to-cover.

              I know my way around Python but did not know any pandas, numpy or matplotlib and needed to. The introductory chapter did a great job of running through a whole bunch of uses without getting stuck in the details so I could get a flavor of what I could do and how.

              I then have dipped into the other chapters when trying to find out about specific things.

              A very useful book that covers a lot of ground.

              Displaying reviews 1-10

              Back to top

              Previous | Next »

               
              Buy 2 Get 1 Free Free Shipping Guarantee
              Buying Options
              Immediate Access - Go Digital what's this?
              Ebook:  $33.99
              Formats:  DAISY, ePub, Mobi, PDF
              Print & Ebook:  $43.99
              Print:  $39.99