Python for Data Analysis
Data Wrangling with Pandas, NumPy, and IPython
Publisher: O'Reilly Media
Final Release Date: October 2012
Pages: 466

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.

Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It’s ideal for analysts new to Python and for Python programmers new to scientific computing.

  • Use the IPython interactive shell as your primary development environment
  • Learn basic and advanced NumPy (Numerical Python) features
  • Get started with data analysis tools in the pandas library
  • Use high-performance tools to load, clean, transform, merge, and reshape data
  • Create scatter plots and static or interactive visualizations with matplotlib
  • Apply the pandas groupby facility to slice, dice, and summarize datasets
  • Measure data by points in time, whether it’s specific instances, fixed periods, or intervals
  • Learn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyPython for Data Analysis
 
4.2

(based on 23 reviews)

Ratings Distribution

  • 5 Stars

     

    (9)

  • 4 Stars

     

    (12)

  • 3 Stars

     

    (1)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (1)

91%

of respondents would recommend this to a friend.

Pros

  • Helpful examples (19)
  • Well-written (13)
  • Easy to understand (12)
  • Accurate (5)
  • Concise (5)

Cons

    Best Uses

    • Intermediate (15)
    • Expert (4)
    • Novice (4)
      • Reviewer Profile:
      • Developer (11)

    Reviewed by 23 customers

    Sort by

    Displaying reviews 1-10

    Back to top

    Previous | Next »

    (1 of 1 customers found this review helpful)

     
    5.0

    Excellent book to learn Pandas

    By Tiberiu

    from Bucharest, Romania

    Verified Buyer

    Pros

    • Helpful examples
    • Well-written

    Cons

      Best Uses

      • Intermediate

      Comments about oreilly Python for Data Analysis:

      I bought the book because I wanted to learn how I could use Pandas module for Finance. It is an excellent book for programmers that want to learn data handling with Pandas. As regarding quant analysis, it has only one chapter covering this subject. I recommand the book for quants that already have some basic or intermediate knowledge of Python but want to learn more about data wrangling. The book offers examples from different domains. I also found examples of applied Python in finance that I could not find in other books. The book is focused on Pandas module, although you can find examples regarding Numpy and Matplolib modules.

      (2 of 2 customers found this review helpful)

       
      4.0

      Real Start to Data Analysis with Python

      By Geoff the Numbers Guy

      from Silicon Valley, CA

      Verified Buyer

      Pros

      • Easy to understand
      • Helpful examples

      Cons

        Best Uses

        • Intermediate
        • Novice

        Comments about oreilly Python for Data Analysis:

        I use R for my work and have been interested in learning Python to take my career in a slightly different direction. So I already know about data analysis, just not how to do it in Python. One of the beautiful things about Python (like R) is the wealth of libraries where other people have solved common problems and all you need do is make use of their solutions. In the case of Python, it is especially Pandas that turns it into a good tool for data analysis. And true to the title, the author does a great job of giving you the information you need to set up a Python environment with Pandas and associated packages in place so that instead of writing code to do data analysis, you can get straight to the analysis part.

        One of the best things about this book is the clues it gives to getting your tools working. The one weak point, from my perspective, is the occasional digression into how you would do something in regular Python and why Pandas is better. If Pandas is better, and it's free, why would you want to know about an inferior approach?

        If you know about programming and data analysis, but want to apply your skills using Python, this is a good book to get started.

         
        5.0

        Good book with examples and features

        By Anish Chapagain

        from Kathmandu, Nepal

        About Me Developer

        Verified Reviewer

        Pros

        • Concise
        • Easy to understand
        • Helpful examples
        • Well-written

        Cons

          Best Uses

          • Expert
          • Intermediate

          Comments about oreilly Python for Data Analysis:

          Book is great resource for Python lover and also for Data analysis. Tips, Graphs and Code were really helpful to visualize and interpret.

          (4 of 4 customers found this review helpful)

           
          4.0

          Good reference to deal with tabular data

          By Fábio Fortkamp

          from Florianópolis, Brazil

          About Me Master's Student

          Verified Reviewer

          Pros

          • Easy to understand
          • Helpful examples
          • Well-written

          Cons

          • Lack Of Figures

          Best Uses

          • Researchers
          • Scientists
          • Student

          Comments about oreilly Python for Data Analysis:

          This book solved a practical problem for me. I needed a way to process a hundred text files (with tens of thousands of lines each) containing experimental data (I am a student in a Master of Engineering program in Brazil) and I wanted to use Python, since I was familiar with it. After some research, I discovered a library names pandas and this book, which was written by its main developer. Disclosure: I've got the book through the O'Reilly Reader Review Program.

          The book is not only about pandas, though. The title was correctly chosen: the authors covers various excellent tools in using Python to analyze tabular data. For example, I had used the numerical library NumPy before, and the chapter on it is one of the best introductions I've seen. The book also has chapters on matplotlib (a package to produce 2D plots) and on iPython (an enhanced shell), and you can use them as independent references on these subjects.

          I like two main things about this book. The libraries covered are very object-oriented, and the author explains carefully the concepts behind each class, like the differences between a Figure and an Axes object in matplotlib. I also like how detailed the examples are --- the author presents a new command, and then discuss each option. In particular, McKinney emphasizes how to extract data from a table: by row, by column, filtering by values etc.

          The main problem I had was the lack of figures and diagrams. Like I said, the concepts are well written, but I missed a figure to more easily understand, for instance, the merging of two databases, or the relationship between a Series and a DataFrame (the main data types of pandas).

          This is a minor problem. After I read the book, I started to write my own scripts, and I found myself constantly referring to it and the information I needed was usually easy to find. If you have to deal with tabular data of any sort, doing operations on them, extracting information and creating plots and charts, this book is a very nice companion to have.

           
          5.0

          GREAT book.

          By MCP

          from Napa, CA

          About Me Getting Started, Just Learning

          Verified Buyer

          Pros

            Cons

              Best Uses

                Comments about oreilly Python for Data Analysis:

                The book is very well written and organized.

                (4 of 4 customers found this review helpful)

                 
                4.0

                Great for jump starting data analysis

                By Sarah Bird

                from California

                About Me Developer

                Verified Buyer

                Pros

                • Accurate
                • Concise
                • Helpful examples

                Cons

                  Best Uses

                  • Intermediate

                  Comments about oreilly Python for Data Analysis:

                  My favorite thing about this book is the second chapter "Introductory Examples," which is the only chapter I read cover-to-cover.

                  I know my way around Python but did not know any pandas, numpy or matplotlib and needed to. The introductory chapter did a great job of running through a whole bunch of uses without getting stuck in the details so I could get a flavor of what I could do and how.

                  I then have dipped into the other chapters when trying to find out about specific things.

                  A very useful book that covers a lot of ground.

                   
                  4.0

                  Good start for data handling in python

                  By Myself

                  from Belgium

                  Pros

                  • Accurate
                  • Helpful examples

                  Cons

                    Best Uses

                    • Expert
                    • Intermediate

                    Comments about oreilly Python for Data Analysis:

                    Good starting point for data handling using pandas in python.

                    Basic previous knowledge of python helpful.

                    Mainly focused on the pandas module but includes some interesting information on ipython/numpy.

                     
                    5.0

                    Great book!

                    By Jure C.

                    from Ljubljana, Slovenia

                    About Me Developer

                    Verified Buyer

                    Pros

                    • Helpful examples
                    • Well-written

                    Cons

                      Best Uses

                      • Intermediate

                      Comments about oreilly Python for Data Analysis:

                      I read this booking after watching a couple of Wes's tutorial videos and it really helped me understand and put my pandas project into practice. It also helped me give the extra push to start using iPython Notebook.

                      I would recommend this book to anyone that is currently mangling data using self written python scripts.

                      (1 of 1 customers found this review helpful)

                       
                      5.0

                      very helpful to start with python

                      By panagiotis

                      from new york

                      About Me Developer

                      Verified Buyer

                      Pros

                      • Concise
                      • Easy to understand
                      • Helpful examples
                      • Well-written

                      Cons

                        Best Uses

                          Comments about oreilly Python for Data Analysis:

                          very helpful to start with python, numpy

                          (19 of 54 customers found this review helpful)

                           
                          1.0

                          Could this book be any more confusing?

                          By Jim the Runner

                          from San Jose, CA

                          Comments about oreilly Python for Data Analysis:

                          The title is very misleading, first of all. It's not about Python. It's about NumPy and Pandas. If you don't already know Python, you're probably going to struggle with this book, unless you start with the dense 50 page appendix waaaayyy in the back of the book.

                          The rest of the book consists of one random example after another without a clear roadmap. Showing 10 different ways to create a DataFrame isn't very helpful, when the author doesn't explain the concepts behind why you would use one approach over another.

                          To be honest, the problems with this book are similar to what I've found in other O'Reilly books. They read like dictionaries rather than books on how to write. I'm wondering if the problem has more to do with publishing standards than the authors.

                          Displaying reviews 1-10

                          Back to top

                          Previous | Next »

                           
                          Buy 2 Get 1 Free Free Shipping Guarantee
                          Buying Options
                          Immediate Access - Go Digital what's this?
                          Ebook: $33.99
                          Formats:  DAISY, ePub, Mobi, PDF
                          Print & Ebook: $43.99
                          Print: $39.99