Python for Data Analysis
Data Wrangling with Pandas, NumPy, and IPython
Publisher: O'Reilly Media
Final Release Date: October 2012
Pages: 470

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.

Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It’s ideal for analysts new to Python and for Python programmers new to scientific computing.

  • Use the IPython interactive shell as your primary development environment
  • Learn basic and advanced NumPy (Numerical Python) features
  • Get started with data analysis tools in the pandas library
  • Use high-performance tools to load, clean, transform, merge, and reshape data
  • Create scatter plots and static or interactive visualizations with matplotlib
  • Apply the pandas groupby facility to slice, dice, and summarize datasets
  • Measure data by points in time, whether it’s specific instances, fixed periods, or intervals
  • Learn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyPython for Data Analysis
 
4.2

(based on 22 reviews)

Ratings Distribution

  • 5 Stars

     

    (8)

  • 4 Stars

     

    (12)

  • 3 Stars

     

    (1)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (1)

90%

of respondents would recommend this to a friend.

Pros

  • Helpful examples (18)
  • Easy to understand (12)
  • Well-written (12)
  • Accurate (5)
  • Concise (5)

Cons

    Best Uses

    • Intermediate (14)
    • Expert (4)
    • Novice (4)
      • Reviewer Profile:
      • Developer (11)

    Reviewed by 22 customers

    Sort by

    Displaying reviews 1-10

    Back to top

    Previous | Next »

     
    4.0

    Real Start to Data Analysis with Python

    By Geoff the Numbers Guy

    from Silicon Valley, CA

    Verified Buyer

    Pros

    • Easy to understand
    • Helpful examples

    Cons

      Best Uses

      • Intermediate
      • Novice

      Comments about oreilly Python for Data Analysis:

      I use R for my work and have been interested in learning Python to take my career in a slightly different direction. So I already know about data analysis, just not how to do it in Python. One of the beautiful things about Python (like R) is the wealth of libraries where other people have solved common problems and all you need do is make use of their solutions. In the case of Python, it is especially Pandas that turns it into a good tool for data analysis. And true to the title, the author does a great job of giving you the information you need to set up a Python environment with Pandas and associated packages in place so that instead of writing code to do data analysis, you can get straight to the analysis part.

      One of the best things about this book is the clues it gives to getting your tools working. The one weak point, from my perspective, is the occasional digression into how you would do something in regular Python and why Pandas is better. If Pandas is better, and it's free, why would you want to know about an inferior approach?

      If you know about programming and data analysis, but want to apply your skills using Python, this is a good book to get started.

       
      5.0

      Good book with examples and features

      By Anish Chapagain

      from Kathmandu, Nepal

      About Me Developer

      Verified Reviewer

      Pros

      • Concise
      • Easy to understand
      • Helpful examples
      • Well-written

      Cons

        Best Uses

        • Expert
        • Intermediate

        Comments about oreilly Python for Data Analysis:

        Book is great resource for Python lover and also for Data analysis. Tips, Graphs and Code were really helpful to visualize and interpret.

        (3 of 3 customers found this review helpful)

         
        4.0

        Good reference to deal with tabular data

        By Fábio Fortkamp

        from Florianópolis, Brazil

        About Me Master's Student

        Verified Reviewer

        Pros

        • Easy to understand
        • Helpful examples
        • Well-written

        Cons

        • Lack Of Figures

        Best Uses

        • Researchers
        • Scientists
        • Student

        Comments about oreilly Python for Data Analysis:

        This book solved a practical problem for me. I needed a way to process a hundred text files (with tens of thousands of lines each) containing experimental data (I am a student in a Master of Engineering program in Brazil) and I wanted to use Python, since I was familiar with it. After some research, I discovered a library names pandas and this book, which was written by its main developer. Disclosure: I've got the book through the O'Reilly Reader Review Program.

        The book is not only about pandas, though. The title was correctly chosen: the authors covers various excellent tools in using Python to analyze tabular data. For example, I had used the numerical library NumPy before, and the chapter on it is one of the best introductions I've seen. The book also has chapters on matplotlib (a package to produce 2D plots) and on iPython (an enhanced shell), and you can use them as independent references on these subjects.

        I like two main things about this book. The libraries covered are very object-oriented, and the author explains carefully the concepts behind each class, like the differences between a Figure and an Axes object in matplotlib. I also like how detailed the examples are --- the author presents a new command, and then discuss each option. In particular, McKinney emphasizes how to extract data from a table: by row, by column, filtering by values etc.

        The main problem I had was the lack of figures and diagrams. Like I said, the concepts are well written, but I missed a figure to more easily understand, for instance, the merging of two databases, or the relationship between a Series and a DataFrame (the main data types of pandas).

        This is a minor problem. After I read the book, I started to write my own scripts, and I found myself constantly referring to it and the information I needed was usually easy to find. If you have to deal with tabular data of any sort, doing operations on them, extracting information and creating plots and charts, this book is a very nice companion to have.

         
        5.0

        GREAT book.

        By MCP

        from Napa, CA

        About Me Getting Started, Just Learning

        Verified Buyer

        Pros

          Cons

            Best Uses

              Comments about oreilly Python for Data Analysis:

              The book is very well written and organized.

              (4 of 4 customers found this review helpful)

               
              4.0

              Great for jump starting data analysis

              By Sarah Bird

              from California

              About Me Developer

              Verified Buyer

              Pros

              • Accurate
              • Concise
              • Helpful examples

              Cons

                Best Uses

                • Intermediate

                Comments about oreilly Python for Data Analysis:

                My favorite thing about this book is the second chapter "Introductory Examples," which is the only chapter I read cover-to-cover.

                I know my way around Python but did not know any pandas, numpy or matplotlib and needed to. The introductory chapter did a great job of running through a whole bunch of uses without getting stuck in the details so I could get a flavor of what I could do and how.

                I then have dipped into the other chapters when trying to find out about specific things.

                A very useful book that covers a lot of ground.

                 
                4.0

                Good start for data handling in python

                By Myself

                from Belgium

                Pros

                • Accurate
                • Helpful examples

                Cons

                  Best Uses

                  • Expert
                  • Intermediate

                  Comments about oreilly Python for Data Analysis:

                  Good starting point for data handling using pandas in python.

                  Basic previous knowledge of python helpful.

                  Mainly focused on the pandas module but includes some interesting information on ipython/numpy.

                   
                  5.0

                  Great book!

                  By Jure C.

                  from Ljubljana, Slovenia

                  About Me Developer

                  Verified Buyer

                  Pros

                  • Helpful examples
                  • Well-written

                  Cons

                    Best Uses

                    • Intermediate

                    Comments about oreilly Python for Data Analysis:

                    I read this booking after watching a couple of Wes's tutorial videos and it really helped me understand and put my pandas project into practice. It also helped me give the extra push to start using iPython Notebook.

                    I would recommend this book to anyone that is currently mangling data using self written python scripts.

                    (1 of 1 customers found this review helpful)

                     
                    5.0

                    very helpful to start with python

                    By panagiotis

                    from new york

                    About Me Developer

                    Verified Buyer

                    Pros

                    • Concise
                    • Easy to understand
                    • Helpful examples
                    • Well-written

                    Cons

                      Best Uses

                        Comments about oreilly Python for Data Analysis:

                        very helpful to start with python, numpy

                        (18 of 51 customers found this review helpful)

                         
                        1.0

                        Could this book be any more confusing?

                        By Jim the Runner

                        from San Jose, CA

                        Comments about oreilly Python for Data Analysis:

                        The title is very misleading, first of all. It's not about Python. It's about NumPy and Pandas. If you don't already know Python, you're probably going to struggle with this book, unless you start with the dense 50 page appendix waaaayyy in the back of the book.

                        The rest of the book consists of one random example after another without a clear roadmap. Showing 10 different ways to create a DataFrame isn't very helpful, when the author doesn't explain the concepts behind why you would use one approach over another.

                        To be honest, the problems with this book are similar to what I've found in other O'Reilly books. They read like dictionaries rather than books on how to write. I'm wondering if the problem has more to do with publishing standards than the authors.

                        (3 of 7 customers found this review helpful)

                         
                        4.0

                        Right way to start to code about data

                        By rafadaguiar

                        from Recife, Pernambuco, Brazil

                        About Me Developer, Student

                        Verified Reviewer

                        Pros

                        • Easy to understand
                        • Helpful examples
                        • Well-written

                        Cons

                          Best Uses

                          • Intermediate

                          Comments about oreilly Python for Data Analysis:

                          I wouldn't say that is the right way to start to learn data science because I think that in the beginning it is important to pass through certain concepts and techniques(like machine learning, statistical analysis). Though, once you know what data science is about this book will be very helpful in order to achieve quick coding skills in this area.

                          Displaying reviews 1-10

                          Back to top

                          Previous | Next »

                           
                          Buy 2 Get 1 Free Free Shipping Guarantee
                          Buying Options
                          Immediate Access - Go Digital what's this?
                          Ebook: $33.99
                          Formats:  DAISY, ePub, Mobi, PDF
                          Print & Ebook: $43.99
                          Print: $39.99