Python for Data Analysis
Data Wrangling with Pandas, NumPy, and IPython
Publisher: O'Reilly Media
Final Release Date: October 2012
Pages: 470

Python for Data Analysis is concerned with the nuts and bolts of manipulating, processing, cleaning, and crunching data in Python. It is also a practical, modern introduction to scientific computing in Python, tailored for data-intensive applications. This is a book about the parts of the Python language and libraries you’ll need to effectively solve a broad set of data analysis problems. This book is not an exposition on analytical methods using Python as the implementation language.

Written by Wes McKinney, the main author of the pandas library, this hands-on book is packed with practical cases studies. It’s ideal for analysts new to Python and for Python programmers new to scientific computing.

  • Use the IPython interactive shell as your primary development environment
  • Learn basic and advanced NumPy (Numerical Python) features
  • Get started with data analysis tools in the pandas library
  • Use high-performance tools to load, clean, transform, merge, and reshape data
  • Create scatter plots and static or interactive visualizations with matplotlib
  • Apply the pandas groupby facility to slice, dice, and summarize datasets
  • Measure data by points in time, whether it’s specific instances, fixed periods, or intervals
  • Learn how to solve problems in web analytics, social sciences, finance, and economics, through detailed examples
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyPython for Data Analysis
 
4.2

(based on 21 reviews)

Ratings Distribution

  • 5 Stars

     

    (8)

  • 4 Stars

     

    (11)

  • 3 Stars

     

    (1)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (1)

90%

of respondents would recommend this to a friend.

Pros

  • Helpful examples (17)
  • Well-written (12)
  • Easy to understand (11)
  • Accurate (5)
  • Concise (5)

Cons

    Best Uses

    • Intermediate (13)
    • Expert (4)
    • Novice (3)
      • Reviewer Profile:
      • Developer (11)

    Reviewed by 21 customers

    Sort by

    Displaying reviews 1-10

    Back to top

    Previous | Next »

     
    5.0

    Good book with examples and features

    By Anish Chapagain

    from Kathmandu, Nepal

    About Me Developer

    Verified Reviewer

    Pros

    • Concise
    • Easy to understand
    • Helpful examples
    • Well-written

    Cons

      Best Uses

      • Expert
      • Intermediate

      Comments about oreilly Python for Data Analysis:

      Book is great resource for Python lover and also for Data analysis. Tips, Graphs and Code were really helpful to visualize and interpret.

      (2 of 2 customers found this review helpful)

       
      4.0

      Good reference to deal with tabular data

      By Fábio Fortkamp

      from Florianópolis, Brazil

      About Me Master's Student

      Verified Reviewer

      Pros

      • Easy to understand
      • Helpful examples
      • Well-written

      Cons

      • Lack Of Figures

      Best Uses

      • Researchers
      • Scientists
      • Student

      Comments about oreilly Python for Data Analysis:

      This book solved a practical problem for me. I needed a way to process a hundred text files (with tens of thousands of lines each) containing experimental data (I am a student in a Master of Engineering program in Brazil) and I wanted to use Python, since I was familiar with it. After some research, I discovered a library names pandas and this book, which was written by its main developer. Disclosure: I've got the book through the O'Reilly Reader Review Program.

      The book is not only about pandas, though. The title was correctly chosen: the authors covers various excellent tools in using Python to analyze tabular data. For example, I had used the numerical library NumPy before, and the chapter on it is one of the best introductions I've seen. The book also has chapters on matplotlib (a package to produce 2D plots) and on iPython (an enhanced shell), and you can use them as independent references on these subjects.

      I like two main things about this book. The libraries covered are very object-oriented, and the author explains carefully the concepts behind each class, like the differences between a Figure and an Axes object in matplotlib. I also like how detailed the examples are --- the author presents a new command, and then discuss each option. In particular, McKinney emphasizes how to extract data from a table: by row, by column, filtering by values etc.

      The main problem I had was the lack of figures and diagrams. Like I said, the concepts are well written, but I missed a figure to more easily understand, for instance, the merging of two databases, or the relationship between a Series and a DataFrame (the main data types of pandas).

      This is a minor problem. After I read the book, I started to write my own scripts, and I found myself constantly referring to it and the information I needed was usually easy to find. If you have to deal with tabular data of any sort, doing operations on them, extracting information and creating plots and charts, this book is a very nice companion to have.

       
      5.0

      GREAT book.

      By MCP

      from Napa, CA

      About Me Getting Started, Just Learning

      Verified Buyer

      Pros

        Cons

          Best Uses

            Comments about oreilly Python for Data Analysis:

            The book is very well written and organized.

            (3 of 3 customers found this review helpful)

             
            4.0

            Great for jump starting data analysis

            By Sarah Bird

            from California

            About Me Developer

            Verified Buyer

            Pros

            • Accurate
            • Concise
            • Helpful examples

            Cons

              Best Uses

              • Intermediate

              Comments about oreilly Python for Data Analysis:

              My favorite thing about this book is the second chapter "Introductory Examples," which is the only chapter I read cover-to-cover.

              I know my way around Python but did not know any pandas, numpy or matplotlib and needed to. The introductory chapter did a great job of running through a whole bunch of uses without getting stuck in the details so I could get a flavor of what I could do and how.

              I then have dipped into the other chapters when trying to find out about specific things.

              A very useful book that covers a lot of ground.

               
              4.0

              Good start for data handling in python

              By Myself

              from Belgium

              Pros

              • Accurate
              • Helpful examples

              Cons

                Best Uses

                • Expert
                • Intermediate

                Comments about oreilly Python for Data Analysis:

                Good starting point for data handling using pandas in python.

                Basic previous knowledge of python helpful.

                Mainly focused on the pandas module but includes some interesting information on ipython/numpy.

                 
                5.0

                Great book!

                By Jure C.

                from Ljubljana, Slovenia

                About Me Developer

                Verified Buyer

                Pros

                • Helpful examples
                • Well-written

                Cons

                  Best Uses

                  • Intermediate

                  Comments about oreilly Python for Data Analysis:

                  I read this booking after watching a couple of Wes's tutorial videos and it really helped me understand and put my pandas project into practice. It also helped me give the extra push to start using iPython Notebook.

                  I would recommend this book to anyone that is currently mangling data using self written python scripts.

                  (1 of 1 customers found this review helpful)

                   
                  5.0

                  very helpful to start with python

                  By panagiotis

                  from new york

                  About Me Developer

                  Verified Buyer

                  Pros

                  • Concise
                  • Easy to understand
                  • Helpful examples
                  • Well-written

                  Cons

                    Best Uses

                      Comments about oreilly Python for Data Analysis:

                      very helpful to start with python, numpy

                      (17 of 46 customers found this review helpful)

                       
                      1.0

                      Could this book be any more confusing?

                      By Jim the Runner

                      from San Jose, CA

                      Comments about oreilly Python for Data Analysis:

                      The title is very misleading, first of all. It's not about Python. It's about NumPy and Pandas. If you don't already know Python, you're probably going to struggle with this book, unless you start with the dense 50 page appendix waaaayyy in the back of the book.

                      The rest of the book consists of one random example after another without a clear roadmap. Showing 10 different ways to create a DataFrame isn't very helpful, when the author doesn't explain the concepts behind why you would use one approach over another.

                      To be honest, the problems with this book are similar to what I've found in other O'Reilly books. They read like dictionaries rather than books on how to write. I'm wondering if the problem has more to do with publishing standards than the authors.

                      (2 of 6 customers found this review helpful)

                       
                      4.0

                      Right way to start to code about data

                      By rafadaguiar

                      from Recife, Pernambuco, Brazil

                      About Me Developer, Student

                      Verified Reviewer

                      Pros

                      • Easy to understand
                      • Helpful examples
                      • Well-written

                      Cons

                        Best Uses

                        • Intermediate

                        Comments about oreilly Python for Data Analysis:

                        I wouldn't say that is the right way to start to learn data science because I think that in the beginning it is important to pass through certain concepts and techniques(like machine learning, statistical analysis). Though, once you know what data science is about this book will be very helpful in order to achieve quick coding skills in this area.

                        (2 of 10 customers found this review helpful)

                         
                        5.0

                        great book for any beginning analyst

                        By colin

                        from montreal, canada

                        Comments about oreilly Python for Data Analysis:

                        Amazing book all around.

                        Just one comment: Why is the animal on the cover not a panda? Perhaps in the next edition?

                        Displaying reviews 1-10

                        Back to top

                        Previous | Next »

                         
                        Buy 2 Get 1 Free Free Shipping Guarantee
                        Buying Options
                        Immediate Access - Go Digital what's this?
                        Ebook: $33.99
                        Formats:  DAISY, ePub, Mobi, PDF
                        Print & Ebook: $43.99
                        Print: $39.99