Agile Data Science
Building Data Analytics Applications with Hadoop
Publisher: O'Reilly Media
Final Release Date: October 2013
Pages: 178

Mining big data requires a deep investment in people and time. How can you be sure you’re building the right models? With this hands-on book, you’ll learn a flexible toolset and methodology for building effective analytics applications with Hadoop.

Using lightweight tools such as Python, Apache Pig, and the D3.js library, your team will create an agile environment for exploring data, starting with an example application to mine your own email inboxes. You’ll learn an iterative approach that enables you to quickly change the kind of analysis you’re doing, depending on what the data is telling you. All example code in this book is available as working Heroku apps.

  • Create analytics applications by using the agile big data development methodology
  • Build value from your data in a series of agile sprints, using the data-value stack
  • Gain insight by using several data structures to extract multiple features from a single dataset
  • Visualize data with charts, and expose different aspects through interactive reports
  • Use historical data to predict the future, and translate predictions into action
  • Get feedback from users after each sprint to keep your project on track
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyAgile Data Science
 
4.6

(based on 7 reviews)

Ratings Distribution

  • 5 Stars

     

    (6)

  • 4 Stars

     

    (0)

  • 3 Stars

     

    (0)

  • 2 Stars

     

    (1)

  • 1 Stars

     

    (0)

86%

of respondents would recommend this to a friend.

Pros

  • Helpful examples (6)
  • Concise (4)
  • Easy to understand (3)
  • Well-written (3)

Cons

No Cons

Best Uses

  • Intermediate (5)
    • Reviewer Profile:
    • Developer (7), Designer (3)

Reviewed by 7 customers

Displaying reviews 1-7

Back to top

 
5.0

Concise and incisive

By venos

from London

About Me Designer, Developer, Maker

Pros

  • Accurate
  • Concise
  • Easy to understand
  • Helpful examples
  • Well-written

Cons

    Best Uses

      Comments about oreilly Agile Data Science:

      I needed a book to get me started quickly on developing data analytic applications. I was new to data analytic and system development. I needed to develop a web based big data analytic application as part of an M.Sc coursework. I purchased this book after reading positive reviews and reviewing the table of content. I wasn't disappointed at all. It is now on my short-list of reference materials. Although the book is slightly dated now due to recent developments and new Hadoop tech stack since the book's publication. However, the agile concepts in the book are timeless, very helpful and the examples remain relevant. I would recommend the book even in 2016.

       
      5.0

      Excellent book

      By Michel

      from Nice

      About Me Designer, Developer

      Comments about oreilly Agile Data Science:

      Very usefull book to create graphs

      (1 of 2 customers found this review helpful)

       
      2.0

      Nice approach, but look elsewhere

      By HB Natty

      from Kansas City, KS

      About Me Developer, Hacker

      Verified Reviewer

      Pros

      • Helpful examples
      • Open-source
      • Practical Methodology

      Cons

      • No Updates
      • Outdated
      • Too many errors

      Best Uses

      • Expert
      • Intermediate

      Comments about oreilly Agile Data Science:

      Although I like the approach and methodology Mr. Jurney tries to take with this book, it's confusingly assembled, erroneous, and now, out-of-date. Trying to write a practical book on this topic using the tools the author chooses while they're moving so fast seems like a big challenge that he has apparently abandoned. There are just too many swiftly moving parts to make this book practical. If you're not already keeping up with changes to Pig, Avro, Mongo, and Hadoop then just getting past Chapter 3 is a major accomplishment. I wasted a lot of time either looking for the older versions Jurney cites or troubleshooting and cobbling the newer packages together.
      A big credit to the author is that he released the code, examples, and tutorials on Github. That seems like a much better approach for teaching this topic than trying to put it into a book. However, it looks like the project is now fairly stale.
      I wish I'd put my money elsewhere. Noble effort, but I'm sure you'll find better resources elsewhere.

      (1 of 1 customers found this review helpful)

       
      5.0

      Practical book!

      By Harish Chakravarthy

      from Bay Area, CA

      About Me Developer

      Verified Reviewer

      Pros

      • Helpful examples

      Cons

      • Too many errors

      Best Uses

      • Intermediate
      • Self-moivated

      Comments about oreilly Agile Data Science:

      Practical book that introduces numerous tools for agile data science. I cloned the git hub repository, ran the programs and followed the explanation in the book. I was amazed with interesting insights generated by the programs in each chapter leading up the real-time prediction. Certainly Russel Jurney has provided excellent building blocks and working solutions to get started.

      This books is certainly not for everyone. In addition to the programming skills you need to be self-motivated to take full advantage of this book and GitHub repository. There are numerous inconsistencies between the code on GitHub and print. There are also numerous open and unanswered bugs in the GitHub repository. Chapter 4 is optional (not mentioned anywhere!) and could be placed at the end.

      I had numerous aha! moments reading this book. I strongly recommend this book to anyone interested in getting started with agile data science.

      (1 of 1 customers found this review helpful)

       
      5.0

      Just great

      By John G

      from Stamford, CT

      About Me Designer, Developer

      Verified Buyer

      Pros

      • Accurate
      • Concise
      • Easy to understand
      • Helpful examples
      • Well-written

      Cons

        Best Uses

        • Intermediate
        • Novice

        Comments about oreilly Agile Data Science:

        Very satisfied with this book. I covers material that matters in a way you can learn it and actually use it.

        (4 of 4 customers found this review helpful)

         
        5.0

        A must read for anyone starting BigData

        By ArthurZ

        from Canada

        About Me Developer

        Verified Reviewer

        Pros

        • Concise
        • Easy to understand
        • Helpful examples
        • Well-written

        Cons

          Best Uses

          • Intermediate

          Comments about oreilly Agile Data Science:

          There are at least two reasons to read this book:

          1) The author understands that a typical business today cannot wait for a Data Scientist for too long to deliver results demanding as usual a very quick turnaround on investments (ROI), you will be able to cope with the demand and
          2) The book covers all the needed and proven modern brick and mortar offerings to get the job done by a relatively newcomer to the Big Data World.

          It certainly enables such a professional to grow and expand based on the acquired knowledge, and one can truly do it very fast.

          (11 of 11 customers found this review helpful)

           
          5.0

          Great practical guide to tools and tech

          By aaron

          from Philadelphia, PA

          About Me Developer, Manager

          Verified Reviewer

          Pros

          • Concise
          • Helpful examples

          Cons

            Best Uses

            • Intermediate

            Comments about oreilly Agile Data Science:

            I'm really enjoying going through this big data tutorial and learning much.

            Interestingly I've toyed with nearly all the technologies being used and thought I understood the value of big data. I even have some map-reduce analytic jobs running to provide real value.

            This book made the 'agile' part click and made me look at my analytic workflow like any other software process. Just like I focus on optimizing my tooling for automation/compiling/testing applications I see how easy it could be to have a similar workflow to BI.

            I like the writing style and the pace. He calls out some common traps while not spending too much time going into installation and tool details best left to the project websites.

            I'd like to see a part II of this where these techniques are blended with SQL data and maybe data warehouses.

            Displaying reviews 1-7

            Back to top

             
            Buy 2 Get 1 Free Free Shipping Guarantee
            Buying Options
            Immediate Access - Go Digital what's this?
            Ebook:  $33.99
            Formats:  DAISY, ePub, Mobi, PDF
            Print & Ebook:  $43.99
            Print:  $39.99

            Available in Multiple Languages