Hadoop Application Architectures
Designing Real-World Big Data Applications
Publisher: O'Reilly Media
Final Release Date: June 2015
Pages: 400

Get expert guidance on architecting end-to-end data management solutions with Apache Hadoop. While many sources explain how to use various components in the Hadoop ecosystem, this practical book takes you through architectural considerations necessary to tie those components together into a complete tailored application, based on your particular use case.

To reinforce those lessons, the book’s second section provides detailed examples of architectures used in some of the most commonly found Hadoop applications. Whether you’re designing a new Hadoop application, or planning to integrate Hadoop into your existing data infrastructure, Hadoop Application Architectures will skillfully guide you through the process.

This book covers:

  • Factors to consider when using Hadoop to store and model data
  • Best practices for moving data in and out of the system
  • Data processing frameworks, including MapReduce, Spark, and Hive
  • Common Hadoop processing patterns, such as removing duplicate records and using windowing analytics
  • Giraph, GraphX, and other tools for large graph processing on Hadoop
  • Using workflow orchestration and scheduling tools such as Apache Oozie
  • Near-real-time stream processing with Apache Storm, Apache Spark Streaming, and Apache Flume
  • Architecture examples for clickstream analysis, fraud detection, and data warehousing
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyHadoop Application Architectures
 
4.4

(based on 10 reviews)

Ratings Distribution

  • 5 Stars

     

    (7)

  • 4 Stars

     

    (1)

  • 3 Stars

     

    (1)

  • 2 Stars

     

    (1)

  • 1 Stars

     

    (0)

90%

of respondents would recommend this to a friend.

Pros

  • Accurate (8)
  • Easy to understand (7)
  • Helpful examples (6)
  • Well-written (6)
  • Concise (5)

Cons

  • Not comprehensive enough (3)

Best Uses

  • Intermediate (9)
  • Expert (4)
  • Novice (3)
    • Reviewer Profile:
    • Developer (6), Designer (3)

Reviewed by 10 customers

Displaying reviews 1-10

Back to top

(0 of 1 customers found this review helpful)

 
4.0

Useful reference

By Jeff

from NY, NY

Verified Buyer

Pros

  • Concise
  • Easy to understand
  • Helpful examples
  • Well-written

Cons

  • Not comprehensive enough

Best Uses

  • Intermediate
  • Novice

Comments about oreilly Hadoop Application Architectures:

Useful to proof some architectural choices in this technology stack.
Goes in some details about HDFS file formats, compression, container formats, etc.
Provides overview of most known/often mentioned Hadoop tools.

Overall, level of detail is good up to an intermediate user.
I do see that more experienced in this space people might find content somewhat high level.

It though serves my purpose of getting base-line architecture concepts in this technology.

 
5.0

Essential reading for any Data Engineer/Architect

By Gideon

from Israel

About Me Developer

Verified Buyer

Pros

  • Accurate
  • Concise
  • Helpful examples

Cons

    Best Uses

    • Intermediate

    Comments about oreilly Hadoop Application Architectures:

    An excellent and detailed overview of Big Data architecture patterns and best practices.
    The book covers most aspects of data ingestion and processing, including case studies which teach you how to design practical data pipelines.
    Recommended for any big data engineer/architect.

    (1 of 2 customers found this review helpful)

     
    2.0

    the book title doesn't match to the expectations

    By anand

    from atlanta,ga

    Verified Buyer

    Pros

      Cons

      • Difficult to understand
      • Not comprehensive enough
      • Too basic

      Best Uses

        Comments about oreilly Hadoop Application Architectures:

        In general we would expect lot on design and development methodologies of an application, but this book doesn't much help on that, This book speaks lot on Spark ..but not the real use case

        (1 of 1 customers found this review helpful)

         
        5.0

        Execellent book. Highly recommended!

        By pnwhitney

        from Allen, TX

        About Me Developer

        Verified Buyer

        Pros

        • Accurate
        • Easy to understand
        • Helpful examples
        • Well-written

        Cons

          Best Uses

          • Intermediate
          • Novice
          • Student

          Comments about oreilly Hadoop Application Architectures:

          Great overview of everything Hadoop and then some!

           
          5.0

          Well put together Hadoop architecture book

          By Go COLTS!

          from IN

          About Me Designer

          Verified Buyer

          Pros

          • Accurate
          • Concise
          • Easy to understand
          • Helpful examples
          • Well-written

          Cons

            Best Uses

            • Intermediate

            Comments about oreilly Hadoop Application Architectures:

            The book provides just enough depth to get a general understanding of the ingest methods and processing frameworks on Hadoop for an architect without going into too much syntax.

             
            5.0

            Essential concern-centric, not tool-centric, view of Hadoop

            By Sean

            from London, UK

            About Me Developer

            Verified Reviewer

            Pros

            • Accurate
            • Helpful examples

            Cons

              Best Uses

              • Expert
              • Intermediate

              Comments about oreilly Hadoop Application Architectures:

              Disclosure: the authors are my coworkers, so I know what they've been up to and have awaited the release of this book.

              I believe it provides a much needed guide for developers and architects working in the Hadoop ecosystem since it focuses on cross-cutting concerns, not just tools. The first half is organized around essential elements of any application architecture: formats, schemas, ingest, processing, workflow orchestration. It connects these to tools in the ecosystem and gives a survey of their use, but focuses on the issues and the general solutions, rather than just the tools that are deployed. This is refreshing.

              The second half provides end-to-end examples of architecting common use cases, like clickstream processing and fraud detection, as worked examples. A great guide for those who want to understand the "why" and "how" of Hadoop app development and not just the "what".

              (0 of 1 customers found this review helpful)

               
              5.0

              Good reference book

              By Shridhar R

              from Bangalore India

              About Me Developer

              Verified Buyer

              Pros

              • Accurate
              • Concise
              • Easy to understand

              Cons

                Best Uses

                • Expert
                • Intermediate

                Comments about oreilly Hadoop Application Architectures:

                One of the best book for hadoop architects and developers. It is concise and explains architecture patterns in simple way. This is good reference book for making architectural or design decisions.

                (1 of 2 customers found this review helpful)

                 
                3.0

                Very nice introduction to Hadoop!

                By Svende

                from Copenhagen

                About Me Designer, Developer

                Verified Buyer

                Pros

                • Accurate
                • Easy to understand
                • Well-written

                Cons

                • Miss An Area Description
                • Miss Load Examples
                • Not comprehensive enough

                Best Uses

                • Expert
                • Intermediate
                • Novice
                • Student

                Comments about oreilly Hadoop Application Architectures:

                Have read 4 chapters. We have had great value reading these chapters, but we still have problems understanding the load processes.
                I miss information about the ingestion/load proces. Some examples telling what can/shall happen until the data is in place.

                (0 of 1 customers found this review helpful)

                 
                5.0

                I would buy again

                By Steve

                from NY

                About Me Designer

                Verified Buyer

                Pros

                • Accurate
                • Concise
                • Easy to understand
                • Helpful examples
                • Well-written

                Cons

                  Best Uses

                  • Expert
                  • Intermediate

                  Comments about oreilly Hadoop Application Architectures:

                  Contains a plethora of good information necessary to help architect a Hadoop environment. Need to blend with ongoing app updates and techniques.

                  (3 of 3 customers found this review helpful)

                   
                  5.0

                  More than just Hadoop

                  By mrtheb

                  from Montreal, Canada

                  About Me Developer

                  Verified Buyer

                  Pros

                  • Accurate
                  • Easy to understand
                  • Well-written

                  Cons

                    Best Uses

                    • Intermediate

                    Comments about oreilly Hadoop Application Architectures:

                    Haven't gone through all the book but so far it's the best view of all the current tools available around data pipelines. IMO, it goes further than just being limited to Hadoop by discussing tools like Kafka (which I don't really consider being necessarily part of the Hadoop family). It doesn't not pretend to be a in-depth intro to all the tech it discusses but covers just enough to provide go insights.

                    I'll certainly use the book as a reference when I have to pick tools for a great data pipeline integration, even those not listed in the book (yet).

                    Displaying reviews 1-10

                    Back to top

                     
                    Buy 2 Get 1 Free Free Shipping Guarantee
                    Buying Options
                    Immediate Access - Go Digital what's this?
                    Ebook:  $42.99
                    Formats:  DAISY, ePub, Mobi, PDF
                    Print & Ebook:  $54.99
                    Print:  $49.99