Talend for Big Data
By Bahaaldine Azarmi
Publisher: Packt Publishing
Final Release Date: February 2014
Pages: 96

In Detail

Talend, a successful Open Source Data Integration Solution, accelerates the adoption of new big data technologies and efficiently integrates them into your existing IT infrastructure. It is able to do this because of its intuitive graphical language, its multiple connectors to the Hadoop ecosystem, and its array of tools for data integration, quality, management, and governance.

This is a concise, pragmatic book that will guide you through design and implement big data transfer easily and perform big data analytics jobs using Hadoop technologies like HDFS, HBase, Hive, Pig, and Sqoop. You will see and learn how to write complex processing job codes and how to leverage the power of Hadoop projects through the design of graphical Talend jobs using business modeler, meta-data repository, and a palette of configurable components.

Starting with understanding how to process a large amount of data using Talend big data components, you will then learn how to write job procedures in HDFS. You will then look at how to use Hadoop projects to process data and how to export the data to your favourite relational database system.

You will learn how to implement Hive ELT jobs, Pig aggregation and filtering jobs, and simple Sqoop jobs using the Talend big data component palette. You will also learn the basics of Twitter sentiment analysis the instructions to format data with Apache Hive.

Talend for Big Data will enable you to start working on big data projects immediately, from simple processing projects to complex projects using common big data patterns.

Approach

This book is written in a concise and easy-to-understand manner, and acts as a comprehensive guide on data analytics and integration with Talend big data processing jobs.

Who this book is for

If you are a chief information officer, enterprise architect, data architect, data scientist, software developer, software engineer, or a data analyst who is familiar with data processing projects and who wants to use Talend to get your first big data job executed in a reliable, quick, and graphical way, then Talend for Big Data is perfect for you.

Product Details
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyTalend for Big Data
 
4.2

(based on 5 reviews)

Ratings Distribution

  • 5 Stars

     

    (2)

  • 4 Stars

     

    (2)

  • 3 Stars

     

    (1)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (0)

100%

of respondents would recommend this to a friend.

Pros

  • Accurate (3)
  • Helpful examples (3)
  • Well-written (3)

Cons

    Best Uses

    • Intermediate (4)
    • Expert (3)
    • Novice (3)
      • Reviewer Profile:
      • Developer (3)

    Reviewed by 5 customers

    Sort by

    Displaying reviews 1-5

    Back to top

     
    5.0

    A must for Big Data Enthusiasts!

    By Dipanjan Sarkar

    from Bangalore, India

    About Me Developer

    Verified Reviewer

    Pros

    • Accurate
    • Concise
    • Easy to understand
    • Helpful examples
    • Well-written

    Cons

      Best Uses

      • Expert
      • Intermediate
      • Novice
      • Student

      Comments about oreilly Talend for Big Data:

      I got my hands on this book recently, being a Big Data enthusiast, I dived into this book immediately. Now even though you will get a lot of resources on Big data on the web, it is very unstructured just like Big Data itself is! However, this book covers how you can tap into the massive power of Talend to work with Big Data with concrete implementations and descriptive examples.

      Talend is basically an open source software vendor, providing with tools to manage and work with Big Data easily without having extensive knowledge of hard core coding and internals.This book covers exhaustively how to use Talend Open Studio with the Hadoop distribution for working with Big Data. Some of the key takeaways from this book include,

      - Setting up and running Talend Open Studio for Big Data
      - Twitter sentiment analysis with Apache Hive
      - Aggregating Data with Apache Pig
      - Linking RDBMS like SQL with HDFS
      - Big Data Architecture and Integration patterns

      If you are a data scientist or a big data enthusiast and love exploring open source tools, this book is definitely worth buying!

       
      4.0

      Go buy it!

      By Rishi {Krishna}

      from Mumbai, India

      About Me Designer, Developer

      Verified Reviewer

      Pros

        Cons

          Best Uses

          • Expert
          • Intermediate

          Comments about oreilly Talend for Big Data:

          More is the word that I would say after reading this book. A very interesting topic added with the brilliant step by step explanation of what to do and how to get the expected result by Bahaaldine is amazing. He takes a very cautious and systematic approach while trying to help the readers understand and play with Talend Open Studio for Big Data. What is also interesting is his entire outlook in the book seems like the tech guru trying to teach his children how to operate the new toy, which is very methodical and full of patience.

          What I Loved in this Book:

          • The step-by-step narration with appropriate screen shots and follow up text
          • The links which seem to take us to the right sources to download all the content required
          • No loose ends have been left while preparing this book from resources to programs
          • The first opening statement of who this books is for gives the apt and complete information of the target audience without being vague on any points

          What I did not like:

          • The book is too short. While it gets my interests going there could have been more to this book and topic
          • The book gives apt examples but a few more odd balls would have won my kudos
          • The author could have on some topics gone a tidsy-bitsy more with his explanations

          On the Whole:

          On the whole I would say definitely YES for the book. The author has put a wholehearted effort to understand and reciprocate the same for all of his readers out there. So go ahead and buy a copy of this wonderful book. It's definitely worth the time.

           
          3.0

          Hands-on of Talend & Hadoop Required

          By navcode

          from India

          Pros

            Cons

              Best Uses

              • Novice

              Comments about oreilly Talend for Big Data:

              > Prior Hands-on on the Talend DI, ESB components and fair working knowledge of the Hadoop family of components like Hive, HDFS, Sqoop, Pig would be easier to get the real essence behind the motive of the book

              > The book is comprehensive in explaining the steps required to get started with Talend & Hadoop components from ground-zero i..e procurement, setup, initial configuration etc.

              > The Style of Picking up a single topic i.e. Hive and devising a fully functional working example is crisp and clear even if you are complete beginner

              > Author has also spent effort in explaining the important Talend Terminologies like Context, Schema, overview of components, Talend Modules for an easy take-off

              > The book starts with the brief level of explanation of the Hive, SQOOP, Pig, HDFC etc. and sets the ground before taking a deep dive into the real implementation

              > The good thing I like about the book is that, the illustration of the implementation revolves around single scenario about the Tweet feed analysis & hash tags. This offloads the need to go over with the underlying use again and again & get on quickly to Talend and Hadoop

              > The idea of highlighting the missing essential regex into the Hive and how to compensate and to have work-around for the same is a good helpful tip

               
              4.0

              REVIEW OF TALEND FOR BIG DATA

              By Uchit

              from India

              About Me Developer, Educator

              Verified Reviewer

              Pros

              • Accurate
              • Helpful examples
              • Well-written

              Cons

              • Not comprehensive enough

              Best Uses

              • Expert
              • Intermediate

              Comments about oreilly Talend for Big Data:

              This handbook has kept up with the increasing focus on Big Data technology and integration with the typical components of Open source. There are many strengths associated with this text but as per my experience, I have found some really good topics like chapter 3: Formatting data (sentimental analysis), Chapter 4: Processing tweets with apache hive (extracting hash tags and emoticons) will be the greatest wonder of this book. Another strength of this book is the resource list of images with images and Appendix section at the end of book chapters. Once students discover this book's usefulness, they consult it in conjunction with every big data related tasks, saving me time and encouraging them to participate more fully in their own learning for big data.
              The author, Bahhaldine Azarmi, gives easy-to-understand explanations to describe what can be possible with big data and related technologies like database, application servers or web servers. I recommend to this book to anyone who wants to learn more on big data and its terminologies. The book is also useful due to the fact that examples are presented from a variety of real time levels But be warned, you'll be left with more questions! You'll be ready to start your own search for other big data explanations and integrations about what you see happening all around you.
              This book does have a few drawbacks. First, for those not familiar with technology or for those yet unfamiliar with some of the main challenges brought up by big data learners, some of the writing in this book may seem foreign at first. The incorporation of terms from both the technology and Big data fields can make the reading difficult for some. Although it is apparent that the author wanted their book to be an accessible souce for non-experts, it sometimes falls short of this goal with the extended use of technical terms and wordy sentences.
              Overall, this book provides many insights into the use of technology to enable big data learners to acquire grip to their full capacity. And, for those unfamiliar with technology, this book presents them with basic information about the tools and technology that can make their understanding not only a technology specific, but a learning real-time scenarios as well.

               
              5.0

              Talend for Big Data book review

              By Pcoffre

              from Paris, France

              About Me Educator, Maker

              Verified Reviewer

              Pros

              • Accurate
              • Concise
              • Easy to understand
              • Helpful examples
              • Well-written

              Cons

              • For Hadoop Talend Users

              Best Uses

              • Intermediate
              • Novice
              • Student

              Comments about oreilly Talend for Big Data:

              Talend for Big Data explains to readers how to work with Talend's Big Data solutions. In seven easy-to-read chapters, you will be ready to take on the technology.

              Hadoop and big data require some coding skills but Talend Open Studio for Big Data as well as all the Enterprise and Platform versions will ease the access to the technology for you as users are able to develop graphically their jobs. Moreover, Talend provides a powerful and versatile open source big data product that makes the job of working with big data technologies easy and helps drive and improve business performance, without the need for special knowledge or resources.

              What is also interesting in Talend's technology is that the big data product combines big data components for MapReduce 2.0 (YARN), Hadoop, HBase, Hive, HCatalog, Oozie, Sqoop and Pig into a unified open source environment so you can quickly load, extract, transform and process large and diverse data sets from disparate systems.

              Overall what is enjoyable throughout the book are the screenshots added by the author and the examples that clearly illustrate concepts that can be complex to fully understand.

              Displaying reviews 1-5

              Back to top

               
              Buy 2 Get 1 Free Free Shipping Guarantee
              Buying Options
              Immediate Access - Go Digital what's this?
              Ebook: $20.99
              Formats:  ePub, Mobi, PDF