Apache Sqoop Cookbook
Unlocking Hadoop for Your Relational Database
Publisher: O'Reilly Media
Released: July 2013
Pages: 94

Integrating data from multiple sources is essential in the age of big data, but it can be a challenging and time-consuming task. This handy cookbook provides dozens of ready-to-use recipes for using Apache Sqoop, the command-line interface application that optimizes data transfers between relational databases and Hadoop.

Sqoop is both powerful and bewildering, but with this cookbook’s problem-solution-discussion format, you’ll quickly learn how to deploy and then apply Sqoop in your environment. The authors provide MySQL, Oracle, and PostgreSQL database examples on GitHub that you can easily adapt for SQL Server, Netezza, Teradata, or other relational systems.

  • Transfer data from a single database table into your Hadoop ecosystem
  • Keep table data and Hadoop in sync by importing data incrementally
  • Import data from more than one database table
  • Customize transferred data by calling various database functions
  • Export generated, processed, or backed-up data from Hadoop to your database
  • Run Sqoop within Oozie, Hadoop’s specialized workflow scheduler
  • Load data into Hadoop’s data warehouse (Hive) or database (HBase)
  • Handle installation, connection, and syntax issues common to specific database vendors
Table of Contents
Product Details
About the Author
Colophon
Recommended for You
Customer Reviews

REVIEW SNAPSHOT®

by PowerReviews
oreillyApache Sqoop Cookbook
 
4.0

(based on 2 reviews)

Ratings Distribution

  • 5 Stars

     

    (0)

  • 4 Stars

     

    (2)

  • 3 Stars

     

    (0)

  • 2 Stars

     

    (0)

  • 1 Stars

     

    (0)

Reviewed by 2 customers

Sort by

Displaying reviews 1-2

Back to top

 
4.0

Great Overview and a Valuable Reference

By Tom Wheeler

from St. Louis, Missouri

About Me Developer, Educator

Pros

  • Accurate
  • Concise
  • Easy to understand
  • Helpful examples

Cons

  • Print Quality

Best Uses

  • Intermediate
  • Novice

Comments about oreilly Apache Sqoop Cookbook:

Although just 75 pages long, this book is both a great overview and a valuable reference. It focuses on what's important rather than trying to cover every possible detail of Sqoop. Both authors are involved in the development and leadership of Sqoop and their knowledge is extensive. This shines through in the explanations, which I found both helpful and technically accurate.

Sqoop is a powerful tool with lots of options. Beginners are often unaware of its capabilities and wind up doing things the hard way. Even those who have used Sqoop for years might not know about some of its newer features, such as how to use saved jobs to track incremental imports. I'd recommend this book to either group, because spending just an hour or two reading it now could save you a lot more time later.

Since Sqoop is used to get data into and out of a Hadoop cluster, it is typically the first or last step in a much larger data processing workflow. In other words, everything else depends on your ability to use Sqoop quickly and correctly. That makes the "cookbook" format of this book all the more valuable -- it lets you flip right to the page you need and read a concise explanation that shows you exactly how to get the job done.

The content of the book is great, but I am giving this book only four stars due to a problem with the printing itself. The ink used in this book is shiny and actually causes a glare that can make it difficult to read in certain lighting. I have noticed this problem with a few newer O'Reilly books and I hope it's something that they'll fix soon.

Disclosure: Both authors are co-workers of mine at Cloudera. I volunteered to serve as a technical reviewer for this book and the publisher sent me a free copy after it went to press.

 
4.0

Good examples, install instruct's so so

By Joe

from CO

About Me Developer

Verified Reviewer

Pros

  • Accurate
  • Concise
  • Easy to understand
  • Helpful examples
  • Well-written

Cons

    Best Uses

    • Intermediate

    Comments about oreilly Apache Sqoop Cookbook:

    Overall, really liked the organization and information presented in the book. I wish the installation section had a little more detailed information. Once I got past the install, the other recipes worked very well.

    Displaying reviews 1-2

    Back to top

     
    Buy 2 Get 1 Free Free Shipping Guarantee
    Buying Options
    Immediate Access - Go Digital what's this?
    Ebook: $9.99
    Formats:  ePub, Mobi, PDF
    Print & Ebook: $16.49
    Print: $14.99