Getting Started with Impala
Interactive SQL for Apache Hadoop
Publisher: O'Reilly Media
Final Release Date: August 2014
Pages: 110

With Early Release ebooks, you get books in their earliest form — the author's raw and unedited content as he or she writes — so you can take advantage of these technologies long before the official release of these titles. You'll also receive updates when significant changes are made, new chapters as they're written, and the final ebook bundle.

This practical book shows you how to write, tune, and port SQL queries and other statements for a Big Data environment using Impala, the open source, MPP SQL query engine for Apache Hadoop. The best practices outlined inside help database developers and business analysts design schemas that interoperate with other Hadoop components, are convenient for administers to manage and monitor, and accommodate future expansion in data size and evolution of software capabilities.

Author John Russell from Cloudera’s Impala project includes insights from consulting engagements with Cloudera customers and from the Impala development team.

  • Deploy SQL applications on Hadoop
  • Understand the strengths and limitations of Impala’s massively parallel processing model for various data-related use cases
  • Learn a mental model for understanding performance characteristics of Impala
  • Optimize queries and other statements
  • Reduce time and effort for porting SQL code to Impala
Table of Contents
Product Details
About the Author
Recommended for You
Customer Reviews
Buy 2 Get 1 Free Free Shipping Guarantee
Buying Options
Immediate Access - Go Digital what's this?
Pre-Order  Print: $29.99
October 2014 (est.)