HBase is a remarkable tool for indexing mass volumes of data, but getting started with this distributed database and its ecosystem can be daunting. With this hands-on guide, you’ll learn how to architect, design, and deploy your own HBase applications by examining real-world solutions. Along with HBase principles and cluster deployment guidelines, this book includes in-depth case studies that demonstrate how large companies solved specific use cases with HBase.
Authors Jean-Marc Spaggiari and Kevin O’Dell also provide draft solutions and code examples to help you implement your own versions of those use cases, from master data management (MDM) and document storage to near real-time event processing. You’ll also learn troubleshooting techniques to help you avoid common deployment mistakes.
Learn exactly what HBase does, what its ecosystem includes, and how to set up your environment
Explore how real-world HBase instances were deployed and put into production
Examine documented use cases for tracking healthcare claims, digital advertising, data management, and product quality
Understand how HBase works with tools and techniques such as Spark, Kafka, MapReduce, and the Java API
Learn how to identify the causes and understand the consequences of the most common HBase issues
Introduction to HBase
Chapter 1What Is HBase?
Column-Oriented Versus Row-Oriented
Implementation and Use Cases
Chapter 2HBase Principles
Internal Table Operations
Chapter 3HBase Ecosystem
Chapter 4HBase Sizing and Tuning Overview
Different Workload Tuning
Chapter 5Environment Setup
HBase Standalone Installation
HBase in a VM
Local Versus VM
Pseudodistributed and Fully Distributed
Chapter 6Use Case: HBase as a System of Record
Chapter 7Implementation of an Underlying Storage Engine
Chapter 8Use Case: Near Real-Time Event Processing
Near Real-Time Event Processing
Chapter 9Implementation of Near Real-Time Event Processing
Chapter 10Use Case: HBase as a Master Data Management Tool
Chapter 11Implementation of HBase as a Master Data Management Tool
Jean-Marc Spaggiari, an HBase contributor since 2012, works as an HBase specialist Solutions Architect for Cloudera to support Hadoop and HBase through technical support and consulting work. He has worked with some of the biggest HBase users in North America.
Jean-Marc’s prime role is to support HBase users over their HBase cluster deployments, upgrades, configuration and optimization, as well as to support them regarding HBase related application development. He is also a very active HBase community member, testing every release from performance and stability standpoints. Prior to Cloudera, Jean-Marc worked as a Project Manager and as a Solution Architect for CGI and insurances companies. He has almost 20 years of Java development experience. In addition to regularly attending HBaseCon, he has spoken at various Hadoop User Group meetings and many conferences in North America, usually focusing on HBase related presentations and demonstration.
Kevin is currently a Field Engineer at Rocana where he works with customers to architect large-scale IT Operations. Prior to Rocana, Kevin worked at Cloudera for over four years where he interacted with numerous Fortune 500 companies across every vertical.
In addition, to his day to day at Rocana, Kevin works closely with the open source Apache community. He is a contributor on the Apache HBase project, has written numerous blog posts and presented at multiple conferences regarding the Hadoop ecosystem.
The animal on the cover of Architecting HBase Applications is a killer whale or orca (Orcinus orca). Killer whales have black and white coloring, including a distinctive white patch above the eye. Males can grow up to 26 feet in length and can weigh up to 6 tons. Females are slightly smaller, growing to 23 feet and 4 tons in size.
Killer whales are toothed whales, and feed on fish, sea mammals, birds, and even other whales. Within their ecosystem they are apex predators, meaning they have no natural predators. Groups of killer whales (known as pods) have been observed specializing in what they eat, so diets can vary from one pod to another. Killer whales are highly social animals, and develop complex relationships and hierarchies. They are known to pass knowledge, such as hunting techniques and vocalizations, along from generation to generation. Over time, this has the effect of creating divergent behaviors between different pods.
Killer whales are not classified as a threat to humans, and have long played a part in the mythology of several cultures. Like most species of whales, the killer whale population was drastically reduced by commercial whaling over the last several centuries. Although whaling has been banned, killer whales are still threatened by human activities, including boat collisions and fishing line entanglement. The current population is unknown, but is estimated to be around 50,000.
Many of the animals on O'Reilly covers are endangered; all of them are important to the world. To learn more about how you can help, go to animals.oreilly.com.
The cover image is from British Quadrapeds. The cover fonts are URW Typewriter and Guardian Sans. The text font is Adobe Minion Pro; the heading font is Adobe Myriad Condensed; and the code font is Dalton Maag's Ubuntu Mono.