Accumulo is a sorted and distributed key/value store designed to handle large amounts of data. Being highly robust and scalable, its performance makes it ideal for real-time data storage. Apache Accumulo is based on Google's BigTable design and is built on top of Apache Hadoop, Zookeeper, and Thrift.
Apache Accumulo for Developers is your guide to building an Accumulo cluster both as a single-node and multi-node, on-site and in the cloud. Accumulo has been proven to be able to handle petabytes of data, with cell-level security, and real-time analyses so this is your step by step guide in taking full advantage of this power.
Apache Accumulo for Developers looks at the process of setting up three systems - Hadoop, ZooKeeper, and Accumulo – and configuring, monitoring, and securing them.
You will learn to connect Accumulo to both Hadoop and ZooKeeper. You will also learn how to monitor the cluster (single-node or multi-node) to find any performance bottlenecks, and then integrate to Amazon EC2, Google Cloud Platform, Rackspace, and Windows Azure. When integrating with these cloud platforms, we will focus on scripting as well.
You will also learn to troubleshoot clusters with monitoring tools, and use Accumulo cell-level security to secure your data.
The book will have a tutorial-based approach that will show the readers how to start from scratch with building an Accumulo cluster and learning how to monitor the system and implement aspects such as security.
Who this book is for
This book is great for developers new to Accumulo, who are looking to get a good grounding in how to use Accumulo. It’s assumed that you have an understanding of how Hadoop works, both HDFS and the Map/Reduce. No prior knowledge of ZooKeeper is assumed.