As more corporations turn to Hadoop to store and process their most valuable data, the risk of a potential breach of those systems increases exponentially. This practical book not only shows Hadoop administrators and security architects how to protect Hadoop data from unauthorized access, it also shows how to limit the ability of an attacker to corrupt or modify data in the event of a security breach.
Authors Ben Spivey and Joey Echeverria provide in-depth information about the security features available in Hadoop, and organize them according to common computer security concepts. You’ll also get real-world examples that demonstrate how you can apply these concepts to your use cases.
Understand the challenges of securing distributed systems, particularly Hadoop
Use best practices for preparing Hadoop cluster hardware as securely as possible
Get an overview of the Kerberos network authentication protocol
Delve into authorization and accounting principles as they apply to Hadoop
Learn how to use mechanisms to protect data in a Hadoop cluster, both in transit and at rest
Integrate Hadoop data ingest into enterprise-wide security architecture
Ensure that security architecture reaches all the way to end-user access
Hadoop Security: A Brief History
Hadoop Components and Ecosystem
Chapter 2Securing Distributed Systems
Threat and Risk Assessment
Defense in Depth
Chapter 3System Architecture
Hadoop Roles and Separation Strategies
Operating System Security
Kerberos Workflow: A Simple Example
Authentication, Authorization, and Accounting
Chapter 5Identity and Authentication
MapReduce and YARN Authorization
HBase and Accumulo Authorization
Chapter 7Apache Sentry (Incubating)
The Sentry Service
Sentry Privilege Models
Sentry Policy Administration
HDFS Audit Logs
MapReduce Audit Logs
YARN Audit Logs
Hive Audit Logs
Cloudera Impala Audit Logs
HBase Audit Logs
Accumulo Audit Logs
Sentry Audit Logs
Chapter 9Data Protection
Encrypting Data at Rest
Encrypting Data in Transit
Data Destruction and Deletion
Chapter 10Securing Data Ingest
Integrity of Ingested Data
Data Ingest Confidentiality
Chapter 11Data Extraction and Client Access Security
Ben is currently a Solutions Architect at Cloudera. During his time with Cloudera, he has worked in a consulting capacity to assist customers with their Hadoop deployments. Ben has worked with many Fortune 500 companies across multiple industries, including financial services, retail, and health care. His primary expertise is the planning, installation, configuration, and securing of customers' Hadoop clusters.
In addition to consulting responsibilities, Ben contributes a vast amount of technical writing on customer document deliverables, to include Hadoop best practices, security integration, and cluster administration.
Prior to Cloudera, Ben worked for the National Security Agency and with a defense contractor as a software engineer. During this time, Ben built applications that, among other things, integrated with enterprise security infrastructure to protect sensitive information.
Ben holds a Bachelor’s degree in Computer Science and a Master’s degree in Information Technology, with a focus on Information Assurance. Ben’s final Master’s project was focused on designing an enterprise IT infrastructure with a defense-in-depth approach.
Joey Echeverria is a Software Engineer at ScalingData where he builds the next generation of IT Operations Analytics on the Apache Hadoop platform. Joey is also a committer on the Kite SDK, an Apache-licensed data API for the Hadoop ecosystem. Joey was previously a Software Engineer at Cloudera where he contributed to a number of ASF projects including Apache Flume, Apache Sqoop, Apache Hadoop, and Apache HBase.
While at Cloudera, Joey also served as the Director of Federal FieldTechnical Services, overseeing the public sector Professional Servicesand Systems Engineering teams. Joey started at Cloudera as a SolutionsArchitect, where he helped customers to design, develop, and deployproduction Hadoop applications and clusters. When needed, he has also filled in for Cloudera’s support and training teams and has taughtCloudera’s administrator and Apache HBase courses.
Joey’s background is in building and deploying secure data processing applications, with the last 7 years focused on Hadoop-based applications. In the past, he has worked on resource constrained data processing, a clustered implementation of the Snort intrusiondetection system, and he built a distributed index system on Hadoopwhen he worked for NSA.
The animal on the cover of Hadoop Security is a Japanese badger (Meles anakuma), in the same family as weasels. As its name suggests, it's endemic to Japan; it is found on Honshu, Kyushu, Shikoku, and Shodoshima.
Japanese badgers are small compared to its European counterparts. Males are about 31 inches in length and females are a little smaller at an average of 28 inches. Other than the size of their canine teeth, males and females don't differ much physically. Adults weigh about 8.8 to 17.6 pounds, and have blunt torsos with short limbs. The badger has powerful digging claws on its front feet and smaller hind feet. Though not as distinct as on the European badger, the Japanese badger has the characteristic black and white stripes on its face.
Japanese badgers are nocturnal and hibernate during the winter. Once females are two years old, they mate and birth litters up to two or three cubs in the spring. Compared to their European counterparts, Japanese badgers are more solitary; mates don't form pair bonds.
Japanese badgers inhabit a variety of woodland and forest habitats, where they eat an omnivorous diet of worms, beetles, berries, and persimmons.
Many of the animals on O'Reilly covers are endangered; all of them are important to the world. To learn more about how you can help, go to animals.oreilly.com.
The cover image is from loose plates, source is unknown. The cover fonts are URW Typewriter and Guardian Sans. The text font is Adobe Minion Pro; the heading font is Adobe Myriad Condensed; and the code font is Dalton Maag's Ubuntu Mono.
I was working on Hadoop Kerberos pass through authentication when I discover this book. This is exactly what I needed: comprehensive and complete information about Kerberos and Hadoop security including detailed configurations. Before this book, I googled a lot and discover different pieces information scatter around apache, cloudera and hortonworks sites, but it does not give you a complete picture as to how these pieces fit together. This book bring this together. Very helpful.
One important part not covered is how to implement pass through security programmingly. Of course there are only very small audience for this topic as it is advanced topic. But it will help to complete the story.
Bottom Line Yes, I would recommend this to a friend