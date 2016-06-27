The Enterprise Big Data Lake
Delivering the Promise of Big Data and Data Science
Publisher: O'Reilly Media
Release Date: March 2019
Pages: 224
Enterprises are experimenting with using Hadoop to build Big Data Lakes, but many projects are stalling or failing because the approaches that worked at Internet companies have to be adopted for the enterprise. This practical handbook guides managers and IT professionals from the initial research and decision-making process through planning, choosing products, and implementing, maintaining, and governing the modern data lake.
You'll explore various approaches to starting and growing a Data Lake, including Data Warehouse off-loading, analytical sandboxes, and "Data Puddles." Author Alex Gorelik shows you methods for setting up different tiers of data, from raw untreated landing areas to carefully managed and summarized data. You'll learn how to enable self-service to help users find, understand, and provision data; how to provide different interfaces to users with different skill levels; and how to do all of that in compliance with enterprise data governance policies.
Table of Contents
-
Chapter 1 A Historical Perspective
-
Chapter 2 Introduction to Big Data and Data Science
-
Chapter 3 The Enterprise Data Lake
-
Chapter 4 Starting a Data Lake
-
Chapter 5 Architecting a Data Lake and Fitting It into the Enterprise Data Ecosystem
-
Chapter 6 Optimizing for Self-Service: Giving Power to the Analysts and Removing IT Bottlenecks
-
Chapter 7 Governing the Data Lake
-
Chapter 8 Integrating the Data Lake into the Enterprise
-
Chapter 9 Evaluating and Choosing Technology Solutions for Processing, Ingestion, Management, and Governance
-
Chapter 10 Industry-Specific Examples of Successful Data Lakes