Book description
Over the last 20 years, companies have invested roughly $3-4 trillion in enterprise software. These investments have been primarily focused on the development and deployment of single systems, applications, functions, and geographies targeted at the automation and optimization of key business processes. Companies are now investing heavily in big data analytics ($44 billion alone in 2014) in an effort to begin analyzing all of the data being generated from their process automation systems. But companies are quickly realizing that one of their key bottlenecks is Data Variety—the silo’d nature of the data that is a natural result of internal and external source proliferation.
The problem of big data variety has crept up from the bottom—and the cost of variety is only appreciated when companies attempt to ask simple questions across many business silos (divisions, geographies, functions, etc.). Current top-down, deterministic data unification approaches (such as ETL, ELT, and MDM) were simply not designed to scale to the variety of hundreds or thousands or even tens of thousands of data silos.
Download this free eBook to learn about the fundamental challenges that Data Variety poses to enterprises looking to maximize the value of their existing investments—and how new approaches promise to help organizations embrace and leverage the fundamental diversity of data. Readers will also find best practices for designing bottom-up and probabilistic methods for finding and managing data; principles for doing data science at scale in the big data era; preparing and unifying data in ways that complement existing systems; optimizing data warehousing; and how to use “data ops” to automate large-scale integration.
Publisher resources
Table of contents
- Introduction
- 1. The Solution: Data Curation at Scale
- 2. An Alternative Approach to Data Management
- 3. Pragmatic Challenges in Building Data Cleaning Systems
- 4. Understanding Data Science: An Emerging Discipline for Data-Intensive Discovery
- 5. From DevOps to DataOps
- 6. Data Unification Brings Out the Best in Installed Data Management Strategies
Product information
- Title: Getting Data Right
- Author(s):
- Release date: September 2015
- Publisher(s): O'Reilly Media, Inc.
- ISBN: 9781491935316
You might also like
book
Data Preparation in the Big Data Era
Preparing and cleaning data is notoriously expensive, prone to error, and time consuming: the process accounts …
book
Predictive Analytics and Data Mining
Put Predictive Analytics into ActionLearn the basics of Predictive Analysis and Data Mining through an easy …
book
Getting DataOps Right
Many large organizations have accumulated dozens of disconnected data sources to serve different lines of business …
book
Data Science for Business
Written by renowned data science experts Foster Provost and Tom Fawcett, Data Science for Business introduces …