The single most authoritative guide on the most difficult phase ofbuilding a data warehouse
The extract, transform, and load (ETL) phase of the datawarehouse development life cycle is far and away the mostdifficult, time-consuming, and labor-intensive phase of building adata warehouse. Done right, companies can maximize their use ofdata storage; if not, they can end up wasting millions of dollarsstoring obsolete and rarely used data. Bestselling author RalphKimball, along with Joe Caserta, shows you how a properly designedETL system extracts the data from the source systems, enforces dataquality and consistency standards, conforms the data so thatseparate sources can be used together, and finally delivers thedata in a presentation-ready format.
Serving as a road map for planning, designing, building, andrunning the back-room of a data warehouse, this book providescomplete coverage of proven, timesaving ETL techniques. Beginningwith a quick overview of ETL fundamentals, it then looks at ETLdata structures, both relational and dimensional. The authors showhow to build useful dimensional structures, providing practicalexamples of techniques.
Along the way youll learn how to:
- Plan and design your ETL system
- Choose the appropriate architecture from the many possibleoptions
- Build the development/test/production suite of ETLprocesses
- Build a comprehensive data cleaning subsystem
- Tune the overall ETL process for optimum performance