Book description
A hands-on guide on how to execute an analytics project from conceptualization to operationalization using Greenplum
- Explore the software components and appliance modules available in Greenplum
- Learn core Big Data Architecture concepts and master data loading and processing patterns
- Understand Big Data problems and the Data Science lifecycle
In Detail
Organizations are leveraging the use of data and analytics to gain a competitive advantage over their opposition. Therefore, organizations are quickly becoming more and more data driven. With the advent of Big Data, existing Data Warehousing and Business Intelligence solutions are becoming obsolete, and a requisite for new agile platforms consisting of all the aspects of Big Data has become inevitable. From loading/integrating data to presenting analytical visualizations and reports, the new Big Data platforms like Greenplum do it all. It is now the mindset of the user that requires a tuning to put the solutions to work.
"Getting Started with Greenplum for Big Data Analytics" is a practical, hands-on guide to learning and implementing Big Data Analytics using the Greenplum Integrated Analytics Platform. From processing structured and unstructured data to presenting the results/insights to key business stakeholders, this book explains it all.
"Getting Started with Greenplum for Big Data Analytics" discusses the key characteristics of Big Data and its impact on current Data Warehousing platforms. It will take you through the standard Data Science project lifecycle and will lay down the key requirements for an integrated analytics platform. It then explores the various software and appliance components of Greenplum and discusses the relevance of each component at every level in the Data Science lifecycle.
You will also learn Big Data architectural patterns and recap some key advanced analytics techniques in detail. The book will also take a look at programming with R and integration with Greenplum for implementing analytics. Additionally, you will explore MADlib and advanced SQL techniques in Greenplum for analytics. This book also elaborates on the physical architecture aspects of Greenplum with guidance on handling high-availability, back-up, and recovery.
Table of contents
-
Getting Started with Greenplum for Big Data Analytics
- Table of Contents
- Getting Started with Greenplum for Big Data Analytics
- Credits
- Foreword
- About the Author
- Acknowledgement
- About the Reviewers
- www.PacktPub.com
- Preface
- 1. Big Data, Analytics, and Data Science Life Cycle
-
2. Greenplum Unified Analytics Platform (UAP)
- Big Data analytics – platform requirements
- Greenplum Unified Analytics Platform (UAP)
- Greenplum UAP components
- Greenplum Data Computing Appliance (DCA)
- Greenplum Data Integration Accelerator (DIA)
- References/Further reading
- Summary
- 3. Advanced Analytics – Paradigms, Tools, and Techniques
-
4. Implementing Analytics with Greenplum UAP
- Data loading for Greenplum Database and HD
- Greenplum table distribution and partitioning
- Data Computing Appliance (DCA)
- Greenplum Database management
- In-database analytics options (Greenplum-specific)
- Using R with Greenplum
- Using Weka with Greenplum
- Using MADlib with Greenplum
- Using Greenplum Chorus
- Pivotal
- References/Further reading
- Summary
- Index
Product information
- Title: Getting Started with Greenplum for Big Data Analytics
- Author(s):
- Release date: October 2013
- Publisher(s): Packt Publishing
- ISBN: 9781782177043
You might also like
book
Real-Time Data Analytics for Large Scale Sensor Data
Real-Time Data Analytics for Large-Scale Sensor Data covers the theory and applications of hardware platforms and …
book
Valuing Businesses Using Regression Analysis
Demystifies regression-based valuation through simple explanations, easy-to-understand charts, and time-saving bonus resources Current methodologies using median, …
book
Multiple Time Series Modeling Using the SAS VARMAX Procedure
Aimed at econometricians who have completed at least one course in time series modeling, Multiple Time …
book
Applied Data Science Using PySpark: Learn the End-to-End Predictive Model-Building Cycle
Discover the capabilities of PySpark and its application in the realm of data science. This comprehensive …