Search Inside and Read Larger Cover R for Data Science Import, Tidy, Transform, Visualize, and Model Data By Publisher: O'Reilly Media Final Release Date: December 2016 Pages: 522 Learn how to use R to turn raw data into insight, knowledge, and understanding. This book introduces you to R, RStudio, and the tidyverse, a collection of R packages designed to work together to make data science fast, fluent, and fun. Suitable for readers with no previous programming experience, R for Data Science is designed to get you doing data science as quickly as possible. Authors Hadley Wickham and Garrett Grolemund guide you through the steps of importing, wrangling, exploring, and modeling your data and communicating the results. You’ll get a complete, big-picture understanding of the data science cycle, along with basic tools you need to manage the details. Each section of the book is paired with exercises to help you practice what you’ve learned along the way. You’ll learn how to: Wrangle —transform your datasets into a form convenient for analysis

—transform your datasets into a form convenient for analysis Program —learn powerful R tools for solving data problems with greater clarity and ease

—learn powerful R tools for solving data problems with greater clarity and ease Explore —examine your data, generate hypotheses, and quickly test them

—examine your data, generate hypotheses, and quickly test them Model —provide a low-dimensional summary that captures true "signals" in your dataset

—provide a low-dimensional summary that captures true "signals" in your dataset Communicate—learn R Markdown for integrating prose, code, and results Explore Chapter 1 Data Visualization with ggplot2 Introduction First Steps Aesthetic Mappings Common Problems Facets Geometric Objects Statistical Transformations Position Adjustments Coordinate Systems The Layered Grammar of Graphics Chapter 2 Workflow: Basics Coding Basics What’s in a Name? Calling Functions Chapter 3 Data Transformation with dplyr Introduction Filter Rows with filter() Arrange Rows with arrange() Select Columns with select() Add New Variables with mutate() Grouped Summaries with summarize() Grouped Mutates (and Filters) Chapter 4 Workflow: Scripts Running Code RStudio Diagnostics Chapter 5 Exploratory Data Analysis Introduction Questions Variation Missing Values Covariation Patterns and Models ggplot2 Calls Learning More Chapter 6 Workflow: Projects What Is Real? Where Does Your Analysis Live? Paths and Directories RStudio Projects Summary Wrangle Chapter 7 Tibbles with tibble Introduction Creating Tibbles Tibbles Versus data.frame Interacting with Older Code Chapter 8 Data Import with readr Introduction Getting Started Parsing a Vector Parsing a File Writing to a File Other Types of Data Chapter 9 Tidy Data with tidyr Introduction Tidy Data Spreading and Gathering Separating and Pull Missing Values Case Study Nontidy Data Chapter 10 Relational Data with dplyr Introduction nycflights13 Keys Mutating Joins Filtering Joins Join Problems Set Operations Chapter 11 Strings with stringr Introduction String Basics Matching Patterns with Regular Expressions Tools Other Types of Pattern Other Uses of Regular Expressions stringi Chapter 12 Factors with forcats Introduction Creating Factors General Social Survey Modifying Factor Order Modifying Factor Levels Chapter 13 Dates and Times with lubridate Introduction Creating Date/Times Date-Time Components Time Spans Time Zones Program Chapter 14 Pipes with magrittr Introduction Piping Alternatives When Not to Use the Pipe Other Tools from magrittr Chapter 15 Functions Introduction When Should You Write a Function? Functions Are for Humans and Computers Conditional Execution Function Arguments Return Values Environment Chapter 16 Vectors Introduction Vector Basics Important Types of Atomic Vector Using Atomic Vectors Recursive Vectors (Lists) Attributes Augmented Vectors Chapter 17 Iteration with purrr Introduction For Loops For Loop Variations For Loops Versus Functionals The Map Functions Dealing with Failure Mapping over Multiple Arguments Walk Other Patterns of For Loops Model Chapter 18 Model Basics with modelr Introduction A Simple Model Visualizing Models Formulas and Model Families Missing Values Other Model Families Chapter 19 Model Building Introduction Why Are Low-Quality Diamonds More Expensive? What Affects the Number of Daily Flights? Learning More About Models Chapter 20 Many Models with purrr and broom Introduction gapminder List-Columns Creating List-Columns Simplifying List-Columns Making Tidy Data with broom Communicate Chapter 21 R Markdown Introduction R Markdown Basics Text Formatting with Markdown Code Chunks Troubleshooting YAML Header Learning More Chapter 22 Graphics for Communication with ggplot2 Introduction Label Annotations Scales Zooming Themes Saving Your Plots Learning More Chapter 23 R Markdown Formats Introduction Output Options Documents Notebooks Presentations Dashboards Interactivity Websites Other Formats Learning More Chapter 24 R Markdown Workflow Title: R for Data Science By: Hadley Wickham, Garrett Grolemund Publisher: O'Reilly Media Formats: Print

Safari Books Online Print: Ebook: Pages: 522 Print ISBN: 978-1-4919-1039-9 | ISBN 10: 1-4919-1039-9 Ebook ISBN: 978-1-4919-1033-7 | ISBN 10: 1-4919-1033-X Hadley Wickham Hadley Wickham is an Assistant Professor and the Dobelman FamilyJunior Chair in Statistics at Rice University. He is an active memberof the R community, has written and contributed to over 30 R packages, and won the John Chambers Award for Statistical Computing for his work developing tools for data reshaping and visualization. His research focuses on how to make data analysis better, faster and easier, with a particular emphasis on the use of visualization to better understand data and models. View Hadley Wickham's full profile page. Garrett Grolemund Garrett Grolemund is a statistician, teacher and R developer who currently works for RStudio. He sees data analysis as a largely untapped fountain of value for both industry and science. Garrett received his Ph.D at Rice University in Hadley Wickham's lab, where his research traced the origins of data analysis as a cognitive process and identified how attentional and epistemological concerns guide every data analysis.



Garrett is passionate about helping people avoid the frustration and unnecessary learning he went through while mastering data analysis. Even before he finished his dissertation, he started teaching corporate training in R and data analysis for Revolutions Analytics. He's taught at Google, eBay, Axciom and many other companies, and is currently developing a training curriculum for RStudio that will make useful know-how even more accessible.



Outside of teaching, Garrett spends time doing clinical trials research, legal research, and financial analysis. He also develops R software, he's co-authored the lubridate R package--which provides methods to parse, manipulate, and do arithmetic with date-times--and wrote the ggsubplot package, which extends the ggplot2 package. View Garrett Grolemund's full profile page. Colophon The animal on the cover of R for Data Science is the kakapo (Strigops habroptilus). Also known as the owl parrot, the kakapo is a large flightless bird native to New Zealand. Adult kakapos can grow up to 64 centimeters in height and 4 kilograms in weight. Their feathers are generally yellow and green, although there is significant variation between individuals. Kakapos are nocturnal and use their robust sense of smell to navigate at night. Although they cannot fly, kakapos have strong legs that enable them to run and climb much better than most birds. The name kakapo comes from the language of the native Maori people of New Zealand. Kakapos were an important part of Maori culture, both as a food source and as a part of Maori mythology. Kakapo skin and feathers were also used to make cloaks and capes. Due to the introduction of predators to New Zealand during European colonization, kakapos are now critically endangered, with less than 200 individuals currently living. The government of New Zealand has been actively attempting to revive the kakapo population by providing special conservation zones on three predator-free islands.