Despite the excitement around "data science," "big data," and "analytics," the ambiguity of these terms has led to poor communication between data scientists and organizations seeking their help. In this report, authors Harlan Harris, Sean Murphy, and Marck Vaisman examine their survey of several hundred data science practitioners in mid-2012, when they asked respondents how they viewed their skills, careers, and experiences with prospective employers. The results are striking.
Based on the survey data, the authors found that data scientists today can be clustered into four subgroups, each with a different mix of skillsets. Their purpose is to identify a new, more precise vocabulary for data science roles, teams, and career paths.
This report describes:
Four data scientist clusters: Data Businesspeople, Data Creatives, Data Developers, and Data Researchers
Cases in miscommunication between data scientists and organizations looking to hire
Why "T-shaped" data scientists have an advantage in breadth and depth of skills
How organizations can apply the survey results to identify, train, integrate, team up, and promote data scientists
Chapter 1 Introduction
Chapter 2 Case Studies in Miscommunication
Rock Stars and Gods
Apples and Oranges
Chapter 3 A Survey of, and About, Professionals
Clustering Data Scientists
The Variety of Data Scientists
Chapter 4 T-Shaped Data Scientists
Evidence for T-Shaped Data Scientists
Chapter 5 Data Scientists and Organizations
Where Data People Come From: Science vs. Tools Education
From Theory to Practice: Internships and Mentoring
Harlan D. Harris is a Senior Data Scientist at Kaplan Test Prep, the Co-Founder and Co-Organizer of the Data Science DC Meetup, and the Co-Founder and President of Data Community DC, Inc. He has a PhD in Computer Science (Machine Learning) from the University of Illinois at Urbana-Champaign and worked as a researcher in several Psychology departments before turning to industry.
Sean Patrick Murphy, with degrees in mathematics, electrical engineering, and biomedical engineering and an MBA from Oxford University, has served as a senior scientist at the Johns Hopkins Applied Physics Laboratory for the past ten years. Previously, he served as the Chief Data Scientist at WiserTogether, a series A funded health care analytics firm, and the Director of Research at Manhattan Prep, a boutique graduate educational company. He was also the co-founder and CEO of a big data-focused startup: CloudSpree.
Marck Vaisman is a data scientist, consultant, entrepreneur, master munger and hacker. Marck is the Principal Data Scientist at DataXtract, LLC helping clients from start-ups to Fortune 500 firms with all kinds of data science projects. His professional experience spans the management consulting, telecommunications, Internet, and technology industries. He is the co-founder of Data Community DC, Inc. and co-organizer of the Data Science DC and R Users DC meetup groups. He has an MBA from Vanderbilt University and a B.S. in Mechanical Engineering from Boston University. Marck is also a contributing author of The Bad Data Handbook.
This book is a personal positioning help for data scientists and those who want to become one. These days everybody talks about data science and data scientists but these concepts are rarely elaborated. If you are wondering whether you can become a data scientist or not or do not know what to do exactly, this is the book for you.
This book clusters respondents to a survey based on their skills and thoughts about themselves and proposes some groups with their corresponding skills and the breadth and depth of those skills.
I found this book easy to read, story-based and entertaining so everyone can enjoy and learn. The main problem the book arises is an easy- looking but often neglected one; as mentioned in the beginning of the book, organizations still have problems to bring data science into action and employees to respond to different and sometimes confusing expectations.
Bottom Line Yes, I would recommend this to a friend