Strata Conference Santa Clara 2014: Complete Video Compilation

Video description

Gain a clear perspective on the future of big data—and all the analytics, architectures, techniques, tools, and technologies you need to use data successfully. With this complete video compilation, you’ll get a front-row seat to the keynotes, workshops, and sessions at O’Reilly’s Strata Conference Santa Clara 2014. You can download these videos or stream them through our HD player.

Publisher resources

View/Submit Errata

Table of contents

  1. Tutorials
    1. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 1
    2. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 2
    3. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 3
    4. Introduction to Machine Learning with IPython and scikit-learn - Olivier Grisel - Part 4
    5. IPython In Depth - Brian Granger and Fernando Prez - Part 1
    6. IPython In Depth - Brian Granger and Fernando Prez - Part 2
    7. IPython In Depth - Brian Granger and Fernando Prez - Part 3
    8. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 1
    9. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 2
    10. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 3
    11. Building a Data Platform - John Akred, Richard Williamson, and Stephen O'Sullivan - Part 4
    12. Design Thinking for Dummies (Data Scientists) - Michael Stringer, Dean Malmgren, and Laurie Skelly - Part 1
    13. Design Thinking for Dummies (Data Scientists) - Michael Stringer, Dean Malmgren, and Laurie Skelly - Part 2
    14. Design Thinking for Dummies (Data Scientists) - Michael Stringer, Dean Malmgren, and Laurie Skelly - Part 3
    15. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 1
    16. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 2
    17. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 3
    18. Dissecting Data Science Algorithms using Spreadsheets - John Foreman - Part 4
    19. Introduction to Hadoop 2.0 - Rich Raposa - Part 1
    20. Introduction to Hadoop 2.0 - Rich Raposa - Part 2
    21. Introduction to Hadoop 2.0 - Rich Raposa - Part 3
    22. Introduction to Hadoop 2.0 - Rich Raposa - Part 4
    23. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 1
    24. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 2
    25. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 3
    26. Large-scale Machine Learning Cookbook using GraphLab - Carlos Guestrin - Part 4
    27. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 1
    28. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 2
    29. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 3
    30. From Scattered to Scatterplots: An Introduction to d3.js - Scott Murray - Part 4
    31. Effective Data Science With Scalding - Vitaly Gordon - Part 1
    32. Effective Data Science With Scalding - Vitaly Gordon - Part 2
    33. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 1
    34. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 2
    35. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 3
    36. Big Data Workflows on Mesos Clusters - Florian Leibert, Paco Nathan, and Benjamin Hindman - Part 4
    37. Adviser: Learning How to get A Second Opinion on Your Analysis when it's Important to get it Right - Leland Wilkinson - Part 1
    38. Adviser: Learning How to get A Second Opinion on Your Analysis when it's Important to get it Right - Leland Wilkinson - Part 2
    39. Adviser: Learning How to get A Second Opinion on Your Analysis when it's Important to get it Right - Leland Wilkinson - Part 3
    40. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 1
    41. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 2
    42. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 3
    43. Building Real-Time Apps with Apache HBase - Ronan Stokes - Part 4
    44. Data Transformation: Skills of the Agile Data Wrangler - Joe Hellerstein, and Jeffrey Heer - Part 1
    45. Data Transformation: Skills of the Agile Data Wrangler - Joe Hellerstein, and Jeffrey Heer - Part 2
  2. Hardcore Data Science
    1. Hardcore Data Science Opening Remarks - Ben Lorica
    2. Extreme Machine Learning - Alexander Gray
    3. What the #@)*$ is Big Data? A Holistic View of Data and Algorithms - Alice Zheng
    4. Overcoming the Barriers to Production-Ready Machine-Learning Workflows - Henrik Brink, and Joshua Bloom
    5. Anomaly Detection - Ted Dunning
    6. Neural Networks for Machine Perception - Ilya Sutskever
    7. The Predictive Business - Kira Radinsky
    8. Can We Make Big Data Management Easier? - Magda Balazinska
    9. Design Challenges for Real Predictive Platforms - Max Gasner
    10. Machine Learning Gremlins - Ben Hamner
    11. Algebra for Scalable Analytics - Oscar Boykin
  3. Data-Driven Business Day
    1. Introduction to Data Driven Business Day - Alistair Croll
    2. Those Numbers Wont Measure Themselves - Farrah Bostic
    3. Social Data Intelligence: Integrating Social and Enterprise Data for Competitive Advantage - Susan Etlinger
    4. Open Data: Its Not Just for Governments - Jen van der Meer
    5. The Insight Economy - Krista Schnell
    6. 9 Levers for Converting Big Data and Analytics into Results - Christy Maver
    7. Deploying a Data Sciences Team -- The Promise and the Pitfalls - Diane Chang
    8. Sensing Best Practices - Ben Waber
    9. Leveraging Value from Open Data Through Collaboration -Peter Pirnejad
    10. Becoming a Learning Organization: From Data Teams to Corporate Influence - Pamela Peele
    11. Making Big Data Small - Baron Schwartz
    12. Big Data Meets Big Infrastructure: Going Underground in One Major European City - Narendra Mulani
    13. The Era of Data-Powered Government - Beth Blauer
    14. TripIt Uses Data to Organize Itineraries, No Matter Where You Book - Edith Harbaugh
  4. Keynotes
    1. Crossing the Chasm: What's New, What's Not - Geoffrey Moore
    2. Evolution from Apache Hadoop to the Enterprise Data Hub - Amr Awadallah
    3. Collecting Massive Data via Crowdsourcing - John Schitka
    4. Empowering Personalized Learning with Big Data - Ramona Pierson
    5. Hadoop in 5 Minutes or Less - John Schroeder
    6. People are Data Too - Farrah Bostic
    7. Bringing Big Data to One Billion People - Quentin Clark
    8. Small Data in Sports: Little Differences that Mean Big Outcomes - David Epstein
    9. The Art of Good Practice - Rodney Mullen
    10. Big Data Moonshots and Ground Control - Joe Hellerstein and Tutti Taygerly
    11. Data Science and Smart Systems: Creating the Digital Brain - Kaushik Das
    12. How Companies are Using Spark, and Where the Edge in Big Data Will Be - Matei Zaharia
    13. In-Hadoop Analytics: Bringing analytics to big data - Anjul Bhambhri
    14. Record Linkage and Other Statistical Models for Quantifying Conflict Casualties in Syria - Megan Price
    15. Ben Fry Keynote
    16. Survivorship Bias and the Psychology of Luck - David McRaney
  5. Sessions
    1. Apache Hadoop and the Emergence of the Enterprise Data Hub - Eli Collins
    2. Information Visualization for Large-Scale Data Workflows - Michael Conover
    3. Adaptive Adversaries: Building Systems to Fight Fraud and Cyber Intruders - Ari Gesher
    4. Fighting Global Cybercrime and BotNets using Big Data - Bryan Hurd and Herain Oberoi
    5. Navigating the Big Data Vendor Landscape - Edd Dumbill
    6. Best Practices for Hadoop In Production - Panel Discussion Facilitated by Forrester Analyst - Mike Gualtieri
    7. Thorn in the Side of Big Data: Too Few Artists - Chris Re
    8. 10,000: The Most Dangerous Number in Sports - David Epstein
    9. You're Halfway There: Moving from Insight to Action - Bob Filbin
    10. Building the Next Generation Data Architecture with Hadoop, Data Warehouse Data Discovery Platform - Bill Franks
    11. Minority Report Meets Big Data: Touch and Interactive Big Data is Here - Justin Langseth, and Eva Andreasson
    12. Machine Learning for Social Change - Fernand Pajot
    13. Harness Data in Real-Time with Infinite Storage - Yuvaraj Athur Raghuvir
    14. You Don't Need to Boil the Big Data Ocean with Hadoop - Ben Werther, and Sanjay Mathur
    15. Predictive Modeling in the Cloud with Scikit-learn and IPython - Olivier Grisel
    16. Mining Student Notes in Real Time to Provide Study Guides - Perry Samson
    17. Thinking with Data - Max Shron
    18. Building a Data-centered Data Center for Agile Development - Justin Makeig
    19. Evolving Data Governance for the Big Data Enterprise - Scott Lee and Rachel Haines
    20. Making Big Data Cost Effective in a Bare Metal Cloud - Harold Hannon
    21. How Evernote Does Conversion Using Hadoop Analytics - Damon Cool
    22. Crowdsourcing at Locu: How I Learned to Stop Worrying and Love the Crowd - Adam Marcus
    23. Building a Lightweight Discovery Interface for Chinese Patents - Eric Pugh
    24. Superconductor: Scaling Charts with Design and GPUs - Leo Meyerovich
    25. Break Down Data Silos with Apache Accumulo - Adam Fuchs
    26. Organizing Big Data with the Crowd - Lukas Biewald
    27. Scalable PostgreSQL as your data platform - Ben Redman
    28. Unlocking the Secrets of Gertrude Stein - Ian Timourian
    29. A Different Look at Data and Security - Learning to Live with Fear - Pablos Holman
    30. Stand Back, I'm Going To Try Science! - Rachel Poulsen and John Akred
    31. Collaborative Advanced Analytics For Big Data - Bruno Aziza
    32. Network Science Made Simple: SNA for Pie Chart Makers - Marc Smith
    33. How Twitter Monitors Millions of Time-series - Yann Ramin
    34. Harvard's Clean Energy Project: Big Data Maps To Renewable Energy - Kai Trepte
    35. Working With Time Series Data Using Apache Cassandra - Patrick McFadin
    36. Friending Graph Analytics: Large-Scale Graph Processing Made Easy - Ted Willke
    37. Transforming Search Engine Marketing at Ask.com - Mohit Sati
    38. Music Videos and Gastronomification for Big Data Analysis - Brian Abelson, and Thomas Levine
    39. Soylent Mean: Data Science is Made of People - Cameran Hetrick and Kimberly Stedman
    40. Big Data: Beyond Bare-Metal? - Mike Wendt
    41. Secrets of Apache Hive Queries and UDFs - Shrikanth Shankar
    42. Twitter and HP HAVEn: The Big Data Big Picture - Sanjay Goil
    43. Data Science How to Build and Deploy a Team of Data Scientists - Diane Chang, Steven Hillion, Nick Kolegraff, and Matthew Gee
    44. The Netflix Data Platform - A Recipe for High Business Impact - Kurt Brown
    45. Bedtime Stories: Learning from Sleep Data - Monica Rogati
    46. Tracking a Soccer Game with Big Data - Srinath Perera
    47. Data Transformation: A User-Centric Approach to Accessing and Analyzing Big Data - Joe Hellerstein
    48. Apache Hadoop 2.0: Migration from 1.0 to 2.0 - Vinod Kumar Vavilapalli
    49. Getting a Handle on Hadoop and its Potential to Catalyze a New Information Architecture Model - Milan Vaclavik
    50. The Sidekick Pattern: Using Small Data to Increase the Value of Big Data - Abe Gong
    51. Exascale Data Analytics @ Facebook - Sambavi Muthukrishnan
    52. Sending Millions of Surveys Around the World on Mobile Phones - Max Richman
    53. Business Data Lake: An Evolution in Data Infrastructure - Jeffrey Kelly, Steven Hirsch, Steve Jones, and Sabrina Dahlgren
    54. Expressing Yourself in R - Hadley Wickham
    55. Data Journalism - Organized Crime and Corruption Reporting - Drew Sullivan
    56. The Inflection Point - Hadoop and Big Data Analytics - Anjul Bhambhri
    57. Spreadsheets: The Dark Matter of Big Data - Felienne Hermans
    58. Scale-Invariant Intelligence - Vin Sharma
    59. Probabilistic Programming: What, Why, How, and When - Beau Cronin
    60. Beyond Hadoop MapReduce: Interactive Advertising Insights with Shark @ Yahoo! - Nandu Jayakumar and Tim Tully
    61. Machine Learning for Machine Data - David Andrzejewski - Part 1
    62. Machine Learning for Machine Data - David Andrzejewski - Part 2
    63. Lessons from the Trenches: edo Interactive Leverages Hadoop to Build Customer Loyalty - Rob Rosen, and Tim Garnto
    64. The IPython Notebook: Get Close to Your Data with Python and JavaScript - Brian Granger
    65. Government Data on Both Sides of the Bridge - Moderated by: Jesse Robbins - Panelists: Shannon Spanhake and Eddie Tejeda
    66. Enabling Business Transformation with Analytics over Real-time Streaming Data - Anand Venugopal, and Pranay Tonpay
    67. The Next Wave of SQL-on-Hadoop: Building a Virtual EDW on Native Hadoop Data - Marcel Kornacker
    68. How Comcast Turns Big Data into Real-Time Operational Insights - Patrick Shumate
    69. Chicago Bars, Prisoners Dilemma, and Practical Models in Search -Chris Harland
    70. Big Industrial Internet Data: Connecting and Optimizing at New Scales - Steven Gustafson and Parag Goradia - Part 1
    71. Big Industrial Internet Data: Connecting and Optimizing at New Scales - Steven Gustafson, and Parag Goradia - Part 2
    72. FAST and FURIOUS Big Data Analytics Meets Hadoop - Wayne Thompson, and Paul Kent
    73. The Urgent Need to Appify Big Data - Ryan Cunningham
    74. Unboxing Data Startups - Michael Abbott
    75. Apache Hive Stinger: Petabyte Scale SQL, IN Hadoop - Owen O'Malley, and Alan Gates
    76. Querying Petabytes of Data in Seconds - Reynold Xin, and Sameer Agarwal
    77. The Need for Speed Scale: A Database for Real-Time Analytics - Eric Frenkiel
    78. Graph All The Things! 11: Graph Database Use Cases That Aren't Social - Emil Eifrem
    79. Graph Analysis with One Trillion Edges on Apache Giraph - Avery Ching
    80. Big Data for Big Power: Smart Meters does not mean Smart Grids - Brett Sargent
    81. The Last Mile: Challenges and Opportunities in Data Tools - Wes McKinney
    82. Are We Data Scientists or Data Janitors? - Nenshad Bardoliwalla
    83. Session with Ben Fry
    84. Data for Good - Moderated by: Jake Porway - Panelists: Drew Conway, Rayid Ghani, and Elena Eneva
    85. NonStop HBase - Making HBase Continuously Available for Enterprise Deployment - Jagane Sundar
    86. Apache Mesos as an SDK for Building Distributed Frameworks - Paco Nathan
    87. Agile Analytics - Neal Ford
    88. Socializing Search. Professionally. - Sriram Sankar, and Daniel Tunkelang
    89. Big Data for Better Data Centers - Krishna Raj Raja and Balaji Parimi
    90. One Size Does Not Fit All: Analyzing Data at Scale with AWS - Rahul Pathak
    91. Making Choices: What Kind of Relationship are You Seeking with Your Database? - J.R. Arredondo
    92. StatusWolf: Creating Dashboards That Don't Suck Using Art and Engineering - Mark Troyer
    93. Real-Time Analytics with NewSQL: Why Hadoop is not enough - Raj Bains
    94. MLbase: Distributed Machine Learning Made Easy - Ameet Talwalkar and Evan Sparks
    95. Real-time Analytics with Open Source Technologies - Fangjin Yang, and Gian Merlino

Product information

  • Title: Strata Conference Santa Clara 2014: Complete Video Compilation
  • Author(s):
  • Release date: March 2014
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 978149190031