Books & Videos

Table of Contents

  1. Chapter 1 So Secure It’s Lost

    1. Safe Access in Secure Big Data Systems

  2. Chapter 2 The Challenge: Sharing Data Safely

    1. Surprising Outcomes with Anonymity

    2. The Netflix Prize

    3. Unexpected Results from the Netflix Contest

    4. Implications of Breaking Anonymity

    5. Be Alert to the Possibility of Cross-Reference Datasets

    6. New York Taxicabs: Threats to Privacy

    7. Sharing Data Safely

  3. Chapter 3 Data on a Need-to-Know Basis

    1. Views: A Secure Way to Limit What Is Seen

    2. Why Limit Access?

    3. Apache Drill Views for Granular Security

    4. How Views Work

    5. Summary of Need-to-Know Methods

  4. Chapter 4 Fake Data Gives Real Answers

    1. The Surprising Thing About Fake Data

    2. Keep It Simple: log-synth

    3. Log-synth Use Case 1: Broken Large-Scale Hive Query

    4. Log-synth Use Case 2: Fraud Detection Model for Common Point of Compromise

    5. Summary: Fake Data and log-synth to Safely Work with Secure Data

  5. Chapter 5 Fixing a Broken Large-Scale Query

    1. A Description of the Problem

    2. Determining What the Synthetic Data Needed to Be

    3. Schema for the Synthetic Data

    4. Generating the Synthetic Data

    5. Tips and Caveats

    6. What to Do from Here?

  6. Chapter 6 Fraud Detection

    1. What Is Really Important?

    2. The User Model

    3. Sampler for the Common Point of Compromise

    4. How the Breach Model Works

    5. Results of the Entire System Together

    6. Handy Tricks

    7. Summary

  7. Chapter 7 A Detailed Look at log-synth

    1. Goals

    2. Maintaining Simplicity: The Role of JSON in log-synth

    3. Structure

    4. Sampling Complex Values

    5. Structuring and De-structuring Samplers

    6. Extending log-synth

    7. Using log-synth with Apache Drill

    8. Choice of Data Generators

    9. R is for Random

    10. Benchmark Systems

    11. Probabilistic Programming

    12. Differential Privacy Preserving Systems

    13. Future Directions for log-synth

  8. Chapter 8 Sharing Data Safely: Practical Lessons

  9. Appendix Additional Resources

    1. Log-synth Open Source Software

    2. Apache Drill and Drill SQL Views

    3. General Resources and References