Neo4j High Performance

Book description

Design, build, and administer scalable graph database systems for your applications using Neo4j

  • Explore the numerous components that provide abstractions for pretty much any functionality you need from your persistent graphs

  • Familiarize yourself with how to test the GraphAware framework, along with working in High Availability mode

  • Get an insight into the internal working of Neo4j and learn about some useful tools, administrative configurations, and security tweaks built for it

  • In Detail

    This book provides an insight into working with Neo4j; deployment, configuration, and optimization of the data models; and utilizing storage for better performance.

    This book covers all aspects related to working with Neo4j, including querying, indexing, modeling of graph data, testing, and deployment of your Neo4j applications, and also shows you the internal features of the Neo4j graph database. With a sample demonstration and outline of community developed tools, this book will help you develop cutting-edge, high performance, and secure applications for complex data using the Neo4j graph database.

    Table of contents

    1. Neo4j High Performance
      1. Table of Contents
      2. Neo4j High Performance
      3. Credits
      4. About the Author
      5. About the Reviewers
      6. www.PacktPub.com
        1. Support files, eBooks, discount offers, and more
          1. Why subscribe?
          2. Free access for Packt account holders
      7. Preface
        1. What this book covers
        2. What you need for this book
        3. Who this book is for
        4. Conventions
        5. Reader feedback
        6. Customer support
          1. Downloading the example code
          2. Errata
          3. Piracy
          4. Questions
      8. 1. Getting Started with Neo4j
        1. Graphs and their utilities
          1. Introducing NoSQL databases
          2. Dynamic schemas
          3. Automatic sharding
          4. Built-in caching
          5. Replication
        2. Types of NoSQL databases
          1. Key-value stores
          2. Column family stores
          3. Document databases
          4. Graph databases
          5. Graph compute engines
        3. The Neo4j graph database
          1. ACID compliance
          2. Characteristics of Neo4j
          3. The basic CRUD operations
        4. The Neo4j setup and configurations
          1. Modes of setup – the embedded mode
          2. Modes of setup – the server mode
          3. Neo4j high availability
            1. Machine #1 – neo4j-01.local
            2. Machine #2 – neo4j-02.local
            3. Machine #3 – neo4j-03.local
        5. Configure Neo4j for Amazon clusters
        6. Cloud deployment with Azure
        7. Summary
      9. 2. Querying and Indexing in Neo4j
        1. The Neo4j interface
          1. Running Cypher queries
          2. Visualization of results
        2. Introduction to Cypher
        3. Cypher graph operations
          1. Cypher clauses
          2. More useful clauses
        4. Advanced Cypher tricks
          1. Query optimizations
          2. Graph model optimizations
        5. Gremlin – an overview
        6. Indexing in Neo4j
          1. Manual and automatic indexing
          2. Schema-based indexing
          3. Indexing benefits and trade-offs
        7. Migration techniques for SQL users
          1. Handling dual data stores
          2. Analyzing the model
          3. Initial import
          4. Keeping data in sync
          5. The result
        8. Useful code snippets
          1. Importing data to Neo4j
          2. Exporting data from Neo4j
        9. Summary
      10. 3. Efficient Data Modeling with Graphs
        1. Data models
          1. The aggregated data model
          2. Connected data models
        2. Property graphs
        3. Design constraints in Neo4j
        4. Graph modeling techniques
          1. Aggregation in graphs
          2. Graphs for adjacency lists
          3. Materialized paths
          4. Modeling with nested sets
          5. Flattening with ordered field names
        5. Schema design patterns
          1. Hyper edges
          2. Implementing linked lists
          3. Complex similarity computations
          4. Route generation algorithms
        6. Modeling across multiple domains
        7. Summary
      11. 4. Neo4j for High-volume Applications
        1. Graph processing
        2. Big data and graphs
        3. Processing with Hadoop or Neo4j
        4. Managing transactions
          1. Deadlock handling
          2. Uniqueness of entities
          3. Events for transactions
        5. The graphalgo package
        6. Introduction to Spring Data Neo4j
        7. Summary
      12. 5. Testing and Scaling Neo4j Applications
        1. Testing Neo4j applications
        2. Unit testing
          1. Using the Java API
          2. GraphUnit-based unit testing
            1. Unit testing an embedded database
            2. Unit testing a Neo4J server
        3. Performance testing
        4. Benchmarking performance with Gatling
        5. Scaling Neo4j applications
        6. Summary
      13. 6. Neo4j Internals
        1. Introduction to Neo4j internals
        2. Working of your code
          1. Node and relationship management
          2. Implementation specifics
        3. Storage for properties
          1. The storage structure
          2. Migrating to the new storage
        4. Caching internals
        5. Cache types
          1. AdaptiveCacheManager
        6. Transactions
          1. The Write Ahead log
          2. Detecting deadlocks
            1. RWLock
            2. RAGManager
            3. LockManager
          3. Commands
        7. High availability
          1. HA and the need for a master
          2. The master election
        8. Summary
      14. 7. Administering Neo4j
        1. Interfacing with the tools and frameworks
          1. Using Neo4j for PHP developers
          2. The JavaScript Neo4j adapter
          3. Neo4j with Python
        2. Admin tricks
          1. Server configuration
          2. JVM configurations
          3. Caches
        3. Memory mapped I/O configuration
          1. Traversal speed optimization example
          2. Batch insert example
        4. Neo4j server logging
          1. Server logging configurations
          2. HTTP logging configurations
          3. Garbage collection logging
          4. Logical logs
          5. Open file size limit on Linux
        5. Neo4j server security
          1. Port and remote connection security
          2. Support for HTTPS
            1. Server authorization rules
              1. Setup server authorization rules enforcement
              2. Security rules targeting with wildcards
            2. Other security options
        6. Summary
      15. 8. Use Case – Similarity-based Recommendation System
        1. The why and how of recommendations
          1. Collaborative filtering
          2. Content-based filtering
          3. The hybrid approach
        2. Building a recommendation system
        3. Recommendations on map data
        4. Visualization of graphs
        5. Summary
      16. Index

    Product information

    • Title: Neo4j High Performance
    • Author(s): Sonal Raj
    • Release date: March 2015
    • Publisher(s): Packt Publishing
    • ISBN: 9781783555154