Designing Voice User Interfaces

Book description

Voice user interfaces (VUIs) are becoming all the rage today. But how do you build one that people can actually converse with? Whether you’re designing a mobile app, a toy, or a device such as a home assistant, this practical book guides you through basic VUI design principles, helps you choose the right speech recognition engine, and shows you how to measure your VUI’s performance and improve upon it.

Author Cathy Pearl also takes product managers, UX designers, and VUI designers into advanced design topics that will help make your VUI not just functional, but great.

  • Understand key VUI design concepts, including command-and-control and conversational systems
  • Decide if you should use an avatar or other visual representation with your VUI
  • Explore speech recognition technology and its impact on your design
  • Take your VUI above and beyond the basic exchange of information
  • Learn practical ways to test your VUI application with users
  • Monitor your app and learn how to quickly improve performance
  • Get real-world examples of VUIs for home assistants, smartwatches, and car systems

Publisher resources

View/Submit Errata

Table of contents

  1. Dedication
  2. Praise for Designing Voice User Interfaces
  3. Preface
    1. Why Write This Book?
    2. The Chinese Room and the Turing Test
    3. Who Should Read This Book
    4. How This Book Is Organized
    5. O’Reilly Safari
    6. How to Contact Us
    7. Acknowledgments
  4. 1. Introduction
    1. A Brief History of VUIs
      1. The Second Era of VUIs
      2. Why Voice User Interfaces?
    2. Conversational User Interfaces
      1. An Interview with Alexa
    3. What Is a VUI Designer?
    4. Chatbots
    5. Conclusion
  5. 2. Basic Voice User Interface Design Principles
    1. Designing for Mobile Devices Versus IVR Systems
    2. Conversational Design
    3. Setting User Expectations
    4. Design Tools
      1. Sample Dialogs
      2. Visual Mock-Ups
      3. Flow
      4. Prototyping Tools
    5. Confirmations
      1. Method 1: Three-Tiered Confidence
      2. Method 2: Implicit Confirmation
      3. Method 3: Nonspeech Confirmation
      4. Method 4: Generic Confirmation
      5. Method 5: Visual Confirmation
    6. Command-and-Control Versus Conversational
      1. Command-and-Control
      2. Conversational
    7. Conversational Markers
    8. Error Handling
      1. No Speech Detected
      2. Speech Detected but Nothing Recognized
      3. Recognized but Not Handled
      4. Recognized but Incorrectly
      5. Escalating Error
    9. Don’t Blame the User
    10. Novice and Expert Users
    11. Keeping Track of Context
    12. Help and Other Universals
    13. Latency
    14. Disambiguation
    15. Design Documentation
      1. Prompts
      2. Grammars/Key Phrases
    16. Accessibility
      1. Interaction Should Be Time-Efficient
      2. Keep It Short
      3. Talk Faster!
      4. Interrupt Me at Any Time
      5. Provide Context
      6. Where Am I?
      7. Text-to-Speech Personalization
    17. Conclusion
  6. 3. Personas, Avatars, Actors, and Video Games
    1. Personas
    2. Should My VUI Be Seen?
    3. Using an Avatar: What Not to Do
    4. Using an Avatar (or Recorded Video): What to Do
      1. Storytelling
      2. Teamwork
      3. Video Games
    5. When Should I Use Video in My VUI?
    6. Visual VUI—Best Practices
      1. Should My Users See Themselves?
      2. What About the GUI?
      3. Handling Errors
      4. Turn Taking and Barge-In
      5. Maintaining Engagement and the Illusion of Awareness
    7. Visual (Non-Avatar) Feedback
    8. Choosing a Voice
    9. Pros of an Avatar
    10. The Downsides of an Avatar
      1. The Uncanny Valley
    11. Conclusion
  7. 4. Speech Recognition Technology
    1. Choosing an Engine
    2. Barge-In
      1. Timeouts
        1. End-of-speech timeout
        2. No speech timeout
        3. Too much speech
    3. N-Best Lists
    4. The Challenges of Speech Recognition
      1. Noise
      2. Multiple Speakers
      3. Children
      4. Names, Spelling, and Alphanumeric
    5. Data Privacy
    6. Conclusion
  8. 5. Advanced Voice User Interface Design
    1. Branching Based on Voice Input
      1. Constrained Responses
      2. Open Speech
      3. Categorization of Input
      4. Wildcards and Logical Expressions
    2. Disambiguation
      1. Not Enough Information
      2. More Than One Piece of Information When Only One Is Expected
    3. Handling Negation
    4. Capturing Intent and Objects
    5. Dialog Management
    6. Don’t Leave Your User Hanging
    7. Should the VUI Display What It Recognized?
    8. Sentiment Analysis and Emotion Detection
    9. Text-to-Speech Versus Recorded Speech
    10. Speaker Verification
    11. “Wake” Words
    12. Context
    13. Advanced Multimodal
    14. Bootstrapping Datasets
      1. Website data
      2. Call center data
      3. Data collection
    15. Advanced NLU
    16. Conclusion
  9. 6. User Testing for Voice User Interfaces
    1. Special VUI Considerations
    2. Background Research on Users and Use Cases
      1. Don’t Reinvent the Wheel
    3. Designing a Study with Real Users
      1. Task Definition
      2. Choosing Participants
      3. Questions to Ask
        1. Open responses (to be asked verbally)
      4. Things to Look For
    4. Early-Stage Testing
      1. Sample Dialogs
      2. Mock-ups
      3. Wizard of Oz Testing
      4. Difference Between WOz and Usability Testing
    5. Usability Testing
      1. Remote Testing
        1. Moderated versus unmoderated
        2. Video recording
        3. Services for remote testing
      2. Lab Testing
      3. Guerrilla Testing
    6. Performance Measures
    7. Next Steps
    8. Testing VUIS in Cars, Devices, and Robots
      1. Cars
      2. Devices and Robots
    9. Conclusion
  10. 7. Your Voice User Interface Is Finished! Now What?
    1. Prerelease Testing
      1. Dialog Traversal Testing
      2. Recognition Testing
      3. Load Testing
    2. Measuring Performance
      1. Task Completion Rates
      2. Dropout Rate
      3. Other Items to Track
        1. Amount of time in the VUI
        2. Barge-in
        3. Speech versus GUI
        4. High no-speech timeouts, no matches
        5. Navigation
        6. Latency
        7. Whole call recording
    3. Logging
    4. Transcription
    5. Release Phases
      1. Pilot
    6. Surveys
    7. Analysis
      1. Confidence Thresholds
      2. End-of-Speech Timeouts
      3. Interim Results versus Final Results
      4. Custom Dictionaries
      5. Prompts
    8. Tools
      1. Regression Testing
    9. Conclusion
  11. 8. Voice-Enabled Devices and Cars
    1. Devices
      1. Home Assistants
      2. Watches/Bands/Earbuds
      3. Other Devices
    2. Cars and Autonomous Vehicles
      1. Challenges of Designing VUI for the Car
      2. Designing for in the Car
      3. Distracted Driving
      4. Device Shifting
      5. Interaction Mode
      6. Conclusions on Cars
    3. Conclusion
  12. A. Epilogue
  13. B. Products Mentioned in This Book
    1. Mobile Phone Assistants
    2. Home Assistants
    3. Toys/Other
    4. Apps
    5. Video Games
    6. Watches / Bands
    7. Cars
  14. C. About the Author
  15. Index
  16. About the Author
  17. Colophon
  18. Copyright

Product information

  • Title: Designing Voice User Interfaces
  • Author(s): Cathy Pearl
  • Release date: December 2016
  • Publisher(s): O'Reilly Media, Inc.
  • ISBN: 9781491955369