Voice Communication Between Humans and Machines

Voice Communication Between Humans and Machines

  • Author: Wilpon, Jay G.; Roe, David B.
  • Publisher: National Academies Press
  • ISBN: 9780309049887
  • eISBN Pdf: 9780309556255
  • Place of publication:  United States
  • Year of digital publication: 1994
  • Month: January
  • Pages: 559
  • Language: English

Science fiction has long been populated with conversational computers and robots. Now, speech synthesis and recognition have matured to where a wide range of real-world applications—from serving people with disabilities to boosting the nation's competitiveness—are within our grasp.

Voice Communication Between Humans and Machines takes the first interdisciplinary look at what we know about voice processing, where our technologies stand, and what the future may hold for this fascinating field. The volume integrates theoretical, technical, and practical views from world-class experts at leading research centers around the world, reporting on the scientific bases behind human-machine voice communication, the state of the art in computerization, and progress in user friendliness. It offers an up-to-date treatment of technological progress in key areas: speech synthesis, speech recognition, and natural language understanding.

The book also explores the emergence of the voice processing industry and specific opportunities in telecommunications and other businesses, in military and government operations, and in assistance for the disabled. It outlines, as well, practical issues and research questions that must be resolved if machines are to become fellow problem-solvers along with humans.

Voice Communication Between Humans and Machines provides a comprehensive understanding of the field of voice processing for engineers, researchers, and business executives, as well as speech and hearing specialists, advocates for people with disabilities, faculty and students, and interested individuals.

  • VOICE COMMUNICATION BETWEEN HUMANS AND MACHINES
  • Copyright
  • Acknowledgments
  • Contents
  • Voice Communication Between Humans and Machines—An Introduction
    • ELEMENTS OF VOICE PROCESSING TECHNOLOGY
    • VOICE CODING
    • VOICE SYNTHESIS
    • SPEECH RECOGNITION
    • SPEAKER RECOGNITION
    • SPOKEN LANGUAGE TRANSLATION
    • NATURAL LANGUAGE PROCESSING
    • COLLOQUIUM THEME
  • SCIENTIFIC BASES OF HUMAN-MACHINE COMMUNICATION BY VOICE
    • Scientific Bases of Human-Machine Communication by Voice
      • SUMMARY
      • INTRODUCTION
      • DIGITAL COMPUTATION AND MICROELECTRONICS
      • SPEECH ANALYSIS AND SYNTHESIS
      • SPEECH RECOGNITION AND UNDERSTANDING
      • USABILITY ISSUES
      • CONCLUSION
      • REFERENCES
    • The Role of Voice in Human-Machine Communication
      • SUMMARY
      • INTRODUCTION
        • Background and Definitions
          • Speech Analysis
          • Speech Synthesis
      • WHEN IS SPOKEN INTERACTION WITH COMPUTERS USEFUL?
        • Voice Input
          • Hands/Eyes-Busy Tasks
          • Limited Keyboard/Screen Option
          • Disability
          • Subject Matter Is Pronunciation
        • Voice Output
        • Summary
      • COMPARISON OF SPOKEN LANGUAGE WITH OTHER COMMUNICATION MODALITIES
        • Spoken Language System Prototypes
        • Spoken Language vs. Typed Language
          • Research Methodology
          • Comparison of Language-Based Communication Modalities
        • Comparison of Natural Language Interaction with Alternative Modalities
          • Direct Manipulation
          • Natural Language Interaction
        • Summary: Circumstances Favoring Spoken Language Interaction with Machines
      • HUMAN FACTORS OBSTACLES TO SPOKEN LANGUAGE SYSTEMS
        • Spontaneous Speech
        • Natural Language
        • Interaction and Dialogue
      • MULTIMODAL SYSTEMS
      • SCIENTIFIC RESEARCH ON COMMUNICATION MODALITIES
      • ACKNOWLEDGMENTS
      • REFERENCES
    • Speech Communication—An Overview
      • SUMMARY
      • INTRODUCTION
      • FOUNDATIONS OF SPEECH TECHNOLOGY
      • INCENTIVES IN SPEECH RESEARCH
      • TECHNOLOGY STATUS
        • Coding.
        • Recognition and synthesis.
        • Talker verification.
        • Autodirective microphone arrays.
      • CRITICAL DIRECTIONS IN SPEECH RESEARCH
        • Physics of Speech Generation; Fluid-Dynamic Principles
        • Computational Models of Language
        • Information Processing in the Auditory System; Auditory Behavior
        • Coalescing Speech Coding, Synthesis, and Recognition
        • "Robust" Techniques for Speech Analysis
        • Three Dimensional Sound Capture and Projection
        • Integration of Sensory Modalities for Sight, Sound, and Touch
      • SPEECH TECHNOLOGY PROJECTIONS—2000
      • ACKNOWLEDGMENTS
      • REFERENCES
      • BIBLIOGRAPHY
  • SPEECH SYNTHESIS TECHNOLOGY
    • Computer Speech Synthesis: Its Status and Prospects
      • SUMMARY
    • Models of Speech Synthesis
      • SUMMARY
      • INTRODUCTION
        • Knowledge About Natural Speech
        • Flexibility and Technical Dimensions
        • The Sound-Generating Part
        • Simple Waveform Concatenation
        • Analysis-Synthesis Systems
        • Source Models
        • Formant-Based Terminal Analog
        • Higher-Level Parameters
        • Articulatory Models
      • THE CONTROL PART
        • Concatenation of Units
        • Rules and Notations
        • Automatic Learning
      • SPEAKING CHARACTERISTICS AND SPEAKING STYLES
      • MULTILINGUAL SYNTHESIS
        • Speech Quality
      • CONCLUDING REMARKS
      • ACKNOWLEDGMENTS
      • REFERENCES
    • Linguistic Aspects of Speech Synthesis
      • SUMMARY
      • INTRODUCTION
      • CONSTRAINTS ON SPEECH PRODUCTION
      • WORD-LEVEL ANALYSIS
      • LETTER-TO-SOUND RULES
      • MORPHOPHONEMICS AND LEXICAL STRESS
      • ORTHOGRAPHIC CONVENTIONS
      • PART-OF-SPEECH ASSIGNMENT
      • PARSING
      • PROSODIC MARKING
      • DISCOURSE-LEVEL EFFECTS
      • MULTILINGUAL SYNTHESIS
      • THE FUTURE
      • REFERENCES
  • SPEECH RECOGNITION TECHNOLOGY
    • Speech Recognition Technology: A Critique
      • SUMMARY
      • REFERENCES
    • State of the Art in Continuous Speech Recognition
      • SUMMARY
      • INTRODUCTION
      • THE SPEECH RECOGNITION PROBLEM
        • General Synthesis/Recognition Process
        • Units of Speech
      • HIDDEN MARKOV MODELS
        • Markov Chains
        • Hidden Markov Models
        • Phonetic HMMs
      • A HISTORICAL OVERVIEW
      • TRAINING AND RECOGNITION
        • Feature Extraction
        • Training
          • Phonetic HMMs and Lexicon
          • Grammar
        • Recognition
      • STATE OF THE ART
        • Improvements in Performance
          • Common Speech Corpora
          • Acoustic Modeling
          • Language Modeling
          • Research Experimentation Cycle
        • Sample Performance Figures
        • Effects of Training and Grammar
        • Speaker-Dependent vs. Speaker-Independent Recognition
        • Adaptation
        • Adding New Words
      • REAL-TIME SPEECH RECOGNITION
      • ALTERNATIVE MODELS
        • Segmental Models
        • Neural Networks
      • CONCLUDING REMARKS
      • REFERENCES
    • Training and Search Methods for Speech Recognition
      • SUMMARY
      • INTRODUCTION
      • ESTIMATION OF STATISTICAL PARAMETERS OF HMMS
      • REMARKS ON THE ESTIMATION PROCEDURE
      • FINDING THE MOST LIKELY PATH
      • DECODING: FINDING THE MOST LIKELY WORD SEQUENCE
      • REFERENCES
  • NATURAL LANGUAGE UNDERSTANDING TECHNOLOGY
    • The Roles of Language Processing in a Spoken Language Interface
      • SUMMARY
      • INTRODUCTION
        • Background: The ARPA Spoken Language Program
      • THE DUAL ROLE OF LANGUAGE PROCESSING
        • Approaches to Spoken Language Understanding
        • Interfacing Speech and Language
        • Progress in Spoken Language Understanding
      • THE ROLE OF DISCOURSE
        • Constraints on Reference
        • Constraints from Mixed Initiative
        • Order in Problem Solving and Dialogue
        • Discourse Constraints in a Spoken Language System
      • EVALUATION
      • CONCLUSIONS
      • REFERENCES
    • Models of Natural Language Understanding
      • SUMMARY
      • INTRODUCTION
      • A BRIEF HISTORY OF NLP
      • WHY IS NLP DIFFICULT?
      • WHAT IS IN AN NLP SYSTEM?
        • Syntax
        • Semantics
        • Discourse and Pragmatics
        • Reasoning, Response Planning, and Response Generation
        • Simplifying the Problem
        • Another View
      • HOW CAN NL SYSTEMS BE APPLIED AND EVALUATED?
      • CONCLUSIONS
      • REFERENCES
      • BIBLIOGRAPHY
    • Integration of Speech with Natural Language Understanding
      • SUMMARY
      • INTRODUCTION
      • COPING WITH SPONTANEOUS SPOKEN LANGUAGE
        • Language Phenomena in Spontaneous Speech
        • Strategies for Handling Spontaneous Speech Phenomena
      • ROBUSTNESS TO RECOGNITION ERRORS
      • NATURAL LANGUAGE CONSTRAINTS IN RECOGNITION
        • Models for Integration
        • Architectures for Integration
          • Word Lattice Parsing
          • Dynamic Grammar Networks
          • N-best Filtering or Rescoring
        • Integration Results
      • SPEECH CONSTRAINTS IN NATURAL LANGUAGE UNDERSTANDING
      • CONCLUSIONS
      • REFERENCES
  • APPLICATIONS OF VOICE-PROCESSING TECHNOLOGY I
    • A Perspective on Early Commerical Applications of Voice-Processing Technology for Telecommunicationsand Aids for the…
      • SUMMARY
      • INTRODUCTION
      • CURRENT COMMERCIAL APPLICATIONS: TELEPHONE BASED
      • CURRENT COMMERCIAL APPLICATIONS: AIDS TO THE HANDICAPPED
      • CONCLUSION
    • Applications of Voice-Processing Technology in Telecommunications
      • SUMMARY
      • INTRODUCTION
      • THE VISION
      • THE ART OF SPEECH RECOGNITION AND SYNTHESIS
      • APPLICATIONS OF SPEECH RECOGNITION AND SYNTHESIS
      • SPEECH TECHNOLOGY TELECOMMUNICATIONS MARKET
        • Cost Reduction vs. New Revenue Opportunities
        • Automation of Operator Services
        • Voice Access to Information over the Telephone Network
        • Voice Dialing
        • Voice-Interactive Phone Service
        • Directory Assistance Call Completion
        • Reverse Directory Assistance
        • Telephone Relay Service
      • FUTURE POSSIBILITIES
        • Near-Term Technical Challenges
        • Personal Communication Networks and Services
        • Predictions
      • REFERENCES
    • Speech Processing for Physical and Sensory Disabilities
      • SUMMARY
      • INTRODUCTION
      • ASSISTIVE HEARING TECHNOLOGY
        • Background
        • Hearing Aids and Assistive Listening Devices
        • Visual Sensory Aids
        • Tactile Sensory Aids
        • Direct Electrical Stimulation of the Auditory System
        • Noise Reduction
      • OTHER FORMS OF ASSISTIVE TECHNOLOGY INVOLVING VOICE COMMUNICATION
        • Speech Processing for Sightless People
        • Augmentative and Alternative Communication
        • Assistive Voice Control: Miscellaneous Applications
      • ACKNOWLEDGMENT
      • REFERENCES
  • APPLICATIONS OF VOICE-PROCESSING TECHNOLOGY II
    • Commercial Applications of Speech Interface Technology: An Industry at the Threshold
      • SUMMARY
      • INTRODUCTION
      • BACKGROUND
      • TECHNOLOGY
      • THE ADVANCED SPEECH TECHNOLOGY MARKET
      • RECENT MARKET TRENDS
      • MARKET SIZE
      • RECENT SIGNIFICANT COMMERCIAL DEVELOPMENTS
      • FUTURE APPLICATIONS
    • Military and Government Applications of Human-Machine Communication by Voice
      • SUMMARY
      • INTRODUCTION
      • TECHNOLOGY TRENDS AND NEEDS
      • SUMMARY OF VISITS AND CONTACTS
      • ARMY APPLICATIONS
      • NAVY APPLICATIONS
      • AIR FORCE APPLICATIONS
      • AIR TRAFFIC CONTROL APPLICATIONS
      • LAW ENFORCEMENT APPLICATIONS
      • SUMMARY OF USERS AND APPLICATIONS
      • TECHNOLOGY TRANSFER
      • ACKNOWLEDGMENTS
      • REFERENCES
  • TECHNOLOGY DEPLOYMENT
    • Deployment of Human-Machine Dialogue Systems
      • SUMMARY
      • INTRODUCTION
      • DEGREE OF DIFFICULTY OF A VOICE DIALOGUE APPLICATION
        • Dimensions of the Speech Recognition Task
        • Dimensions of the Language-Understanding Task
        • Dimensions of the Speech Synthesis Task
        • Additional Dimensions of Difficulty
        • Examples of Speech Applications
      • PROCEDURE FOR DEPLOYMENT OF SPEECH APPLICATIONS
        • The Art of Human-Machine Dialogues
      • CONCLUSIONS
      • REFERENCES
    • What Does Voice-Processing Technology Support Today?
      • SUMMARY
      • INTRODUCTION
      • SYSTEM TECHNOLOGIES
        • Hardware Technology
          • Microprocessors
          • Digital Signal Processors
          • Equipment and Systems
        • Application Technology Trend
          • Development Environment for DSP
          • Application Development Environment
          • Speech Input/Output Operating Systems
          • Real-Time Application Support
      • ALGORITHMS
        • Databases
          • Databases for Research
          • Databases for Application
          • Simulated Telephone Lines
        • Assessment of Algorithms
          • Assessment of Speech Recognition Algorithms
          • Assessment of Speech Synthesis Technology
        • Robust Algorithms
          • Classification of Factors in Robustness
          • Environmental Variation
          • Noise
          • Speaker Variation
      • SPEECH TECHNOLOGY AND THE MARKET
        • Illusions About Speech Recognition Technology
        • Strategy for Expanding the Market
          • Service Trials
          • Robustness Research
          • Long Term Research
      • CONCLUSION
      • REFERENCES
    • User Interfaces for Voice Applications
      • SUMMARY
      • USER INTERFACE CONSIDERATIONS
        • Task Requirements
          • Information Elements
          • Task Modalities
          • Cost of Interaction Failures
        • Technological Capabilities and Limitations
          • Voice Input
          • Voice Output
          • System Capabilities
        • User Expectations and Expertise
          • Conversational Speech Behaviors
      • USER INTERFACE DESIGN STRATEGIES
        • Dialogue Flow
        • Feedback and Confirmation
        • Instructions
        • Error Recovery
      • EVALUATING TECHNOLOGY READINESS
      • CONCLUSION
      • ACKNOWLEDGMENTS
      • REFERENCES
  • TECHNOLOGY IN 2001
    • Speech Technology in the Year 2001
      • SUMMARY
      • INTRODUCTION
      • REFERENCES
    • Toward the Ultimate Synthesis/Recognition System
      • SUMMARY
      • VISION OF THE FUTURE
      • FUTURE SPEECH SYNTHESIZERS
      • FUTURE SPEECH RECOGNIZERS
      • TOWARD ROBUST SPEECH/SPEAKER RECOGNITION UNDER ADVERSE CONDITIONS
      • SPEECH AND NATURAL LANGUAGE PROCESSING
      • USE OF ARTICULATORY AND PERCEPTUAL CONSTRAINTS
      • EVALUATION METHODS
      • CONCLUSION
      • REFERENCES
    • Speech Technology in 2001: New Research Directions
      • SUMMARY
      • INTRODUCTION
      • CURRENT CAPABILITIES
      • CHALLENGING ISSUES IN SPEECH RESEARCH
      • THE ROBUSTNESS ISSUE
      • SPEECH ANALYSIS
        • Temporal Decomposition
      • TRAINING AND PATTERN-MATCHING ISSUES
      • ADDITIONAL ISSUES IN SPEECH SYNTHESIS
      • CONCLUSIONS
      • REFERENCES
    • New Trends in Natural Language Processing: Statistical Natural Language Processing
      • SUMMARY
      • SOME LIMITATIONS OF RULE-BASED NLP
      • STATISTICAL TECHNIQUES: FIRST APPEARANCE
      • THE ARCHITECTURE OF AN NLU SYSTEM
      • PART-OF-SPEECH TAGGING
        • The Problem of Unknown Words
      • STOCHASTIC PARSING
        • Constraining the Inside/Outside Algorithm
        • Conditioning PCFG Rules on Linguistic Context
        • Annotated Corpora
      • LEXICAL SEMANTICS AND BEYOND
      • A QUESTION FOR TOMORROW
      • ACKNOWLEDGMENTS
      • REFERENCES
    • The Future of Voice-Processing Technology in the World of Computers and Communications
      • SUMMARY
      • EXPECTATIONS FOR VOICE INTERFACE
      • VOICE INTERFACE IN THE C&C INFORMATION SOCIETY
      • A FRIENDLY, SMART INTERFACE
      • VOICE INTERFACE AND VLSI TECHNOLOGY
      • FUTURE RESEARCH AND DEVELOPMENT ISSUES
      • TOWARD AUTOMATIC INTERPRETATION
    • Author Biographies
  • Index

SUBSCRIBE TO OUR NEWSLETTER

By subscribing, you accept our Privacy Policy