Massive Data Sets

Massive Data Sets

Proceedings of a Workshop

  • Publisher: National Academies Press
  • ISBN: 9780309056946
  • eISBN Pdf: 9780309556866
  • Place of publication:  United States
  • Year of digital publication: 1997
  • Month: January
  • Pages: 219
  • DDC: 510
  • Language: English
  • Massive Data Sets
  • Copyright
  • PREFACE
  • Contents
  • OPENING REMARKS
  • PART I PARTICIPANTS' EXPECTATIONS FOR THE WORKSHOP
  • PART II APPLICATIONS PAPERS
    • Earth Observation Systems: What Shall We Do with the Data We Are Expecting in 1998?
      • ABSTRACT
      • 1 INTRODUCTION
      • 2 DATA CLASSIFICATION SCHEME
      • 3 GRINNING AND BIDDING TO CREATE LEVEL 3 DATA
      • 4 GENERATING LEVEL 2 DATA
        • 4.1 Sensitivity Studies
        • 4.2 Climatologies
      • 5 SUMMARY OF ISSUES
      • ACKNOWLEDGEMENTS
      • References
    • Information Retrieval: Finding Needles in Massive Haystacks
      • 1.0 INFORMATION RETRIEVAL: THE PROMISE AND PROBLEMS
      • 2.0 AN SMALL EXAMPLE
      • 3.0 LATENT SEMANTIC INDEXING (LSI)
      • 4.0 USING LSI FOR INFORMATION RETRIEVAL
      • 5.0 SOME OPEN STATISTICAL ISSUES
      • 6.0 CONCLUSIONS
      • 7.0 References
    • Statistics and Massive Data Sets: One View from the Social Sciences
      • ABSTRACT
      • 1 INTRODUCTION
      • 2 THE PROBLEM
      • 3 THE OPPORTUNITY
      • 4 ONE PATH TO AN ANSWER
    • The Challenge of Functional Magnetic Resonance Imaging
      • 1 INTRODUCTION
      • 2 FUNCTIONAL MAGNETIC RESONANCE IMAGING (FMRI)
      • 3 A TYPICAL EXPERIMENT
      • 4 DATA PROCESSING
      • 5 STATISTICAL CHALLENGES
      • 6 COMPUTATIONAL CHALLENGES
      • 7 DISCUSSION
      • 8 ACKNOWLEDGEMENTS
      • References
    • Marketing
      • DISCUSSION
    • Massive Data Sets: Guidelines And Practical Experience From Health Care
      • 1 INTRODUCTION
      • 2 COREPLUS AND SAFS: CASE STUDIES IN MDA
      • 3 PRESENTATION OF DATA ANALYSIS
      • 4 STATISTICAL CHALLENGES
      • 5 ORGANIZATION OF COMPUTATIONS
        • 5.1 Example: Job Scheduling
      • References
    • Massive Data Sets in Semiconductor Manufacturing
      • 1 INTRODUCTION
      • 2 BACKGROUND
      • 3 DATA OVERVIEW
      • 4 DIE FABRICATION DATA
      • 5 WAFER ELECTRICAL DATA
      • 6 SORT ELECTRICAL TEST DATA
      • 7 FINAL ELECTRICAL TEST DATA
      • 8 STATISTICAL ISSUES
      • 9 CONCLUSIONS
      • 10 RECOMMENDATIONS
      • References
    • Management Issues In The Analysis Of Large-Scale Crime Data Sets
      • 1 THE INFORMATION GLUT
      • 2 NIBRS
      • 3 NCVS
      • 4 DATA UTILIZATION
      • 5 FUTURE ISSUES
    • Analyzing Telephone Network Data
      • ABSTRACT
      • ACKNOWLEDGEMENTS
      • REFERENCES
    • Massive Data Assimilation/Fusion in Atmospheric Models and Analysis: Statistical, Physical, and Computational Challenges
      • ABSTRACT
      • 1 INTRODUCTION
      • 2 THE SATELLITE DATA
      • 3 EXISTING METHODS AND TOOLS: PROSPECTS AND LIMITATIONS
      • 4 SUMMARY AND CONCLUDING REMARKS
      • ACKNOWLEDGMENTS
      • References
  • PART III ADDITIONAL INVITED PAPERS
    • Massive Data Sets and Artificial Intelligence Planning
      • ABSTRACT
      • 1 THE INFORMATION GLUT
      • 2 AN PLANNING PERSPECTIVE ON DATA ANALYSIS
      • 3 AN PLANNING SYSTEM FOR ANALYSIS OF SMALL DATASETS
        • 3.1 Representation
        • 3.2 Controlling Plan Execution
      • 4 DISCUSSION
      • ACKNOWLEDGMENTS
      • References
    • Massive Data Sets: Problems and Possibilities, with Application to Environmental Monitoring
      • ABSTRACT
      • INTRODUCTION
      • TYPES OF MASSIVE DATA SETS
      • STATISTICAL DATA ANALYSIS
      • APPLICATION TO THE ENVIRONMENTAL SCIENCES
      • CONCLUSIONS
      • ACKNOWLEDGMENT
      • References
    • Visualizing Large Data Sets
      • ABSTRACT
      • 1 INTRODUCTION
      • 2 DOMAIN-SPECIFIC REPRESENTATION
      • 3 HIGH INFORMATION DENSITY
      • 4 INTERACTIVE FILTERS
      • 5 MULTIPLE LINKED VIEWS
      • 6 SYSTEMS
      • 7 SOFTWARE AND TECHNOLOGY
      • 8 CONCLUSION
      • ACKNOWLEDGMENTS
      • References
    • From Massive Data Sets to Science Catalogs: Applications and Challenges
      • ABSTRACT
      • 1 INTRODUCTION
        • 1.1 Background
        • 1.2 Developing Science Catalogs from Data
      • 2 SCIENCE CATALOGING APPLICATIONS AT JPL
        • 2.1 The SKICAT Project
          • 2.1.1 Classifying Sky Objects
          • 2.1.2 Classifying Faint Sky Objects
          • 2.1.3 SKICAT Classification Results
          • 2.1.4 Why was SKICAT Successful?
        • 2.2 Cataloging Volcanoes in Magellan-SAR Images
          • 2.2.1 Background
          • 2.2.2 Automated Detection of Volcanoes
        • 2.3 Other Science Cataloging Projects at JPL
      • 3 GENERAL IMPLICATIONS FOR THE ANALYSIS OF MASSIVE DATA SETS
        • 3.1 Complexity Issues for Classification Problems
          • 3.1.1 Supervised Classification can be Tractable
          • 3.1.2 Unsupervised Classification can be Intractable
        • 3.2 Human Factors: The Interactive Process of Data Analysis
        • 3.3 Subjective Human Annotation of Data Sets for Classification Purposes
        • 3.4 Effective Use of Prior Knowledge
        • 3.5 Dealing with High Dimensionality
        • 3.6 How Does the Data Grow?
      • 4 CONCLUSION
      • ACKNOWLEDGEMENTS
      • References
    • Information Retrieval and the Statistics of Large Data Sets
      • ABSTRACT
      • 1 IR AND STATISTICS TODAY
      • 2 THE FUTURE
      • References
    • Some Ideas About the Exploratory Spatial Analysis Technology Required for Massive Databases
      • ABSTRACT
      • 1 A GLOBAL SPATIAL DATA EXPLOSION
      • 2 A GLOBAL DATA SWAMP
      • 3 NEW TOOLS ARE REQUIRED
        • 3.1 Automated map pattern detectors
        • 3.2 Autonomous database explorer
        • 3.3 Geographic object recognition
        • 3.4 Very large spatial data classification tools
        • 3.5 Neurofuzzy and hybrid spatial modelling systems for very large data bases
      • 4 DISCOVERY HOW TO EXPLOIT THE GEOCYBERSPACE
      • References
    • Massive Data Sets in Navy Problems
      • I. ABSTRACT
      • II. INTRODUCTION
      • III. DISCUSSIONS
      • IV. PROPOSED NEEDS
      • V. CONCLUSIONS
      • VI. ACKNOWLEDGMENTS
      • VII. References
    • Massive Data Sets Workshop: The Morning After
      • ABSTRACT
      • 1 INTRODUCTION
      • 2 DISCLOSURE: PERSONAL EXPERIENCES
      • 3 WHAT IS MASSIVE? A CLASSIFICATION OF SIZE
      • 4 OBSTACLES TO SCALING
        • 4.1 Human limitations: visualization
        • 4.2 Human-machine interactions
        • 4.3 Storage requirements
        • 4.4 Computational complexity
      • 5 ON THE STRUCTURE OF LARGE DATA SETS
        • 5.1 Types of data
        • 5.2 How do data sets grow?
        • 5.3 On data organization
        • 5.4 Derived data sets
      • 6 DATA BASE MANAGEMENT AND RELATED ISSUES
        • 6.1 Data base management and data analysis systems
        • 6.2 Problems and challenges in the data base area
      • 7 THE STAGES OF AN DATA ANALYSIS
        • 7.1 Planning the data collection
        • 7.2 Actual collection
        • 7.3 Data access
        • 7.4 Initial data checking
        • 7.5 Data analysis proper
        • 7.6 The final product: presentation of arguments and conclusions
      • 8 EXAMPLES AND SOME THOUGHTS ON STRATEGY
      • 9 VOLUME REDUCTION
      • 10 SUPERCOMPUTERS AND SOFTWARE CHALLENGES
        • 10.1 When do we need a Concorde?
        • 10.2 General Purpose Data Analysis and Supercomputers
        • 10.3 Languages, Programming Environments and Data-Based Prototyping
      • 11 SUMMARY OF CONCLUSIONS
      • 12 References
  • PART IV FUNDAMENTAL ISSUES AND GRAND CHALLENGES
    • Panel Discussion
      • QUESTION 1: WHAT ARE SOME ALTERNATIVES FOR CLASSIFYING STATISTICAL PROBLEMS?
      • QUESTION 2: SHOULD STATISTICS BE DATA-BASED OR MODEL-BASED?
      • QUESTION 3: CAN COMPUTER-INTENSIVE METHODS COEXIST WITH MASSIVE DATA SETS?
      • QUESTION 4: IS THERE AN MODEL-FREE NOTION OF SUFFICIENCY?
    • Items for Ongoing Consideration
      • DATA PREPARATION
      • MODELS AND DATA PRESENTATION RESEARCH ISSUES
    • Closing Remarks
  • APPENDIX WORKSHOP PARTICIPANTS

Subjects

SUBSCRIBE TO OUR NEWSLETTER

By subscribing, you accept our Privacy Policy