Statistical Analysis of Massive Data Streams

Statistical Analysis of Massive Data Streams

Proceedings of a Workshop

  • Publisher: National Academies Press
  • eISBN Pdf: 9780309593021
  • Place of publication:  United States
  • Year of digital publication: 2004
  • Month: September
  • Pages: 396
  • DDC: 510
  • Language: English

Massive data streams, large quantities of data that arrive continuously, are becoming increasingly commonplace in many areas of science and technology. Consequently development of analytical methods for such streams is of growing importance. To address this issue, the National Security Agency asked the NRC to hold a workshop to explore methods for analysis of streams of data so as to stimulate progress in the field. This report presents the results of that workshop. It provides presentations that focused on five different research areas where massive data streams are present: atmospheric and meteorological data; high-energy physics; integrated data systems; network traffic; and mining commercial data streams. The goals of the report are to improve communication among researchers in the field and to increase relevant statistical science activity.

  • STATISTICAL ANALYSIS OF MASSIVE DATA STREAMS
  • Copyright
  • ACKNOWLEDGEMENT OF REVIEWERS
  • Preface and Workshop Rationale
  • Sallie Keller-McNulty Welcome and Overview of Sessions
    • TRANSCRIPT OF PRESENTATION
  • James Schatz Welcome and Overview of Sessions
    • TRANSCRIPT OF PRESENTATION
  • Douglas Nychka, Chair of Session on Atmospheric and Meteorological Data Introduction by Session Chair
    • TRANSCRIPT OF PRESENTATION
  • John Bates Exploratory Climate Analysis Tools for Environmental Satellite and Weather Radar Data
    • ABSTRACT OF PRESENTATION
      • 1. Introduction
      • 2. Philosophy of the use of remote sensing data for climate monitoring
    • TRANSCRIPT OF PRESENTATION
  • Amy Braverman Statistical Challenges in the Production and Analysis of Remote Sensing Earth Science Data at the Jet…
    • TRANSCRIPT OF PRESENTATION
  • Ralph Milliff Global and Regional Surface Wind Field Inferences from Spaceborne Scatterometer Data
    • TRANSCRIPT OF PRESENTATION
    • Global and Regional Surface Wind Field Inferences Given Spaceborne Scatterometer Data Ralph F.Milliff
    • GLOBAL AND REGIONAL SURFACE WIND FIELD INFERENCES FROM SPACE-BORNE SCATTEROMETER DATA
      • Blending QSCAT and Weather-Center Analysis Winds
      • Bayesian Inference for Surface Winds in the Labrador Sea
      • Bayesian Hierarchical Model for Surface Winds in the Tropics
      • A Bayesian Hierarchical Air-Sea Interaction Model
      • References
      • Figure Captions
      • Summary
  • Report from Breakout Group
  • David Scott, Chair of Session on High-Energy Physics Introduction by Session Chair
    • TRANSCRIPT OF PRESENTATION
  • Robert Jacobsen Statistical Analysis of High Energy Physics Data
    • TRANSCRIPT OF PRESENTATION
  • Paul Padley Some Challenges in Experimental Particle Physics Data Streams
    • ABSTRACT OF PRESENTATION
    • TRANSCRIPT OF PRESENTATION
  • Miron Livny Data Grids (or, A Distributed Computing View of High Energy Physics)
    • TRANSCRIPT OF PRESENTATION
  • Report from Breakout Group
  • Daryl Pregibon Keynote Address: Graph Mining—Discovery in Large Networks
    • ABSTRACT OF PRESENTATION
    • TRANSCRIPT OF PRESENTATION
  • Sallie Keller-McNulty, Chair of Session on Integrated Data Systems Introduction by Session Chair
    • TRANSCRIPT OF PRESENTATION
  • J.Douglas Beason Global Situational Awareness
    • ABSTRACT OF PRESENTATION
    • TRANSCRIPT OF PRESENTATION
  • Kevin Vixie Incorporating Invariants in Mahalanobis Distance-Based Classifiers: Applications to Face Recognition
    • TRANSCRIPT OF PRESENTATION
    • INCORPORATING INVARIANTS IN MAHALANOBIS DISTANCE BASED CLASSIFIERS: APPLICATION TO FACE RECOGNITION
      • I. INTRODUCTION
      • II. COMBINING WITHIN CLASS COVARIANCES AND LINEAR APPROXIMATIONS TO INVARIANCES
      • III. FACE RECOGNITION RESULTS
      • IV. CONCLUSIONS
      • V. ACKNOWLEDGMENT
      • REFERENCES
  • John Elder Ensembles of Models: Simplicity (of Function) Through Complexity (of Form)
    • TRANSCRIPT OF PRESENTATION
  • Report from Breakout Group
  • Mark Hansen Untitled Presentation
    • TRANSCRIPT OF PRESENTATION
  • Wendy Martinez, Chair of Session on Network Traffic Introduction by Session Chair
    • TRANSCRIPT OF PRESENTATION
  • William Cleveland FSD Models for Open-Loop Generation of Internet Packet Traffic
    • ABSTRACT OF PRESENTATION
    • TRANSCRIPT OF PRESENTATION
  • Johannes Gehrke Processing Aggregate Queries over Continuous Data Streams
    • ABSTRACT OF PRESENTATION
    • TRANSCRIPT OF PRESENTATION
  • Edward Wegman Visualization of Internet Packet Headers
    • ABSTRACT OF PRESENTATION
    • TRANSCRIPT OF PRESENTATION
  • Paul Whitney Toward the Routine Analysis of Moderate to Large-Size Data
    • TRANSCRIPT OF PRESENTATION
  • Leland Wilkinson, Chair of Session on Mining Commercial Streams of Data Introduction by Session Chair
    • TRANSCRIPT OF PRESENTATION
  • Lee Rhodes A Stream Processor for Extracting Usage Intelligence from High-Momentum Internet Data
    • TRANSCRIPT OF PRESENTATION
    • A STREAM PROCESSOR FOR EXTRACTING USAGE INTELLIGENCE FROM HIGH-MOMENTUM INTERNET DATA
      • 1. INTRODUCTION
      • 2. BUSINESS CHALLENGES FOR THE NSPs
      • 3. SOURCES AND TYPES OF DATA
        • 3.1 USAGE MEs
        • 3.2 SESSION MEs
        • 3.3 REFERENCE DATA
      • 4. DATA STREAMS AND RIVERS
      • 5. IUM HIGH-LEVEL ARCHITECTURE
      • 6. STREAM COLLECTION AND NORMALIZATION
      • 7. STREAM RULE PROCESSING
      • 8. RULE CHAINS AND ASSOCIATED DATA STRUCTURES
      • 9. STATISTICS FROM STREAMS
        • 9.1 CAPTURE MODELS
      • 9.2 CAPTURE MODEL AGGREGATION
      • 9.3 DRILL FORWARD
      • 9.4 USER INTERACTION WITH STREAMING MODELS
      • 10. SUMMARY
      • ACKNOWLEDGMENTS
      • REFERENCES
  • Pedro Domingos A General Framework for Mining Massive Data Streams
    • TRANSCRIPT OF PRESENTATION
    • A GENERAL FRAMEWORK FOR MINING MASSIVE DATA STREAMS
      • Abstract
        • 1 The Problem
        • 2 The Framework
        • 3 Time-Changing Data
        • 4 Conclusion
        • Reference
    • TRANSCRIPT OF PRESENTATION
  • Andrew Moore kd- R- Ball- and Ad- Trees: Scalable Massive Science Data Analysis
    • TRANSCRIPT OF PRESENTATION
  • Concluding Comments

Subjects

SUBSCRIBE TO OUR NEWSLETTER

By subscribing, you accept our Privacy Policy