Refining the Concept of Scientific Inference When Working with Big Data

Refining the Concept of Scientific Inference When Working with Big Data

Proceedings of a Workshop

  • Author: Wender, Ben A.
  • Publisher: National Academies Press
  • ISBN: 9780309454445
  • eISBN Pdf: 9780309454452
  • eISBN Epub: 9780309454476
  • Place of publication:  United States
  • Year of digital publication: 2017
  • Month: February
  • Pages: 115
  • DDC: 510
  • Language: English

The concept of utilizing big data to enable scientific discovery has generated tremendous excitement and investment from both private and public sectors over the past decade, and expectations continue to grow. Using big data analytics to identify complex patterns hidden inside volumes of data that have never been combined could accelerate the rate of scientific discovery and lead to the development of beneficial technologies and products. However, producing actionable scientific knowledge from such large, complex data sets requires statistical models that produce reliable inferences (NRC, 2013). Without careful consideration of the suitability of both available data and the statistical models applied, analysis of big data may result in misleading correlations and false discoveries, which can potentially undermine confidence in scientific research if the results are not reproducible. In June 2016 the National Academies of Sciences, Engineering, and Medicine convened a workshop to examine critical challenges and opportunities in performing scientific inference reliably when working with big data. Participants explored new methodologic developments that hold significant promise and potential research program areas for the future. This publication summarizes the presentations and discussions from the workshop.

  • FrontMatter
  • Acknowledgment of Reviewers
  • Contents
  • 1 Introduction
  • 2 Framing the Workshop
  • 3 Inference About Discoveries Based on Integration of Diverse Data Sets
  • 4 Inference About Causal Discoveries Driven by Large Observational Data
  • 5 Inference When Regularization Is Used to Simplify Fitting of High-Dimensional Models
  • 6 Panel Discussion
  • References
  • Appendixes
  • Appendix A: Registered Workshop Participants
  • Appendix B: Workshop Agenda
  • Appendix C: Acronyms

Subjects

SUBSCRIBE TO OUR NEWSLETTER

By subscribing, you accept our Privacy Policy