Workflows and Provenance: Toward Information Science Solutions for the Natural Sciences
The era of big data and ubiquitous computation has brought with it concerns about ensuring reproducibility in this new research environment. It is easy to assume computational methods self-document by their very nature of being exact, deterministic processes. However, similar to laboratory experiments, ensuring reproducibility in the computational realm requires the documentation of both the protocols used (workflows) as well as a detailed description of the computational environment: algorithms, implementations, software environments as well as the data ingested and execution logs of the computation. These two aspects of computational reproducibility (workflows and execution details) are discussed in the context of biomolecular Nuclear Magnetic Resonance spectroscopy (bioNMR) as well as the PRIMAD model for computational reproducibility.