Software Practices for Better Science: Testing, Reproducibility, and Documentation
a tutorial presented at
Exascale Computing Project Tutorial Days
on 3:00 pm - 6:15 pm EST (UTC-5) Monday 6 February 2023
Presenters: David E. Bernholdt (Oak Ridge National Laboratory), David M. Rogers (Oak Ridge National Laboratory), and Gregory R. Watson (Oak Ridge National Laboratory)
This page provides detailed information specific to the tutorial event above. Expect updates to this page up to, and perhaps shortly after, the date of the tutorial. Pages for other tutorial events can be accessed from the main page of this site.
Quick Links
- Recording (YouTube)
- Presentation Slides (FigShare)
On this Page
- Description
- Agenda
- Presentation Slides
- How to Participate
- Related ECP Events
- Stay in Touch
- Resources from Presentations
- Requested Citation
- Acknowledgements
Description
As many ECP projects begin their transition from major development towards production science, this tutorial will offer key strategies to help projects improve their science. The tutorial will focus on testing strategies (design and selection in different contexts), reproducibility concerns and the creation of “lab notebook”-style documentation. These practices will provide you and your team detailed information about what to do and why. We’ll offer practical strategies, based on experience in a broad range of projects, that can help improve the effectiveness in going from software to science.
Agenda
Time (EST) | Title | Presenter |
---|---|---|
3:00 PM | Introduction and Setup | David E. Bernholdt (ORNL) |
3:05 PM | Testing Strategies | David M. Rogers (ORNL) |
4:00 PM | Break | |
4:15 PM | Improving Reproducibility Through Better Software Practices | Gregory R. Watson (ORNL) |
5:15 PM | Lab Notebooks for Computational Mathematics, Sciences, & Engineering | David E. Bernholdt (ORNL) |
6:15 PM | Adjourn |
Presentation Slides
The latest version of the slides will always be available at https://doi.org/10.6084/m9.figshare.21989507.
Note that these files may include additional slides that will not be discussed during the tutorial, but questions are welcome.
How to Participate
- Please use Zoom chat or unmute to ask questions at any time. We will respond in chat or verbally as opportunities permit.
Related ECP Events
If you’re interested in this tutorial, you might be interested in these Birds of a Feather sessions taking place next week as part of the ECP Community BoF Days:
- Sharing Your Software Sustainability, Productivity, and Quality Experience through BSSw.io
- BSSw Fellowship
Stay in Touch
-
After the tutorial please feel free to email questions or feedback to the BSSw tutorial team at bssw-tutorial@lists.mcs.anl.gov.
-
To find out about future events organized by the IDEAS Productivity Project, you can subscribe to our mailing list (usually ~2 messages/month).
-
For monthly updates on the Better Scientific Software site, subscribe to our monthly digest.
Resources from Presentations
Links from the tutorial presentations are listed here for convenience
- Module 1: Introduction and Setup
- Module 2: Testing Strategies
- Useful resources on testing (formerly linked to
ideas-productivity.org/resources/howtos/) - CI/CD Introduction
- Team Experiences Implementing Continuous Integration
- Understanding the risk of script injections
- DCA++: A software framework to solve correlated electron problems with modern quantum cluster methods
- Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing
- An Automated Performance Evaluation Framework for the GINKGO Software Ecosystem
- Joint Center for Satellite Data Assimilation (JEDI) documentation
- Hints from the front lines
- Using containers
- Speeding up C++ GitHub actions using ccache
- Deploying documentation
- Related articles:
- Useful resources on testing (formerly linked to
- Module 3: Improving Reproducibility Through Better Software Practices
- Motivations and Background:
- Definitions, Guidelines, and Organizations:
- National Science Foundation Data Management Plan Requirements
- Findable, Accessible, Interoperable, Re-usable
- https://gofair.us/
- SC22 Reproducibility Initiative
- ACM Transactions on Mathematical Software (TOMS)
- ACM Artifact Review and Badging
- http://fursin.net/reproducibility.html
- National Information Standards Organization (NISO) on Reproducibility and Badging
- Helpful Tools
- Floating Point Analysis Tools
- Code Ocean (Cloud platforms - publish and reproduce research code and data)
- DOIs and hosting of data, code, documents:
- Other Resources:
- The FAIR Guiding Principles for Scientific Data Management and Stewardship. Mark D. Wilkinson, et al. 2016
- FAIR4RS (previously linked to
www.rd-alliance.org/groups/fair-research-software-fair4rs-wg) - Editorial: ACM TOMS Replicated Computational Results Initiative. Michael A. Heroux. 2015
- Enhancing Reproducibility for Computational Methods
- Simple experiments in reproducibility and technical trust by Mike Heroux and students (work in progress)
- What every scientist should know about floating-point arithmetic. David Goldberg.
- Module 4: Lab Notebooks for Computational Mathematics, Sciences, & Engineering
- Writing the Laboratory Notebook
- HPC and the Lab Manager
- What All Codes Should Do (ATPESC 2019)
- DIKW pyramid
- How to pick an electronic notebook
- Resources – Execution Environments
- ATPESC 2022 Laboratory Environment BSSw Tutorial
- Lab Notebooks for Computational Mathematics, Sciences & Engineering (previously linked to
ideas-productivity.org/events/hpc-best-practices-webinars/#webinar070) - Popper
- FlashKit
- Code Ocean
- Weight & Biases
Requested Citation
The requested citation the overall tutorial is:
David E. Bernholdt, David M. Rogers, and Gregory R. Watson, Software Practices for Better Science: Testing, Reproducibility, and Documentation tutorial, in Exascale Computing Project Tutorial Days, online, 2023. DOI: 10.6084/m9.figshare.21989507.
Individual modules may be cited as Speaker, Module Title, in Software Practices for Better Science: Testing, Reproducibility, and Documentation tutorial…
Acknowledgements
This tutorial is produced by the IDEAS Productivity project.
This work was supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.