Software Productivity and Sustainability
a track presented at
Argonne Training Program on Extreme-Scale Computing
on 9:30am - 5:45pm CDT Thursday 12 August 2021
Presenters: David E. Bernholdt (Oak Ridge National Laboratory), Anshu Dubey (Argonne National Laboratory), Rinku K. Gupta (Argonne National Laboratory), and David M. Rogers (Oak Ridge National Laboratory)
This page provides detailed information specific to the tutorial event above. Expect updates to this page up to, and perhaps shortly after, the date of the tutorial. Pages for other tutorial events can be accessed from the main page of this site.
- Presentations (Youtube playlist)
- Presentation Slides (FigShare)
- Hands-On Code Repository (GitHub)
On this Page
- Presentation Slides
- How to Participate
- Hands-On Exercises
- Stay in Touch
- Resources from Presentations
- Requested Citation
The BSSw tutorial focuses on issues of developer productivity, software sustainability, and reproducibility in scientific research software, particularly targeting high-performance computers.
|9:30 AM||0||Introduction and Setup||David E. Bernholdt (ORNL)|
|9:40 AM||1||Motivation and Overview of Best Practices in HPC Software Development||David E. Bernholdt (ORNL)|
|10:00 AM||2||Agile Methodologies||Rinku K. Gupta (ANL)|
|10:30 AM||3||Git Workflows||Rinku K. Gupta (ANL)|
|11:15 AM||4||Scientific Software Design||Anshu Dubey (ANL)|
|11:45 AM||5||Improving Reproducibility Through Better Software Practices||David E. Bernholdt (ORNL)|
|12:30 PM||6||Agile Methodologies Redux||Rinku K. Gupta (ANL)|
|1:45 PM||7||Software Testing Introduction||David M. Rogers (ORNL)|
|2:05 PM||8||Testing Walkthrough||David M. Rogers (ORNL)|
|2:15 PM||9||Testing Complex Software||David M. Rogers (ORNL)|
|2:35 PM||10||Continuous Integration||David M. Rogers (ORNL)|
|3:15 PM||11||Refactoring Scientific Software||Anshu Dubey (ANL)|
|4:15 PM||12||Summary||David E. Bernholdt (ORNL)|
The latest version of the slides will always be available at https://doi.org/10.6084/m9.figshare.15130590.
Note that these files may include additional slides that will not be discussed during the tutorial, but questions are welcome.
How to Participate
Please use Slack to ask questions at any time. We will respond in chat or verbally as opportunities permit. We’ll call for questions at transition points in presentations, as time permits.
During breaks, the instructors are happy to hold further discussions with anyone interested.
The agenda includes a Q&A session at the end of the day as well.
In this tutorial, we will not have time set aside to work through the hands-on activities, but you are encouraged to pursue them on your own. We’ll watch the repository for issues and pull requests, and the mailing list for other questions, and provide feedback as we can.
The hands-on exercises for this track are based around a simple numerical model using the one-dimensional heat equation. The example is described briefly in the repository’s README file, and in greater detail in the ATPESC Hands-On lesson. The ATPESC version focuses on the numerical aspects of the model. But for this track, we’re focused on how to make the software better from a quality perspective, so you don’t need to understand the math to do these exercises.
For the purposes of these hands-on exercises, you should imagine you’ve inherited an early version of the hello-numerical-world software from a colleague who’s left the project, and you’ve been assigned to get it into better shape so that it can be used in the next ATPESC summer school.
The repository you’ll be working with is on GitHub: bssw-tutorial/hello-numerical-world-2021-08-12-atpesc. Note: most of the screenshots will refer to the generic “hello-numerical-world” repository rather than the one specifically for this tutorial.
List of Hands-On Exercises
Note that the exercise numbers align with the presentation modules. Not every module has exercises (yet).
- Exercise 0: Setting up the Prerequisites. Setup the accounts needed for these exercises.
- Exercise 2: Agile Methodologies. You’ll use GitHub issues and project boards to setup a simple “personal kanban” board.
- Exercise 3: Git Workflows. You’ll fork our hello-numerical-world repository, create a feature branch, and make a pull request
- Exercise 6: Agile Redux. You’ll create epic, story, and task issues for the refactoring task and track them on a kanban board
- Exercise 8: Software Testing. You’ll establish a simple continuous integration workflow and then refine it, adding code coverage assessment
- Exercise 10: Continuous Integration. You’ll establish a simple continuous integration workflow and then refine it, adding code coverage assessment
- Exercise 11: Refactoring Scientific Software. You’ll perform a small, well-defined refactoring exercise
Stay in Touch
After the tutorial please feel free to email questions or feedback to the BSSw tutorial team at firstname.lastname@example.org.
If you want to do the hands-on exercises, we’re happy to provide feedback on your pull requests and issues, even after the end of the tutorial.
To find out about future events organized by the IDEAS Productivity Project, you can subscribe to our mailing list (usually ~2 messages/month).
For monthly updates on the Better Scientific Software site, subscribe to our monthly digest.
Resources from Presentations
Links from the tutorial presentations are listed here for convenience
- Module 0: Introduction and Setup
- Module 1: Motivation and Overview of Best Practices in HPC Software Development
- Module 2: Agile Methodologies
- Agile Manifesto
- A-team tools for Agile practices
- Policies: A Code of Conduct for Open Source Projects
- Module 3: Git Workflows
- Module 4: Scientific Software Design
- Related paper: A Design Proposal for a Next Generation Scientific Software Framework
- Related webinar: Software Design for Longevity with Performance Portability
- Module 5: Improving Reproducibility Through Better Software Practices
- Toward a Compatible Reproducibility Taxonomy for Computational and Computing Sciences (updated 2022-03-31 with DOI link)
- Reproducibility and Replicability in Science (updated 2022-03-31 with DOI link)
- Many Psychology Findings Not as Strong as Claimed
- The War Over Supercooled Water
- Researchers Find Bug in Python Script May Have Affected Hundreds of Studies
- The FAIR Guiding Principles for Scientific Data Management and Stewardship. Mark D. Wilkinson, et al. 2016
- National Science Foundation Data Management Plan Requirements
- SC21 Reproducibility Initiative
- ACM Transactions on Mathematical Software (TOMS)
- ACM Artifact Review and Badging
- National Information Standards Organization (NISO) on Reproducibility and Badging
- Floating Point Analysis Tools
- Code Ocean (Cloud platforms - publish and reproduce research code and data)
- DOIs and hosting of data, code, documents:
- Editorial: ACM TOMS Replicated Computational Results Initiative. Michael A. Heroux. 2015
- Simple experiments in reproducibility and technical trust by Mike Heroux and students (work in progress)
- What every scientist should know about floating-point arithmetic. David Goldberg.
- Module 6: Agile Methodologies Redux
- Module 7: Software Testing Introduction
- In the face of uncertainties, NNSA seeks verification and validation
- Python Build and Test Framework: pyscaffold.org
- Build-Link-Test CMake Framework: llnl-blt.readthedocs.io
- Static Source Analysis (C++): clang-tidy
- Static Source Analysis (python): flake8 and pylint (updated 2022-03-31 due to dead link)
- Code Coverage Webservices: codecov and coveralls
- Tutorials for code coverage: Online Tutorial, Another example
- Development Practices Survey Article
- Module 8: Testing Walkthrough
- CMake Tutorial
- CMake add-test command documentation
- Module 9: Testing Complex Software
- Module 10: Continuous Integration
- Module 11: Refactoring Scientific Software
- Module 12: Summary
- COVID-19 epidemiology saga
- Productivity and Sustainability Improvement Planning
- Better Scientific Software web site
- COVID-19 epidemiology saga
The requested citation the overall tutorial is:
David E. Bernholdt, Anshu Dubey, Rinku K. Gupta, and David M. Rogers, Software Productivity and Sustainability track, in Argonne Training Program on Extreme-Scale Computing, online, 2021. DOI: 10.6084/m9.figshare.15130590.
Individual modules may be cited as Speaker, Module Title, in Software Productivity and Sustainability track…
This tutorial is produced by the IDEAS Productivity project.
This work was supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.