Better Scientific Software
a tutorial presented at
Presenters: David E. Bernholdt (Oak Ridge National Laboratory), Anshu Dubey (Argonne National Laboratory), Rinku K. Gupta (Argonne National Laboratory), and David M. Rogers (Oak Ridge National Laboratory)
This page provides detailed information specific to the tutorial event above. Expect updates to this page up to, and perhaps shortly after, the date of the tutorial. Pages for other tutorial events can be accessed from the main page of this site.Last updated: 2021-07-26
On this Page
- Presentation Slides
- How to Participate
- Hands-On Exercises
- Stay in Touch
- Resources from Presentations
- Requested Citation
Producing scientific software is a challenge. The high-performance modeling and simulation community, in particular, is dealing with the confluence of disruptive changes in computing architectures and new opportunities (and demands) for greatly improved simulation capabilities, especially through coupling physics and scales. At the same time, computational science and engineering (CSE), as well as other areas of science, are experiencing increasing focus on scientific reproducibility and software quality.
Computer architecture changes require new software design and implementation strategies, including significant refactoring of existing code. Reproducibility demands require more rigor across the entire software endeavor. Code coupling requires aggregate team interactions including integration of software processes and practices. These challenges demand large investments in scientific software development and improved practices. Focusing on improved developer productivity and software sustainability is both urgent and essential.
This half-day tutorial distills multi-project and multi-years experience from members of the IDEAS Productivity project and the creators of the BSSw.io community website. The tutorial will provide information about software practices, processes, and tools explicitly tailored for CSE. Topics to be covered include: Agile methodologies and tools, software design and refactoring, testing and continuous integration, Git workflows for teams, and reproducibility. Material will be mostly at the beginner and intermediate levels. There will also be opportunities to discuss topics raised by the audience.
|1:00pm-1:05pm||00||Introduction||David E. Bernholdt, ORNL|
|1:05pm-1:15pm||01||Motivation and Overview of Best Practices in HPC Software Development||David E. Bernholdt, ORNL|
|1:15pm-1:45pm||02||Agile Methodologies||Rinku Gupta, ANL|
|1:45pm-2:00pm||03||Git Workflows||Rinku Gupta, ANL|
|2:00pm-2:20pm||04||Software Testing 1||David Rogers, ORNL|
|2:20pm-2:40pm||Break (optional Q&A)|
|2:40pm-3:00pm||05||Software Design||Anshu Dubey, ANL|
|3:00pm-3:15pm||06||Software Testing 2||David Rogers, ORNL|
|3:15pm-3:40pm||07||Refactoring||Anshu Dubey, ANL|
|3:40pm-3:55pm||08||Reproducibility||David E. Bernholdt, ORNL|
|3:55pm-4:00pm||09||Summary||David E. Bernholdt, ORNL|
- The latest version of the slides will always be available at https://doi.org/10.6084/m9.figshare.14256257.
- Note that these files may include additional slides that will not be discussed during the tutorial, but questions are welcome.
- v2: Corrects a small misstatement about doing demos during breaks in the tutorial
- v1: Provided to ISS organizers
How to Participate
Please use the Sli.do chat to ask questions at any time. We will respond in Sli.do or verbally as opportunities permit.
The schedule includes a break in the middle. We plan to be available for additional Q&A at that time.
The hands-on exercises for this tutorial are based around a simple numerical model using the one-dimensional heat equation. The example is described briefly in the repository’s README file, and in greater detail in the ATPESC Hands-On lesson. The ATPESC version focuses on the numerical aspects of the model. But for this tutorial, we’re focused on how to make the software better from a quality perspective, so you don’t need to understand the math to do these exercises.
For the purposes of these hands-on exercises, you should imagine you’ve inherited an early version of the hello-numerical-world software from a colleague who’s left the project, and you’ve been assigned to get it into better shape so that it can be used in the next ATPESC summer school.
The repository you’ll be working with is on GitHub: bssw-tutorial/hello-numerical-world-2021-03-iss. Note: most of the screenshots will refer to the generic “hello-numerical-world” repository rather than the one specifically for this tutorial.
List of Hands-On Exercises
Note that the exercise numbers align with the presentation modules. Not every module has exercises (yet).
- Exercise 0: Setting up the Prerequisites. Setup the accounts needed for these exercises.
- Exercise 2: Agile Methodologies. You’ll use GitHub issues and project boards to setup a simple “personal kanban” board.
- Exercise 3: Git Workflows. You’ll fork our hello-numerical-world repository, create a feature branch, and make a pull request
- Exercise 7a: Agile Redux. You’ll create epic, story, and task issues for the refactoring task and track them on a kanban board
- Exercise 7b: Refactoring Part 1. You’ll perform a small, well-defined refactoring exercise
- Exercise 7c: Refactoring Part 2. You’ll perform a a more open-ended refactoring exercise
- Additional Exercise: Continuous Integration. You’ll establish a simple continuous integration workflow and then refine it, adding code coverage assessment
Stay in Touch
After the tutorial please feel free to email questions or feedback to the BSSw tutorial team at email@example.com.
If you want to do the hands-on exercises on your own, we’re happy to provide feedback on your pull requests.
To find out about future events organized by the IDEAS Productivity Project, you can subscribe to our mailing list (usually ~2 messages/month).
For monthly updates on the Better Scientific Software site, subscribe to our monthly digest.
Resources from Presentations
These are the links included in the tutorial presentations, included here for easier access
- Module 0: Introduction
- Module 1: Motivation and Overview of Best Practices in HPC Software Development
- Module 2: Agile Methodologies
- Agile Manifesto
- A-team tools for Agile practices
- Policies: A Code of Conduct for Open Source Projects
- Module 3: Git Workflows
- Module 4: Software Testing 1
- Python Build and Test Framework: pyscaffold.org
- Build-Link-Test CMake Framework: llnl-blt.readthedocs.io
- Static Source Analysis (C++): clang-tidy
- Static Source Analysis (python): flake8 and pylint
- Code Coverage Webservices: codecov and coveralls
- Tutorials for code coverage: Online Tutorial, Another example
- Development Practices Survey Article
- Module 5: Software Design
- Module 6: Software Testing 2
- Useful How-to resources on test and test suites on ideas-productivity.org
- Related Articles: 1, 2
- Module 7: Refactoring
- Module 8: Reproducibility
- Floating Point Analysis Tools
- Code Ocean (Cloud platforms - publish and reproduce research code and data)
- DOIs and hosting of data, code, documents:
- National Science Foundation Data Management Plan Requirements
- SC20 Transparency and Reproducibility Initiative
- ACM Transactions on Mathematical Software (TOMS)
- ACM Artifact Review and Badging
- National Information Standards Organization (NISO) on Reproducibility and Badging
- The FAIR Guiding Principles for Scientific Data Management and Stewardship. Mark D. Wilkinson, et al. 2016
- Editorial: ACM TOMS Replicated Computational Results Initiative. Michael A. Heroux. 2015
- Simple experiments in reproducibility and technical trust by Mike Heroux and students (work in progress)
- Module 9: Summary
- COVID-19 epidemiology saga
- Productivity and Sustainability Improvement Planning
- Better Scientific Software web site
- COVID-19 epidemiology saga
- Additional Module: Continuous Integration
The requested citation the overall tutorial is: David E. Bernholdt, Anshu Dubey, Rinku K. Gupta, and David Rogers, Better Scientific Software tutorial, in Improving Scientific Software, online, 2021. DOI: 10.6084/m9.figshare.14256257.
Individual modules may be cited as Speaker, Module Title, in Better Scientific Software tutorial…
This tutorial is produced by the IDEAS Productivity project.
This work was supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.