Better Scientific Software
a tutorial presented at
ISC High Performance
on 2:00 pm - 6:00 pm CEST (UTC+2) Sunday 21 May 2023
Presenters: Anshu Dubey (Argonne National Laboratory) and David M. Rogers (Oak Ridge National Laboratory)
This page provides detailed information specific to the tutorial event above. Expect updates to this page up to, and perhaps shortly after, the date of the tutorial. Pages for other tutorial events can be accessed from the main page of this site.
Quick Links
- Presentation Slides (FigShare)
On this Page
- Description
- Agenda
- Presentation Slides
- Other Software-Related Events at ISC23
- Stay in Touch
- Resources from Presentations
- Requested Citation
- Acknowledgements
Description
The computational science and engineering (CSE) community is in the midst of an extremely challenging period created by the confluence of disruptive changes in computing architectures, demand for greater scientific reproducibility, and new opportunities for greatly improved simulation capabilities, especially through coupling physics and scales. Computer architecture changes require new software design and implementation strategies, including significant refactoring of existing code. Reproducibility demands require more rigor across the entire software endeavor. Code coupling requires aggregate team interactions including integration of software processes and practices. These challenges demand large investments in scientific software development and improved practices. Focusing on improved developer productivity and software sustainability is both urgent and essential.
This tutorial will provide information and hands-on experience with software practices, processes, and tools explicitly tailored for CSE. Goals are improving the productivity of those who develop CSE software and increasing the quality and sustainability of software artifacts. We discuss practices that are relevant for projects of all sizes, with emphasis on complex workflows and reproducible science. Topics include software design, effective models, tools, complex workflows, computational experiments, software testing (including automated testing, legacy code testing, and continuous integration), and software packaging.
Agenda
Time (CEST) | Title | Presenter |
---|---|---|
2:00 PM | Introduction | Anshu Dubey (ANL) |
2:05 PM | Motivation and Overview of Best Practices in HPC Software Development | Anshu Dubey (ANL) |
2:25 PM | Scientific Software Design | Anshu Dubey (ANL) |
3:00 PM | Testing and Continuous Integration | David M. Rogers (ORNL) |
4:00 PM | Break | |
4:30 PM | Software Packaging - Condensed Version | David M. Rogers (ORNL) |
5:00 PM | Collaborative Software Development | David M. Rogers (ORNL) |
5:30 PM | Lab Notebooks for Computational Mathematics, Sciences, & Engineering | Anshu Dubey (ANL) |
5:50 PM | Summary | Anshu Dubey (ANL) |
6:00 PM | Adjourn |
Presentation Slides
The latest version of the slides will always be available at https://doi.org/10.6084/m9.figshare.22790762.
Note that these files may include additional slides that will not be discussed during the tutorial, but questions are welcome.
Other Software-Related Events at ISC23
If you’re interested in this tutorial, you might be interested in this list of other software-related events taking place in the ISC23 conference.
Stay in Touch
-
After the tutorial please feel free to email questions or feedback to the BSSw tutorial team at bssw-tutorial@lists.mcs.anl.gov.
-
To find out about future events organized by the IDEAS Productivity Project, you can subscribe to our mailing list (usually ~2 messages/month).
-
For monthly updates on the Better Scientific Software site, subscribe to our monthly digest.
Resources from Presentations
Links from the tutorial presentations are listed here for convenience
- Module 1: Introduction
- Module 2: Motivation and Overview of Best Practices in HPC Software Development
- COVID-19 epidemiology saga
- https://doi.org/10.25561/77482
- https://www.nicholaslewis.org/imperial-college-uk-covid-19-numbers-dont-seem-to-add-up/
- https://www.nature.com/articles/d41586-020-01003-6
- https://www.foxnews.com/world/imperial-college-britain-coronavirus-lockdown-buggy-mess-unreliable
- https://www.telegraph.co.uk/technology/2020/05/16/coding-led-lockdown-totally-unreliable-buggy-mess-say-experts/
- https://github.com/mrc-ide/covid-sim/
- https://philbull.wordpress.com/2020/05/10/why-you-can-ignore-reviews-of-scientific-code-by-commercial-software-developers/amp/
- http://doi.org/10.5281/zenodo.3865491
- Best Practices for Scientific Computing
- Good Enough Practices in Scientific Computing
- Linux Foundation Core Infrastructure Initiative (CII) Best Practices Badging Program
- Rate Your Project Assesment Tool
- Progress Tracking Card (PTC) Examples
- Productivity and Sustainability Improvement Planning
- Better Scientific Software (BSSw)
- COVID-19 epidemiology saga
- Module 3: Scientific Software Design
- The Exascale Computing Project (ECP)
- Findings from the ECP Performance Portability Panel Series
- Performance Portability and the Exascale Computing Project
- Kokkos Lecture Series
- Related paper: A Design Proposal for a Next Generation Scientific Software Framework
- Related webinar: Software Design for Longevity with Performance Portability
- Module 4: Testing and Continuous Integration
- A Holistic Algorithmic Approach to Improving Accuracy, Robustness, and Computational Efficiency for Atmospheric Dynamics
- Useful resources on testing (formerly linked to
ideas-productivity.org/resources/howtos/) - Python Build and Test Framework: pyscaffold.org
- Build-Link-Test CMake Framework: llnl-blt.readthedocs.io
- https://github.com/bssw-tutorial/hello-numerical-world
- Tutorials for code coverage: Online Tutorial, Another example
- Lcov (formerly linked to
ltp.sourceforge.net/coverage/lcov.php) - CI/CD Introduction
- Joint Center for Satellite Data Assimilation (JEDI) documentation
- https://github.com/CompFUSE/DCA
- https://docs.docker.com/build/ci/
- https://spack.readthedocs.io/en/latest/containers.html
- https://supercontainers.github.io/sc20-tutorial/07.spack/index.html
- https://docs.github.com/en/actions/using-jobs/running-jobs-in-a-container
- Understanding the risk of script injections
- https://github.com/ECP-WarpX/WarpX
- https://cristianadam.eu/20200113/speeding-up-c-plus-plus-github-actions-using-ccache/
- Hints from the front lines
- Good ideas and idioms from across developer spaces
- DCA++: A software framework to solve correlated electron problems with modern quantum cluster methods
- Ginkgo: A Modern Linear Operator Algebra Framework for High Performance Computing
- https://warpx.readthedocs.io/
- An Automated Performance Evaluation Framework for the GINKGO Software Ecosystem
- Hints from the front lines
- Team Experiences Implementing Continuous Integration
- Module 5: Software Packaging - Condensed Version
- Simple Heat Equation Example
- https://github.com/frobnitzem/lib0
- https://code.ornl.gov/99R/mpi-test
- https://cmake.org/cmake/help/git-stage/manual/cmake-packages.7.html#creating-packages
- https://supercontainers.github.io/sc20-tutorial/
- https://fluid-run.readthedocs.io/en/latest/HowTo/setup_your_repo.html
- https://fastapi.tiangolo.com/deployment/docker/#build-a-docker-image-for-fastapi
- https://supercontainers.github.io/sc20-tutorial/02.docker/index.html
- https://cloud.sylabs.io/builder
- Article on CI team practices
- https://github.com/bssw-tutorial/simple-heateq
- https://pyscaffold.org/
- https://python-poetry.org/docs/pyproject
- Fortran resources
- https://spack.readthedocs.io
- https://github.com/mpbelhorn/olcf-spack-environments/blob/develop/hosts/frontier/envs/base/spack.yaml
- https://spack.readthedocs.io/en/latest/packaging_guide.html#dependency-specs
- Intermediate example: C++ with spack
- https://spack-tutorial.readthedocs.io/en/latest/tutorial_packaging.html
- https://spack.readthedocs.io/en/latest/spack.util.html#module-spack.util.prefix
- https://spack.readthedocs.io/en/latest/packaging_guide.html#accessing-dependencies
- DCA++
- https://github.com/pyscf/extension-template
- ZFP
- https://github.com/ECP-copa/Cabana
- https://cmake.org/cmake/help/latest/guide/tutorial/index.html
- https://cmake.org/cmake/help/latest/command/add_test.html
- Module 6: Collaborative Software Development
- Design Patterns for Git Workflows
- Git Flow (Driessen’s Original Blog)
- GitHub Flow (previously linked to
scottchacon.com/2011/08/31/github-flow.html) - GitLab Flow (previously linked to
docs.gitlab.com/ee/topics/gitlab_flow.html) - Agile Manifesto
- How to code review in a Pull Request
- Investing in Code Reviews for Better Research Software (previously linked to
ideas-productivity.org/events/hpc-best-practices-webinars/#webinar068) - Testing and Code Review Practices in Research Software Development (previously linked to
ideas-productivity.org/events/hpc-best-practices-webinars/#webinar044) - Open Source Initiative
- Choose an Open Source License Tool
- Introduction to Software Licensing (previously linked to
ideas-productivity.org/events/hpc-best-practices-webinars/#webinar024) - Trilinos
- Open MPI
- FleCSI
- Module 7: Lab Notebooks for Computational Mathematics, Sciences, & Engineering
- Writing the Laboratory Notebook
- HPC and the Lab Manager
- What All Codes Should Do (ATPESC 2019)
- DIKW pyramid
- How to pick an electronic notebook
- Resources – Execution Environments
- ATPESC 2022 Laboratory Environment BSSw Tutorial
- Lab Notebooks for Computational Mathematics, Sciences & Engineering (previously linked to
ideas-productivity.org/events/hpc-best-practices-webinars/#webinar070) - Popper
- FlashKit
- Code Ocean
- Weight & Biases
- Module 8: Summary
Requested Citation
The requested citation the overall tutorial is:
Anshu Dubey and David M. Rogers, Better Scientific Software tutorial, in ISC High Performance, Hamburg, Germany, and online, 2023. DOI: 10.6084/m9.figshare.22790762.
Individual modules may be cited as Speaker, Module Title, in Better Scientific Software tutorial…
Acknowledgements
This tutorial is produced by the IDEAS Productivity project.
This work was supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.