Better Software for Reproducible Science

a tutorial presented at

The International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC25)

on 8:30 am - 12:00 pm CST Sunday 16 November 2025

Presenters: David E. Bernholdt (Oak Ridge National Laboratory) and Anshu Dubey (Argonne National Laboratory)

This page provides detailed information specific to the tutorial event above. Expect updates to this page up to, and perhaps shortly after, the date of the tutorial. Pages for other tutorial events can be accessed from the main page of this site.

Quick Links

Presentation Slides (FigShare)

Description

Producing scientific software is a challenge. The high-performance modeling and simulation community, in particular, faces the confluence of disruptive changes in computing architectures and new opportunities (and demands) for greatly improved simulation capabilities, especially through coupling physics and scales. Simultaneously, computational science and engineering (CSE), as well as other areas of science, are experiencing an increasing focus on scientific reproducibility and software quality. Large language models (LLMs), can significantly increase developer productivity through judicious off-loading of tasks. However, models can hallucinate, therefore it is important to have a good methodology to get the most benefit out of this approach.

In this tutorial, attendees will learn about practices, processes, and tools to improve the productivity of those who develop CSE software, increase the sustainability of software artifacts, and enhance trustworthiness in their use. We will focus on aspects of scientific software development that are not adequately addressed by resources developed for industrial software engineering, offering a strategy for the responsible use of LLMs to enhance developer productivity in the context of scientific software development, incorporating testing strategies for the generated code, and discussing reproducibility considerations in the development and use of scientific software.

Agenda

Time (CST)	Title	Presenter
8:30 AM	Introduction	David E. Bernholdt (ORNL)
8:45 AM	Motivation and Overview of Best Practices in HPC Software Development	David E. Bernholdt (ORNL)
9:15 AM	Improving Reproducibility Through Better Software Practices	David E. Bernholdt (ORNL)
10:00 AM	Morning break
10:30 AM	Responsible Software Development with LLMs	Anshu Dubey (ANL)
12:00 PM	Adjourn

Presentation Slides

The latest version of the slides will always be available at https://doi.org/10.6084/m9.figshare.30394186.

Note that these files may include additional slides that will not be discussed during the tutorial, but questions are welcome.

How to Participate

We want to interact with you! We find these tutorials most interesting and informative (for everyone) if you ask questions and share experiences! We learn too!
If you’re in the room with us, please raise your hand to ask a question. Please wait for the microphone so that others in the room as well as online participants can hear your question.
If you’re online, please use the Digital Experience Q&A tool to ask your question. We’ll respond verbally or in the Q&A tool or verbally as opportunities permit. Please remember that the online stream is delayed about 15 seconds, so there will be some latency in our responses.

Hands-On Activities

Introduction

The hands-on activity for this tutorial involves the use of a large language model (LLM) to generate tests and code according to specifications (prompts) you will develop. Participation in these activities is encouraged, but not required. After interested participants have been given some time to try the exercise on their own, the instructor will review their prompts and the resulting code with the class and these materials will be made available to all participants.

You can participate in the hands-on section in two modes: using the LLM’s web interface, or using CodeScribe, a tool that enables using chat-completion through the API interface of the LLM. The main objectives of the hands-on can be met by using the web interface. The advantage of using CodeScribe is to get exposure to the chat-completion technique, and getting to know a tool that can be very handy for writing code.

The instructor will work in the C language, but a surface understanding of the language will suffice to follow the explanations. For your own work, you can prompt the LLM to generate code in the language of your choice. Evaluation of the generated code (and revising the prompt accordingly) will be part of the hands-on activity. For the purposes of the tutorial, inspection of the code will suffice to gauge its appropriateness, but if you wish to more rigorously validate the generated code, you will need to be able to access an environment to build and run it either locally or remotely.

Advance preparation

If you wish to participate in the hands-on activities, we strongly encourage you to do a bit of preparation before you leave for St. Louis. This is especially true if you want to use CodeScribe, which may require some advance interaction with the tool developers to integrate new LLM APIs.

Preparation for using the LLM web interface

You will need access to an LLM chat tool. The instructor will be using ChatGPT, but any comparable LLM, including institutionally-supported tools, should work.
(OPTIONAL) If you wish to build and run the code generated by the LLM, you will need access (local or remote) to an appropriate environment. As stated above, you can use whatever language you prefer for your hands-on activity. The instructor will work in C.

Preparation for using CodeScribe

It is important that you do this preparation with enough lead time that we can assist you if necessary before SC25 starts. We will not be able to provide any support for CodeScribe setup issues during the tutorial itself.

You will need API access to an LLM tool. This is a level beyond the web interface and it incurs an extra charge on some platforms. However many institutionally-supported LLMs offer API access at no additional cost. You will be responsible for any additional costs incurred. The instructor will be using ChatGPT, but any comparable LLM should work. CodeScribe also supports several freely downloadable models which can be run locally:
The first two models are smaller and may be faster on laptops. Download instructions can be found here: https://huggingface.co/docs/hub/en/models-downloading. Note that if you’re downloading Hugging Face models using their CLI, they will be downloaded to their cache folder (~/.cache/huggingface/hub/<model-name>/snapshots/<sha1> or your operating system’s equivalent). If you use git clone you can put the models wherever you want.
CodeScribe is a Python code, so you will need a working Python installation on a system that you will be able to access (local or remote) during the tutorial.
Download and install CodeScribe from https://github.com/adubey64/CodeScribe a. Installation instructions are provided in the README file: https://github.com/adubey64/CodeScribe?tab=readme-ov-file#installation b. You are encouraged to watch the two tutorials on the installation and use of CodeScribe in this Box folder: https://anl.app.box.com/folder/336154643880?s=zv3zdbphqprdz8rjh1c84xpeqd8yg32u. They are 19 minutes and 11 minutes long, respectively. The tutorials were prepared before “generate” command was added to the tool, so there is no mention of it in either tutorial, but it works like the “translate” command.
You will need to integrate your CodeScribe installation with the API of your LLM of choice. Basic instructions are provided in the README file: https://github.com/adubey64/CodeScribe?tab=readme-ov-file#integrating-llm-of-choice. It may be necessary for you to add support for your particular model to CodeScribe. You should look at the file https://github.com/adubey64/CodeScribe/blob/development/code_scribe/lib/_llm.py, copy the class most closely resembling the target model and create a pull request in our repository. With enough lead time we will try our best to help make it work.
(OPTIONAL) If you wish to build and run the code generated by the LLM, you will need access (local or remote) to an appropriate environment. As stated above, you can use whatever language you prefer for your hands-on activity. The instructor will work in C.

During the tutorial

Prompts

We provide the instructor’s prompts as examples, but to really get the most out of this hands-on exercise, you really should develop your own set of prompts, not just paste ours into your LLM.

prompts.toml

Generated code

The full set of generated code, along with the Makefile and some test inputs and outputs can be downloaded as a ZIP file.

The individual files are:

If you’re interested in this tutorial, you might be interested in some of the other software-related events taking place in the SC25 conference.

Stay in Touch

After the tutorial please feel free to email questions or feedback to the BSSw tutorial team at bssw-tutorial@lists.mcs.anl.gov.
To find out about future events organized by the IDEAS Productivity Project, you can subscribe to our mailing list (usually ~2 messages/month).
For monthly updates on the Better Scientific Software site, subscribe to our monthly digest.

Resources from Presentations

Links from the tutorial presentations are listed here for convenience

Module 1: Introduction
Module 2: Motivation and Overview of Best Practices in HPC Software Development
Module 3: Improving Reproducibility Through Better Software Practices
- Toward a Compatible Reproducibility Taxonomy for Computational and Computing Sciences
- Reproducibility and Replicability in Science
- Many Psychology Findings Not As Strong As Claimed
- The War Over Supercooled Water
- Researchers find bug in Python Script may have affected hundreds of studies
- National Science Foundation Data Management Plan Requirements
- Introducing the FAIR Principles for research software
- FAIR Principles for Data
- ACM Reproducible Computational Results
- ACM Artifact Review and Badging
- http://fursin.net/reproducibility.html
- National Information Standards Organization (NISO) on Reproducibility and Badging
- HPC and the Lab Manager
- Writing the Laboratory Notebook
- DIKW pyramid
- Helpful Tools
  - Floating Point Analysis Tools
  - Code Ocean (Cloud platforms - publish and reproduce research code and data)
  - DOIs and hosting of data, code, documents:
    - Zenodo
    - FigShare
- Other Resources:
  - The FAIR Guiding Principles for Scientific Data Management and Stewardship. Mark D. Wilkinson, et al. 2016
  - FAIR4RS (previously linked to ~~www.rd-alliance.org/groups/fair-research-software-fair4rs-wg~~)
  - Editorial: ACM TOMS Replicated Computational Results Initiative. Michael A. Heroux. 2015
  - Enhancing Reproducibility for Computational Methods
  - Simple experiments in reproducibility and technical trust by Mike Heroux and students (work in progress)
  - What every scientist should know about floating-point arithmetic. David Goldberg.
- Jupyter4Science: Better Practices for Using Jupyter Notebooks for Science by Nicole Brewer
Module 4: Responsible Software Development with LLMs
- none

Requested Citation

The requested citation the overall tutorial is:

David E. Bernholdt and Anshu Dubey, Better Software for Reproducible Science tutorial, in The International Conference for High-Performance Computing, Networking, Storage, and Analysis (SC25), St. Louis, Missouri, 2025. DOI: 10.6084/m9.figshare.30394186.

Individual modules may be cited as Speaker, Module Title, in Better Software for Reproducible Science tutorial…

Acknowledgements

This tutorial is produced by the Consortium for the Advancement of Scientific Software (CASS).

This work was supported by the U.S. Department of Energy, Office of Science, Office of Advanced Scientific Computing Research, Next-Generation Scientific Software Technologies (NGSST) and Scientific Discovery through Advanced Computing (SciDAC) programs.

This work was supported by the U.S. Department of Energy Office of Science, Office of Advanced Scientific Computing Research (ASCR), and by the Exascale Computing Project (17-SC-20-SC), a collaborative effort of the U.S. Department of Energy Office of Science and the National Nuclear Security Administration.