This lesson is still being designed and assembled (Pre-Alpha version)

Software Development with LLMs

This lesson presents a systematic approach for the productive and reliable use of generative artificial intelligence (AI) large language models (LLMs) in scientific software development. The strategy is based on the implementation of a robust verification methodology, which will be explained in this lesson.

This lesson provides a “hands-on” accompaniment to several presentation modules in the Better Scientific Software tutorial collection, including:

as well as the core of the tutorial module

The material in these different episodes is designed to be as independent as possible, but you will probably be best served by at least reading through (if not doing) all of the episodes.

We’re offering two separate “worked examples” – one based on a simple heat equation solver, and the other a non-numerical example based on mirror images of a particle at the boundary of a mesh. We hope that most of you will be comfortable with one or both. Since we’re using LLMs, you can implement them in any programming language for which your LLM can generate reasonable code.

Important Note

The capabilities of generative AI-based tools are rapidly evolving. This lesson represents a practical approach to the use of widely-available technologies and tools in the 2025 time frame. It is highly likely that this material will not stand the test of time, although we hope to be able to update it as the technology evolves.

Prerequisites

None

Schedule

Setup Download files required for the lesson
00:00 1. Background and Example Problems FIXME
00:00 2. Scientific Software Design with LLMs FIXME
00:00 3. Refactoring Scientific Software with LLMs FIXME
00:00 4. Scientific Software Development with LLMs FIXME
00:00 5. Improving the Reproducibility of Scientific Software with LLMs FIXME
00:00 6. Terminology and Background on Intellectual Property What is the primary form of intellectual property typically associated with software?
What is the purpose of a license for software?
At what point can you assert copyright over your software?
00:10 7. Why You Should Choose a License What are the two basic categories of software licenses?
What are the benefits of specifying a license for your software?
00:36 8. What is Open Source? What organization is considered to be the arbiter of whether or not a license is open source?
What are the ‘four freedoms’ by which the Free Software Foundation defines free (aka open-source) software?
What is the difference between a permissive and a copyleft license?
Is there a licensing scheme comparable to open-source for non-software works?
01:08 9. Why Choose Open Source Licensing? What are some of the reasons for preferring open-source licensing over proprietary?
Does open-source licensing prevent you from making money off of your software?
Does open-source licensing guarantee the sustainability of your software?
01:46 10. Choosing an Open Source License What are some of the reasons for going with an established open-source license instead of creating a new one?
What are some of the most popular open-source licenses?
Name a tool that can help with a more detailed understanding of common open-source licenses?
02:14 11. Documenting Your Choice of License What are the two basic strategies for documenting your choice of license?
What information should you include in each file in your software?
02:34 12. Collaboration and Licensing What are the concerns with accepting code from collaborators?
What mechanisms are there to ensure collaborators agree to license terms?
What concerns are there with using code from online forums?
Why are LLMs challenging for copyright and licensing?
03:06 Finish

The actual schedule may vary slightly depending on the topics and exercises chosen by the instructor.