Article for bssw

betterscientificsoftware · Sep 19, 2024 · 3704154 · 3704154
1 parent 48f6aaf
commit 3704154
Showing 1 changed file with 38 additions and 0 deletions.
diff --git a/Articles/Blog/2024-11-documentation-psip.md b/Articles/Blog/2024-11-documentation-psip.md
@@ -0,0 +1,38 @@
+Enhancing the stability and impact of containerized environments through documentation improvement
+
+Scientific software requires complex sets of dependencies on a wide variety of platforms.  It is a challenge to create a standard environment for development, testing, and deployment.  These 3 phases of software creation often occur on separate machines and at different times.  Testing and deployment are ideally automated processes which occur on dedicated systems.  Development usually takes place on a user's machine and that machine may need to have multiple software environments available.  Containers can help with these challenges and can be used to get a standard environment on all of these different machines that can be switched easily depending on what a developer needs, so they have potential to make a great impact in standardized software environments for reproducibility and for portability.   We are trying to provide turn-key environment containers for projects that look and feel like the systems they are used to using.  It was important for the container environment to closely match the environment on the current development, build, and test systems.  Users are accustomed to using environment modules to get into their desired environment.  The process for building typically looks something like this:
+
+* log into the target system (usually for specific OS)
+* Load the environment (source a script, load modules, set environment variables, etc.)
+* configure (run a script, call cmake, etc.)
+* build (make, ninja)
+* test
+
+With the containerized environments we want to interrupt this workflow as little as possible while still getting the reliability, reproducibility, mobility, and performance that projects need.  The idea is to provide containers that fit easily into existing workflows and require minimal changes to a project’s infrastructure.  The containers we are building are targeted at the OS/system/environment level.  For a project to use our containers the workflow looks like:
+
+* log into the target system (usually for specific OS) Log into any system that runs podman
+* Load the environment (source a script, load modules, set environment variables, etc.) launch the environment container
+* configure (run a script, call cmake, etc.)
+* build (make, ninja)
+* test
+
+In this way a project does not need to change their configure and build scripts to use the containers. The only significant change is in setting up the environment.  The containers are made to match our bare metal installations which are deployed across systems via shared disks.  Users access the bare metal installations through environment modules. To make the above workflows equivalent, we had to exactly match how the modules set the environment inside the container to how it is set on the existing bare metal installations that the projects are already using.   We embarked on some research to figure out how to create containers that have the required software environment for our codes that are easy to use and easy to port to many platforms.  The initial effort was exploratory, informal, and mainly conducted by one developer. The bare metal installations are already installed with spack so the obvious path to follow was to use spack inside of containers.  We were able to use the same spack configuration files in the containers as we use on the host machines.  To test our progress in recreating the existing environment we needed to build a code in a container using the same scripts that were used on bare metal.  While it made sense for the developers to be the first to exercise these container capabilities, it was clear that we needed.  In the end we were able to build our entire software environments inside containers and to make those available via internal container registries.  In addition, we developed infrastructure that allows projects to build containers tailored to exactly their needs which can reduce the size of the containers.  More details of our container environment project were presented at NLIT (National Laboratories Information Technology) in 2024 and the slides can be found here: 2024_04_NLIT_SEMS_Containers.
+
+
+In addition to containerizing scientific software environments, we were also interested in improving our processes including integrating the generation of project documentation into the daily workflow. The overall effort for the containerized environment project was small and relatively short lived with one developer working 20% for a few months. This made it difficult to devote a large amount of time to documentation.  While it was important to capture the state of the investigation, we could not afford to slow the development process.  To help us improve the documentation we were inspired by some of the core ideas of the PSIP, although we did not follow PSIP in a highly rigorous way.  We created a progress tracking card (PTC) and revisited it a couple times to keep on track.  The process essentially provides a structured but lightweight way to have conversations about improvements we wanted to make in the project.  It was a way to break an abstract goal into concrete achievable steps.  This process helped simply by making it a point of conversation during project meetings. One challenge of integrating documentation into the developer workflow is to determine what level of documentation is appropriate.  The goal of having "good documentation" is both uncontroversial and too vague to be useful. We had to think about what kinds of documentation were needed, who the audience is for each type of documentation, how large the audience is for each type, among other factors. When considering this it becomes clear that there are a lot of things that one can document about a software project.  You can imagine different documents with names like: “How to use this software”, “How the software actually works”, “Why we need this software”, “Why the developers made all the decisions that they made”.  All of these would have different audiences and would document very different things about the project.  In the case of a mature, well used project it seems appropriate that you would want all those kinds of documentation.  For a small effort such as ours we needed to document the project, but it was not clear what was going to end up being ultimately fruitful or important.  Another challenge is that the developer on the project tends to get lost in the exploration and solving of problems and to leave documentation to an undisclosed “later date”.  This means that details would be lost and there was not a clear record of how the project came to be in its current state, why decisions were made, and what else was tried and did not pan out.   It seems reasonable given this situation that we first try to implement a lab notebook style of documentation in a very light weight way just to make improvements.  The practice we adopted was to have a wiki open in one window and the terminal in the other.   As problems were encountered, we could copy and paste code snippets, notes, commands run on the CLI into the wiki to create a record of the process of investigating how to best use containers for our code teams.  Every month a new wiki page was created to give some idea of chronology to the documents.
+
+
+This is a very lightweight and very rough form of documentation that I suspected at first would not be very useful.  However, I almost immediately realized it was useful in several ways.
+
+1) By creating such documents, I no longer had to keep all the information and experience in my active memory.  It became a reference document nearly as soon as I created it.  I was able to come back and recall details that I otherwise would have missed.  The time cost of creating these lightweight notes was quickly recovered by the time it saved as a reference document for myself.
+
+2) These documents added depth and context to discussions about the project the project with other people.   I was able to show exactly what been tried, what had failed, and what succeeded.  This led to richer and more productive conversations about the project and how we should proceed.
+
+3) The documentation allowed for much better collaboration.  For example, we were able to point a summer intern at these documents and he was able to get up to speed enough to use our containers and infrastructure as part of his summer project.  These documents were not only useful for me but without any editing were useful for others trying to contribute.
+
+4) Even in situations where more complete, clean, and detailed documentation is eventually needed, the rough notes provide a great starting point to work from, and decrease the risk of omitting details, as compared to going back and creating documentation from scratch after the fact
+
+Going into the project I knew that my process for documenting my work could be improved.  I always thought that useful documentation would take a significant amount of effort I was surprised how beneficial this level of documentation turned out to be and how quickly the documentation became valuable to me and others.  In documentation I usually let the perfect be the enemy of the good or be the enemy of any documentation at all.  I tend to think that creating documentation requires a lot of time and effort and quickly gets out of date without a sustained effort. Without the time to commit to perfect documentation, I delay the creation of documentation until the project is "ready".  I assume that there is some minimal level of documentation that is useful, and I have assumed that the useful level of documentation is higher than my experience with this project has demonstrated. The addition of very rough, quick, and unedited documentation to the workflow created a great benefit to the project overall.  The assumption I had is that the threshold for useful documentation was higher than I actually found it to be.  Writing down even a little helps a lot, even if more detail is required later.
+
+
+But old habits die hard and acquired skills are sometimes hard to integrate.  In this case the good news is that some of these lessons have stuck around.   Realizing just how low the barrier for usefulness is when it comes to documentation combined with seeing how helpful it can be for someone following along, I do find myself documenting more frequently.  I don't keep the "notebook wiki" open all the time but I do have a sharper eye for when documentation will be impactful for someone else and an understanding that it does not have to be formal and rigorous to be helpful.   I am much more likely to log my process step by step on gitlab/github/Jira tickets when I know other people may come to a similar problem later.