Skip to content

DuraMAT/software_guide

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

70 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Guidelines for open-source software development

Purposes

Open approach to knowledge and data sharing in science promotes a more accurate verification of scientific results, reduces duplication of efforts, and makes research more efficient through tools and applications. It also enables inclusivity through more efficient knowledge-creation, transfer, and access to the public.

To help you share software products effectively, including:

  • Sharing best practices in software dissemination

  • Save time and effort in the dissemination process

  • Establishing some consistency across projects

  • Getting you (and DuraMat) more credit for software products

Software products funded by DuraMAT

DuraMAT funds many projects that produce software products:

Project Link
DuraMat data hub https://datahub.duramat.org
PV Analytics https://github.com/pvlib/pvanalytics
PV Degradation Tools https://github.com/NREL/PVDegradationTools
PV Ops https://github.com/sandialabs/pvOps
VocMax https://github.com/toddkarin/vocmax
PV Climate Zones https://github.com/toddkarin/pvcz
PV Vision https://github.com/hackingmaterials/pv-vision
PV Tools https://pvtools.lbl.gov/string-length-calculator
PV ARC thickness estimator https://github.com/DuraMAT/pvarc
PV-terms https://github.com/DuraMAT/pv-terms
Comparative LCOE calculator https://www.github.com/NREL/PVLCOE
PV-Pro SDM parameter estimation https://github.com/DuraMAT/pvpro
IV curve correction https://github.com/DuraMAT/IVcorrection
WhatsCracking https://datahub.duramat.org/dataset/whatscracking-application

Duramat supports open-science and encourages, whenever possible, software products should follow F.A.I.R. practices

Overview of dissemination levels

The level of dissemination should depend on the purpose of the software

Level 1: Basic repository to support a document

Typically, Level 1 repository is used to support and document published analyses for enhanced reproducibility – e.g., something akin to supporting information for a journal publication.

To build up Level 1 repository:

1.1 Get approval for code release

Follow Laboratory-specific guidelines for approval to release your code

1.2 Add inline code documentation

Some notes:

  • The formatting of the docstring can depend on if you are autoconverting the docstrings to HTML documentation

  • Common formatting examples include reST (restructured text), Google formatting, epyDoc, etc.

  • You can add type hinting to further help in code readability as well as the ability to use static type checking tools

Example of inline code documentation:

1.3 Add README

(https://github.com/DuraMAT/pv-terms/blob/master/README.md)

1.4 Add LICENSE

  • Talk to your lab’s IP / IT departments for guidance

  • BSD/MIT licenses are examples of very “open” licenses that allow others to do what they’d like with the software. BSD typically gives some more protection against others using your name to promote their product, e.g.:

    • “our commercial product uses LBL-approved software technology for its analysis”
    • “uses the same algorithms developed by the brilliant scientist <your_name_here>”
  • Be careful about choosing licenses that require all downstream code to also use the same license, e.g., GPL/Apache.

  • If you leave DuraMat and work for a company, you may no longer be able to use your own code as companies typically avoid any GPL code

  • Some labs may actually discourage or ban versions of such licenses because they contain patent-granting language (e.g., Apache 2.0 and GPL 3.0 for LBL)

  • If you really insist on these licenses, suggest talking to DuraMat program (for impact on industry adoption) as well as your lab’s IPO

Example of license:

BSD 3-Clause License

Copyright 2020-2023 Alliance for Sustainable Energy, LLC

Redistribution and use in source and binary forms, with or without modification, are permitted provided that the following conditions are met:

Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.

Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.

The name of the copyright holder, contributors, the United States Government, the United States Department of Energy, or any of their employees may not be used to endorse or promote products derived from this software without specific prior written permission.

THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER, CONTRIBUTORS, UNITED STATES GOVERNMENT OR UNITED STATES DEPARTMENT OF ENERGY, NOR ANY OF THEIR EMPLOYEES, BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

(https://github.com/NREL/PV_ICE/blob/main/LICENSE.md)

Level 2: Repository to support an entire project

Typically, may serve as documentation for the innovations of an entire project, e.g., for multiple publications. However, the project may no longer be actively maintained after project end.

To build up Level 2 repository:

2.1 All Level 1 items

Level 1: Basic repository to support a document

2.2 Ensure that DOE requirements are being met

In addition to lab-specific guidelines, ensure that DOE requirements are being met. For example, this likely includes:

  • Software Record (gets recorded in OSTI.gov and helps in reporting purposes / credit)
  • Lab-specific approval to release code

2.3 Set up a public facing Github repository

This could be hosted by the project organization, by your institution, or by your research lab. Examples include:

2.4 Additional README components

  • Screenshot or visual aid of the project
  • Current status of the project (testing use, production use, actively maintained, etc.)
  • Funding information and institutional branding (logo, funding acknowledgement text)

2.5 Add Contributor license agreement (CLA)

A CLA defines the terms under which intellectual property has been contributed to a company/project.

Example of CLA (the bottom of the LBL BSD-3 license):

You are under no obligation whatsoever to provide any bug fixes, patches, or upgrades to the features, functionality or performance of the source code ("Enhancements") to anyone; however, if you choose to make your Enhancements available either publicly, or directly to Lawrence Berkeley National Laboratory or its contributors, without imposing a separate written license agreement for such Enhancements, then you hereby grant the following license: a non-exclusive, royalty-free perpetual license to install, use, modify, prepare derivative works, incorporate into other computer software, distribute, and sublicense such enhancements or derivative works thereof, in binary and source code form.

(https://spdx.org/licenses/BSD-3-Clause-LBNL.html)

Another example of a CLA signing instructions and CLA file

2.6 Use a standard layout for the repository

You can look up standard project layouts for the programming language you are using. Some details of the layout may depend on the tools you are using for other tasks such as code distribution or continuous integration. An example is shown as:

You can also use the cookiecutter and cruft package to help

  • The cookiecutter package will set up different package structures depending on your usage
  • The cruft package can help you keep things up to date as things change

You can easily get started with cookiecutter using two command lines:

pip install cookiecutter

cookiecutter https://github.com/audreyfeldroy/cookiecutter-pypackage.git

2.7 Add a consistent versioning scheme

Examples include semantic versioning (v0.0.1) and date-based versioning (v2023.01.25); tools like versioneer may help.

2.8 Ensure your software is easy to install locally

Ensure your software is easy to install locally, including any necessary dependencies. For example, Python projects may include files such as setup.py or requirements.txt.

Files (like setup.py) can help users install the correct dependencies with the correct versions to ensure your software runs smoothly with an example as follows:

2.9 Report your software to your funding

Report your software to your funding program so it can be included in accomplishments

Level 3: Repository for long-term projects

The project is intended to be used and maintained long-term by the project team and a community of users; project lives on even if/when initial developers exit the project

To build a Level 3 repository:

3.1 All Level 1 items

Level 1: Basic repository to support a document

3.2 All Level 2 items

Level 2: Repository to support an entire project

3.3 Implement a release system

One option is to use Github tags and releases. You can obtain a digital object identifier (DOI) for each release via Zenodo:

  • Link the Github repo to Zenodo
  • Perform the release and tag it
  • Update the README to include the DOI identifier Zenodo provides in the “how to cite” section

3.4 Set up continuous integration (CI) tools

Continuous integration (CI) is a software practice that requires frequently committing code to a shared repository. Examples include Github actions to execute CircleCI, Travis CI, etc. against pull requests.

A code coverage tool (e.g. coveralls) can help establish that tests cover the entire codebase and publish test status (pass/fail, test coverage)

3.5 Check for consistent code formatting

Keeping your code clean: pre-commit hooks

  • Pre-commit hooks run a series of checks and automated fixes against your code before you commit that code to git
  • For example, pre-commit hooks can:
    • Auto-fix indentation, trailing spaces, line ending, line length, etc. issues (e.g., via a tool like black). This will essentially free up any energy in the project from code formatting issues
    • Warn against issues like unused imports, undefined variables, bare ”except” clause, too high code complexity, etc. (via a tool like flake)
  • If set up early on, it keeps your code “on track” of clean code
  • It can also be installed and run later, but then you may get a long list of previous code issues to fix

Similar checks on Github through Github workflows:

3.6 Add Documentation pages

Documents can be deployed at several places (e.g., Github pages, Jupyter books, readthedocs). Documentation pages should provide:

  • Getting started. Provide simple instructions to install the code and run a sample problem. Links here to Tutorials.
  • Examples / Tutorials. Links here to illustrations of using the code.
  • API reference. Links here to the documentation of each public Class, function and/or method. Note that this can typically be auto-generated.
  • Release notes. Links here to logs of changes with each tagged release.

3.7 Release large data sets with code

  • Data sets should be formally released into a separate archival repository (project-specific data hub (e.g., DuraMat Data Hub), Figshare, Dryad, etc.).

  • Include in the repository: smaller files that are needed for the code, for example for unit test or examples, provided they have been cleared for release and are not infringing copyright from other sources or NDAs.

  • Remember not to use links to local files on your computer!

  • Git and Github are generally not suitable for large files. (Git-LFS (Git Large File System) is intended to solve this, but can be clunky.)

3.8 Auto- and Peer-checking of your repository

Auto-checking with Scientific Python’s repo-review

  • You can run a check on your repo using Scientific Python’s repo-review tool
  • Web version may not work, but command line version does.
  • Example of pvlib:

Peer-checking with pyopensci

  • It may be a good exercise for larger libraries like pvlib
  • Create an issue here, it will guide you through the process
  • If you are nervous or skeptical, one of the options is a “presubmission inquiry”

3.9 Upload to other code services

Upload to PyPI, Conda, or other easy install code service.

3.10 Submit to a code-centric journal (optional)

Consider submitting to a code-centric journal publication such as Journal of Open Source Software. The length of a JOSS paper is 250 – 1000 words, i.e., the entire paper is like a couple of abstracts.

Example of pvlib on JOSS:

Self-checking your work with the JOSS review checklist:

Reviews can be done via Github repo:

Getting DuraMat Support

If after reading this guide there are doubts, or you require some more assistance to ramp up your software to the proper level or would like us to assess your current completeness, Contact Us

Some useful links

https://michal.karzynski.pl/blog/2019/05/26/python-project-maturity-checklist/

https://michal.karzynski.pl/blog/2019/05/26/python-project-maturity-checklist/

https://dbader.org/blog/write-a-great-readme-for-your-github-project

https://www.patricksoftwareblog.com/software-development-checklist-for-python-applications/

About

Guidelines for the open-source software development

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Contributors 4

  •  
  •  
  •  
  •