Developer Guide¶

This guide is targeting mainly developers, maintainers and other technical contributors and provides more information on how to work with this repository.

Important Information

TODO: Final Steps¶

Dear project author, thank you for using fair-python-cookiecutter!

Before diving into your actual project work, please complete the following steps to finalize the configuration of your project repository:

Inspect the generated project files¶

We suggest that first you familiarize yourself with the generated structure and make it "your own". The following sections of this guide provide a high-level overview, but you might want to inspect the various files to get a better understanding. A few files contain TODO items or sections -- please complete and remove them.

Test the tools locally¶

After having some idea about the repository structure, we suggest that you try to run some operations, such as linting, running tests and building the documentation on your computer.

Push the repository¶

If you have not created an empty repository in your git hosting service already, you should create it now. Follow the instructions of your hosting service to push an existing repository (i.e. this one), which will consist of

adding the remote repository locally (git remote add ...)
pushing the contents to the remote (git push)

Check the CI¶

Your first push should have automatically triggered the CI pipeline. Please check that it runs successfully.

Set up Pages and Releases¶

For deployment of documentation pages and releases of your code, some additional configuration is required. Please consult the corresponding sections of this guide.

Overview¶

Repository Structure¶

Here is a non-exhaustive list of the most important files and directories in the repository.

GeneralMetadataDevelopmentCI / QA

AUTHORS.md: acknowledges and lists all contributors
CHANGELOG.md: summarizes the changes for each version of the software for users
CODE_OF_CONDUCT.md: defines the social standards that must be followed by contributors
CONTRIBUTING.md: explains how others can contribute to the project
README.md: provides an overview and points to other resources

CITATION.cff: metadata stating how to cite the project
codemeta.json: metadata for harvesting by other tools and services
LICENSE: the (main) license of the project
LICENSES: copies of all licenses that apply to files in the project
.reuse/dep5: granular license and copyright information for all files and directories

pyproject.toml: project metadata, dependencies, development tool configurations
poetry.lock: needed for reproducible installation of the project
src: actual code provided by the project
tests: all tests for the code in the project
mkdocs.yml: configuration of the project website
docs: most contents used for the project website

.pre-commit-config.yaml: quality assurance tools used in the project
.github/workflows: CI scripts for GitHub (QA, documentation and package deployment)
.github/ISSUE_TEMPLATE: templates for the GitHub issue tracker
.gitlab-ci.yml: mostly equivalent CI scripts, but for GitLab
.gitlab/issue_templates: The same issues templates, but for GitLab

Tip

You might find various other files popping up which are generated by different tools. Most of these should not be committed into the repository, so they are excluded in the .gitignore file. Everything listed there is safe to delete.

Used Tools¶

Here is a non-exhaustive list of the most important tools used in the project.

GeneralCode QualityFormatting and StyleFAIR metadata

poetry for dependency management and packaging
poethepoet tool for running common tasks
pre-commit for orchestrating linters, formatters and other utilities
mkdocs for generating the project documentation website
mike for managing the mkdocs-generated documentation website

flake8 for general linting (using various linter plugins)
mypy for editor-independent type-checking
pytest for unit testing
pytest-cov for computing code coverage by tests
hypothesis for property-based testing
bandit for checking security issues in the code
safety for checking security issues in the current dependencies

black for source-code formatting
autoflake for automatically removing unused imports
pydocstyle for checking docstring conventions

cffconvert to check the CITATION.cff (citation metadata)
codemetapy to generate a codemeta.json (general software metadata)
somesy to keep all important metadata continuously synchronized
reuse to check REUSE-compliance (granular copyright and license metadata)
licensecheck to scan for possible license incompatibilities in the dependencies

Tip

Most tools installed and used by this project are listed in the pyproject.toml and .pre-commit-config.yaml files.

Basics¶

The project

heavily uses pyproject.toml, which is a recommended standard
adopts the src layout, to avoid common problems
keeps the actual code (src) and test code (tests) separated

The pyproject.toml is the main configuration file for the project. It contains both general information about the software as well as configuration for various tools.

In older software, most of this information is often scattered over many little tool-specific configuration files and a setup.py, setup.cfg and/or requirements.txt file.

Tip

pyproject.toml is the first place your should check when looking for the configuration of some development tool.

Configuration¶

The main tool needed to manage and configure the project is Poetry.

Please follow its setup documentation to install it correctly. Poetry should not be installed with pip like other Python tools.

Poetry performs many important tasks:

it manages the virtual environment(s) used for the project
it manages all the dependencies needed for the code to work
it takes care of packaging the code into a pip-installable package

You can find a cheatsheet with the most important commands here and consult its official documentation for detailed information.

Note that poetry is only needed for development of the repository. The end-users who just want to install and use this project do not need to set up or know anything about poetry.

Tip

If you use poetry shell to activate the virtual environment of the project, and the project is already installed with poetry install, in the following you do not have to prepend poetry run in the commands you will see below.

Task Runner¶

It is a good practice to have a common way for launching different project-related tasks. It removes the need of remembering flags for various tools, and avoids duplication of the same commands in the CI pipelines. If something in a workflow needs to change, it can be changed in just one place, thus reducing the risk of making a mistake.

Often projects use a shell script or Makefile for this purpose. This project uses poethepoet, as it integrates nicely with poetry. The tasks are defined in pyproject.toml and can be launched using:

poetry run poe TASK_NAME

CI Workflows¶

The project contains CI workflows for both GitHub and GitLab.

The main CI pipeline runs on each new pushed commit and will

Run all configured code analysis tools,
Run code tests with multiple versions of Python,
build and deploy the online project documentation website, and
if a new version tag was pushed, launch the release workflow

Quality Control¶

Static Analysis¶

Except for code testing, most tools for quality control are added to the project as pre-commit hooks. The pre-commit tool takes care of installing, updating and running the tools according to the configuration in the .pre-commit-config.yaml file.

For every new copy of the repository (e.g. after git clone), pre-commit first must be activated. This is usually done using pre-commit install, which also requires that pre-commit is already available. For more convenience, we simplified the procedure.

In this project, you can run:

poetry run poe init-dev

This will make sure that pre-commit is enabled in your repository copy.

Once enabled, every time you try to git commit some changed files various tools will run on those (and only those) files.

This means that (with some exceptions) pre-commit by default will run only on the changed files that were added to the next commit (i.e., files in the git staging area). These files are usually colored in green when running git status.

Some tools only report the problems they detected
Some tools actively modify files (e.g., fix formatting)

In any case, the git commit will fail if a file was modified by a tool, or some problems were reported. In order to complete the commit, you need to

resolve all problems (by fixing them or marking them as false alarm), and
git add all changed files again (to update the files in the staging area).

After doing that, you can retry to git commit your changes.

To avoid having to deal with many issues at once, it is a good habit to run pre-commit by hand from time to time. In this project, this can be done with:

poetry run poe lint --all-files

Testing¶

pytest is used as the main framework for testing.

The project uses the pytest-cov plugin to integrate pytest with coverage, which collects and reports test coverage information.

In addition to writing regular unit tests with pytest, consider using hypothesis, which integrates nicely with pytest and implements property-based testing - which involves automatic generation of randomized inputs for test cases. This can help to find bugs often found for various edge cases that are easy to overlook in ad-hoc manual tests. Such randomized tests can be a good addition to hand-crafted tests and inputs.

To run all tests, either invoke pytest directly, or use the provided task:

poetry run poe test

Tip

Add the flag --cov to enable the test coverage tracking and get a table with results after the tests are completed.

Documentation¶

The project uses mkdocs with the popular and excellent mkdocs-material theme to generate the project documentation website, which provides both user and developer documentation.

mkdocs is configured in the mkdocs.yml file, which we prepared in a way that there is

no need to duplicate sections from files in other places (such as README.md)
fully automatic API documentation pages based on Python docstrings in the code
a detailed test coverage report is included in the website

The first point is important, because avoiding duplication means avoiding errors whenever text or examples are updated. The second point is convenient, as modules and functions do not need to be added by hand, which is easy to forget. The third point removes the need to use an external service such as CodeCov to store and present code coverage information.

As software changes over time and users cannot always keep up with the latest developments, each new version of the software should provide version-specific documentation. To make this both possible as well as convenient, this project uses mike to generate and manage the mkdocs documentation for different versions of the software.

Tip

You can easily add new pages (e.g. extended tutorials or topic-specific guides) to your documentation website by creating markdown files in the docs/ directory and adding them to the nav section in mkdocs.yml.

Offline Documentation¶

You can manually generate a local and fully offline copy of the documentation, which can be useful for e.g. previewing the results during active work on the documentation:

poetry install --with docs
poetry run poe docs

Once the documentation site is built, run mkdocs serve and open https://localhost:8000 in your browser to see the local copy of the website.

Tip

You probably should always check bigger website updates locally before it is publicly deployed. The automatic pipelines can only catch technical problems, but you still e.g. might want to do some proof-reading.

Online Documentation¶

To avoid dependence on additional services such as readthedocs, the project website is set up for simple deployment using GitHub Pages or GitLab Pages.

The provided CI pipeline automatically generates the documentation for the latest development version (i.e., current state of the main branch) as well as every released version (i.e., marked by a version tag vX.Y.Z).

Publishing the documentation to a website using GitHub or GitLab Pages needs a bit of configuration. Please follow the steps for your respective hosting service.

GitLabGitHub

Create a new project access token for GitLab Pages deployment
- in your GitLab project, go to Settings > Access Tokens
- Add a new token with the following settings:
  - Token name: PAGES_DEPLOYMENT_TOKEN
  - Expiration date: (far in the future)
  - Select a role: Maintainer
  - Select scopes: read_repository, write_repository
Provide the token as a masked(!) variable to the CI pipeline
- in your GitLab project, go to Settings > CI/CD
- in the section Variables add a new variable with
  - Key: PAGES_TOKEN
  - Value: (the token string, as generated in the previous step)
  - enable Mask variable, so your token will not appear in logs
Ensure that the GitLab pages URL is correct
- in your GitLab project, go to Deploy > Pages
- make sure that Use unique domain is NOT enabled
- check that under Access pages the URL matches the site_url in your mkdocs.yml

make sure that you pushed the repository and the CI pipeline completed at least once
check that a gh-pages branch exists (created by the CI)
go to your GitHub repository Settings and from there to settings for Pages
under Build and deployment pick gh-pages as the branch for serving documentation

Important Information

When adding any kind of token to your repository configuration, which usually allows code and pipelines to access and modify your project, make sure that the token is protected.

In GitHub, tokens should be always added as secrets
In GitLab, tokens should be added as CI variables that are masked

This will make sure that the token will not appear in logs of the CI pipeline runs and minimize the risk of abuse for malicious purposes. NEVER save a token in a text file in your repository!

Tip

Should anything go wrong and you need to manually access the data of the deployed website, you can find it in the gh-pages or gl-pages branch of the repository. Normally you should not need to use that branch directly, though.

Releases¶

From time to time the project is ready for a new release for users.

Creating a New Release¶

Before releasing a new version, push the commit the new release should be based on to the upstream repository, and make sure that:

the CI pipeline completes successfully
the version number in pyproject.toml is updated, in particular:
it must be larger than the previous released version
it should adequately reflect the severity of changes
the provided user and developer documentation is up-to-date, including:
a new section in the CHANGELOG.md file summarizing changes in the new version
possibly revised information about contributors and/or maintainers

If this is the case, proceed with the release by:

creating a new tag that matches the version in the pyproject.toml: git tag vX.Y.Z
pushing the new tag to the upstream repository: git push origin vX.Y.Z

The pushed version tag will trigger a pipeline that will:

build and deploy the documentation website for the specific version
publish the package to enabled targets (see below)

Release Targets¶

The CI pipelines are built in such a way that features can be enabled, disabled and configured easily.

GitLabGitHub

Targets for releases can be enabled or disabled in the variables section in .gitlab-ci.yml.

Targets for releases can be enabled or disabled in .github/workflows/ci.yml and configured by adapting the corresponding actions in .github/workflows/releases.yml.

GitHub / GitLab Release¶

By default, the release workflow will create a basic GitHub or GitLab Release that provides a snapshot of the repository as a download. This requires no additional configuration.

See here for information on how the Github release can be customized.

Note

The Github Release can be used to trigger automated software publication of your released versions to Zenodo, based on the metadata provided in the CITATION.cff file.

PyPI and Compatible Indices¶

The CI pipelines support automatic releases to PyPI, Test PyPI or other custom repositories, but in any case this requires a bit of initial configuration.

GitLabGitHub

For automated releases to PyPI and Test PyPI the project uses the classic token-based workflow.

Before the project can be released to PyPI or Test PyPI the first time, a new PyPI API token must be created in the PyPI account of the main project maintainer, and added to your CI as a masked variable, and a variable updated in the .gitlab-ci.yml.

The corresponding tokens can be added analogously to the PAGES_TOKEN for online documentation, which was explained here.

PyPI:

add the token as a masked CI variable called RELEASE_TOKEN_pypi
in .gitlab-ci.yml, set release_to_pypi: "true"

Test PyPI:

add the token as a masked CI variable called RELEASE_TOKEN_testpypi
in .gitlab-ci.yml, set release_to_testpypi: "true"

Custom Package Index:

add the token as a masked CI variable called RELEASE_TOKEN_custom
in .gitlab-ci.yml, set release_to_custom: "true"
update PKGIDX_URL in the release_custom_pypi job to the correct legacy API endpoint

For automated releases to PyPI and Test PyPI the project uses the new Trusted Publishers workflow that is both more secure and convenient to use than other authorization methods.

Before the project can be released to PyPI or Test PyPI the first time, first a pending publisher must be added in the PyPI account of the main project maintainer, using release.yml as the requested workflow name.

Note

It is important to use the correct workflow name, otherwise the workflow will fail!

Once this is done, set the corresponding option (to_pypi / to_test_pypi) to true in the publish job in ci.yml to enable the corresponding publication target.

If the old and less secure token-based authentication method is needed or the package should be published to a different PyPI-compatible package index, please adapt release.yml accordingly.

If for some reason you do not want to use the CI for the PyPI releases, you can skip these instructions and manually use poetry publish to do the release.