Developer Guide¶
This guide is targeting mainly developers, maintainers and other technical contributors and provides more information on how to work with this repository.
Important Information
TODO: Final Steps¶
Dear project author, thank you for using fair-python-cookiecutter!
Before diving into your actual project work, please complete the following steps to finalize the configuration of your project repository:
Inspect the generated project files¶
We suggest that first you familiarize yourself with the generated structure and make it "your own". The following sections of this guide provide a high-level overview, but you might want to inspect the various files to get a better understanding. A few files contain TODO items or sections -- please complete and remove them.
Test the tools locally¶
After having some idea about the repository structure, we suggest that you try to run some operations, such as linting, running tests and building the documentation on your computer.
Push the repository¶
If you have not created an empty repository in your git hosting service already, you should create it now. Follow the instructions of your hosting service to push an existing repository (i.e. this one), which will consist of
- adding the remote repository locally (git remote add ...)
- pushing the contents to the remote (git push)
Check the CI¶
Your first push should have automatically triggered the CI pipeline. Please check that it runs successfully.
Set up Pages and Releases¶
For deployment of documentation pages and releases of your code, some additional configuration is required. Please consult the corresponding sections of this guide.
Overview¶
Repository Structure¶
Here is a non-exhaustive list of the most important files and directories in the repository.
- AUTHORS.md: acknowledges and lists all contributors
- CHANGELOG.md: summarizes the changes for each version of the software for users
- CODE_OF_CONDUCT.md: defines the social standards that must be followed by contributors
- CONTRIBUTING.md: explains how others can contribute to the project
- README.md: provides an overview and points to other resources
- CITATION.cff: metadata stating how to cite the project
- codemeta.json: metadata for harvesting by other tools and services
- LICENSE: the (main) license of the project
- LICENSES: copies of all licenses that apply to files in the project
- .reuse/dep5: granular license and copyright information for all files and directories
- pyproject.toml: project metadata, dependencies, development tool configurations
- poetry.lock: needed for reproducible installation of the project
- src: actual code provided by the project
- tests: all tests for the code in the project
- mkdocs.yml: configuration of the project website
- docs: most contents used for the project website
- .pre-commit-config.yaml: quality assurance tools used in the project
- .github/workflows: CI scripts for GitHub (QA, documentation and package deployment)
- .github/ISSUE_TEMPLATE: templates for the GitHub issue tracker
- .gitlab-ci.yml: mostly equivalent CI scripts, but for GitLab
- .gitlab/issue_templates: The same issues templates, but for GitLab
Tip
You might find various other files popping up which are generated by different tools.
Most of these should not be committed into the repository, so they are excluded
in the .gitignore file. Everything listed there is safe to delete.
Used Tools¶
Here is a non-exhaustive list of the most important tools used in the project.
- poetryfor dependency management and packaging
- poethepoettool for running common tasks
- pre-commitfor orchestrating linters, formatters and other utilities
- mkdocsfor generating the project documentation website
- mikefor managing the- mkdocs-generated documentation website
- flake8for general linting (using various linter plugins)
- mypyfor editor-independent type-checking
- pytestfor unit testing
- pytest-covfor computing code coverage by tests
- hypothesisfor property-based testing
- banditfor checking security issues in the code
- safetyfor checking security issues in the current dependencies
- blackfor source-code formatting
- autoflakefor automatically removing unused imports
- pydocstylefor checking docstring conventions
- cffconvertto check the- CITATION.cff(citation metadata)
- codemetapyto generate a- codemeta.json(general software metadata)
- somesyto keep all important metadata continuously synchronized
- reuseto check REUSE-compliance (granular copyright and license metadata)
- licensecheckto scan for possible license incompatibilities in the dependencies
Tip
Most tools installed and used by this project are listed in the
pyproject.toml and .pre-commit-config.yaml files.
Basics¶
The project
- heavily uses pyproject.toml, which is a recommended standard
- adopts the srclayout, to avoid common problems
- keeps the actual code (src) and test code (tests) separated
The pyproject.toml is the main configuration file for the project. It contains both
general information about the software as well as configuration for various tools.
In older software, most of this information is often scattered over many little
tool-specific configuration files and a setup.py, setup.cfg and/or requirements.txt
file.
Tip
pyproject.toml is the first place your should check
when looking for the configuration of some development tool.
Configuration¶
The main tool needed to manage and configure the project is Poetry.
Please follow its setup documentation to install it correctly. Poetry should not
be installed with pip like other Python tools.
Poetry performs many important tasks:
- it manages the virtual environment(s) used for the project
- it manages all the dependencies needed for the code to work
- it takes care of packaging the code into a pip-installable package
You can find a cheatsheet with the most important commands here and consult its official documentation for detailed information.
Note that poetry is only needed for development of the repository.
The end-users who just want to install and use this project
do not need to set up or know anything about poetry.
Tip
If you use poetry shell to activate the virtual environment of the project,
and the project is already installed with poetry install, in the following you do not
have to prepend poetry run in the commands you will see below.
Task Runner¶
It is a good practice to have a common way for launching different project-related tasks. It removes the need of remembering flags for various tools, and avoids duplication of the same commands in the CI pipelines. If something in a workflow needs to change, it can be changed in just one place, thus reducing the risk of making a mistake.
Often projects use a shell script or Makefile for this purpose. This project uses
poethepoet, as it integrates nicely with poetry.
The tasks are defined in pyproject.toml and can be launched using:
poetry run poe TASK_NAME
CI Workflows¶
The project contains CI workflows for both GitHub and GitLab.
The main CI pipeline runs on each new pushed commit and will
- Run all configured code analysis tools,
- Run code tests with multiple versions of Python,
- build and deploy the online project documentation website, and
- if a new version tag was pushed, launch the release workflow
Quality Control¶
Static Analysis¶
Except for code testing, most tools for quality control are added to the project as
pre-commit hooks. The pre-commit tool takes care of
installing, updating and running the tools according to the configuration in the
.pre-commit-config.yaml file.
For every new copy of the repository (e.g. after git clone), pre-commit first must
be activated. This is usually done using pre-commit install, which also requires that
pre-commit is already available. For more convenience, we simplified the procedure.
In this project, you can run:
poetry run poe init-dev
This will make sure that pre-commit is enabled in your repository copy.
Once enabled,
every time you try to git commit some changed files
various tools will run on those (and only those) files.
This means that (with some exceptions) pre-commit by default will run only
on the changed files that were added to the next commit
(i.e., files in the git staging area).
These files are usually colored in green when running git status.
- Some tools only report the problems they detected
- Some tools actively modify files (e.g., fix formatting)
In any case, the git commit will fail if a file was modified by a tool, or some
problems were reported. In order to complete the commit, you need to
- resolve all problems (by fixing them or marking them as false alarm), and
- git addall changed files again (to update the files in the staging area).
After doing that, you can retry to git commit your changes.
To avoid having to deal with many issues at once, it is a good habit to run
pre-commit by hand from time to time. In this project, this can be done with:
poetry run poe lint --all-files
Testing¶
pytest is used as the main framework for testing.
The project uses the pytest-cov plugin
to integrate pytest with
coverage, which
collects and reports test coverage information.
In addition to writing regular unit tests with pytest, consider using
hypothesis,
which integrates nicely with pytest and implements property-based testing - which
involves automatic generation of randomized inputs for test cases. This can help to find
bugs often found for various edge cases that are easy to overlook in ad-hoc manual tests.
Such randomized tests can be a good addition to hand-crafted tests and inputs.
To run all tests, either invoke pytest directly, or use the provided task:
poetry run poe test
Tip
Add the flag --cov to enable the test coverage tracking and get a table with
results after the tests are completed.
Documentation¶
The project uses mkdocs with the popular and excellent
mkdocs-material
theme to generate the project documentation website, which provides both user and
developer documentation.
mkdocs is configured in the mkdocs.yml file, which we prepared in a way that there is
- no need to duplicate sections from files in other places (such as README.md)
- fully automatic API documentation pages based on Python docstrings in the code
- a detailed test coverage report is included in the website
The first point is important, because avoiding duplication means avoiding errors whenever text or examples are updated. The second point is convenient, as modules and functions do not need to be added by hand, which is easy to forget. The third point removes the need to use an external service such as CodeCov to store and present code coverage information.
As software changes over time and users cannot always keep up with the latest developments,
each new version of the software should provide version-specific documentation.
To make this both possible as well as convenient, this project uses
mike to generate and manage the mkdocs
documentation for different versions of the software.
Tip
You can easily add new pages (e.g. extended tutorials or topic-specific guides) to
your documentation website by creating markdown files in the docs/ directory and
adding them to the nav section in mkdocs.yml.
Offline Documentation¶
You can manually generate a local and fully offline copy of the documentation, which can be useful for e.g. previewing the results during active work on the documentation:
poetry install --with docs
poetry run poe docs
Once the documentation site is built, run mkdocs serve and
open https://localhost:8000 in your browser to see the local copy of the website.
Tip
You probably should always check bigger website updates locally before it is publicly deployed. The automatic pipelines can only catch technical problems, but you still e.g. might want to do some proof-reading.
Online Documentation¶
To avoid dependence on additional services such as readthedocs, the project website is set up for simple deployment using GitHub Pages or GitLab Pages.
The provided CI pipeline automatically generates the documentation for the latest
development version (i.e., current state of the main branch) as well as every released
version (i.e., marked by a version tag vX.Y.Z).
Publishing the documentation to a website using GitHub or GitLab Pages needs a bit of configuration. Please follow the steps for your respective hosting service.
- Create a new project access token for GitLab Pages deployment- in your GitLab project, go to Settings > Access Tokens
- Add a new token with the following settings:- Token name: PAGES_DEPLOYMENT_TOKEN
- Expiration date: (far in the future)
- Select a role: Maintainer
- Select scopes: read_repository, write_repository
 
- Token name: 
 
- Provide the token as a masked(!) variable to the CI pipeline- in your GitLab project, go to Settings > CI/CD
- in the section Variables add a new variable with- Key: PAGES_TOKEN
- Value: (the token string, as generated in the previous step)
- enable Mask variable, so your token will not appear in logs
 
- Key: 
 
- Ensure that the GitLab pages URL is correct- in your GitLab project, go to Deploy > Pages
- make sure that Use unique domain is NOT enabled
- check that under Access pages the URL matches the site_urlin yourmkdocs.yml
 
- make sure that you pushed the repository and the CI pipeline completed at least once
- check that a gh-pagesbranch exists (created by the CI)
- go to your GitHub repository Settings and from there to settings for Pages
- under Build and deployment pick gh-pagesas the branch for serving documentation
Important Information
When adding any kind of token to your repository configuration, which usually allows code and pipelines to access and modify your project, make sure that the token is protected.
- In GitHub, tokens should be always added as secrets
- In GitLab, tokens should be added as CI variables that are masked
This will make sure that the token will not appear in logs of the CI pipeline runs and minimize the risk of abuse for malicious purposes. NEVER save a token in a text file in your repository!
Tip
Should anything go wrong and you need to manually access the data of the deployed
website, you can find it in the gh-pages or gl-pages branch of the repository.
Normally you should not need to use that branch directly, though.
Releases¶
From time to time the project is ready for a new release for users.
Creating a New Release¶
Before releasing a new version, push the commit the new release should be based on to the upstream repository, and make sure that:
- the CI pipeline completes successfully
- the version number in pyproject.tomlis updated, in particular:
- it must be larger than the previous released version
- it should adequately reflect the severity of changes
- the provided user and developer documentation is up-to-date, including:
- a new section in the CHANGELOG.mdfile summarizing changes in the new version
- possibly revised information about contributors and/or maintainers
If this is the case, proceed with the release by:
- creating a new tag that matches the version in the pyproject.toml:git tag vX.Y.Z
- pushing the new tag to the upstream repository: git push origin vX.Y.Z
The pushed version tag will trigger a pipeline that will:
- build and deploy the documentation website for the specific version
- publish the package to enabled targets (see below)
Release Targets¶
The CI pipelines are built in such a way that features can be enabled, disabled and configured easily.
Targets for releases can be enabled or disabled in the variables section in .gitlab-ci.yml.
Targets for releases can be enabled or disabled in .github/workflows/ci.yml and
configured by adapting the corresponding actions in .github/workflows/releases.yml.
GitHub / GitLab Release¶
By default, the release workflow will create a basic GitHub or GitLab Release that provides a snapshot of the repository as a download. This requires no additional configuration.
See here for information on how the Github release can be customized.
Note
The Github Release can be used to trigger automated software publication of your
released versions to
Zenodo,
based on the metadata provided in the CITATION.cff file.
PyPI and Compatible Indices¶
The CI pipelines support automatic releases to PyPI, Test PyPI or other custom repositories, but in any case this requires a bit of initial configuration.
For automated releases to PyPI and Test PyPI the project uses the classic token-based workflow.
Before the project can be released to PyPI or Test PyPI the first time,
a new PyPI API token must be created in the PyPI account of the main project maintainer,
and added to your CI as a masked variable, and a variable updated in the .gitlab-ci.yml.
The corresponding tokens can be added analogously to the PAGES_TOKEN for online documentation,
which was explained here.
PyPI:
- add the token as a masked CI variable called RELEASE_TOKEN_pypi
- in .gitlab-ci.yml, setrelease_to_pypi: "true"
Test PyPI:
- add the token as a masked CI variable called RELEASE_TOKEN_testpypi
- in .gitlab-ci.yml, setrelease_to_testpypi: "true"
Custom Package Index:
- add the token as a masked CI variable called RELEASE_TOKEN_custom
- in .gitlab-ci.yml, setrelease_to_custom: "true"
- update PKGIDX_URLin therelease_custom_pypijob to the correct legacy API endpoint
For automated releases to PyPI and Test PyPI the project uses the new Trusted Publishers workflow that is both more secure and convenient to use than other authorization methods.
Before the project can be released to PyPI or Test PyPI the first time,
first a pending publisher
must be added in the PyPI account of the main project maintainer, using
release.yml as the requested workflow name.
Note
It is important to use the correct workflow name, otherwise the workflow will fail!
Once this is done, set the corresponding option (to_pypi / to_test_pypi) to true
in the publish job in ci.yml to enable the corresponding publication target.
If the old and less secure token-based authentication method is needed or
the package should be published to a different PyPI-compatible package index, please
adapt release.yml accordingly.
If for some reason you do not want to use the CI for the PyPI releases, you can skip these instructions
and manually use poetry publish to do the release.