Developer Guide¶
This guide is targeting mainly developers, maintainers and other technical contributors and provides more information on how to work with this repository.
Important Information
TODO: Final Steps¶
Dear project author, thank you for using fair-python-cookiecutter
!
Before diving into your actual project work, please complete the following steps to finalize the configuration of your project repository:
Inspect the generated project files¶
We suggest that first you familiarize yourself with the generated structure and make it "your own". The following sections of this guide provide a high-level overview, but you might want to inspect the various files to get a better understanding. A few files contain TODO items or sections -- please complete and remove them.
Test the tools locally¶
After having some idea about the repository structure, we suggest that you try to run some operations, such as linting, running tests and building the documentation on your computer.
Push the repository¶
If you have not created an empty repository in your git hosting service already, you should create it now. Follow the instructions of your hosting service to push an existing repository (i.e. this one), which will consist of
- adding the remote repository locally (
git remote add ...
) - pushing the contents to the remote (
git push
)
Check the CI¶
Your first push should have automatically triggered the CI pipeline. Please check that it runs successfully.
Set up Pages and Releases¶
For deployment of documentation pages and releases of your code, some additional configuration is required. Please consult the corresponding sections of this guide.
Overview¶
Repository Structure¶
Here is a non-exhaustive list of the most important files and directories in the repository.
AUTHORS.md
: acknowledges and lists all contributorsCHANGELOG.md
: summarizes the changes for each version of the software for usersCODE_OF_CONDUCT.md
: defines the social standards that must be followed by contributorsCONTRIBUTING.md
: explains how others can contribute to the projectREADME.md
: provides an overview and points to other resources
CITATION.cff
: metadata stating how to cite the projectcodemeta.json
: metadata for harvesting by other tools and servicesLICENSE
: the (main) license of the projectLICENSES
: copies of all licenses that apply to files in the project.reuse/dep5
: granular license and copyright information for all files and directories
pyproject.toml
: project metadata, dependencies, development tool configurationspoetry.lock
: needed for reproducible installation of the projectsrc
: actual code provided by the projecttests
: all tests for the code in the projectmkdocs.yml
: configuration of the project websitedocs
: most contents used for the project website
.pre-commit-config.yaml
: quality assurance tools used in the project.github/workflows
: CI scripts for GitHub (QA, documentation and package deployment).github/ISSUE_TEMPLATE
: templates for the GitHub issue tracker.gitlab-ci.yml
: mostly equivalent CI scripts, but for GitLab.gitlab/issue_templates
: The same issues templates, but for GitLab
Tip
You might find various other files popping up which are generated by different tools.
Most of these should not be committed into the repository, so they are excluded
in the .gitignore
file. Everything listed there is safe to delete.
Used Tools¶
Here is a non-exhaustive list of the most important tools used in the project.
poetry
for dependency management and packagingpoethepoet
tool for running common taskspre-commit
for orchestrating linters, formatters and other utilitiesmkdocs
for generating the project documentation websitemike
for managing themkdocs
-generated documentation website
flake8
for general linting (using various linter plugins)mypy
for editor-independent type-checkingpytest
for unit testingpytest-cov
for computing code coverage by testshypothesis
for property-based testingbandit
for checking security issues in the codesafety
for checking security issues in the current dependencies
black
for source-code formattingautoflake
for automatically removing unused importspydocstyle
for checking docstring conventions
cffconvert
to check theCITATION.cff
(citation metadata)codemetapy
to generate acodemeta.json
(general software metadata)somesy
to keep all important metadata continuously synchronizedreuse
to check REUSE-compliance (granular copyright and license metadata)licensecheck
to scan for possible license incompatibilities in the dependencies
Tip
Most tools installed and used by this project are listed in the
pyproject.toml
and .pre-commit-config.yaml
files.
Basics¶
The project
- heavily uses
pyproject.toml
, which is a recommended standard - adopts the
src
layout, to avoid common problems - keeps the actual code (
src
) and test code (tests
) separated
The pyproject.toml
is the main configuration file for the project. It contains both
general information about the software as well as configuration for various tools.
In older software, most of this information is often scattered over many little
tool-specific configuration files and a setup.py
, setup.cfg
and/or requirements.txt
file.
Tip
pyproject.toml
is the first place your should check
when looking for the configuration of some development tool.
Configuration¶
The main tool needed to manage and configure the project is Poetry.
Please follow its setup documentation to install it correctly. Poetry should not
be installed with pip
like other Python tools.
Poetry performs many important tasks:
- it manages the virtual environment(s) used for the project
- it manages all the dependencies needed for the code to work
- it takes care of packaging the code into a
pip
-installable package
You can find a cheatsheet with the most important commands here and consult its official documentation for detailed information.
Note that poetry
is only needed for development of the repository.
The end-users who just want to install and use this project
do not need to set up or know anything about poetry.
Tip
If you use poetry shell
to activate the virtual environment of the project,
and the project is already installed with poetry install
, in the following you do not
have to prepend poetry run
in the commands you will see below.
Task Runner¶
It is a good practice to have a common way for launching different project-related tasks. It removes the need of remembering flags for various tools, and avoids duplication of the same commands in the CI pipelines. If something in a workflow needs to change, it can be changed in just one place, thus reducing the risk of making a mistake.
Often projects use a shell script or Makefile
for this purpose. This project uses
poethepoet, as it integrates nicely with poetry
.
The tasks are defined in pyproject.toml
and can be launched using:
poetry run poe TASK_NAME
CI Workflows¶
The project contains CI workflows for both GitHub and GitLab.
The main CI pipeline runs on each new pushed commit and will
- Run all configured code analysis tools,
- Run code tests with multiple versions of Python,
- build and deploy the online project documentation website, and
- if a new version tag was pushed, launch the release workflow
Quality Control¶
Static Analysis¶
Except for code testing, most tools for quality control are added to the project as
pre-commit
hooks. The pre-commit
tool takes care of
installing, updating and running the tools according to the configuration in the
.pre-commit-config.yaml
file.
For every new copy of the repository (e.g. after git clone
), pre-commit
first must
be activated. This is usually done using pre-commit install
, which also requires that
pre-commit
is already available. For more convenience, we simplified the procedure.
In this project, you can run:
poetry run poe init-dev
This will make sure that pre-commit
is enabled in your repository copy.
Once enabled,
every time you try to git commit
some changed files
various tools will run on those (and only those) files.
This means that (with some exceptions) pre-commit
by default will run only
on the changed files that were added to the next commit
(i.e., files in the git staging area).
These files are usually colored in green when running git status
.
- Some tools only report the problems they detected
- Some tools actively modify files (e.g., fix formatting)
In any case, the git commit
will fail if a file was modified by a tool, or some
problems were reported. In order to complete the commit, you need to
- resolve all problems (by fixing them or marking them as false alarm), and
git add
all changed files again (to update the files in the staging area).
After doing that, you can retry to git commit
your changes.
To avoid having to deal with many issues at once, it is a good habit to run
pre-commit
by hand from time to time. In this project, this can be done with:
poetry run poe lint --all-files
Testing¶
pytest is used as the main framework for testing.
The project uses the pytest-cov
plugin
to integrate pytest
with
coverage
, which
collects and reports test coverage information.
In addition to writing regular unit tests with pytest
, consider using
hypothesis,
which integrates nicely with pytest
and implements property-based testing - which
involves automatic generation of randomized inputs for test cases. This can help to find
bugs often found for various edge cases that are easy to overlook in ad-hoc manual tests.
Such randomized tests can be a good addition to hand-crafted tests and inputs.
To run all tests, either invoke pytest
directly, or use the provided task:
poetry run poe test
Tip
Add the flag --cov
to enable the test coverage tracking and get a table with
results after the tests are completed.
Documentation¶
The project uses mkdocs
with the popular and excellent
mkdocs-material
theme to generate the project documentation website, which provides both user and
developer documentation.
mkdocs
is configured in the mkdocs.yml
file, which we prepared in a way that there is
- no need to duplicate sections from files in other places (such as
README.md
) - fully automatic API documentation pages based on Python docstrings in the code
- a detailed test coverage report is included in the website
The first point is important, because avoiding duplication means avoiding errors whenever text or examples are updated. The second point is convenient, as modules and functions do not need to be added by hand, which is easy to forget. The third point removes the need to use an external service such as CodeCov to store and present code coverage information.
As software changes over time and users cannot always keep up with the latest developments,
each new version of the software should provide version-specific documentation.
To make this both possible as well as convenient, this project uses
mike
to generate and manage the mkdocs
documentation for different versions of the software.
Tip
You can easily add new pages (e.g. extended tutorials or topic-specific guides) to
your documentation website by creating markdown files in the docs/
directory and
adding them to the nav
section in mkdocs.yml
.
Offline Documentation¶
You can manually generate a local and fully offline copy of the documentation, which can be useful for e.g. previewing the results during active work on the documentation:
poetry install --with docs
poetry run poe docs
Once the documentation site is built, run mkdocs serve
and
open https://localhost:8000
in your browser to see the local copy of the website.
Tip
You probably should always check bigger website updates locally before it is publicly deployed. The automatic pipelines can only catch technical problems, but you still e.g. might want to do some proof-reading.
Online Documentation¶
To avoid dependence on additional services such as readthedocs, the project website is set up for simple deployment using GitHub Pages or GitLab Pages.
The provided CI pipeline automatically generates the documentation for the latest
development version (i.e., current state of the main
branch) as well as every released
version (i.e., marked by a version tag vX.Y.Z
).
Publishing the documentation to a website using GitHub or GitLab Pages needs a bit of configuration. Please follow the steps for your respective hosting service.
- Create a new project access token for GitLab Pages deployment
- in your GitLab project, go to Settings > Access Tokens
- Add a new token with the following settings:
- Token name:
PAGES_DEPLOYMENT_TOKEN
- Expiration date: (far in the future)
- Select a role: Maintainer
- Select scopes: read_repository, write_repository
- Token name:
- Provide the token as a masked(!) variable to the CI pipeline
- in your GitLab project, go to Settings > CI/CD
- in the section Variables add a new variable with
- Key:
PAGES_TOKEN
- Value: (the token string, as generated in the previous step)
- enable Mask variable, so your token will not appear in logs
- Key:
- Ensure that the GitLab pages URL is correct
- in your GitLab project, go to Deploy > Pages
- make sure that Use unique domain is NOT enabled
- check that under Access pages the URL matches the
site_url
in yourmkdocs.yml
- make sure that you pushed the repository and the CI pipeline completed at least once
- check that a
gh-pages
branch exists (created by the CI) - go to your GitHub repository Settings and from there to settings for Pages
- under Build and deployment pick
gh-pages
as the branch for serving documentation
Important Information
When adding any kind of token to your repository configuration, which usually allows code and pipelines to access and modify your project, make sure that the token is protected.
- In GitHub, tokens should be always added as secrets
- In GitLab, tokens should be added as CI variables that are masked
This will make sure that the token will not appear in logs of the CI pipeline runs and minimize the risk of abuse for malicious purposes. NEVER save a token in a text file in your repository!
Tip
Should anything go wrong and you need to manually access the data of the deployed
website, you can find it in the gh-pages
or gl-pages
branch of the repository.
Normally you should not need to use that branch directly, though.
Releases¶
From time to time the project is ready for a new release for users.
Creating a New Release¶
Before releasing a new version, push the commit the new release should be based on to the upstream repository, and make sure that:
- the CI pipeline completes successfully
- the version number in
pyproject.toml
is updated, in particular: - it must be larger than the previous released version
- it should adequately reflect the severity of changes
- the provided user and developer documentation is up-to-date, including:
- a new section in the
CHANGELOG.md
file summarizing changes in the new version - possibly revised information about contributors and/or maintainers
If this is the case, proceed with the release by:
- creating a new tag that matches the version in the
pyproject.toml
:git tag vX.Y.Z
- pushing the new tag to the upstream repository:
git push origin vX.Y.Z
The pushed version tag will trigger a pipeline that will:
- build and deploy the documentation website for the specific version
- publish the package to enabled targets (see below)
Release Targets¶
The CI pipelines are built in such a way that features can be enabled, disabled and configured easily.
Targets for releases can be enabled or disabled in the variables
section in .gitlab-ci.yml
.
Targets for releases can be enabled or disabled in .github/workflows/ci.yml
and
configured by adapting the corresponding actions in .github/workflows/releases.yml
.
GitHub / GitLab Release¶
By default, the release workflow will create a basic GitHub or GitLab Release that provides a snapshot of the repository as a download. This requires no additional configuration.
See here for information on how the Github release can be customized.
Note
The Github Release can be used to trigger automated software publication of your
released versions to
Zenodo,
based on the metadata provided in the CITATION.cff
file.
PyPI and Compatible Indices¶
The CI pipelines support automatic releases to PyPI, Test PyPI or other custom repositories, but in any case this requires a bit of initial configuration.
For automated releases to PyPI and Test PyPI the project uses the classic token-based workflow.
Before the project can be released to PyPI or Test PyPI the first time,
a new PyPI API token must be created in the PyPI account of the main project maintainer,
and added to your CI as a masked variable, and a variable updated in the .gitlab-ci.yml
.
The corresponding tokens can be added analogously to the PAGES_TOKEN
for online documentation,
which was explained here.
PyPI:
- add the token as a masked CI variable called
RELEASE_TOKEN_pypi
- in
.gitlab-ci.yml
, setrelease_to_pypi: "true"
Test PyPI:
- add the token as a masked CI variable called
RELEASE_TOKEN_testpypi
- in
.gitlab-ci.yml
, setrelease_to_testpypi: "true"
Custom Package Index:
- add the token as a masked CI variable called
RELEASE_TOKEN_custom
- in
.gitlab-ci.yml
, setrelease_to_custom: "true"
- update
PKGIDX_URL
in therelease_custom_pypi
job to the correct legacy API endpoint
For automated releases to PyPI and Test PyPI the project uses the new Trusted Publishers workflow that is both more secure and convenient to use than other authorization methods.
Before the project can be released to PyPI or Test PyPI the first time,
first a pending publisher
must be added in the PyPI account of the main project maintainer, using
release.yml
as the requested workflow name.
Note
It is important to use the correct workflow name, otherwise the workflow will fail!
Once this is done, set the corresponding option (to_pypi
/ to_test_pypi
) to true
in the publish
job in ci.yml
to enable the corresponding publication target.
If the old and less secure token-based authentication method is needed or
the package should be published to a different PyPI-compatible package index, please
adapt release.yml
accordingly.
If for some reason you do not want to use the CI for the PyPI releases, you can skip these instructions
and manually use poetry publish
to do the release.