Skip to content

User Manual

This is the reference manual for somesy providing details about its configuration and behavior.

Before you proceed, make sure you have read the introduction and the quick-start guide.

Somesy Metadata Schemas

Because the same information is represented in different ways and more or less detail in different files, somesy requires to put all project information in a somesy-specific input section is located in a supported input file. Somesy will use this as the single source of truth for the supported project metadata fields and can synchronize this information into different target files.

Info

The somesy schemas are designed to allow expressing metadata in the most useful and rich form, i.e. the best form that some of the target formats supports.

Somesy project metadata consists of two main schemas - one schema for describing people (authors, maintainers, contributors), and a schema describing the project.

The somesy metadata fields (especially for people) are mostly based on Citation File Format 1.2, with some custom extensions. Somesy will try staying aligned with future revisions of CITATION.cff, unless for technical or practical reasons a deviation or extension is justified.

One useful distinction somesy does in contrast to many target formats is to allow stating all people in one place. If a person is both an author and a maintainer, that person does not need to be listed with all information twice.

This is done by adding the author and maintainer flags that can be set for every listed person. Somesy will take care of duplicating the entries where this is needed.

Furthermore somesy allows to provide more fine-grained information about the contribution of a person and acknowledge contributors that are neither full authors or maintainers.

Note

Currently, provided information about contributors that are neither authors nor maintainers, and all the detailed information on the contributions is not used.

Nevertheless, we encourage tracking granular contributor and contribution information in order to motivate and acknowledge also minor or invisible contributions to a projects.

Once CITATION.cff introduces corresponding mechanisms, somesy will be aligned with the corresponding capabilities. Furthermore, somesy might support the allcontributors tool as a target in the future.

Schemas Overview

Here is an overview of the schemas used in somesy.

The complete somesy input file (somesy.toml) or section (pyproject.toml).

Field Type Required? Default Description
project ProjectMetadata yes Project metadata to be used and synchronized.
config SomesyConfig no somesy tool configuration (matches CLI flags).

Pydantic model for Project Metadata Input.

Field Type Required? Default Description
name str yes Project name.
description str yes Project description.
version str yes Project version.
license LicenseEnum yes SPDX License string.
homepage Url no URL of the project homepage.
repository Url no URL of the project source code repository.
documentation Url no URL of the project documentation.
keywords str no Keywords that describe the project.
people Person yes Project authors, maintainers and contributors.

Metadata abount a person in the context of a software project.

This schema is based on CITATION.cff 1.2, modified and extended for the needs of somesy.

Field Type Required? Default Description
orcid Url no The person's ORCID url (not required, but highly suggested).
email str yes The person's email address.
family_names str yes The person's family names.
given_names str yes The person's given names.
name_particle str no The person's name particle, e.g., a nobiliary particle or a preposition meaning 'of' or 'from' (for example 'von' in 'Alexander von Humboldt').
name_suffix str no The person's name-suffix, e.g. 'Jr.' for Sammy Davis Jr. or 'III' for Frank Edwin Wright III.
alias str no The person's alias.
affiliation str no The person's affiliation.
address str no The person's address.
city str no The person's city.
country Country no The person's country.
fax str no The person's fax number.
post_code str no The person's post-code.
region str no The person's region.
tel str no The person's phone number.
author bool no false Indicates whether the person is an author of the project (i.e. significant contributor).
publication_author bool no Indicates whether the person is to be listed as an author in academic citations.
maintainer bool no false Indicates whether the person is a maintainer of the project (i.e. for contact).
contribution str no Summary of how the person contributed to the project.
contribution_types ContributionTypeEnum no Relevant types of contributions (see https://allcontributors.org/docs/de/emoji-key).
contribution_begin date no Beginning date of the contribution.
contribution_end date no Ending date of the contribution.

Pydantic model for somesy tool configuration.

Note that all fields match CLI options, and CLI options will override the values declared in a somesy input file (such as somesy.toml).

Field Type Required? Default Description
show_info bool no false Show basic information messages on run (-v flag).
verbose bool no false Show verbose messages on run (-vv flag).
debug bool no false Show debug messages on run (-vvv flag).
input_file Path no "somesy.toml" Project metadata input file path.
no_sync_pyproject bool no false Do not sync with pyproject.toml.
pyproject_file Path no "pyproject.toml" pyproject.toml file path.
no_sync_package_json bool no false Do not sync with package.json.
package_json_file Path no "package.json" package.json file path.
no_sync_julia bool no false Do not sync with Project.toml.
julia_file Path no "Project.toml" Project.toml file path.
no_sync_fortran bool no false Do not sync with fpm.toml.
fortran_file Path no "fpm.toml" fpm.toml file path.
no_sync_pom_xml bool no false Do not sync with pom.xml.
pom_xml_file Path no "pom.xml" pom.xml file path.
no_sync_mkdocs bool no false Do not sync with mkdocs.yml.
mkdocs_file Path no "mkdocs.yml" mkdocs.yml file path.
no_sync_rust bool no false Do not sync with Cargo.toml.
rust_file Path no "Cargo.toml" Cargo.toml file path.
no_sync_cff bool no false Do not sync with CFF.
cff_file Path no "CITATION.cff" CFF file path.
no_sync_codemeta bool no false Do not sync with codemeta.json.
codemeta_file Path no "codemeta.json" codemeta.json file path.

Metadata Mapping

From its own schema somesy must convert the information into the target formats. The following tables sketch how fields are mapped to corresponding other fields in some of the currently supported formats. Bold field names are mandatory, the others are optional.

Somesy Field Poetry Config SetupTools Config Java POM Julia Config Fortran Config package.json mkdocs.yml Rust Config CITATION.cff CodeMeta
given-names name+email name name name+email name+email name name+email name+email givenName name+email
family-names name+email name name name+email name+email name name+email name+email familyName name+email
email name+email email email name+email name+email email name+email name+email email name+email
orcid - - url - - url - - id -
(many others) - - - - - - - - (same) -
Somesy Field Poetry Config SetupTools Config Java POM Julia Config Fortran Config package.json mkdocs.yml Rust Config CITATION.cff CodeMeta
name name name name name name name site_name name title name
description description description description - description description site_description description abstract description
license license license licenses.license - license license - license license license
version version version version version version version - version version version
author=true authors authors developers authors author author site_author authors authors author
maintainer=true maintainers maintainers - - maintainer maintainers - - contact maintainer
people - - - - - contributors - - - contributor
keywords keywords keywords - - keywords keywords - keywords keywords keywords
homepage homepage urls.homepage urls - homepage homepage site_url homepage url url
repository repository urls.repository scm.url - - repository repo_url repository repository_code codeRepository
documentation documentation urls.documentation distributionManagement.site.url - - - - documentation - buildInstructions

Note that the mapping is often not 1-to-1. For example, CITATION.cff allows rich specification of author contact information and complex names. In contrast, poetry only supports a simple string with a name and email (like in git commits) to list authors and maintainers. Therefore somesy sometimes must do much more than just move or rename fields. This means that giving a clean and complete mapping overview is not feasible. In case of doubt or confusion, please open an issue or consult the somesy code.

The somesy CLI tool

You can see all supported somesy CLI command options using somesy --help:

somesy sync --help
                                                                                
 Usage: somesy sync [OPTIONS] COMMAND [ARGS]...                                 
                                                                                
 Sync project metadata input with metadata files.                               

╭─ Options ────────────────────────────────────────────────────────────────────╮
 --input-file            -i      FILE  Somesy input file path (default:       
                                       .somesy.toml)                          
                                       [default: None]                        
 --no-sync-pyproject     -P            Do not sync pyproject.toml file        
                                       (default: False)                       
 --pyproject-file        -p      FILE  Existing pyproject.toml file path      
                                       (default: pyproject.toml)              
                                       [default: None]                        
 --no-sync-package-json  -J            Do not sync package.json file          
                                       (default: False)                       
 --package-json-file     -j      FILE  Existing package.json file path        
                                       (default: package.json)                
                                       [default: None]                        
 --no-sync-julia         -L            Do not sync Project.toml (Julia) file  
                                       (default: False)                       
 --julia-file            -l      FILE  Custom Project.toml (Julia) file path  
                                       (default: Project.toml)                
                                       [default: None]                        
 --no-sync-fortran       -F            Do not sync fpm.toml (Fortran) file    
                                       (default: False)                       
 --fortran-file          -f      PATH  Custom fpm.toml (Fortran) file path    
                                       (default: fpm.toml)                    
                                       [default: None]                        
 --no-sync-pomxml        -X            Do not sync pom.xml (Java Maven) file  
                                       (default: False)                       
 --pomxml-file           -x      FILE  Custom pom.xml (Java Maven) file path  
                                       (default: pom.xml)                     
                                       [default: None]                        
 --no-sync-mkdocs        -D            Do not sync mkdocs.yml file (default:  
                                       False)                                 
 --mkdocs-file           -d      FILE  Custom mkdocs.yml file path (default:  
                                       mkdocs.yml)                            
                                       [default: None]                        
 --no-sync-rust          -R            Do not sync Cargo.toml file (default:  
                                       False)                                 
 --rust-file             -r      FILE  Custom Cargo.toml file path (default:  
                                       Cargo.toml)                            
                                       [default: None]                        
 --no-sync-cff           -C            Do not sync CITATION.cff file          
                                       (default: False)                       
 --cff-file              -c      FILE  CITATION.cff file path (default:       
                                       CITATION.cff)                          
                                       [default: None]                        
 --no-sync-codemeta      -M            Do not sync codemeta.json file         
                                       (default: False)                       
 --codemeta-file         -m      FILE  Custom codemeta.json file path         
                                       (default: codemeta.json)               
                                       [default: None]                        
 --help                                Show this message and exit.            
╰──────────────────────────────────────────────────────────────────────────────╯

Everything that can be configured as a CLI flag can also be set in a somesy.toml file in the [config] section. CLI arguments set in an input file override the defaults, while options passed as CLI arguments override the configuration.

Without an input file specifically provided, somesy will check if it can find a valid

  • .somesy.toml
  • somesy.toml
  • pyproject.toml (in tool.somesy section)
  • Project.toml (in tool.somesy section)
  • fpm.toml (in tool.somesy section)
  • package.json (in somesy section)
  • Cargo.toml (in package.metadata.somesy section)

which is located in the current working directory. If you want to provide the somesy input file from a different location, you can pass it with the -i option.

Somesy Input File

Here is an example how project metadata and somesy can be configured using one of the supported input formats:

[project]
name = "my-amazing-project"
version = "0.1.0"
description = "Brief description of my amazing software."

keywords = ["some", "descriptive", "keywords"]
license = "MIT"
repository = "https://github.com/username/my-amazing-project"

# This is you, the proud author of your project:
[[project.people]]
given-names = "Jane"
family-names = "Doe"
email = "j.doe@example.com"
orcid = "https://orcid.org/0000-0000-0000-0001"
author = true      # is a full author of the project (i.e. appears in citations)
maintainer = true  # currently maintains the project (i.e. is a contact person)

# this person is an acknowledged contributor, but not author or maintainer:
[[project.people]]
given-names = "Another"
family-names = "Contributor"
email = "a.contributor@example.com"
orcid = "https://orcid.org/0000-0000-0000-0002"
# ... but for scientific publications, this contributor should be listed as author:
publication_author = true

[config]
verbose = true     # show detailed information about what somesy is doing
[tool.poetry]
name = "my-amazing-project"
version = "0.1.0"
...

[tool.somesy.project]
name = "my-amazing-project"
version = "0.1.0"
description = "Brief description of my amazing software."

keywords = ["some", "descriptive", "keywords"]
license = "MIT"
repository = "https://github.com/username/my-amazing-project"

# This is you, the proud author of your project
[[tool.somesy.project.people]]
given-names = "Jane"
family-names = "Doe"
email = "j.doe@example.com"
orcid = "https://orcid.org/0000-0000-0000-0001"
author = true      # is a full author of the project (i.e. appears in citations)
maintainer = true  # currently maintains the project (i.e. is a contact person)

# this person is a acknowledged contributor, but not author or maintainer:
[[tool.somesy.project.people]]
given-names = "Another"
family-names = "Contributor"
email = "a.contributor@example.com"
orcid = "https://orcid.org/0000-0000-0000-0002"

[tool.somesy.config]
verbose = true     # show detailed information about what somesy is doing
name = "my-amazing-project"
version = "0.1.0"
uuid = "c7e460c6-3f3e-11ec-8d3d-0242ac130003"

[deps]
...

[tool.somesy.project]
name = "my-amazing-project"
version = "0.1.0"
description = "Brief description of my amazing software."

keywords = ["some", "descriptive", "keywords"]
license = "MIT"
repository = "https://github.com/username/my-amazing-project"

# This is you, the proud author of your project
[[tool.somesy.project.people]]
given-names = "Jane"
family-names = "Doe"
email = "j.doe@example.com"
orcid = "https://orcid.org/0000-0000-0000-0001"
author = true      # is a full author of the project (i.e. appears in citations)
maintainer = true  # currently maintains the project (i.e. is a contact person)

# this person is a acknowledged contributor, but not author or maintainer:
[[tool.somesy.project.people]]
given-names = "Another"
family-names = "Contributor"
email = "a.contributor@example.com"
orcid = "https://orcid.org/0000-0000-0000-0002"

[tool.somesy.config]
verbose = true     # show detailed information about what somesy is doing
name = "my-amazing-project"
version = "0.1.0"

[tool.somesy.project]
name = "my-amazing-project"
version = "0.1.0"
description = "Brief description of my amazing software."

keywords = ["some", "descriptive", "keywords"]
license = "MIT"
repository = "https://github.com/username/my-amazing-project"

# This is you, the proud author of your project
[[tool.somesy.project.people]]
given-names = "Jane"
family-names = "Doe"
email = "j.doe@example.com"
orcid = "https://orcid.org/0000-0000-0000-0001"
author = true      # is a full author of the project (i.e. appears in citations)
maintainer = true  # currently maintains the project (i.e. is a contact person)

# this person is a acknowledged contributor, but not author or maintainer:
[[tool.somesy.project.people]]
given-names = "Another"
family-names = "Contributor"
email = "a.contributor@example.com"
orcid = "https://orcid.org/0000-0000-0000-0002"

[tool.somesy.config]
verbose = true     # show detailed information about what somesy is doing
{
  "name": "my-amazing-project",
  "version": "0.1.0",
  ...

  "somesy": {
    "project": {
      "name": "my-amazing-project",
      "version": "0.1.0",
      "description": "Brief description of my amazing software.",
      "keywords": ["some", "descriptive", "keywords"],
      "license": "MIT",
      "repository": "https://github.com/username/my-amazing-project",
      "people": [
        {
          "given-names": "Jane",
          "family-names": "Doe",
          "email": "j.doe@example.com",
          "orcid": "https://orcid.org/0000-0000-0000-0001",
          "author": true,
          "maintainer": true
        },
        {
          "given-names": "Another",
          "family-names": "Contributor",
          "email": "a.contributor@example.com",
          "orcid": "https://orcid.org/0000-0000-0000-0002"
        }
      ]
    },
    "config": {
      "verbose": true
    }
  }
}

All possible metadata fields and configuration options are explained further above.

pre-commit hook

We highly recommend to use somesy as a pre-commit hook. A pre-commit hook runs on every commit to automatically point out issues or fix them on the spot, so if you do not use pre-commit in your project yet, you should start today! When used this way, somesy can fix most typical issues with your project metadata even before your changes can leave your computer.

To add somesy as a pre-commit hook, add it to your .pre-commit-config.yaml file in the root folder of your repository:

repos:
  # ... (your other hooks) ...
  - repo: https://github.com/Materials-Data-Science-and-Informatics/somesy
    rev: "v0.4.3"
    hooks:
      - id: somesy

Note that pre-commit gives somesy the staged version of files, so when using somesy with pre-commit, keep in mind that

  • if somesy changed some files, you need to git add them again (and rerun pre-commit)
  • if you explicitly run pre-commit, make sure to git add all changed files (just like before a commit)

Synchronization

Unless configured otherwise, somesy will create CITATION.cff and codemeta.json files if they do not exist. Other supported files (such as pyproject.toml or package.json) are updated if they already exist in your repository.

If you do not want that somesy creates/synchronizes these files, you can disable them by CLI options or in your somesy configuration.

Metadata Update Logic

In this section we explain a few details about how somesy updates metadata.

Somesy Inputs Override Target Values

In general, somesy updates fields in target files and formats based on the information stated in the somesy.toml.

It will convert the metadata into the form needed for a target, while trying to preserve as much information as possible. Then it carefully updates the file, while keeping all other fields in the target file unchanged.

In most cases, somesy will try not to interfere with other values, metadata and configuration you might have in a target file.

Info

For typically manually-edited files, it will even make sure that the comments stay in place! (currently works for TOML)

Tip

Only edit target files manually to add or update fields that somesy does not understand or care about!

Warning

All changes in target files you do to fields somesy does understand will be overwritten the next time you run somesy.

Tip

  • update all project metadata supported by somesy in the somesy.toml
  • update other metadata directly in the target files

Checking Somesy Results

Note that somesy in general will try doing a good job and hopefully will in most cases, but in certain tricky situations it might not be able to figure out the and needed changes correctly.

Danger

Always check the files that somesy synchronized look right before committing/pushing the changes!

Doing what somesy does both convenient and right is (maybe surprisingly to you) quite difficult, so while somesy should save you quite some tedious work, you should not use it blindly. You have been warned!

Person Identification Heuristics

One frequent source of high-level project metadata changes is fluctuation of authors, maintainers and contributors and eventual changes of the respective contact and identification information.

Somesy will try its best to keep track of persons involved in your software project, but to avoid possible problems and unexpected behavior, it might be helpful to understand how somesy determines whether two metadata records describe the same real person.

When somesy compares two metadata records about a person, it will proceed as follows:

  1. If both records contain an ORCID, then the person is the same if the attached ORCIDs are equal, and different if it is not.
  2. Otherwise, if both records have an attached email address, and it is the same email, then they are the same person.
  3. Otherwise, the records are considered to be about the same person if they agree on the full name (i.e. given, middle and family name sequence).

Tip

State ORCIDs for persons whenever possible (i.e. the person has an ORCID)!

Tip

If a person does not have an ORCID, suggest that they should create one!

Somesy will usually correctly understand cases such as:

  1. An ORCID being added to a person (i.e. if it was not present before)
  2. A changed email address (if the name stays the same)
  3. A changed name (if the email address stays the same)
  4. Any other relevant metadata attached to the person

Nevertheless, you should check the changes somesy does before committing them to your repository, especially after you significantly modified your project metadata.

Warning

Note that changing the ORCID will not be recognized, because ORCIDs are assumed to be unique per person.

If you initially have stated an incorrect ORCID for a person and then change it, somesy will think that this is a new person. Therefore, in such a case you will need to fix the ORCID in all configured somesy targets either before running somesy (so somesy will not create new person entries), or after running somesy (to remove the duplicate entries with the incorrect ORCID).

Warning

Person identification and merging is not applied to standards with free text fields for authors or maintainers, such as fpm.toml.

Codemeta

While somesy is modifying existing files for most supported formats and implements features such as person identification and merging, CodeMeta is implemented differently.

As that codemeta.json is a JSON-LD file, it actually represents a graph, has various equally valid representations in a JSON file. Thus, supporting the same features as for other formats is technically much more challenging, if at all feasible. Therefore, for the time being, we regenerate the codemeta.json file directly from the source file, in order to avoid data inconsistency due to many pitfalls hiding in the details of the format.

Warning

The codemeta.json is overwritten and regenerated from scratch every time you sync, so do not edit it if you have the codemeta target enabled in somesy!

As codemeta.json is considered a technical "backend-format" derived from other inputs, in most cases you probably do not need or should edit it by hand anyway.

Of course, you are welcome to contribute an improved CodeMeta writer for somesy that can correctly understand and update the linked data graph which the codemeta.json file represents!

Using somesy to insert metadata into project documentation

While somesy can synchronize structured metadata files and formats, there is a common case that cannot be covered by the sync command - when project metadata should appear in plain text documents, such as documentation files and web pages.

As for documentation the needs and used tooling in different projects is vastly different, somesy provides a very general solution to this problem with the fill command. It takes a Jinja2 template and returns the resulting file where the project metadata is inserted the form dictated by the template.

For example, a template is used to generate the AUTHORS.md file in the somesy repository, which is also shown as the Credits page, using the following command:

somesy fill docs/_template_authors.md -o AUTHORS.md
_template_authors.md
# Authors and Contributors

**Authors** are people whose contributions significantly shaped
the state of `{{ project.name }}` at some point in time.

**Additional contributors** are people who contributed non-trivially to this project
in different ways, e.g. by providing smaller fixes and enhancements to the code
and/or documentation.

Of course, this is just a rough overview and categorization.
For a more complete overview of all contributors and contributions,
please inspect the git history of this repository.

## Authors

{% for p in project.authors() %}
{%- set contr_desc = p.contribution or "" -%}
- {{ p.full_name }} (
  [E-Mail](mailto:{{ p.email }}),
  [ORCID]({{ p.orcid }})
  ){{ ": "+contr_desc if contr_desc else "" }}
{% endfor %}

## Additional Contributors

{% for p in project.contributors() %}
{%- set contr_desc = p.contribution or "" -%}
- {{ p.full_name }} (
  [E-Mail](mailto:{{ p.email }}),
  [ORCID]({{ p.orcid }})
  ){{ ": "+contr_desc if contr_desc else "" }}
{% endfor %}

... maybe **[you](https://materials-data-science-and-informatics.github.io/somesy/latest/contributing)**?
AUTHORS.md
# Authors and Contributors

**Authors** are people whose contributions significantly shaped
the state of `somesy` at some point in time.

**Additional contributors** are people who contributed non-trivially to this project
in different ways, e.g. by providing smaller fixes and enhancements to the code
and/or documentation.

Of course, this is just a rough overview and categorization.
For a more complete overview of all contributors and contributions,
please inspect the git history of this repository.

## Authors

- Mustafa Soylu (
  [E-Mail](mailto:m.soylu@fz-juelich.de),
  [ORCID](https://orcid.org/0000-0003-2637-0432)
  ): Main developer, maintainer and tester.
- Anton Pirogov (
  [E-Mail](mailto:a.pirogov@fz-juelich.de),
  [ORCID](https://orcid.org/0000-0002-5077-7497)
  ): Concepts, tool development and enhancement, documentation.


## Additional Contributors

- Jens Bröder (
  [E-Mail](mailto:j.broeder@fz-juelich.de),
  [ORCID](https://orcid.org/0000-0001-7939-226X)
  ): Discussions and suggestions concerning metadata standards and usability.
- Volker Hofmann (
  [E-Mail](mailto:v.hofmann@fz-juelich.de),
  [ORCID](https://orcid.org/0000-0002-5149-603X)
  ): Discussions and suggestions concerning tool scope and usability.
- Stefan Sandfeld (
  [E-Mail](mailto:s.sandfeld@fz-juelich.de),
  [ORCID](https://orcid.org/0000-0001-9560-4728)
  )


... maybe **[you](https://materials-data-science-and-informatics.github.io/somesy/latest/contributing)**?

The template gets the complete ProjectMetadata as its context, so it is possible to access all included project and contributor information.

FAQ

Somesy introduces it's own metadata format... isn't this counter-productive?

We don't propose to use somesy as a new "standard". On the contrary, the whole point of somesy is to help maintaining standard-compliant metadata alongside other representations. To do its job, somesy needs to introduce a canonical format to express the metadata it tries to manage for you, because otherwise building such a tool is practically impossible. Should you after some time decide you do not want to use it anymore, nothing is lost - you keep all your CITATION.cff and codemeta.json and pyproject.toml files and can continue to maintain them however you see fit.

The somesy-specific format is just the nice and convenient interface to make everybody's life easier. Furthermore, nobody needs to care whether, under the hood, you use somesy (or anything like it) or not - they can use the corresponding files they already know to get the information they need. So there is no "risk" involved in adopting somesy, because it does not try to abolish any other formats or standards or becoming such.

In my project, the effective authors and the publication authors are not the same! What to do?

The author flag in somesy is intended to mark people who significantly contributed to the project in a hands-on way and are closely familiar with details, i.e. can answer specific questions. A reason to stick with this strict understanding of "author" is that a user will be usually interested in contacting such a person to help them with problems.

However, we are aware that acknowledgement practices in different scientific communities vary and current practices in academic publication do not allow for sufficiently granular distinction of contributor roles. Even though the proper solution to problem would be improving community practices, somesy supports the publication_author flag, that can be set independently of the author flag and will make sure that certain contributors will appear as authors in an academic citation context (i.e. reflected in the CITATION.cff file, which can be used for Zenodo publications), but will not appear as authors in a technical context (such as the metadata in a software registry like PyPI).