• Overview
•
Description
•
Get started
•
Contributing
•
Citation
•
Acknowledgments
•
References
This repository is a bank centralizing metadata files describing various published trait databases.
Its content (folder metadata/
) is used by the R package traitdatabases
to download, import, clean, and homogenize trait data.
The metadata file describes all information needed to document and process the trait database. It is structured in four sections:
status
: the status of the metadata filedataset
: general metadata of the datasettaxonomy
: information about taxonomic columnstraits
: description of the trait data
Tag name | Description | Example |
---|---|---|
status |
The status of the metadata file. Should be one of: - draft - incomplete (some metadata need to be added)- complete (all metadata has been filled in) |
draft |
Tag name | Description | Example |
---|---|---|
id |
The dataset identifier1. It's up to you to choose this identifier. Can be the name of the database, the first author and year, etc. | hodgson_2023 |
title |
The dataset title. Typically the title of the (data) paper. | "A functional trait database of arable weeds from Eurasia and North Africa" |
description |
A short description of the dataset | "The functional traits of […] for 928 arable weed species." |
license |
The dataset license. | CC BY-SA 4.0 |
bibtex |
The name of the dataset citation file in a BibTex format (optional) | hodgson_2023.bib |
doi |
The Digital Object Identifier (DOI) of the dataset description | 10.5287/ora-pp4y9nkoz |
url |
The URL of the dataset description (paper) | https://ora.ox.ac.uk/objects/uuid:abafafd9-e8a2-4e84-a339-0a11bf2858ae |
taxon |
The taxonomic group (mammals, birds, etc.) | plants |
taxonomic_level |
The taxonomic resolution (individuals, species, genus, etc.) | species |
type |
One of: - static (a file that can be downloaded)- api (access data through a query) |
static |
file_url |
The full URL to download the static file. NB. Equal to .na if type: api |
https://ora.ox.ac.uk/objects/uuid:abafafd9-e8a2-4e84-a339-0a11bf2858ae/files/s8p58pf68w |
file_name |
The name of the static file (with file extension). NB. Equal to .na if type: api |
Functional+trait+database+of+arable+ weeds+from+Eurasia+and+ North+Africa.xlsx |
file_extension |
The file extension of the static file. NB. Equal to .na if type: api |
.xlsx |
manual_download |
Does the data file need to be manually downloaded? One of: - yes : only for specific cases like data hosted by Wiley Online Library- no : data file can be downloaded through command line (most cases)- .na : if type: api |
no |
sheet |
The sheet number that contains data (only for xslx file)NB. Equal to .na for non Excel file or if type: api |
1 |
long_format |
Are the trait data in long format?. One of: - yes : data are in long format- no : data are in wide format (most cases)- .na : if type: api |
no |
skip_rows |
The number of header rows to remove (if any) | .na |
col_separator |
The character used to separate columns (for txt or csv files) |
.na |
na_value |
The characters used for missing values (if any) | NA |
comment |
Any relevant comment (if any) | .na |
Tag name | Description | Example |
---|---|---|
genus |
The column name of the genus (when species and genus names are separated) | .na |
species |
The column name of the species (when species and genus names are separated) | .na |
binomial |
The column name of the binomial name | Species |
Tag name | Description | Example |
---|---|---|
variable |
The column name of the trait (as in the data file) | SLA |
name |
The full name of the trait | Specific leaf area |
category |
The category of the trait[^2] | Leaf morphology |
type |
The type of the trait. One of: - quantitative - categoric |
quantitative |
units |
The "original" unit of the trait (for quantitative trait only) | mm2.mg-1 |
In the case of categorical
traits, all categories should be listed with
two fileds: value
and description
. This information describes
each level of the categorical trait. For instance:
- variable: VEGPROP
name: Vegetative propagation of perennials
category: reproduction
type: categorical
units: .na
levels:
- value: 1
description: yes
- value: 0
description: no
Follow this 6-step procedure to submit a metadata file for a new trait dataset.
Before proceeding, make sure that the dataset you want to add is not
already in traitdatabases-metadata
.
Also, have a look at the example from the Hodgson et al. 2023 to understand which information is expected in each field.
Click on the Fork icon on the top right of this repository.
click on ‘Create fork’## Install < remotes > package (if not already installed) ----
if (!requireNamespace(c("remotes", "here"), quietly = TRUE)) {
install.packages(c("remotes", "here"))
}
## Install < traitdatabases > from GitHub ----
remotes::install_github("frbcesab/traitdatabases")
Choose the name of your dataset. Then, use the function
td_create_metadata_file
as follow:
traitdatabases::td_create_metadata_file(
name = "hodgson_2023", # name of your dataset
path = here::here("metadata")
)
Depending on the format, open a text editor or excel to fill in the metadata.
It is recommended to add a bibtex file with the citation information of the trait database in the same folder as the metadata file.
Double check that everything is complete, without errors. Finally, update the status of the metadata file (‘complete’, or ‘incomplete’), stage the new files and commit the changes.
This will contribute directly to the growth of the metadata on trait
databases. Your pull request will be reviewed in the shortest delay.
Thank you for your contribution :)
All types of contributions are encouraged and valued. For more information, check out our Contributor Guidelines.
Please note that the traitdatabases-metadata
project is released with
a Contributor Code of
Conduct.
By contributing to this project, you agree to abide by its terms.
Please cite traitdatabases
as:
Casajus N, Coux C & Frelat R (2025) traitdatabases: An R package to compile trait databases. R package version 0.0.0.9000. https://github.com/frbcesab/traitdatabases/
Coming soon…
Coming soon…
Footnotes
-
The dataset identifier should be short and should only contain letters, numbers and the symbol
_
. ↩