Integrated Publishing Toolkit (IPT) - Swedish Biodiversity Data Infrastructure

The main data publishing tool provided by GBIF is the Integrated Publishing Toolkit (IPT). It is an open-source software tool for managing, publishing and sharing biodiversity data via the GBIF network. The IPT is widely used by GBIF nodes around the world (see https://www.gbif.org/ipt). The IPT can be set up locally by any data provider, but the most efficient and convenient way to use the IPT is to request a free account from a trusted data hosting center running an IPT instance. Swedish biodiversity data publishers are strongly encouraged to use the national IPT instance managed by SBDI/GBIF-Sweden (http://www.gbif.se/ipt).

The IPT supports publication of various types of data resources (occurrences, checklists, sampling-event data) in Darwin Core format (including registered Darwin Core Extensions) and metadata about these datasets in EML format. The IPT works exclusively with open data; sensitive data has to be filtered out before publication through the IPT.

Landing page of the swedish IPT installation — Landing page for the Swedish IPT installation (https://www.gbif.se/ipt/).

The IPT provides a graphical web interface that allows you to manage your data and metadata manually. The data can be fed into the IPT manually as static files or automatically via a connection to a database. The IPT allows you to map the data fields from the source to the relevant Darwin Core terms, including terms from registered Darwin Core Extensions. The IPT also ensures that your metadata are structured according to the EML specification.

Each data resource in the IPT can be configured so that the dataset is published on-demand, when the data owner requests it. It can also be set up so that the data are published automatically at a predetermined interval; this is the option of choice when the dataset is frequently updated, and the data are fed into the IPT through a connection to a database. The mechanism for harvesting data from external systems can be selected to accommodate providers and the systems they have in place to expose their data (SOAP service, batch file download etc.).

The IPT publishes datasets as valid and versioned Darwin Core Archives (DwC-A). The DwC-A files can be accessed directly from the Application Programming Interface (API) of the IPT, which constitutes an integral part of the SBDI data layer. The published data are harvested at regular intervals and automatically become searchable through GBIF.org and the SBDI Bioatlas.

Each dataset is provided with a DOI (Digital Object Identifier) by GBIF (the GBIF secretariat through the gbif.org infrastructure) when it is published by the IPT. The DOI allows researchers that use the data to cite the dataset. Data publishers are encouraged to also document their dataset(s) in data papers. The metadata associated with a data resource can be downloaded from the ITP in EML or RTF format and used in the publication of data papers.

The IPT itself does not process or validate the data. Such processing is reserved for the consumer of the data, such as the GBIF.org or LA platforms. There are several validation tools that allow you to find and correct any issues there might be in the dataset both before and after publication, as detailed below, and we recommend that you use one of these tools to maximize the utility of the data. More information on data validation can be found in the guide describing how to improve data quality.

The IPT does not require taxonomic identifiers to adhere to any particular taxonomy. This allows the mobilization of data containing non-Linnaean taxonomic names, e.g. species hypotheses based on sequence data, names of functional groups, manuscript names, or identifiers of molecular taxonomic units.

An SBDI Data Publishing License Agreement must be signed before making your dataset public. There are no automatic checks in the IPT that stop you from publishing your data before such an agreement is signed, but it can have serious legal implications (e.g., GDPR infringements). The IPT supports open license data only (CC0, CC-BY, CC-BY NC), that is, the same data publishing licenses supported by SBDI.

The web-based interface of the IPT provides functions for administration of user accounts (including assignment of user roles) and registered publishing organizations. You can also add, remove or update information on the set of Darwin Core standards, Darwin Core Extensions and controlled vocabularies used in encoding the dataset. Access to most of these functions in the national IPT instance is restricted to SBDI staff.

More information about how to use the IPT can be found in the GBIF IPT video tutorial.

Contact the SBDI Support Center if you are interested in sharing your data, if you would like support in preparing your data for publication or if you would like to get an account created on the Swedish IPT installation.