Menu Close

Other data types and data subject to special requirements

Some data types may need to be adapted to fit specific needs, for example to be interoperable with other reporting systems and/or structure of DwC data cores, terms or extensions. One example is marine biological research data where data reported to the Ocean Biodiversity Information System (OBIS) should follow their best practices for data to be publishable also in GBIF.org, without additional data processing. For further reading see https://obis.org/manual/dataformat/.

Biologging data (covering positions, movements, and sensor data from tracked animals) constitute an example of a special data type that requires customized data publishing pipelines. Within SBDI, we can accomodate biologging data of various formats in the WRAM (Wireless Remote Animal Monitoring) or CAnMove (Centre for Animal Movement research) databases. There is some support in the LA platform for biologging data, and there is also active work to develop community standards and integration of these data in biodiversity platforms. SBDI contributes actively to these efforts. For more information, see https://www.tdwg.org/community/mobs/ for machine observations in general, and https://github.com/tdwg/dwc-for-biologging for biologging in particular.

Archaeological data is another data resource requiring customized publishing tools and data standards. Special concerns arise both in the taxonomic and temporal indexing of archaeological records of biodiversity. SBDI is active both in developing tools and standards for this type of data.

All data publishing tools support the inclusion of links to images or other media files documenting records in standard resource types, like Occurrence Data. To facilitate sharing, it is important that a direct link is provided to the location of the media (image, sound, video etc) file itself, and that this file is under one of the SBDI open licenses. Neither SBDI or GBIF provide primary storage for the multimedia files, this is the responsibility of the data provider. With increasing focus on machine observations, we expect a growing interest in publishing datasets that primarily consist of media files. For such datasets, there is a special data standard for media resources that might be relevant, the Audubon Core. Collected information and best practices on how to share multimedia can be found in this GBIF blog post.

The raw data may contain information (fields) for which no appropriate Darwin Core term exists. As mentioned previously, the IPT gives considerable flexibility in specifying what ontologies or controlled vocabularies that are used in describing the records in the dataset. Thus, if the fields are covered in a Darwin Core Extension, or in any other authoritative international standard ontology or controlled vocabulary, the information can be shared by referencing this source. If there is no such source, then it is still possible to share the information using the DwC field dynamicProperties. The recommendation is to structure the information in terms of key:value pairs, and support a common interchange format, such as JSON. Such an approach will enable sharing of these non-standard fields possible, and it makes it at least possible, if complicated, to use the information in analyses across datasets.

There are a number of concerns that may arise in taxonomic indexing of biodiversity data records. See below for more information on this.

If you have any questions concerning how to publish your data, do not hesitate to contact the SBDI Support Center for assistance in finding the most up-to-date information and in identifying the relevant best practices for your dataset.