Join us now: Best Practices for citizen science data
Oct. 19, 2020, 10:25 a.m.
By Lien Reyserhove, Quentin Groom, Tim Adriaens, Jan Pergl, Toril Moen, Sofie Meeus
The vast majority of species distribution records come from citizen scientists. These data complement records from professional researchers and help global efforts to conserve biodiversity and understand the natural world. Invasive alien species policies heavily depend on citizen science data. Many invasive alien species related research and policy activities require pooling data. Therefore, open data publication, standardization and interoperability are particularly important. Without good data publishing practises, these data get lost or become useless. With time, the link between the data and the methods that were used to collect them gets lost. The FAIR Principles have become the gold standard for publishing scientific data. They ensure that the data are Findable, Accessible, Interoperable and Reusable, and can be used for science indefinitely. Nevertheless, these are generic principles for any kind of data and there is a need to further refine and tailor them specifically for biodiversity data collected by citizen scientists.
The primary place for publishing open biodiversity observation data is the Global Biodiversity Information Facility (GBIF). By publishing data through GBIF, many of the FAIR principles are met automatically. For example, a dataset published through GBIF is accompanied by a Digital Object Identifier (DOI) which makes the dataset findable using a searchable resource such as Google Dataset Search. GBIF also obliges users to conform to community standards, such as Darwin Core and Ecological Metadata Language. This makes the data interoperable. Yet despite this, the richness of the metadata and attributes (principles F2 and R1, https://go-fair.org/fair-principles) associated with the dataset really depends on how well publishers implement these standards. Rich metadata should provide information about the context, quality and characteristics of the data and should allow a computer to accomplish routine sorting. However, the richness of metadata associated with citizen science datasets varies wildly, and it is not that we need to set the bar higher, it is that the bar is not visible. We need clear, achievable guidelines for these datasets. These guidelines should be simple and available in a diversity of languages, so that every researcher involved in alien species citizen science projects can understand and apply them.
Within Working Group 3 (Data Management and Standards) of the COST Action 17122 “Increasing understanding of alien species through citizen science” (Alien CSI, https://alien-csi.eu), we aim to develop metadata guidelines for publishing citizen scientist datasets on alien species. However, these guidelines should be tailored to the needs of any citizen scientist and researcher. With this blogpost, we are looking for collaboration with citizen science researchers, project coordinators or citizen science system owners, although everyone is welcome to join us. More specifically, we are looking for contributions to (1) formulate metadata guidelines for publishing citizen science datasets, (2) generate an informative leaflet to be distributed amongst citizen scientists and (3) translate these recommendations to a multitude of languages. Discussions will be organized on a monthly basis via conference call. Knowledge exchange and acquisition can also be facilitated through a Short Time Scientific Mission (STSM).
Anyone willing to contribute to this specific work or to engage in an STSM is highly welcome to join our working group. Please contact firstname.lastname@example.org for further information.
A conversation on this topic has been started on our community forum.
SangyaPundir [CC BY-SA 4.0], from Wikimedia Commons