This group supports the activities addressed to the Nodes associated to the 22nd meeting of the GBIF Governing Board in Oct 2015

Share |
Group discussion > PRE-COURSE ACTIVITY: Promoting data publishing

PRE-COURSE ACTIVITY: Promoting data publishing

Mélianie Raymond
761 days ago

In preparation for the Nodes training event in Madagascar, we would like to invite nodes and their teams to share their experiences with promoting data publishing:

What are the challenges that you face when promoting data publishing? How have these challenges changed over time? Which strategies and resources do you normally use to promote data publishing?

Looking forward to hearing your ideas!


Cees Hof
757 days ago

Some tips and tricks:

  • Start simple, start with metadata only. If data owners have difficulties with data publishing of "all" of their data, start with extended metadata publications (in GBIF obviously) and demonstrate the benefits of the online presence of your organization and (meta)data in GBIF.
  • Promote and realize the principle of "one-off publishing, multiple exposure". Show an example like a GBIF dataset also visible in VertNet. Try to hook-up with other data-networks, nationally or internationally. For example, Dutch GBIF metadata are soon also visible in our national scholarly database of open datasets and publications.
  • Provide hosted IPT services, that takes away the technical threshold that still exists for al lot of the (smaller) data publishers.
  • Optimise the visibility of your data publishers through your (multi or bilingual) website, project pages, social media, etc.

Nicolas Noé
739 days ago

As our community matures and the amount of data grow, I think it is finally time to pay attention to the licensing we choose to apply to our datasets. Public data is good, but Open Data that can be actually used without legal risks in a wide range of countries and situations is much, much more useful.

I think this subject has been a bit neglected so far for multiple reasons, the biggest being probably that it's extremely complex and that very few of us have even a "simple but solid" grasp of it. This is problematic when working with data owners, and that frequently leads to fear-based decisions using incorrect assumptions. I therefore think all of us who regularly promote data publishing should get better at this topic in the near future.

I think an excellent starting point is "Why we should publish our data under Creative Commons Zero (CC0)" by Canadensys/Peter Desmet. It debunks in a simple language the most common myths and fears about things like citations, copyrightability of biodiversity data, ...

It's also worth noting that other Open communities (Open Source, other open data initiatives, Wikipedia ...) are older than GBIF and have already experienced much more about licensing issues. When faced with challenging licensing questions, I think it would be foolish to NOT look a bit further than our community and to build upon their (often learned the hard way) past experiences. A good example of that is how and why OpenStreepMap had to change the license after a few years of activity, and the fact that some data has to be removed (if the data owner disapprove the new license, or was just impossible to reach).

Hope this helps! 

730 days ago

I agree with Nicolas. Data licencing is one of the biggest issues whenever I meet potential data publishers and try to convince them to publish their data. IPT version 2.2 for instance requires data published through it to use CC0 or CC-BY. I know this decision came from a long and thorough consultation process and it has benefits but it is difficult to convince new people to accept this.

The resource mentioned by Nicolas "Why we should publish our data under Creative Commons Zero (CC0)"has been a good help and it could be beneficial to have other materials to use to for this matter. E.g: brochures, flyers, etc. to raise awareness on the benefits of non-restrictive data licencing and dissmiss fears by data holders that their data will be wasted if published in the public domain.

Cees Hof
730 days ago

The issue of licensing should be considered in combination with topics such as data quality and sensitive data. It is my experience that data owners do not so much fear the CC0 or CCby statements as they simply compensate for that by lowering the quality of their data, for example by blurring the geographic precision of the data.

Hanna Koivula
730 days ago

I agree! We just had a debate about this with FinBIF data policy process. I'm thinking we might end up with a solution where we define a "data-core" which will be available with CC0 making the data discoverable globally. For example global portal would have CC0 access to all dataset level metadata, taxon-location-time of the record and basis of record (or something else defining the fitness-for-use on global scale). The richer data would be available, but with at least CC-BY or stricter lisencing. Being very clear what the benefits and constraints about any lisence will be crucial, espceially when mobilising sample based data.


Anne-Sophie Archambeau
728 days ago

I agree with all the previous comments and they are really close to our own experiment.

But I would like to add that we also promote and organised trainings about data papers and that it is well received.  Even if there is often remarks about the fact that it is not exactly recognise as a "real" publication for work evaluation, we had a really good feedback from those who already published data papers (already few french data publishers who observed direct impact on the visibilty and numbers of downloads of their data sets). So it contributes to convince new publishers. 

Dairo Escobar
724 days ago


In Colombia we have several challenges: 1) the publishers consider that publish of resources (data and metadata) is an activity that time consuming them. 2) Lack of knowledge of scope of the licenses for data use. 3) Few options to see who and how the community is using their data. 4) The publisher prefers to publish data without coordinates for fear of the potential use of their data and no attribution.

Our strategies to try to solve these challenges are focused on make trainings and workshops to publish and data structure according with DwC. We present basic and advanced tools (as a kit to data publishing) to these with tutorial step by step in documents and video. Also we have a small team of people that help them in the process.