The BHL community has already come together to build the world’s largest open access digital library for biodiversity literature and archives. BHL is a vital knowledge repository containing over 500 years of media, text, and data, about Life on Earth. In the process of liberating content from physical materials, we have learned more about how valuable the data, born in literature, can be to supporting life on a sustainable planet.
BHL’s value should not only be measured by the content it has to offer the world. BHL’s global collective of naturalists, academic scholars, scientific researchers, librarians, information architects, and web developers represent a community that is diverse, deeply knowledgeable, and committed to BHL’s goal to provide “free, worldwide access to knowledge about life on Earth.” [1] BHL Staff are both data curators and generators; creating metadata, imaging collections, and sharing untold stories on the web. Together, with Wikimedians, universal bioliteracy isn’t just a dream, it can become our shared reality.
The need for the global biodiversity community and its disparate data silos to unify and build a biodiversity knowledge graph rich in human and machine-curated interlinkages has never been greater. It is the missing “technical infrastructure” sought after by climate policymakers, national governments, and intergovernmental organizations. Converting BHL’s dark data into 5-star linked open data that can be shared widely, improved upon, and made actionable is BHL’s current challenge.
“If we want to address the big challenges we face around the future of land use, conservation, climate change, food security, and health, we need efficient ways to bring together all the data capable of helping us understand the changing state of the world and the essential role that biodiversity plays at all scales.” Donald Hobern, Former Executive Secretary of GBIF. [2]
Dr. Rod Page has advocated that we link together the diverse sources of biodiversity data into what he coins the “biodiversity knowledge graph" [3] (Appendix 2: Interviews). In a recent paper, Dr. Page makes the case for Wikidata as the “logical venue for a global database of taxonomic literature, the so-called “bibliography of life.”
“It not only benefits from a community of active editors, it piggy backs on the remarkable fact that taxonomy is the only discipline to have its own Wikimedia Foundation project (Wikispecies). Consequently, a large number of taxonomic works and their authors already exist in Wikidata.” [4]
To further bolster Page’s argument, the number of global biodiversity databases contributing their identifiers to Wikidata is large, and growing — Wikidata’s Query Service returns over 1300 biodiversity databases and catalogs interlinking their records with Wikidata – a global biodiversity hub indeed.
There is much work to do to unlock, normalize, and semantically enrich the data present in BHL’s corpus but, with the help of a global movement of Wikimedians, this work does not have to fall on the shoulders of “a relatively small cadre of information professionals” [5]. Demand for BHL’s data is growing. Whether BHL actively participates in this work or not, it is already underway. BHL’s main data outputs are already openly available. Bots and Wikimedians like user Fae, Ambrosia10, Rdmpage, Uncommon_fritillary, Magnus_Manske, and many others are bulk-loading BHL’s data into the Wikimedia ecosystem.
As a major data consumer and provider, BHL’s data should be measured against Berners-Lee’s 5-star rating scheme, pushed into the Wikimedia ecosystem, and interwoven into the growing global web of data. Integrating better quality BHL data into Wikimedia will also fulfill BHL’s commitments to the Bouchout Declaration, signed in 2014 to:
“promote free and open access to data and information about biodiversity by people and computers and to bring about an inclusive and shared knowledge management infrastructure that will allow our society to respond more effectively to the challenges of the present and future.” [6]
Wikimedia is a shared information infrastructure that can proliferate the Biodiversity Heritage Library's data-rich collection beyond its repository. By transforming BHL's dark data into 5-star linked open data, BHL can realize its goal to become "the most comprehensive, reliable, and reputable" resource to support global challenges.
The challenges ahead will require the BHL community to act strategically and pursue new capacity-building partnerships. To overcome the digital constraints of legacy or missing metadata, prohibitive paywalls, data silos, and proprietary software and formats, BHL needs to extend beyond digitization and curation efforts within its siloed data ecosystem to bridge knowledge gaps and set biodiversity knowledge truly free.