At the heart of all this interlinking work is Wikidata, Wikimedia Foundation’s most active project in terms of total edits and item count (Appendix 3: Statistics). The platform is generating and providing 5-star linked open data to the world at break-neck speed [1]. The idea for Wikidata was born in 2012 out of a desire to maintain synchrony between the Wikimedia Foundation’s universe of wikis (300+ language editions of Wikipedia and sister Wikimedia projects). The time-consuming task of manually updating articles in each language edition whenever a statistic, figure, or fact changed about the world was inherently unsustainable.
Dr. Denny Vrandečić and Dr. Markus Krötzsch were tasked in 2012 to lead a team of Wikimedia Deutschland developers to build Wikidata [2]. Today, the collaboratively edited knowledge base has grown to be the Internet's largest crowdsourced linked open data repository [3]. The Wikidata community has over 24,000+ active users who have edited 102M+ items over 1,800,000,000 times (Appendix 3: Statistics). All data in Wikidata is licensed under Creative Commons CC0 License, free to all.
“Wikidata helps with the problem of opening data silos. There is a lot of knowledge out there, we know a lot about our surroundings, but it is hidden behind paywalls in silos. Wikidata bursts these silos open for the greater public to be used.”
–Andra Waagmeester, Micelio Bioinformatician. [4](Appendix 2: Interviews)
As the world’s largest openly editable knowledge base, Wikidata, and its underlying software Wikibase, present a compelling vision promising to de-silo all of the world’s data.
Nevertheless, achieving this dream does not come without challenges:
“The project suffers from the biases and vandalism that plague other Wikimedia projects. Including gender gaps in the contributor base—the majority of the volunteer editors are male. And the majority of the data is from—and about—the Northern Hemisphere. The project is young, Giesemann emphasizes” [6].
Despite its detractors, Wikidata holds extraordinary promise to break down data silos. This growing global data hub is now an intimate part of our daily lives. Alexa, Siri, and Google all use Wikidata to drive their ubiquitously relied-upon search services. Google decommissioned its knowledge graph database, Freebase, and in 2014 opted to migrate the data to Wikidata [7]. Today, Google Search relies heavily on data from Wikidata, Wikipedia, schema.org microdata, and other licensed data sources to drive its knowledge graph panels [8].
As a central data repository that drives core third-party Internet services, the quality, coverage, and structuring of data in Wikidata have never mattered more. At its best Wikidata is opening new knowledge pathways, connecting disparate ideas, and helping uncover untold stories from the margins. At its worst, Wikidata can reinforce old power structures, highlight gaps in human knowledge, and reduce complexity to the point of harm [9].
BHL as a data source holds immense untapped potential and would expand Wikidata’s quality and breadth in the life science domain. Additionally, according to Andy Mabbett, Independent Wikimedian, the Wikispecies project is another rich source of biodiversity data and persistent identifiers that should be migrated to Wikidata for the multilingual, searching, and interlinking advantages. BHL staff have a deep bench of advanced data modeling and crosswalking expertise and are ready to assist with biodiversity data enrichment projects and ontology refinements.
Wikidata may be humanity’s opportunity to re-imagine and re-frame knowledge representation — and in so doing, perhaps the chance to change our collective narrative to respect, include, and celebrate the diversity of all life on Earth.
“Until Wikidata can give me a list of all movies shot in the 60s in Spain, in which a black female horse is stolen by a left-handed actor playing a Portuguese orphan, directed by a colorblind German who liked sailing, and written by a dog-owning women [sic] from Helsinki, we have more work to do.”
User: Tobias1984, Wikidata project chat, 2013