Wikipedia is a behemoth, exercising massive authority across the web. The site consistently ranks in the top 10 most visited sites in the world, receiving an average of 4+ billion site visits per month.
Wikipedia is now a go-to source for fact-checking and many academics argue that Wikipedia is more reliable than a scholarly peer-reviewed article [1]:
“...I argue that the content of a popular Wikipedia page is actually the most reliable form of information ever created. Think about it—a peer-reviewed journal article is reviewed by three experts (who may or may not actually check every detail), and then is set in stone. The contents of a popular Wikipedia page might be reviewed by thousands of people.” [2]
Perhaps, a lesser-known fact about Wikipedia is that a majority of Wikipedia’s impressive user traffic comes from its symbiotic relationship with Google. Wikimedia’s Diff Blog purports that “75% of reader sessions (pages viewed by the same user) come from search engines and 90% of these come from a single search engine, Google.” [3]
These impressive stats are due in part to the fact that Google decommissioned its knowledge graph database in 2014 and now uses Wikipedia and Wikidata as pre-eminent sources of structured data to drive its Knowledge Graph Search API.[4]
Siobhan Leachman explains how this all comes together in a “virtuous circle of reuse” and it isn’t just Google that uses Wikipedia and backbone sister projects to proliferate BHL content throughout the web, it’s other biodiversity knowledge bases as well:
“But it isn’t only Google that ingests Wikipedia, Wikidata and Wikimedia Commons content. Citizen science websites such as iNaturalist also use them. After writing an article in Wikipedia, that article can then be found in iNaturalist. If you go to the species page and look at the “About” tab you will see a copy of the Wikipedia species article. Wikimedia Commons images can also be curated into iNaturalist. [...] It assists citizen scientists in identifying their current biodiversity observations. Once an observation is uploaded and confirmed in iNaturalist, the observation data is in turn ingested into GBIF (Global Biodiversity Information Facility). So BHL plays a part in ensuring the improved accuracy of data generated by iNaturalist and used in GBIF.” [5]
Since 2008, BHL has been aware that Wikipedia is a logical pathway towards de-siloing BHL’s content. Yet despite all of the beneficial outcomes Wikipedia campaigns bring for GLAMs, let’s face it – article creation in Wikipedia is hard. To be a successful Wikipedia editor, one must master a maze of rules which undergird the website, including but not limited to the following policies: Wikipedia’s Five Pillars, Wikipedia’s Manual of Style, Quality of the Article.
These policies are there for a good reason as necessary measures in the fight against disinformation campaigns which will only grow exponentially in the coming years with the rise of AI. Wikipedia is no stranger to internet attacks on its content and must remain vigilant in defense of knowledge integrity. [6] Now more than ever, we need evidence-based discourse.
Nevertheless, there remain knowledge equity problems to solve. The Wikimedia Research Team recognizes that vast and important knowledge domains have been omitted from Wikipedia due to restrictive content policies currently in place:
“While Wikipedia’s sourcing policies prevent information lacking reliable secondary sources from being considered, these policies have also prevented vast and important domains of knowledge from entering the project. This is particularly critical for cultures that rely on means of knowledge transmission and representation such as oral sources.”
To understand knowledge domain gaps, Alex Stinson, Lead Strategist at the Wikimedia Foundation, classifies them into three measurable areas (each seemingly more difficult to tackle than the last)
Coverage - Content created for or about any given knowledge entity
Language - Content translated into all the world’s languages
Geography - Content sourced from and/or about a specific region in the world
(Appendix 2: Interviews)
For a comprehensive look at knowledge gaps, the Wikimedia Research Team has developed a knowledge gaps taxonomy to help evaluate and measure inequalities that may exist in terms of knowledge creation and access. The ultimate goal of this work is to create a visualization tool or dashboard that will allow Wikimedians to see where the knowledge gaps exist in real time.
As GLAM professionals, we know there are wide disparities across these areas, and much of our work has been focused on solving this immense problem. But efforts to illuminate all human knowledge will need to scale rapidly to have any real impact on the underserved and underrepresented groups that are now being most threatened by the ravages of climate change. Many of these groups, which possess vast stores of biodiversity knowledge, transmit knowledge through oral traditions, not written ones. Indigenous communities and cultures and their knowledge about our natural world remain largely unaccounted for. Despite positive developments on the horizon for recognizing the unique contributions of Indigenous Knowledge for public policy, especially in regards to our relationship with nature, actionable steps need to be identified and taken [7]. Will Wikipedia community sysops review their policies to allow other knowledge creators to participate and be represented in the sum of all human knowledge? One hopes.
“Because the stories we tell—whether about climate change, United States history, cultural revitalization, or numerous other subjects—matter. Stories shape how we see one another, how we understand our pasts and presents, and how we collectively shape our futures.“ [8]