Like all Wikimedia projects, Wikidata is built on MediaWiki, an open-source wiki software with over 1,500 extensions [1]. Wikibase is just one of many MediaWiki extensions (actually a suite of extensions) written primarily in PHP and developed for the Wikidata project, providing collaborative editing and storage for structured linked data. Wikibase supports custom ontologies, SPARQL queries, federated queries (searching across many Wikibase instances), and data exports in multiple file formats: XML, RDF, and JSON [2].
Alongside the Wikidata community, the Wikibase development community continues to grow and enhance the open-source software’s offerings. The Wikimedia Deutschland Team hopes to make Wikibase installations more accessible to non-technical audiences by building new community tools and offering cloud-hosted solutions. In the recent publication “Strategy for the Wikibase Ecosystem,” the vision for the platform is bold — placing Wikibase at the heart of the rapidly unfolding decentralized semantic web, making data exchange seamless across a universe of Wikibase nodes. The Wikibase Strategy opens with this quote:
“The internet explodes when somebody has the creativity to look at a piece of data that’s put there for one reason and realise they can connect it with something else” [3].
–Sir Tim Berners-Lee
Currently, there are three install methods and a fourth, invite-only hosted solution: Wikibase.cloud [4].
Some GLAMs are opting for a separate Wikibase instance to take advantage of:
control over data quality,
granular user permissions,
data privacy for in-copyright data or personally identifiable information (PII),
custom data modeling; use of domain-specific ontologies, and
potential for SPARQL query federation with Wikidata and other Wikibase instances.
Naturally, the promise of search federation across a global network of Wikibase instances is exciting but note that federation options, like many Wikibase features, are still in very early stages of development and unless you are using the Wikidata ontology or appear on this list of federated knowledge base endpoints, your options for data interoperability are limited. (Wikimedia Foundation, 2022). Through query federation, the Wikimedia Deutschland Team envisions a global linked-open data ecosystem being erected using Wikibase.
“…we imagine that one day all the Wikibase instances will be connected between themselves and back to Wikidata.“ [5]
– Wikimedia Deutschland | Tech News
A current focus for Wikimedia developers is to make Wikibase deployments more accessible to organizations that lack resources and technical capacity. Additionally, the whole community is dealing with scaling issues for the Wikidata Query Service that relies on Blazegraph, a triple-store database that has reached end-of-life. Not only is Blazegraph at capacity and suffering from performance issues, but its codebase also went dormant in 2018 (after its acquisition by Amazon). The Wikimedia Search Team is seeking a suitable replacement for Blazegraph. The top replacement candidates are Jena and Virtuoso but data migration has yet to be completed [6]. Migrating off of Blazegraph is crucial for the project’s overall sustainability.
Despite architecture growing pains, many intrepid GLAMs are experimenting with Wikibase.cloud and/or the Wikibase Docker install option.
The promise of Wikibase has caught the attention of many national libraries including the German National Library, among the first to use Wikibase to host an integrated authority file. Other European libraries including France, Spain, Italy, Wales, Sweden, Luxembourg, and the Netherlands have followed suit with their pilots [7].
Nevertheless, the development of Wikibase is still in its nascent stages and for many libraries that lack a dedicated staff of developers, UX designers, information architects, and technical project managers, deployment and adoption is a very steep climb. Moreover, the promise of data interoperability across multiple Wikibases still needs further development and should include an accessible front-end interface for lay users. Accommodating other ontologies should also be considered, in particular BIBFRAME and CIDOC Conceptual Reference Model (CRM), two semantic data models from the cultural heritage sector.
The good news is that BHL joined the Wikibase Stakeholder Group (WBSG) in 2022 to join a group of like-minded organizations banning together to scope and test out new Wikibase features that support community-driven use cases. WBSG organizations aim to pool resources, to serve their collective users’ needs. Their work includes many investigations around Wikibase federated search: how it will work, and what functionality will be needed to ensure that Wikibase instances that model their data in a diversity of ways can still “talk” to each other.
In short, Wikibase is certainly worth watching. The promise of the community’s vision to unify global knowledge through a distributed network of connected Wikibase instances is so compelling, it should not be ignored.
If you expose your data for other people to consume then other people can improve that data for you. New knowledge can then be synthesized out of existing knowledge. [...] Putting your data in a format that other people recognize and expect (like semantic-linked open data) makes it actionable and usable.
– James Hare, Internet Archive Wikibase Developer (Appendix 2: Interviews).