Skip to main content
SearchLoginLogin or Signup

Image Search with Structured Data Commons (SDC)

Published onAug 20, 2023
Image Search with Structured Data Commons (SDC)

BHL’s user community has had a long-standing desire for native image search. Millions of scientific illustrations, drawings, paintings, diagrams, maps, charts, tables, graphs, posters, photographs, and other biodiversity visual media exist within the BHL corpus. But since project inception, the metadata needed to drive image search has been missing.

In response, BHL has made great strides in collecting metadata for its images through various initiatives, which include BHL Flickr, a grant from the National Endowment for Humanities called “Art of Life,” and the eventual bulk upload to Flickr and Wikimedia Commons. Over the years, these efforts have gone a long way to unlock and expose BHL’s media to a larger digital audience.

With the recent departure of Communications Manager, Grace Costantino, who spearheaded BHL’s media initiatives, the time has come to take stock of past efforts (Appendix 2: Interviews). BHL should be exploring new ways to bring images and metadata into a central place and begin the work of normalizing, enhancing, and converting the media asset metadata into linked open data for greater access and dissemination on the web.

A timeline of BHL’s Media initiatives aim to better describe, find, and promote re-use of images in the corpus. Image: Dearborn, 2022

Structured Data Commons (SDC), recently launched on Wikimedia Commons, presents the perfect place for image metadata enhancement to continue. SDC is a Wikibase instance, installed on Wikimedia Commons to capture linked open data for the over 60 million media files in the media repository. This work is strategically important to the Wikimedia community as it provides the visual source material that drives content creation on all of Wikipedia's sister projects.

When browsing Commons, the main new feature to look for is an added tab where linked open data lives alongside existing free text descriptions (wikitext markup). This inconspicuous upgrade “makes the files on Commons much easier to view, search, edit, curate or organize, use and reuse, in many languages.”

Structured Data on Commons presents BHL with a new opportunity to collate its image collection into a central location and normalize over a decade of metadata accretion generated by users and machines.

The structured data panel for an image in Wikimedia Commons; Swine skeleton, after the technique of bone maceration, on display at the University of São Paulo Museum of Veterinary Anatomy. (Image: Wagner Souza e Silva)

SDC benefits for BHL users are manifold:

  • Greater searchability: SPARQL-driven queries for BHL images

  • Exposure to Wikimedia ecosystem: BHL images as source material for all sister projects including Wikipedia’s 300+ language editions.

  • Connected media: BHL’s images interconnected with other media databases on the web

  • User-generated metadata for BHL images: CC0, harvestable metadata to be reused by BHL Technical Team and other app developers

  • 5-star linked open data: conversion, normalization, and crowdsourcing of existing image metadata

  • Multi-lingual and accessible data: BHL images reaching new and under-served audiences through multilingual and accessibility features offered by Wikibase

    From the above benefits, perhaps the main advantage for BHL is that the metadata generated by volunteers could be harvested back into the BHL data ecosystem and used to drive native image searching — a long-standing request from BHL’s users.

BHL's contributions to the open access image space have left global audiences awestruck. Over 300,000 scientific images are in the public domain due to the hard work of BHL Staff. Image: Costantino, 2021

Continuing BHL’s image efforts on Wikimedia Commons (SDC) seems to be the logical next step forward for BHL image search – enhanced with 5-star linked open data.

No comments here
Why not start the discussion?