October 13, 2021
Rod is a Professor of Taxonomy at the University of Glasgow, Scotland, and an active member of the BHL’s Persistent Identifier Working Group (PIWG). His main interests are in phylogenetics, evolutionary biology, and bioinformatics. He blogs regularly at iPhylo and is the creator of BioStor which he initially built to be able to find and create metadata for BHL’s missing articles. BioStor has been a source of BHL article metadata since 2012 and has defined over 239,000 articles in BHL. Beyond BioStor, Rod has built many tools to connect BHL data to the broader web and is a prolific contributor of data to the emerging biodiversity knowledge graph. (Wikidata Profile: Q7356570)
Perhaps a BHL Lab to incubate cool projects; award a fellowship or annual stipend. Or another idea of an annual challenge (see: GBIF Ebbe Nielsen Challenge)
Keep the challenges open-ended, these were more interesting, and less specific (general scope: e.g. linking things to BHL)
Give people a six-month lead time to come up with proposals for their projects
Host a kick-off event
The work stems from the idea that biodiversity “things' should all be linked together. See this image for a visual representation of the main entities that should be linked.
The main antecedents that inspired Rod’s work:
TDWG, Index Fungorum, IPNI - Many data sources, I wanted to see and serve all information in Resource Description Framework (RDF)
There was a Thomas Reuters project which was a modern descendent of zoological record
I was aiming for DOIs for taxonomic names (but DOIs cost money) - then tried to work with IBM to come up with life science identifiers e.g. (Urn:lsid:zoobank:names:1112) but the identifiers weren’t resolvable.
Long story short: the biodiversity knowledge graph– it didn’t happen. We had all the data and the ID system but no linking.
The dream has yet to be realized but Wikidata has revived the dream a bit for me
But much more interlinking needs to happen and also community standards will need to coalesce around resource description
Overall: “Wikidata is amazing” – it has been a game changer.
High-level but here is the gist:
Rod has a huge database or article data (Biostor/JSTOR/ other sources) and 100’s of ways to get at this data. (see his Github page)
Uses Wikidata Quick statements to add citation data to Wikidata. He has built a tool recently called BHL2Wiki that generates the statements by using article DOIs to fetch data using the CrossRef API. (see BHL2Wiki)
Another of Rod’s dreams: every taxonomic article in Wikidata
Here’s one that Rod built: ALEC (A List of Everything Cool) is a tool to explore biodiversity content in Wikidata
It’s really cool but Rod has more conventional desires e.g. like what are all the articles in a pub?
Rod has a complementary relationship with BiCIKL collaborators (Pensoft and Plazi) – digitizing European natural history projects. They are able to secure EU funding to do this work.
Automatically cluster the names. Originally BioStor was trying to cluster names. (Rod has a paper about how to do this that David Shorthouse employs for Bionomia.)
Wikidata BHL Property | Defines |
BHL Page ID (P687) | segment starts |
BHL bibliography ID (P4327) | titles |
BHL part ID (P6535) | segments |
BHL creator ID (P4081) | author names |
BHL name ID (P8724) | taxons |
Yes, I have seen both properties used.
Firstly, the query you are trying to perform is BHL-specific — not necessarily a science question. We also need to learn to let go of control and let the community come up with its own standards for how to enter this data. Articles are being cataloged quite differently all over Wikidata and data modeling is a conversation and in flux (some using the part id, some using page id)
Also see: Wikidata wants to model FRBR model for book cataloging – (see: Wikidata Books project)
Massive restructuring could be done. The full-text search implementation is almost unusable - it doesn’t let you click on the snippet of text. See here for a better implementation here.
Yes, but organization is key. Both, in-person/virtual. (See workshop)