The SciencePAD treasure hunt of Persistent Identifiers

Have you ever followed a reference for an article that eventually turned out to be unavailable? Have you always thought that experiment data or research results should be permanently accessible? If so, you will probably be interested in the latest news on the current status of software and digital information issues, and you shouldn’t miss the SciencePAD Persistent Identifiers Workshop (SPID2013).

The workshop is being organised by the European Middleware Initiative (EMI) together with its SciencePAD (formerly known as ScienceSoft) project and will take place on 30 January 2013 at CERN. SciencePAD is a community-enabler, working on providing the necessary one-stop-shop tools, including software, catalogues, statistics, reference citation systems, marketplaces, technical services links, platform integration supports, etc. It allows researchers and open source software developers from scientific communities to interact with each other within and beyond their area of expertise, and contribute to the establishment of global knowledge networks in science. “The aim of the SPID2013 Workshop is to bring together experts in the field of digital information and identification of digital objects such as software, data, publications, etc. to look at the current status of specification, implementation, policies, schemas and trends. In particular, discussions will focus on the current status of persistent identifiers for software objects and their relationship with other digital objects,” says Alberto Di Meglio, EMI project director and leader of the SciencePAD activities.

The need for persistent identifiers, namely maintainable identifiers allowing users to refer to a digital object, such as an e-print article, image, dataset or installation file for a piece of software, appeared in the early World Wide Web days. Originally, URLs, specifying particular locations or servers, were understood to be network locations for digital resources to be retrieved, like a web page. So, the URL could be given to others to access the resource. If the way the data was accessed didn’t change, there was no problem. However, this procedure proved to be unreliable because, in the long term, URLs often do not work anymore.

When research results are published online by an institution, users expect the results to be well managed (for example, avoiding data loss or the expiry of web domains). However, this can still happen even if the results are well maintained, because a ‘traditional’ URL cannot be relied on to provide permanent access to that resource. So, online data management relies more on persistent identifiers for the data, so that it continues to provide information about what it identifies, no matter where it is stored and no matter where it is in its life cycle. In fact, a persistent identifier can obtain information about a resource, even if the resource is no longer online. In other words, a persistent identifier continues to give intelligent information about a single object, whatever happens to that object, and users should be able to find out about the resource using only the identifier.

Attending the SciencePAD Persistent Identifiers Workshop will give you the opportunity to contribute to the ongoing efforts to integrate software information into digital knowledge networks and define what is required to make it happen.

For further information:

Budapest: 5th EMI All Hands Meeting - GridCast, 31 October 2013
The EGI Technical Forum 2012 is taking place at the Clarion Congress Hotel in Prague, 18 September 2012
EMI gave birth to ScienceSoft, May 2012
EGI Community Forum, 28 March 2012
Interview with Alberto Di Meglio, EMI Project Director, 28 March 2012

by Beatrice Bressan, for the European Middleware Initiative (EMI)

CERN
Bulletin