Friday, 16 August 2013

Heritage Metadata at IPTC

Heritage is a fast growing sector, with millions of images yet to be digitised and catalogued and discovered by the public. In June the IPTC Photometadata Conference 'Image Rights: manage them - or lose them' in Barcelona held an afternoon session ' Metadata of Cultural Heritage images' where we discussed the role of metadata in the heritage workflow and the need for data embedded in the image file.

I wrote about the need for attribution of heritage digital images in my previous blog 'Museums - How important is Attribution?'.  Greg Reser from the University of California San Diego, underlined the point again in his presentation at the conference. Greg is a metadata expert and among other things is Chair of the VRA (Visual Resources Association) Embedded Metadata Group. His presentation, downloadable here, showed what happens to the well documented web image on a heritage site when the image is downloaded by a user. All too often, the image on the users desktop is devoid of any documentation. Why? Because there is no metadata embedded in the image.

Greg set out to show us how embedded data can help ordinary users do their jobs in the education and heritage sectors. He was keen to emphasise that use of embedded metadata is not restricted to professionals working with Photoshop and Bridge. He showed that key information  can be displayed by the file browser in both PC and Mac. He also showed an example of how useful embedded metadata can be to people accessing images, demonstrating a utility developed by the VRA to bring embedded metadata into the notes section of Powerpoint so teachers can access the information as they show the slides to students. The data can be uploaded into a class web gallery as well, for reference by students. The opportunities are endless. Although not all metadata fields are currently displayed on file browsers, selected information including Dublon Core Description can be displayed.

The next part of the conference detailed our effort to bring more heritage fields to the IPTC schema. The IPTC schema originated in the news industry and became the standard for embedded metadata in the broader image production, distribution and  licensing industries. Until IPTC Extension was created in 2007, there were no specific fields for heritage objects or artworks. The need for a separate set of fields was demonstrated by this set of metadata found online some years back.

This is an understandable mismapping if the creator of the photograph and the creator of the artwork are listed in the same field.  The Artwork and Object fields in IPTC Extension were created to avoid this kind of semantic muddle, and went some way to making it possible to transfer data about heritage objects embedded in the image file. Since then, in the course of working with heritage metadata, we have identified  gaps in the IPTC schema for heritage fields, and we are now running a project to add new fields at the next update of IPTC.

There are some immediate and obvious gaps. The date field is one. The IPTC Extension Date Created field for artworks is a date formatted field. This doesn't encompass the uncertainty in heritage dates: circa dates and spans of time. So a text formatted date field seems an obvious addition. Material  (or Technique or Medium) is another, as is Style or Period. Then there is the need tolink to other sources of information about an artwork on the web, so url fields for the artwork are also on the list.

We have produced a set of possible candidates for the IPTC schema which we are putting out to consultation in the next few weeks. But how far should we go in adding detailed heritage fields to the IPTC?

Generally speaking the direction of travel should be towards granular, non ambiguous data. The objections raised when I first started looking at IPTC, namely that no-one would use so many fields, are overridden now by the increasing automation of data transfer. It is down to the technical people to create selective user interfaces for appropriate sectors , but semantic precision is all important for interoperability.

Nevertheless, in looking at the richness of heritage data we realised that the project could become unwieldy for IPTC and some criteria have to be set.

We looked at how embedded data is used in the heritage sector; for transfer of data within and between institutions (Institutional Use); and for display to the public on the desktop and on the web (Public Display). Perhaps the IPTC schema, which is viewable by anyone using the latest Photoshop and Bridge software, is suited to display of data once an image has left the institutional environment and the VRA schema (and other heritage schemas) can be used for the detailed data used by heritage institutions (and viewable by means of a custom XMP panel), with some fields common to both schemas. See the Metadata Deluxe site for the beta version of the current VRA custom XMP panel.

Here is the list of initial criteria we have set out for adding fields to the IPTC schema

YES  for IPTC if 
  • about the digital image
  • needed to identify source, creator and copyright of artwork
  • links to more granular data on the web
  • descriptive data useful for end user
  • useful for image retrieval
  • can be mapped from other schemas

NO for IPTC if
  • about movement, condition, exhibition of work, except where needed for accreditation
  • specific to institution holding the work and not needed by outside users
  • detailed provenance (former owners etc)
  • about monetary value or insurance of work

We are looking at creating an overall schema for embedded metadata for heritage, to cover both institutiobnal and public access uses. For now we have called it SCREM for Heritage Media Files (SChema for Rich Embedded Metadata for Heritage Media Files)

If you work with heritage images see the presentation I made to the IPTC conference and give us your feedback on the candidate fields. A link showing candidate fields will be up shortly for your responses.