Monday 7 December 2009

Is language a moving target?

Dec 7 2009

I’ve been moved to respond to the blog post on Keywording Central oncontrolled vocabularies, synonyms and keywording standards…….

I think it’s important that people have a clear idea of what’s going on behind the scenes when images are keyworded. Until a year or so ago, I was in the dark myself, until I met and started to work with Liisa Kaakinnen who is a professional keywording and controlled vocabulary expert. Since then I have been deeply into words, and have found out quite a lot about how things work. 

It is true to say that there is no standard controlled vocabulary, but there are standards in the way controlled vocabularies are created, which emerged from the library and text retrieval sector.

Libraries like Getty and Corbis drew on existing methods of creating soundly constructed controlled vocabularies, and adapted the rules to fit the image world. The difference for images is that there are descriptive terms which need to be interpreted from the material rather than simply extracted from words already there. The vocabularies have to reflect this in some measure.

It is no surprise that the vocabularies of Getty and Corbis and other large agencies are very similar. The knowledge base is the same, the material is similar, the user base is the same, and the staff in all likelihood migrate from one company to another with their accumulated experience in the field. The fact that this mass of experience has resulted in very similar controlled vocabularies should be reassuring. The proof of the pudding is in the eating.

The controlled vocabularies provide a way of making keywording consistent and exhaustive. It is important to distinguish between the process of keywording, where the controlled vocabulary is an essential tool, and the process of searching, where the user is mostly unaware of the vocabulary sitting in the background. The way terms are used and presented to the searcher depends  on the search software, but the principles of a sound vocabulary remain based on the use of a controlled vocabulary and a list of synonyms.

The keyworder will generally use the 'preferred term' to tag the image. The addition of synonyms will ensure the use can find the image however they express the search term. Synonyms are not 'hocus pocus' at all; they enable the searcher to use the word that is intuitive to them.

The fact that language grows and changes should not confuse people. Keyworders have been living with this phenomenon for many years, and are aware of the need to update terms. Vocabularies grow and change, but if they are set up using the rules they will remain stable when extra terms or groups of terms are added.

It isn't clear what the blog is trying to say. Of course vocabularies have to be created to suit the individual needs of agencies. But it is also true that they need to take into account the needs of their distributors, and the fact that these vary.

Reformatting keywords (especially for Alamy) can be very time consuming for picture agencies. If Kevin Townsend's blog is saying it is important to use the services of professionals, we would agree. It can save time and money. If he is saying that talking of controlled vocabularies and synonyms is magic wand waving we couldn't disagree more.

Here are two solutions we know of:

One: We are part of the IPTC effort to investigate the creation of a limited 'standard' vocabulary which could make keywording more standardised, and enable keyword translation.

Two: Keyword output needs automating. To present keywords differently for different outlets, you need automated systems to intelligently shift the data. Time to talk to the software people, or find out about our automated solutions to these data problems.

More on keywording on my website