User-based tagging is not exactly a recent phenomenon. Over the past few years it seems that practically every Web 2.0 innovation allows users to apply tags to items to add retrieval. This has (somewhat predictably) caused a few in the library profession to throw their arms up in despair and claim that this ‘chaos’ will be the end of organised information retrieval as we know it. This egalitarian system of organising information was always likely to upset the more elitist members of the library fraternity. After all, what do they know about the subject matter of a particular source of information? No, it is best left to the ‘professionals’. But to what extent is this an accurate reflection of the use of tags by the general public?
There are a certain amount of limitations regarding the use of tags to index an item. For example, there is a certain amount of inconsistency regarding the application of tags. Take a typical set of tags on Delicious for example. An article on the Library of Congress may be tagged in several ways:
Whilst all these tags may be appropriate for the individual, the lack of consistency does cause problems for other users. A user searching for items on any one of these terms will miss any items tagged with any of the other terms. Clearly with the possibility of such a high degree of ‘misses’, it cannot be described as a foolproof system by any means.
There are also problems when it comes to homonyms. Take, for example, the word ‘bow’. This can be used to refer to the implement used to fire arrows, as well as a decoration on a present. Sticking with Delicious, should the user tap in ‘bow’ as a search term, they would be presented with list of links related to either medieval weaponry or decoration – two very different subjects with very different audiences! However, using Delicious, it should be possible to select an additional tag that will narrow the search down, although this is still likely to miss out relevant materials.
Although there are some issues with consistency, when tagging habits are analysed the picture isn’t nearly as chaotic as one would expect. I have recently spent some time looking at such habits, and was interested to discover that there is a high level of commonality between users tags. Take an article by Peter Merholz entitled ‘Metadata for the masses’. At time of analysing tagging habits related to this item, it was bookmarked by 56 members. Despite there being 61 different tags applied to the item, there was a high level of agreement between users regarding the ‘aboutness’ of the article. As you can see from the chart below, after the first three most popular tags there is a big drop-off in numbers.
In fact, should the third and fourth tags be combined (the latter is the plural form of the former) there is an even bigger drop-off. Furthermore, it appears that the first three tags (four if ‘folksonomies’ was incorporated) were applied by around 50% of those that tagged the item. This suggests that, despite popular conception, agreement can be reached regarding the ‘aboutness’ of a document without the guidance of an all knowing professional.
In his article, Merholz explains why he believes user-based tagging could be beneficial for organising information:
The potential benefits of free tagging should encourage practitioners to address such shortcomings [caused by inconsistencies in tagging]. In looking for a real-world analog, I thought of the foot-worn paths that appear in a landscape over time. Called “desire lines,” these trails demonstrate how a landscape’s users choose to move, which is often not on the paved paths. A smart landscape designer will let wanderers create paths through use, and then pave the emerging walkways, ensuring optimal utility.
I think this is exactly why user-based tagging could become a useful tool. Allowing users to freely create ‘paths’ to information would ensure that eventually a commonality will emerge. Once this commonality has emerged it should be easier to develop a controlled language that is truly in the words of the user (rather than in the remote terminology of the ‘information professional’), making information more accessible to the average user. Although user-based tagging does have its limitations, there is great potential there to organise large amounts of information – particularly in terms of the internet. Ultimately our greatest concern should be making information easily accessible for the average user. By dispensing with terminology that we are in control of and handing over a degree of control to the ‘masses’, the potential is there to attain this goal and ensure that a system that is user-friendly and relevant to the end-user emerges.