I think I want to do some sort of review of how various socially-oriented websites have dealt with vocab control issues when their folksonomies become bothersome, aka when they realize things that should be retrieved are probably not being retrieved, and it is effecting the user experience enough to merit some sort of vocabulary control effort.
I became interested in how LibraryThing has gone about this when I did a search for books tagged with “science fiction” and was informed that the system automatically also searched for books tagged “scifi” “sci-fi” “science-fiction” etc. “Who told it to do this?” I thought. Then I found out. For more on LibraryThing tagging and subject headings, see this post and this post.
Recently I read (via Book of Trogool citing ChristinasLISrant citing the Stack Overflow blog) about Stack Overflow’s tag synonym repository and wiki. The issues they’re discussing are relevant to almost any site dealing with user-generated tags, and the way they’re approaching it is interesting to read about. I didn’t even know about the existence of Stack Overflow before this but now I think I need to.
All of this got me thinking about what makes people tag, and more importantly, what makes people want to put the effort into telling a system, “Hey, these tags should be combined.” I think tagging is one story on a social website, and entirely different for library catalogs. I’ve read way too many lackluster articles about tagging in OPACs, and every one I’ve used that offers the chance to tag seems to be failing at making it worthwhile or useful. The two go hand in hand, of course. If no one sees it as worthwhile to tag, the tagging feature will not be useful to anyone. (this only applies to catalogs that are starting from scratch with their folksonomies, not importing any tag data from somewhere else)(digression: are such things copyrightable? can a site’s folksonomy be considered proprietary even though the tags that comprise it were created by users?)
It seems to me that the motivation behind tagging is that the user is personally invested in the site as a community/social service (wikipedia, stack overflow) or the user knows that they’re putting something into a mass of other things and the only way they or anyone will be able to find it is if they make the effort to tag it (this is why i tag on flickr and delicious but hardly anywhere else, depending on my mood). Sorry to say it but I think most people don’t feel as personally invested in the library as they do in their online communities. (that’s something for my other blog, libraryrants.com)(j/k. but why doesn’t anyone own that domain yet?!). As a user, I appreciate the service the library provides for me, but when I’m in the catalog I don’t usually have the time or patience to tag something. In any case, I only know what tags are relevant to apply after I’ve already finished a book, and why would I look the book up in the catalog after I already read it? I would have to feel very motivated to say “I am going to go tag the books I read in my library catalog so I can help other users find them!” 1. I would have to believe that other users would make use of the tags and 2. I would have to have a lot more time on my hands. And this is coming from someone who is really interested in tags and folksonomies and library catalogs! I think I’m starting to rant. I should buy that domain. Oh, one more thing though: 3. I wouldn’t be motivated to tag in a library catalog because I know that subject headings are there to help me and others find stuff. When I know there’s no such assistance, then I consider making the effort to tag.
But I haven’t started to read The Literature about tagging motivation yet. There are probably revelations to be had.
Anyways: back to the issue as it relates to online communities and social websites and not library catalogs. There is motivation, and there are people realizing that folksonomies + some vocab control = a worthwhile pursuit. To me this is pure magic, and I want to investigate all the creative ways people are coming up with to automatically combine tags, but also to make sure that the resultant preferred terms (and maybe, one day, hierarchies?) suit the language of the community as much as possible. As usual, I am overwhelmed with information.
Leftovers:
In late 2005, Alexander Street Press launched a folksonomically oriented database on women and social movements. But we found that when the size of the user community was only 500 or so academics, folksonomies were not that useful except as adjuncts to an existing taxonomy, or as a help for keyword or full-text search. They are not a silver bullet.
-Stephen Rhind-Tutt, president of Alexander Street Press
The vast majority of [EBSCO] users don’t use our services because they love them, they use them because they have to. I just can’t see college students tagging articles inside EBSCOhost or Pro Quest.
-Michael Gorrell, CIO of EBSCO
–both cited in “Speaking Technically”[Discussion with seven database publishers]. American Libraries v. 39 no. 7 (August 2008) p. 54-7.
The differing terminology use in tag lists suggests that tagging may be a working example of Vannevar Bush’s associative trails. He argued that associative trails better represented how users actually work with their documents: as a holistic process of association closely tied to themselves and their work rather than by categorisation. This suggests that user tagging could provide additional access points to traditional controlled vocabularies and provide users with the associative classifications necessary to tie documents and articles to time, task and project relationships as well as other associations which are new and novel.
Usability studies show that information seekers in domains with a large number of objects prefer that related items be in meaningful groups to enable them to understand relationships quickly and thus decide how to proceed: without any means to explore and make sense of large quantities of similar items, users feel lost and fail.
Flat tag clouds as currently implemented are not sufficient to provide a semantic, rich and
multidimensional browsing experience over large tagging spaces. There are several reasons for
this:
1. Choosing tags by frequency of use inevitably causes a high semantic density with very few well-known and stable topics dominating the scene (as seen on RawSugar);
2. Providing only an alphabetical criterion to sort tags heavily limits the ability to quickly navigate, scan and extract, and hence build a coherent mental model out of tags;
3. A flat tag cloud cannot visually support semantic relationships between tags. We suggest that these relationships are needed to improve the user experience and general usefulness of the system;
4. Current tag clouds often fail to provide complex logical operation over tags. Simply clicking on a tag is not enough to enable a smooth and powerful exploration or refinement.
Even if FaceTag doesn’t promise to address all of these issues, we believe our approach can limit the impact of linguistic complications such as polysemy, homonymy and basic level variation while introducing an innovative, multidimensional and more semantic paradigm for organizing,
navigating and searching large information spaces through tags.
To reach this goal, FaceTag contributes to social tagging systems in three ways:
1. The use of (optional) tag hierarchies. Users have the possibility to organize their resources by means of parent-child relationships;
2. Tag hierarchies are semantically assigned to editorially established facets that can be later leveraged on to flexibly navigate the resource domain;
3. Tagging and searching can be mixed to maximize findability, browsability and user-discovery.
It is 4:00 and I don’t have the attention span to read these yet BUT I will:
Ontology is Overrated
A Cognitive Analysis of Tagging


