Landing : Athabascau University

Fuzzy tags


In the past I have written about tags to which we attach some value. Once upon a time in the late 1990s I called these qualities - a means of describing the things that we find valuable in a resource that do not fall neatly into yes-no tag categories: funny, good for beginners, complex, helpful and so on. In recent years I've been calling them scalar tags, to reflect the fact that they carry a scalar value or weighting.

We need these things for a variety of reasons. In the first place, as it turns out, many of the binary-category, definitional, taxonomic tags that we use on many social systems are really nothing of the kind - they often relate to our opinions and feelings towards the things we tag which, by and large, are not black and white (they are also often about things that are relative to us and our context - such as 'my family' but that's another issue). On some sites as many as 30% of all tags used can be about subjective qualities of things. Secondly, they have a really great practical value: if tags carry a value then they can be used to rate things in multiple dimensions. Instead of the traditional 'I like this' or star ratings that simply suggest something is more or less good or bad, we can use scalar tags to describe in what ways we like them. Thus, we create a kind of disembodied user model of the aspects of ourselves that are important in relation to what we are tagging. This, as it happens, is mighty useful in an educational setting because liking a learning object or rating it highly is not really the problem: we want to know why it is liked, in what circumstances, for what purposes. Standard high-low ratings can be useful for things about which we have fairly consistent feelings such as movies, books, music and so on but the thing about learning is that it changes us. This means that what we valued when we were learning something no longer has any value to us because we have already learnt it (at least, that is often the case).

It occurred to me yesterday as I was blogging about sets that there is a much nicer term for this kind of thing than scalar tags: these are actually fuzzy tags. Fuzzy tags are of course ideally suited to being considered in fuzzy sets. Fuzzy tags are not categorisations as such, they are ways of attaching a value to a tag that reflects its degree of membership within a set of such tags. 

Fuzzy tags are not unproblematic. It is fiendishly hard to create an interface for them, they are highly susceptible to the cold start phenomenon, it is difficult to find uses that give people a non-altruistic motive for using them, and it is hard to get the right balance between counting the number of tags and the values attached to them when presenting resources that have been tagged that way. Do you give prime real estate to those with a higher average value and, if so, how do you balance it so that a resource that has been rated highly by one person does not override one that has been rated even higher by some people but lower by others? Not even considering other problems like the fact people rate things relatively differently, issues of tag ambiguity, personalisation factors and much else besides, it's a wicked problem with many viable solutions. 

If we can crack this problem (I have had a fair number of cracks at it already) then it opens the door to some interesting ways of looking at collectives. At the moment, we think of collectives as being a kind of human-machine cyberorganism that is formed from the actions of the crowd which are then processed to create something that has agency in a social system. What comes in the first page or two of a Google search is an implicit recommendation by the collective, because of the ingenious PageRank algorithm that underpins it. The big tags in tag clouds tell us what the collective finds interesting and suggests to us that we might find it interesting too (and we do - we are about 3-4 times more likely to click a big tag in a tag cloud than a small one).

Apart from in a few experimental systems (I think I've built most of them) collectives are a combination of machine intelligence and human intelligence, at least when they work well - they can equally combine mob stupidity and machine ignorance when they work badly. I think they can be a lot more useful when they also capture the affective stuff - human feelings and opinions as well as intelligence. This is about more than just good or bad feelings: it is about things that say what it is that affects us and how we are affected, as well as how much we are affected by them.  Fuzzy tags can give us richer folksonomies that reflect more of the diversity, hopes, interests and intentions of the crowd than simple taxonomic tags. Plus, 'fuzzy tag' is a really catchy name :-)


  • Eric von Stackelberg August 27, 2010 - 3:10pm

    I think "fuzzy tags" is catchy, but are scalar tags not more closely related to multi-valued tags, and fuzzy tags effectively a further extension similar to "fuzzy logic"?

  • Jon Dron August 27, 2010 - 4:11pm

    Because the scale could be anything (an infinite range between zero and one, for instance, as easily as a Likert scale) I think the term is accurate. Fuzzy logic is almost certainly the best kind for dealing with fuzzy tags which would best be seen as being members of fuzzy sets. Of course, this could just be fuzzy thinking on my part.

  • Glenn Groulx August 27, 2010 - 8:12pm

    hello jon,

    I am entirely unfamiliar with set theory, and interpret your ideas about the fuzzy tags from the persepctive of developing a self-assessment model for independent learners within AU Landing.

    Here is a possible future scenario: a learner first engages in piling, which involves using tags and categories to describe their posts - the tagging process is random, unconnected. With the aid of a tutor and facilitator and peers, the learner begins to reflect on their tagging activity, and comment on and compare their own tags with the tags of others, drawing conclusions that impacts their future tagging activity.

    I can see the import of independent learners using both meta-tabs (ways of developing pre-formed blog post templates to meet specific requirements, such as berry-picking posts, or piling posts, etc.) and fuzzy tags (as I interpret this, the idea of fuzzy tags for blogging refers to interpretative data that students use to add a layer of meaning.) For example, though the student has added tags to a post using a meta-tab (preformulated template that includes specific data), the student can also tag each of the individual text fields with descriptive tags, and add fuzzy tags as well. Students can select from a menu the kind of fuzzy tag category (how well the writing met expectations, for example) and attach values to each of the text fields. 

  • Jon Dron August 27, 2010 - 9:03pm

    Interesting ideas Glenn. I don't think a grasp of set theory is needed for the concept. When I first developed the concept of qualities I was looking for a way of doing collaborative filtering without having to match users with each other - to allow them to express things about the qualities of a resource that are valuable to them as well as what it is about and to allow others to use those expressions as well as to contribute to refining their expressiveness. As it went on, I came to see these as pedagogic metadata - ways to express why something was valuable from a learning perspective, which is itself a metacognitive and reflective process as well as being useful for finding the right thing at the right time. While I still find this a very powerful idea, one of the reasons we don't see it everywhere (yet) is that it adds a big layer of complexity to the process. It's not just more effort to add fuzzy tags in the first place (it's hard enough encouraging most people to tag at all), but geometrically increases the sparsity of metadata when new fuzzy tags are being added to the system. Each one is a whole new cold start problem because, in adding them, we are a) doing so in conjunction with other tags and b) inviting others to use them as rating objects. I might think something is good for beginners but you might disagree. There is great value to be had from multiple perspectives but each new tag is a new space for negotiation of meaning and value for each resource it is used on and, of course, it has to be used on at least a few to gain value and traction and, to make matters worse, it has to be contextual, within a context defined largely by other tags (or structural features of the system like groups). Just glance at The tags at to get a sense of how bad the problem can be even when we use simple categorisations, then multiply that by itself each time we add a new fuzzy tag! It's an interaction design issue in some respects but a logical break point in others. My concern about your suggestion is that it adds still further layers of complexity that would make it exponentially harder still to sustain indeed unless you bullied learners into doing it - it specifies a very rigid way of learning and a vast amount of effort and attention. If this were guaranteed to result in better learning at lower personal, social and financial cost then it would certainly be worth it, but I'd have to question, if I were a learner adopting such an approach, whether I were really getting value out of it. What proof would there be? What theoretical basis would there be for doing it even? Might there be a more efficient way? It would make an interesting study! Until that point, the cascades of complexity make it hard to justify and I am fundamentally opposed to learner bullying unless that's what they very clearly both want and need.

  • Stuart Berry August 28, 2010 - 1:12pm

    Thanks for this discussion Jon

    In a few days I will begin the process of data collection for my dissertation and my research is looking at the use and value of online archives. My research environment is an online Master's level course and the students entering this course will have access to the online contributions of learners from two previous iterations of this course. They will be encouraged to work through the current course content while reading through and examining what previous learners had to say when they moved through the same course in an earlier iteration. As I spent time reading the previous contributions trying to understand the archive I began to realize that although there are good keyword search tools to assist in reaching back into the archive, very few of the contributions in the archive had any added tags. As this lack of tagging became clear to me I realized that I needed to better understand the nature and process of tagging and be in a position to offer my research participants some support around the process and value of tagging.

    I began to see the need for what you call fuzzy tags. In many of the archived discussions and comments I thought about what might be appropriate tags (had they been added) and I began to see two different kinds of tags. One kind that says to the reader, this post belongs in this category of postings or it is part of this larger set of discussions and another set of tags that tells the reader the intent of the posting. I think intent-tags would be very useful. In many of the postings I read through I had to scratch my head and wonder what the real intent of the posting was and if there was some way of adding intent then maybe we, the readers, would be able to engage the work of others in a very different way. I think this may prove to be one of the many challenges I will face as I move into this phase of my research but the concept of fuzzy tags may offer a different perspective and possibly provide participants with a clearer way of communicating with their online postings.

  • Glenn Groulx August 28, 2010 - 2:16pm

    hi Jon and Stuart,

    I have posted a more detailed commentary to this thread in the academic blogging circle blog here.

    I think your observation, Stuart, about how students clearly have little experience with tagging, has made it harder for others to revisit the posts and derive some sense from them. It is only when students need to go back into their own archived content to extract meaning and direction do they realize how they might have left clues and additional details about the context, the sense-making, the reasons behind the posts. The effect of going back to previously written content, then, is that students alter their perception of the usefulness of their content, and begin revising and documenting more systematically. i am unsure at what point this actually occurs, and whether there are clearly defined developmental stages that unfold, but it might suggest that pedagogic meta-tagging be a task to develop in learners once they have amassed enough content already to make such revisions worthwhile (in view of the upcoming e-portfolio, for example).

  • Jon Dron August 28, 2010 - 4:55pm

    The value of tags is very high at a system level in a social space which has networks, groups and individuals - the more that people do it, the better it is for everyone. The benefits to the individual are often less obvious so it is not surprising they are not well used. Glenn, I like the way that you are exploring more ways that they can be valuable to the tagger. I too have experienced the benefits of using tags that I created up to a decade ago, and tag clouds in particular are extremely informative ways of understanding a person's or a group's interests, including my own. With one of my early systems I gained very valuable insights into the ways groups work and evolve through watching the coevolution of tag clouds over time. However, I think the main and obvious reason for using them is so that others with similar interests or needs will find what they point to: the main selfish motive is thus to build social capital and draw attention to something that, since it is being shown in a public space, is presumably intended to be shared.
    With that in mind, @Stuart, I'm intrigued by the notion of intent-tags. I guess that these would be ways of setting the context of why someone was writing or posting something in the first place, which would indeed be really useful to know, especially when aggregated with others using the same tag. However, it might be harder to get buy in from end users as the value at a personal level seems even less obvious than for conventional tags (I am calling these crisp tags incidentally because fuzzy set theorists call sets of bivalent objects crisp sets). It might be that I don't quite get the idea though as I can't get more than a vague notion of what words or short phrases would help establish intent, beyond things like 'coursework', 'portfolio entry' or other revealing signs of external regulation. I guess I might even be able to throw in things like 'research' for this kind of post, but after that I start to run out of ideas, unless I start on fuzzy intents that are possibly fairly universal - to try to encourage dialogue, to try to spark ideas, to be more visible internationally or locally for instance. Trouble with those is they make pretty poor metadata, if one is trying to aggregate them, and it is fairly hard to imagine many people searching for things because of them, especially if they get to be very honest - that people have posted them for, say, vanity or displaying their command of the written language :-). If the purpose of an intent tag is just to help explain the context of a post when you are reading it then it might be better to use a simple comment/abstract/short description field. Even then, it is hard to be sufficiently honest or clear about motivation sometimes. An interesting reflective process though with potential pedagogic value. Have I misunderstood or failed to identify what an intent tag might look like? Could you give examples?

  • Eric von Stackelberg August 30, 2010 - 9:15am

    If we consider

    a) crisp tags=tagging,

    b) fuzzy tags= tagging(0...1) /* fuzzy quality */,

    c) multi-valued tags=tagging((quality=0...1),/* fuzzy quality */ (value_to_learner=0...1) /* importance with fuzzy quality)) 

    I had thought Intent would be identified in the fuzzy or multi-valued tag examples. As mentioned the cold start problem would be worse as the tags become more complex but through a combination of UI and  adaption, personalization to decrease the overhead with tagging we should get better data of all three types.

    I should note, that I thought of scalar tags as C, which effectively include the confidence or fuzziness of B for specific attributes. This way as a learner, I can also review the tagging meta-data created by more experienced.




  • Stuart Berry August 30, 2010 - 11:09am

    I think the above comments are an example of how a continued discussion can begin to flush out an idea, push out and hopefully clarify points and when this has reached its natural end I wonder how this complete discussion might be tagged. I am not suggesting this conversation is anywhere near done - just an observation.

    I am trying to think through how I might answer you Jon with respect to an example of an intent tag. I have not yet gathered a single piece of data and yet as I read through the prior contributions of learners in my course archive I am concerned as to how others may find and then perceive value and what they might need to assist them in possibly finding their way through the material. (Maybe I am just expressing doubt about my research) However I see great potential in this archival mass partly due to my perspective and partly because I am working hard to find this value. I think tagging is just one element in this process.

    In reading through two years of course contributions I find rich discussions, great clarity of intent, as well as a lot of phatic asides. I think the phatic asides can be as valuable as the rich contributions  and in most cases it is clear that they are just an aside comment but there are also a number of postings that have rich content but appear to wander somewhere else completely. The idea of an intent tag in this instance might be "an aside" or "off topic" or something more focussed on the actual wandering. I understand your concern about the metadata but maybe the idea of an intent-tag is less about the broader aggregation and could be more seen as a set of sub-tags. This does assume that the author used some clear crisp or fuzzy tags as well but I think of systems that use hierarchical naming schemes and maybe tagging needs to evolve into some form of multi-layered hierarchy. One layer (the top) that spoke to a more universal audience and sub-layers that spoke to a local audience.

  • Jon Dron August 31, 2010 - 10:27am

    @Stuart - I completely agree re sub-tags (we have work in progress to try to do that and my earlier systems used a 2-stage hierarchy of crisp tags then a further layer of fuzzy ones) but it becomes very complex in interface terms and massively increases the cold start problem when we do that. Each higher level in the hierarchy is a little tag system in itself in which a separate tag cloud develops.

    I like the concept of phatic tags - it opens up a huge realm of other potential speech acts we might encapsulate as tags- performative tags (e.g. 'I agree', 'I do') for instance. I guess the value is particularly great if we are performing qualitative content analysis or want to visualise dialogue structures but I am always wary that every time we impose a constraint or affordance we are changing the system: in its weakest form it is a kind of social equivalent of Schroedinger's cat, in its strongest it can profoundly shape and form interactions. This can have value when we are strongly aware of what and why we are shaping it the way we do though, more often, we have tended to let it happen: for instance, the highly constraining form of the threaded discussion that is so ubiquitous and unreflectively implemented in much online learning. Giving people more control over the kind of tags and other fields used, as well as the ways things can interact, seems like a good idea.

    Given the developing discussion here, it seems like it would be good to let people (collectively) create what would effectively be a structured database: to add fields that allow us to capture more structured data such as 'intent tags' or  Glenn's more structured tags, or Eric's multi-valued tags. Once a particular kind of data entry field has been created by someone, anyone else could use it when adding resources (blogs posts, wiki pages, bookmarks etc) if they want.  There is actually a forms plugin for Elgg that allows administrators to do something vaguely along those lines so it would not take too much to convert it into something a bit more social and let anyone play. Once you have defined a certain field type (e.g. 'intent', 'metatag' etc) you and others could re-use it in other forms. It would be hard to make it do more than act as a search field or maybe cluster things as behaviours are much harder to enable than data capture, but it might help to give a bit of hardness to the process if that's what people need.




    That's an interesting distinction: so, are multi-valued tags (i) effectively multi-dimensional fuzzy tags, or do you mean (ii) that this is a fixed array of two elements that includes both the evaluation itself and a rating of the personal value of that evaluation to you? It's beginning to sound quite recursive!  

    If we are talking about multi-dimensional fuzzy tags then we might as well simply use fuzzy tags and allow people to select more than one at a time: that is a more flexible solution than an array. If I use a fuzzy tag of 'value to me' then I can achieve exactly the same result without the complexity.

    If, on the other hand, it is a fixed array with two associated elements, the fuzzy tag itself and the value of the tagged resource to me as a learner then it gets more interesting but it is redundant: the fact that someone uses a tag in the first place carries with it an implication that the tag has some meaning or relevance - there would be no point in tagging something about which you care little. If I say that this is 'good for beginners' and, say, give it a rating of 5/5 it not only means that I think it is good for beginners but that I find this worth noting in the context I am using it, from my own perspective. If I give it a rating of 1/5 I am implying that 'good for beginners' is important to me but that this is hopeless in that context (but I might find other things of value about it). Why would I tag something using a quality that is unimportant to me?  So, the fact that I have tagged it in the first place carries an implication of both the value of the resource and its relevance to me as a learner. It is of no interest to others how useful I found it as a separate concern from the tag itself. The interesting thing to others is why I found it valuable and how valuable I found it in that context, which is what fuzzy tags already give us.

    And yes, the data become exponentially more sparse and the cold-start problem even colder.

    Alas, we gain nothing from people who are more experienced as tags are thrown off along the learning journey as we learn: when they tagged things they were in the same boat as we are. Now they have learned what it was they sought to learn, they are unlikely to go back and change the weighting unless they are either remarkably altruistic or forced to do so by teachers.


    All bets are off if we want to do some collaborative filtering or make use of a reputation system, where individual relevance does make a big difference and may well be worth recording, but there are gigantic cans of worms once we start down those paths about which many PhD theses have been written.