In many "Web 2.0" sites (e.g. Flickr and Last.fm), the main classification of content are user-defined tags. Tag clouds are the most commonplace method of visualizing this folksonomy data, event though they encode only basic information: The font size of a given tag encodes the number of times it has been applied to the current subject. The position of the tag does not encode any information.
We intend to explore other ways to visualize the data inherent in the tags. For example, based on the number of objects on which they appear together, a similarity value could be calculated and later on visualized. Also, tags could be clustered according to their similarity.
Using this new information, we propose to render these enhanced tags similar to Venn diagrams: Any two tags that appear together are represented as overlapping circles, where the size of the circles encodes the overall or local frequency of a tag. The percentage of the overlapping area encodes the similarity of any given pair of tags. However, this method does not work in all cases (see attached images).
Therefore, another imaginable visualization is also proposed: By calculating feature vectors for each tag, consisting of the relative number of subjects in which two tags appear simultaneously, the tags can be arranged using dimensionality reduction techniques such as self-organizing maps. In these tag clouds, similar tags are placed close to each other.
Both of these ideas are not yet definite, as we must also answer the question if any approach is feasible, given large datasets and especially many overlapping tags. Our main objective therefore is to explore those and possibly other visualizations.