## Archive for the ‘Tagging’ Category

### Introducing tags to Journal of Cheminformatics

Sunday, March 24th, 2013

Introducing tags to Journal of Cheminformatics by Bailey Fallon.

From the post:

Journal of Cheminformatics will now be “tagging” its publications, allowing articles related by common themes to be linked together.

Where an article has been tagged, readers will be able access all other articles that share the same tag via a link at the right hand side of the HTML, making it easier to find related content within the journal.

This functionality has been launched for three resources that appear frequently in Journal of Cheminformatics and we will continue to add tags when relevant.

• Open Babel: Open Babel is an open source chemical toolbox that interconverts over 110 chemical data formats. The first paper describing the features and implementation of open Babel appeared in Journal of Cheminformatics in 2011, and this tag links it with a number of other papers that use the toolkit
• PubChem: PubChem is an open archive for the biological activities of small molecules, which provides search and analysis tools to assist users in locating desired information. This tag amalgamates the papers published in the PubChem3D thematic series with other papers reporting applications and developments of PubChem
• InChI: The InChI is as a textual identifier for chemical substances, which provides a standard way of representing chemical information. It is machine readable, making it a valuable tool for cheminformaticians, and this tag links a number of papers in Journal of Cheminformatics that rely on its use

It’s not sophisticated authoring of associations but carefully done, tagging can collate information resources for users.

On export to a topic map application, implied roles could be made explicit, assuming the original tagging was consistent.

### Managing Conference Hashtags

Tuesday, November 13th, 2012

David Karger tweets today:

Ironically amusing that ontology researchers can’t manage to agree on a canonical tag for their conference #iswc #iswc12 #iswc2012

If that’s true for ontology researchers, what chance does the rest of the world have?

Just to help ontology researchers along a bit (in LTM syntax):

*****

/* typing topics */

 [conf = "conference"] /* scoping topics */ [SWTwiiter01 : conf = "Semantic Web, Twitter hashtag 01."] [SWTwiiter02 : conf = "Semantic Web, Twitter hashtag 02."] [SWTwiiter03 : conf = "Semantic Web, Twitter hashtag 03."] 

[iswc2012 : conf = "ISWC 2012, The 11th International Semantic Web Conference" ("#iswc" / SWTwitter01) ("#iswc12" / SWTwitter02) ("#iswc2012" / SWTwitter03)] 

*****

I added the “conf” typing topic to the scoping topics to distinguish those tags from other for:

ISWC (International Standard Musical Work Code)

Welcome to ISWC 2013! The International Symposium on Wearable Computers (ISWC)

Wikipedia – ISWC, also lists:

International Speed Windsurfing Class

But missed:

International Student Welcome Committee

There remains the task of distinguishing tags in the wild from tags for these other subjects.

Once that is done, all the tweets about the conference, under these or other tags, can be collocated for a full set of tweets about the conference.

Other subjects and relationships, such as person, date, location, topic, tags, retweets, etc., can be just as easily added.

Personally I would make the default sort order for Tweet a function of date/time, quite possibly mis-using sortname for that purpose. People are accustomed to seeing Tweets in time order and fancy collocation can wait until they select an author, subject, tag, etc.

### AgroTagger [Auto-Topic Map Authoring?]

Wednesday, November 7th, 2012

AgroTagger

From the webpage:

Used for indexing information resources, Agrotagger is a keyword extractor that uses the AGROVOC thesaurus as its set of allowable keywords. It can extract from Microsoft Office documents, PDF files and web pages.

There are currently several available services that can be accessed either as web interfaces for manual document upload or as REST web services that can be programmatically invoked:

Following up on the AGROVOC thesaurus, FAO thesaurus links with reegle, and found this interesting resource.

Doesn’t seem like a big jump to have a set of keyword that create topics, associations and occurrences With document author(s), journal, place of employment, etc.

Would need proofing but on the other hand could produce a topic map for proofing tout de suite. (No Michel, I had to look it up. )

### Citizen Archivist Dashboard ["...help the next person discover that record"]

Sunday, June 10th, 2012

Citizen Archivist Dashboard

What’s the common theme of these interfaces from the National Archives (United States)?

• Tag – Tagging is a fun and easy way for you to help make National Archives records found more easily online. By adding keywords, terms, and labels to a record, you can do your part to help the next person discover that record. For more information about tagging National Archives records, follow “Tag It Tuesdays,” a weekly feature on the NARAtions Blog. [includes "missions" (sets of materials for tagging), rated as "beginner," "intermediate," and "advanced." Or you can create your own mission.]
• Transcribe – By contributing to transcriptions, you can help the National Archives make historical documents more accessible. Transcriptions help in searching for the document as well as in reading and understanding the document. The work you do transcribing a handwritten or typed document will help the next person discover and use that record.

The transcription tool features over 300 documents ranging from the late 18th century through the 20th century for citizen archivists to transcribe. Documents include letters to a civil war spy, presidential records, suffrage petitions, and fugitive slave case files.

[A pilot project with 300 documents but one you should follow. Public transcription (crowd-sourced if you want the popular term) of documents has the potential to open up vast archives of materials.]

• Edit Articles – Our Archives Wiki is an online space for researchers, educators, genealogists, and Archives staff to share information and knowledge about the records of the National Archives and about their research.

Here are just a few of the ways you may want to participate:

• Create new pages and edit pre-existing pages
• Store useful information discovered during research
• Expand upon a description in our online catalog

• Upload & Share – Calling all researchers! Start sharing your digital copies of National Archives records on the Citizen Archivist Research group on Flickr today.

Researchers scan and photograph National Archives records every day in our research rooms across the country — that’s a lot of digital images for records that are not yet available online. If you have taken scans or photographs of records you can help make them accessible to the public and other researchers by sharing your images with the National Archives Citizen Archivist Research Group on Flickr.

• Index the Census – Citizen Archivists, you can help index the 1940 census!

The National Archives is supporting the 1940 census community indexing project along with other archives, societies, and genealogical organizations. The release of the decennial census is one of the most eagerly awaited record openings. The 1940 census is available to search and browse, free of charge, on the National Archives 1940 Census web site. But, the 1940 census is not yet indexed by name.

You can help index the 1940 census by joining the 1940 census community indexing project. To get started you will need to download and install the indexing software, register as an indexing volunteer, and download a batch of images to transcribe. When the index is completed, the National Archives will make the named index available for free.

The common theme?

The tagging entry sums it up with: “…you can do your part to help the next person discover that record.”

That’s the “trick” of topic maps. Once a fact about a subject is found, you can preserve your “finding” for the next person.

### How do you measure the impact of tagging on retrieval?

Thursday, May 31st, 2012

How do you measure the impact of tagging on retrieval? by Tony Russell-Rose.

From the post:

A client of mine wants to measure the difference between manual tagging and auto-classification on unstructured documents, focusing in particular on its impact on retrieval (i.e. relevance ranking). At the moment they are considering two contrasting approaches:

See Tony’s post for details.

What do you think?

### Closing the Knowledge Gap:.. (Lessons for TMs?)

Friday, December 30th, 2011

Closing the Knowledge Gap: A Case Study – How Cisco Unlocks Communications by Tony Frazier, Director of Product Management, Cisco Systems and David Fishman, Marketing, Lucid Imagination.

From the post:

Cisco Systems set out to build a system that takes the search for knowledge beyond documents into the content of social network inside the enterprise. The resulting Cisco Pulse platform was built to deliver corporate employees a better understanding who’s communicating with whom, how, and about what. Working with Lucid Imagination, Cisco turned to open source — specifically, Solr/Lucene technology — as the foundation of the search architecture.

Cisco’s approach to this project centered on vocabulary-based tagging and search. Every organization has the ability to define keywords for their personalized library. Cisco Pulse then tags a user’s activity, content and behavior in electronic communications to match the vocabulary, presenting valuable information that simplifies and accelerates knowledge sharing across an organization. Vocabulary-based tagging makes unlocking the relevant content of electronic communications safe and efficient.

You need to read the entire article but two things to note:

• No uniform vocabulary: Every “organization” created its own.
• Automatic tagging: Content was automatically tagged (read users did not tag)

The article doesn’t go into any real depth about the tagging but it is implied that who created the content and other information is getting “tagged” as well.

I read that to mean in a topic maps context that with the declaration of a vocabulary and automatic tagging, that another process could create associations with roles and role players and other topic map constructs without bothering end users about those tasks.

Not to mention that declaring equivalents between tags as part of the reading/discovery process might be limited to some but not all users.

An incremental or perhaps even evolving authoring of a topic map.

Rather than a dead-tree resource, delivered a fait accompli, a topic map can change as new information or new views of existing/new information are added to the map. (A topic map doesn’t have to be so useful. It can be the equivalent of a dead-tree resource if you really want.)

### Automatically creating tags for big blogs with WordPress (possible upgrade)

Wednesday, December 28th, 2011

Automatically creating tags for big blogs with WordPress (possible upgrade)

Ajay Ohri writes:

I use the simple-tags plugin in WordPress for automatically creating and posting tags. I am hoping this makes the site better to navigate. Given the fact that I had not been a very efficient tagger before, this plugin can really be useful for someone in creating tags for more than 100 (or 1000 posts) especially WordPress based blog aggregators. (added the hyperlink to simple-tags)

I am thinking about possible changes to this blog to make it more useful. Both for me and you.

Curious if anyone has experience with the “simple-tags” plugin? Was it useful?

Do you think it would be useful with the type of material you find here?

### Evolutionary Subject Tagging in the Humanities…

Saturday, December 3rd, 2011

Evolutionary Subject Tagging in the Humanities; Supporting Discovery and Examination in Digital Cultural Landscapes by JackAmmerman, Vika Zafrin, Dan Benedetti, Garth W. Green.

Abstract:

In this paper, the authors attempt to identify problematic issues for subject tagging in the humanities, particularly those associated with information objects in digital formats. In the third major section, the authors identify a number of assumptions that lie behind the current practice of subject classification that we think should be challenged. We move then to propose features of classification systems that could increase their effectiveness. These emerged as recurrent themes in many of the conversations with scholars, consultants, and colleagues. Finally, we suggest next steps that we believe will help scholars and librarians develop better subject classification systems to support research in the humanities.

Truly remarkable piece of work!

Just to entice you into reading the entire paper, the authors challenge the assumption that knowledge is analogue. Successfully in my view but I already held that position so I was an easy sell.

BTW, if you are in my topic maps class, this paper is required reading. Summarize what you think are the strong/weak points of the paper in 2 to 3 pages.

### Modular Unified Tagging Ontology (MUTO)

Thursday, November 17th, 2011

Modular Unified Tagging Ontology (MUTO)

From the webpage:

The Modular Unified Tagging Ontology (MUTO) is an ontology for tagging and folksonomies. It is based on a thorough review of earlier tagging ontologies and unifies core concepts in one consistent schema. It supports different forms of tagging, such as common, semantic, group, private, and automatic tagging, and is easily extensible.

I though the tagging axioms were worth repeating:

• A tag has always exactly one label – otherwise it is not a tag.

(Additional labels can be separately defined, e.g. via skos:Concept.)
• Tags with the same label are not necessarily semantically identical.

(Each tag has its own identity and property values.)
• A tag can itself be a resource of tagging (tagging of tags).

From the properties defined, however, it isn’t clear how to determine when tags do have the same meaning and/or how to communicate that understanding to others?

Ah, or would that be a tagging of a tagging?

That sounds like it leaves a lot of semantic detail on the cutting room floor but it may be that viable semantic systems, oh, say natural languages, do exactly that. Something to think about isn’t it?

### uClassify

Saturday, July 2nd, 2011

uClassify

From the webpage:

uClassify is a free web service where you can easily create your own text classifiers. You can also directly use classifiers that have already been shared by the community.

Examples:

• Language detection
• Web page categorization
• Written text gender and age recognition
• Mood
• Spam filter
• Sentiment
• Automatic e-mail support
• See below for some examples

So what do you want to classify on? Only your imagination is the limit!

As of 1 July 2011, thirty-seven public classifiers are waiting on you and your imagination.

The emphasis is on tagging documents.

How useful is tagging documents when a search results in > 100 documents? Would your answer be the same or different if the search results were < 20 documents? What if the search results were > 500 documents?

I first saw this at textifter blog in the post A Classifier for the Masses.

### Modeling Social Annotation: A Bayesian Approach

Monday, January 3rd, 2011

Modeling Social Annotation: A Bayesian Approach Authors: Anon Plangprasopchok, Kristina Lerman

Abstract:

Collaborative tagging systems, such as Delicious, CiteULike, and others, allow users to annotate resources, for example, Web pages or scientific papers, with descriptive labels called tags. The social annotations contributed by thousands of users can potentially be used to infer categorical knowledge, classify documents, or recommend new relevant information. Traditional text inference methods do not make the best use of social annotation, since they do not take into account variations in individual users’ perspectives and vocabulary. In a previous work, we introduced a simple probabilistic model that takes the interests of individual annotators into account in order to find hidden topics of annotated resources. Unfortunately, that approach had one major shortcoming: the number of topics and interests must be specified a priori. To address this drawback, we extend the model to a fully Bayesian framework, which offers a way to automatically estimate these numbers. In particular, the model allows the number of interests and topics to change as suggested by the structure of the data. We evaluate the proposed model in detail on the synthetic and real-world data by comparing its performance to Latent Dirichlet Allocation on the topic extraction task. For the latter evaluation, we apply the model to infer topics of Web resources from social annotations obtained from Delicious in order to discover new resources similar to a specified one. Our empirical results demonstrate that the proposed model is a promising method for exploiting social knowledge contained in user-generated annotations.

Questions:

1. How does (if it does) a tagging vocabulary different from a regular vocabulary? (3-5 pages, no citations)
2. Would this technique be application to tracing vocabulary usage across cited papers? In other words, following an author backwards through materials they cite? (3-5 pages, no citations)
3. What other characteristics do you think a paper would have where the usage of a term had shifted to a different meaning? (3-5 pages, no citations)

### Survey on Social Tagging Techniques

Monday, December 6th, 2010

Survey on Social Tagging Techniques Authors: Manish Gupta, Rui Li, Zhijun Yin, Jiawei Han Keywords: Social tagging, bookmarking, tagging, social indexing, social classification, collaborative tagging, folksonomy, folk classification, ethnoclassification, distributed classification, folk taxonomy

Abstract:

Social tagging on online portals has become a trend now. It has emerged as one of the best ways of associating metadata with web objects. With the increase in the kinds of web objects becoming available, collaborative tagging of such objects is also developing along new dimensions. This popularity has led to a vast literature on social tagging. In this survey paper, we would like to summarize different techniques employed to study various aspects of tagging. Broadly, we would discuss about properties of tag streams, tagging models, tag semantics, generating recommendations using tags, visualizations of tags, applications of tags and problems associated with tagging usage. We would discuss topics like why people tag, what influences the choice of tags, how to model the tagging process, kinds of tags, different power laws observed in tagging domain, how tags are created, how to choose the right tags for recommendation, etc. We conclude with thoughts on future work in the area.

I recommend this survey in part due to its depth but also for not lacking a viewpoint:

…But fixed static taxonomies are rigid, conservative, and centralized. [cite omitted]…Hierarchical classifications are influenced by the cataloguer’s view of the world and, as a consequence, are affected by subjectivity and cultural bias. Rigid hierarchical classification systems cannot easily keep up with an increasing and evolving corpus of items…By their very nature, hierarchies tend to establish only one consistent, authoritative structured vision. This implies a loss of precision, erases differences of expression, and does not take into account the variety of user needs and views.

I am not innocent of having made similar arguments in other contexts. It makes good press among the young and dissatisfied, it doesn’t bear up to close scrutiny.

For example, the claim is made that “hierarchical classifications” are “affected by subjectivity and cultural bias.” The implied claim is that social tagging is not. Yes? I would argue that all classification, hierarchical and otherwise is affected by “subjectivity and cultural bias.”

Questions:

1. Choose one of the other claims about hierarchical classifications. Is is also true of social tagging? Why/Why not? (3-5 pages, no citations)
2. Choose a social tagging practice. What are its strengths/weaknesses? (3-5 pages, no citations)
3. How would you use topic maps with the social tagging practice in #2? (3-5 pages, no citations)

### Using Tag Clouds to Promote Community Awareness in Research Environments

Friday, October 15th, 2010

Using Tag Clouds to Promote Community Awareness in Research Environments Authors: Alexandre Spindler, Stefania Leone, Matthias Geel, Moira C. Norrie Keywords: Tag Clouds – Ambient Information – Community Awareness

Abstract:

Tag clouds have become a popular visualisation scheme for presenting an overview of the content of document collections. We describe how we have adapted tag clouds to provide visual summaries of researchers’ activities and use these to promote awareness within a research group. Each user is associated with a tag cloud that is generated automatically based on the documents that they read and write and is integrated into an ambient information system that we have implemented.

One of the selling points of topic maps has been the serendipitous discovery of new information. Discovery is predicated on awareness and this is an interesting approach to that problem.

Questions:

1. To what extent does awareness of tagging by colleagues influence future tagging?
2. How would you design a project to measure the influence of tagging?
3. Would the influence of tagging change your design of an information interface? Why/Why not? If so, how?

### tagging, communities, vocabulary, evolution

Tuesday, October 5th, 2010

tagging, communities, vocabulary, evolution Authors: Shilad Sen, Shyong K. (Tony) Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, John Riedl Keywords: communities, evolution, social book-marking, tagging, vocabulary

Abstract:

A tagging community’s vocabulary of tags forms the basis for social navigation and shared expression. We present a user-centric model of vocabulary evolution in tagging communities based on community influence and personal tendency. We evaluate our model in an emergent tagging system by introducing tagging features into the MovieLens recommender system. We explore four tag selection algorithms for displaying tags applied by other community members. We analyze the algorithms’ effect on vocabulary evolution, tag utility, tag adoption, and user satisfaction.

The influence of an interface on the creation of topic maps is an open area for research. Research on tagging behavior is an excellent starting point for such studies.

Question: Would you modify the experimental setup to test the creation of topics? If so, in what way? Why?