Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

October 9, 2010

Multiobjective Variable Neighborhood Search for Solving the Motif Discovery Problem

Filed under: Bioinformatics — Patrick Durusau @ 2:38 pm

Multiobjective Variable Neighborhood Search for Solving the Motif Discovery Problem Author(s): David L. González-Álvarez, Miguel A. Vega-Rodríguez, Juan A. Gómez-Pulido, Juan M. Sánchez-Pérez Keywords: Multiobjective Skewed Variable Neighborhood Search (MO–SVNS), Motif Discovery Problem (MDP), hypervolume indicator.

Abstract:

In this work we approach the Motif Discovery Problem (MDP) by using a trajectory-based heuristic. Identifying common patterns, motifs, in deoxyribonucleic acid (DNA) sequences is a major problem in bioinformatics, and it has not yet been resolved in an efficient manner. The MDP aims to discover patterns that maximize three objectives: support, motif length, and similarity. Therefore, the use of multiobjective evolutionary techniques can be a good tool to get quality solutions. We have developed a multiobjective version of the Variable Neighborhood Search (MO-VNS) in order to handle this problem. After accurately tuning this algorithm, we also have implemented its variant Multiobjective Skewed Variable Neighborhood Search (MO-SVNS) to analyze which version achieves more complete solutions. Moreover, in this work, we incorporate the hypervolume indicator, allowing future comparisons of other authors. As we will see, our algorithm achieves very good solutions, surpassing other proposals.

The need to discover subjects/motifs that are patterns in strings isn’t limited to deoxyribonucleic acid (DNA) sequences.

A large amount of work has gone into pattern matching in bioinformatics and topic map authors should take advantage of it.

Knowledge Construction with Social Web Tools

Filed under: Interface Research/Design,Topic Maps — Patrick Durusau @ 7:51 am

Knowledge Construction with Social Web Tools Author(s): Margarida Lucas, António Moreira Keywords: Social Web – Knowledge Construction – Interaction Analysis

Abstract:

This paper examines knowledge construction in a distributed learning environment supported by social web tools. Research data was gathered from online asynchronous discussions in a first-year Masters Degree course in Multimedia in Education. Our analysis was modeled on Gunawardena, Lowe and Anderson’s (1997) study and results indicate that, despite a significant percentage in the phase of sharing and comparing information, interaction at the highest levels of knowledge construction is relevant and suggests that knowledge was constructed.

Important work for two reasons:

First, studying actual use of social tools beats by a wide margin projection by programmers of what they think users will find intuitive or useful.

Second, how to study users and software is in its infancy and explorations such as this one provide a basis for further study in this critical area.

There is a “technical” side to topic maps but the “social” and “interface” sides are just as important. Alone, none of them are sufficient for a successful topic map application.

BigTable Model with Cassandra and HBase – Post

Filed under: Cassandra,HBase,NoSQL — Patrick Durusau @ 6:29 am

BigTable Model with Cassandra and HBase Non-hand-waving explanation of Cassandra and HBase.

Has anyone tried to column of values approach where subjectIdentifier or subjectLocator is a set of values?

October 8, 2010

Semantic Drift and Linked Data/Semantic Web

Filed under: Linked Data,OWL,Semantic Web,Subject Identity — Patrick Durusau @ 10:28 am

Overloading OWL sameAs starts with:

Description: General Issue: owl:sameAs is being used in the linked data community in a way that is inconsistent with its semantics.

Read the document but in summary: People use OWL sameAs to mean different things.

I don’t see how their usage can be “inconsistent with its semantics.”

Words don’t possess self-executing semantics that bind us. Rather the other way round I think.

If OWL sameAs had some “original” semantic, it changed by the process of semantic drift.

Semantic drift is where the semantics of a token changes over time or across communities due to its use by people.

URIs or tokens may be “stable,” but the evidence is that the semantics of URIs or tokens are not.

The question is how to manage changing, emerging, drifting semantics? (Not a question answered by a static semantic model of URI based identity.)

PS: RDF researchers have recognized semantic drift and have proposed solutions for addressing it. More on that anon.

Questions:

  • Select a classification more than 30 years old and randomly select one book for each 5 year period for the last 30 years. What (if any) semantic drift do you see in the use of this classification?
  • Exchange your list with a classmate. Do you agree/disagree with their evaluation? Why?
  • Repeat the exercise in #1 and #2 but use a classification where you can find books between 30 and 60 years ago. Select one book per 5 year period.

Union Catalogs of Learning Objects: Why Not?

Filed under: Digital Library — Patrick Durusau @ 6:36 am

Union Catalogs of Learning Objects: Why Not? Author(s): Ana M.B. Pavani Keywords: Metadata – Learning Objects – Digital Libraries – Union Catalogs – Open Archives Inititative Protocol for Metadata Harvesting

Abstract:

This work presents a combined view of digital libraries, union catalogs and digital learning materials; union catalogs of metadata of ETD – Electronic Theses and Dissertations are shown as a paradigm. From this integrated view, and based on the existing ETD solution, it suggests that union catalogs of learning objects (digital learning materials with independent identities) be implemented with the participation of institutions worldwide. Open and free software solutions, and training are part of the overall proposed strategy.

More of a call to action than a specific proposal.

Worth reading to be reminded how important it is to share resources.

Even if, like the first cataloging venture in the 13th century, the work of sharing will never be done.

A Haptic-Based Framework for Chemistry Education: Experiencing Molecular Interactions with Touch

A Haptic-Based Framework for Chemistry Education: Experiencing Molecular Interactions with Touch Author(s): Sara Comai, Davide Mazza Keywords: Haptic technology – Chemical education and teaching – Molecular interaction

Abstract:

The science of haptics has received a great attention in the last decade for data visualization and training. In particular haptics can be introduced as a novel technology for educational purposes. The usage of haptic technologies can greatly help to make the students feel sensations not directly experienceable and typically only reported as notions, sometimes also counter-intuitively, in textbooks. In this work, we present a haptically-enhanced system for the tactile exploration of molecules. After a brief description of the architecture of the developed system, the paper describes how it has been introduced in the usual didactic activity by providing a support for the comprehension of concepts typically explained only theoretically. Users feedbacks and impressions are reported as results of this innovation in teaching.

Imagine researchers using haptics to recognize molecules or molecular reactions.

Are the instances of recognition to be compared with other such instances?

How would you establish the boundaries for a “match?”

How would you communicate those boundaries to others?

Library Linked Data: Call for Use Cases

Filed under: Linked Data,Semantic Web — Patrick Durusau @ 6:11 am

Library Linked Data: Call for Use Cases

Just a quick reminder that the call for use cases from the W3C Library Linked Data Incubator Group ends on 15 October 2010.

The mailing archives may be of interest: public-lld@w3.org.

I’m not a fan of “Linked Data” but it will be encountered by topic map authors and so we need to follow its development.

Inside Neo4j: Intro and roadmap

Filed under: Graphs,Neo4j,NoSQL,Software — Patrick Durusau @ 6:07 am

Inside Neo4j: Intro and roadmap

Chris Gioran has started a series of posts at A Digital Stain covering the internals of Neo4j.

Whether you are interested in Neo4j in particular or graph databases in general, this is a series of posts to watch closely.

October 7, 2010

Public Interchangeable Identifier

Filed under: Subject Identifiers,Subject Identity,Topic Maps — Patrick Durusau @ 7:19 am

I mentioned yesterday that creating a public interchangeable identifier isn’t as easy as identifying identifier and documenting them publicly. Recognizing an Interchangeable Identifier

What if I identified (by some means) “Patrick” as an identifier and posted it to my website (public documentation).

Is that now a “public interchangeable identifier?”

No. Why?

First, there has to be some agreed upon means to declare an identifier to be an identifier. When I say agreed upon, it need not be something as formal as a standard but it has to be recognized by a community of users.

Second, it is important to know in what context this is an identifier? Akin to what we talk about as “scope” in topic maps. But with the recognition that the notion of “unconstrained” scope is a pernicious fiction. Scope may be unspecified but it is never unconstrained.

I would argue that no identifier exists without some defined scope. It may not be known or specified but the essence of an identifier, that it identifies some subject, exists only within some scope.

More on means to declare identifiers and their context anon.

Machine Learning Support for Human Articulation of Concepts from Examples – A Learning Framework

Filed under: Authoring Topic Maps,Topic Maps — Patrick Durusau @ 6:35 am

Machine Learning Support for Human Articulation of Concepts from Examples – A Learning Framework
Author(s): Gabriela Pavel Keywords: concept learning – machine learning – visual environment – learning framework

Abstract:

We aim to show that machine learning methods can provide meaningful feedback to help the student articulate concepts from examples, in particular from images. Therefore we present here a framework to support the learning through human visual classifications and machine learning methods.

In the article the sentence:

Images help people externalize their intuitive knowledge, within a process called articulation, or transfer from tacit to explicit knowledge.

Caught my eye.

The process of authoring of topic maps is articulation, or transfer from tacit to explicit knowledge.

The paper addresses the use of images to teach concepts represented in images but also from students their tacit knowledge of the concepts represented in images.

If that seems a bit mundane, imagine intelligence authors scanning images of people or locales and adding their tacit knowledge to a shareable data store.

KP-Lab System: A Collaborative Environment for Design, Realization and Examination of Different Knowledge Practices

Filed under: Collaboration,Exercises,Interface Research/Design,Software — Patrick Durusau @ 6:18 am

KP-Lab System: A Collaborative Environment for Design, Realization and Examination of Different Knowledge Practices Author(s): Ján Parali?, František Babi? Keywords: collaborative system – practices – patterns – time-line – summative information

Abstract:

This paper presents a collaborative working and learning environment called KP-Lab System. It provides a complex and multifunctional application built on principles of semantic web, exploiting also some web2.0 approaches as Google Apps or mashups. This system offers virtual user environment with different, necessary and advanced features for collaborative learning or working knowledge intensive activities. This paper briefly presents the whole system with special emphasis on its semantic-based aspects and analytical tools.

Public Site: http://2d.mobile.evtek.fi/shared-space (Be aware that FireFox will say this is an untrusted site as of 6 October 2010. Not sure why but I just added a security exception to access the site.)

Software: http://www.kp-lab.org/tools

Exploration of semantic user interfaces is in its infancy and this is another attempt to explore that space.

Questions/Activities:

  1. Create account and login to public site (Organization: none)
  2. Comments on the interface?
  3. Suggestions for changes to interface?
  4. Download/install software (geeks)
  5. Create content (with other class members)
  6. Likes/dislikes managing content on basis of subject identity?

WebGraph

Filed under: Graphs,Indexing,Navigation,Searching,Software — Patrick Durusau @ 5:56 am

WebGraph was mentioned in the article Fast and Compact Web Graph Representations.

Great work on the web graph, with software and data sets for exploring!

(Warning: If you like this sort of thing you will lose hours if not days here.)

Questions:

  1. Is the Web Graph different from a graph of a topic map?
  2. How would you go about researching question #1?
  3. Would your answer to #1 vary depending on the topic map you chose?
  4. Would the size of a topic map affect your answer?
  5. How would you test your answer to #4?
  6. What other aspects of graphs would you want to explore on topic maps?

October 6, 2010

Recognizing an Interchangeable Identifier

Filed under: Indexing,Semantics,Subject Identifiers,Subject Identity — Patrick Durusau @ 7:13 am

Subjects & Identifiers shows why we need interchangeable identifiers.

Q: How would you recognize an interchangeable identifier?

A: Oh, yeah, that’s right. Anything we can talk about has an identifier, so how to recognize an interchangeable identifier?

If two people agree on column headers for a database table, they have interchangeable identifiers for the columns, at least between the two of them.

There are two requirements for interchangeable identifiers:

  1. Identification as an identifier.
  2. Notice of the identifier.

Any token can be an identifier under some circumstances so identifiers must be identified for interchange.

Notice of an identifier is usually a matter of being part of a profession or discipline. Some term is an identifier because it was taught to you as one.

That works but for local interchange, but public interchange requires publicly documented identifiers.

That’s it. Identify identifiers and document the identifiers publicly and you will have public interchangeable identifiers.

It can’t be that simple? Well, truthfully, it’s not.

More on public interchangeable identifiers forthcoming.

Fast and Compact Web Graph Representations

Filed under: Data Structures,Graphs,Navigation,Searching,Software — Patrick Durusau @ 7:10 am

Fast and Compact Web Graph Representations Authors: Francisco Claude, Gonzalo Navarro Keywords: Compression, Web Graph, Data Structures

Abstract:

Compressed graph representations, in particular for Web graphs, have become an attractive research topic because of their applications in the manipulation of huge graphs in main memory. The state of the art is well represented by the WebGraph project, where advantage is taken of several particular properties of Web graphs to offer a trade-off between space and access time. In this paper we show that the same properties can be exploited with a different and elegant technique that builds on grammar-based compression. In particular, we focus on Re-Pair and on Ziv-Lempel compression, which, although cannot reach the best compression ratios of WebGraph, achieve much faster navigation of the graph when both are tuned to use the same space. Moreover, the technique adapts well to run on secondary memory and in distributed scenarios. As a byproduct, we introduce an approximate Re-Pair version that works efficiently with severely limited main memory.

Software & Examples: Fast and Compact Web Graph Representations

As topic maps grow larger and/or memory space becomes smaller (comparatively speaking), compressed graph work becomes increasingly relevant.

Gains in navigation speed are always welcome.

Mining Historic Query Trails to Label Long and Rare Search Engine Queries

Filed under: Authoring Topic Maps,Data Mining,Entity Extraction,Search Engines,Searching — Patrick Durusau @ 7:05 am

Mining Historic Query Trails to Label Long and Rare Search Engine Queries Authors: Peter Bailey, Ryen W. White, Han Liu, Giridhar Kumaran Keywords: Long queries, query labeling

Abstract:

Web search engines can perform poorly for long queries (i.e., those containing four or more terms), in part because of their high level of query specificity. The automatic assignment of labels to long queries can capture aspects of a user’s search intent that may not be apparent from the terms in the query. This affords search result matching or reranking based on queries and labels rather than the query text alone. Query labels can be derived from interaction logs generated from many users’ search result clicks or from query trails comprising the chain of URLs visited following query submission. However, since long queries are typically rare, they are difficult to label in this way because little or no historic log data exists for them. A subset of these queries may be amenable to labeling by detecting similarities between parts of a long and rare query and the queries which appear in logs. In this article, we present the comparison of four similarity algorithms for the automatic assignment of Open Directory Project category labels to long and rare queries, based solely on matching against similar satisfied query trails extracted from log data. Our findings show that although the similarity-matching algorithms we investigated have tradeoffs in terms of coverage and accuracy, one algorithm that bases similarity on a popular search result ranking function (effectively regarding potentially-similar queries as “documents”) outperforms the others. We find that it is possible to correctly predict the top label better than one in five times, even when no past query trail exactly matches the long and rare query. We show that these labels can be used to reorder top-ranked search results leading to a significant improvement in retrieval performance over baselines that do not utilize query labeling, but instead rank results using content-matching or click-through logs. The outcomes of our research have implications for search providers attempting to provide users with highly-relevant search results for long queries.

(Apologies for repeating the long abstract but this needs wider notice.)

What the authors call “label prediction algorithms,” is a step in mining data for subjects.

The research may also improve search results through the use of labels for ranking.

The RelFinder user interface: interactive exploration of relationships between objects of interest

Filed under: Associations,Interface Research/Design,RDF,Semantic Web,Software — Patrick Durusau @ 7:00 am

The RelFinder user interface: interactive exploration of relationships between objects of interest Authors: Steffen Lohmann, Philipp Heim, Timo Stegemann, Jürgen Ziegler Keywords: dbpedia, decision support, graph visualization, linked data, relationship discovery, relationship web, semantic user interfaces, semantic web, sparql, visual exploration

Abstract:

Being aware of the relationships that exist between objects of interest is crucial in many situations. The RelFinder user interface helps to get an overview: Even large amounts of relationships can be visualized, filtered, and analyzed by the user. Common concepts of knowledge representation are exploited in order to support interactive exploration both on the level of global filters and single relationships. The RelFinder is easy-to-use and works on every RDF knowledge base that provides standardized SPARQL access

Software: RelFinder

RelFinder presents a way to leverage data already in RDF for the creation of associations in topic maps.

Or to explore data already available in RDF.

Exploration of relationships is important for “data” but even more important for the syntaxes that contain data.

Such as equivalence between subjects represented by syntax tokens.

October 5, 2010

Re-Using Linked Data

Filed under: Authoring Topic Maps,Dataset,Linked Data,Topic Maps — Patrick Durusau @ 9:24 am

The German national library released its authority records as linked data.

News and reference services have content management systems that don’t use URIs, so how do they link up public linked data with their private data?

In a way that they can share the resulting linked data within their organization?

Exploration question: What mapping facilities exist in popular CMS systems for mapping linked data to local data?

I don’t know the answer to that but will be finding out.

In the meantime, if you know your CMS system cannot do such a mapping, consider using topic maps. (topicmaps.org)

Topic maps can create linked data that is not subject to the limitation of using URIs.

tagging, communities, vocabulary, evolution

Filed under: Authoring Topic Maps,Interface Research/Design,Tagging — Patrick Durusau @ 8:46 am

tagging, communities, vocabulary, evolution Authors: Shilad Sen, Shyong K. (Tony) Lam, Al Mamunur Rashid, Dan Cosley, Dan Frankowski, Jeremy Osterhouse, F. Maxwell Harper, John Riedl Keywords: communities, evolution, social book-marking, tagging, vocabulary

Abstract:

A tagging community’s vocabulary of tags forms the basis for social navigation and shared expression. We present a user-centric model of vocabulary evolution in tagging communities based on community influence and personal tendency. We evaluate our model in an emergent tagging system by introducing tagging features into the MovieLens recommender system. We explore four tag selection algorithms for displaying tags applied by other community members. We analyze the algorithms’ effect on vocabulary evolution, tag utility, tag adoption, and user satisfaction.

The influence of an interface on the creation of topic maps is an open area for research. Research on tagging behavior is an excellent starting point for such studies.

Question: Would you modify the experimental setup to test the creation of topics? If so, in what way? Why?

Context-aware intelligent recommender system

Filed under: Classification,Context-aware,Fuzzy Logic — Patrick Durusau @ 6:49 am

Context-aware intelligent recommender system Authors: Mehdi Elahi Keywords: active learning, classification, context-aware, fuzzy logic, recommendation systems, recommenders

Abstract:

This demo paper presents a context-aware recommendation system. The system mines data from user’s web searches and other sources to improve the presentation of content on visited web pages. While user is browsing the internet, a memory resident agent records and analyzes the content of the webpages that were either searched for or visited in order to identify topic preferences. Then, based on such information, the content of requested web page is ranked and classified with different styles. The demo shows how a music weblog can be modified automatically based on user’s affinities.

Context-aware recommendation systems help present relevant information in large topic maps but I am more interested in their use for authoring systems.

Automatic construction of topics/roles/associations based on prior choices (for user approval) comes to mind.

Not a tool for a casual author but certainly a power tool for professional information explorers. (librarians?)

Grist For Topic Map Mills: German National Library – Authority Files

Filed under: Dataset,Linked Data,RDF,Semantic Web — Patrick Durusau @ 6:09 am

German National Library – Authority Files (linked data)

A post from Lars Svensson announced the release of authority files from the German National Library:

The German National Library (DNB) has published the German library authority files as linked data. The dataset consists of 1.8 Mill differentiated persons from the PND (Personennamendatei, Name authority file), 187.000 subject headings from the SWD (Schlagwortnormdatei, Subject headings authority file), 1.3 Mill corporate bodies from the GKD (Gemeinsame Körperschaftsdatei, Corporate Body Authority file), and 51,000 classes from the German translation of the Dewey Decimal Classification (DDC).

Library students should take particular note of the subject heading and Dewey Decimal Classification materials.

For topic mappers, another set of identifiers that can be mapped between the data sets shown by data cloud as well those that don’t use URIs as identifiers (the vast majority of data).

This will also be of interest to the linked data community.

October 4, 2010

Topic Maps for Drupal

Filed under: Topic Map Software,Topic Maps — Patrick Durusau @ 9:07 am

Topic Maps for Drupal

Sam Hunting has released a topic maps module for Drupal.

Features a wiki-like syntax for the authoring of topic maps in-line with blog posts, for example, and the ability to use data plugins for the parsing of data.

(Well, now I will have to install/configure Drupal so I can test it out.)

QuaaxTM (Release)

Filed under: Topic Map Software,Topic Maps — Patrick Durusau @ 5:23 am

QuaaxTM has released version 0.5.3 of the QuaaxTM PHP Topic Maps engine.

Includes JTM 1.0 read/write and JTM 1.1 read support (QuaaxTMIO library).

(As per Johannes Schmidt, jschmidt@t8d.de.)

Understanding web documents using semantic overlays

Filed under: Interface Research/Design,Mapping,Semantic Web — Patrick Durusau @ 5:08 am

Understanding web documents using semantic overlays Authors: Grégoire Burel, Amparo Elizabeth Cano Keywords: semantic overlays, semantic web, web augmentation

Abstract:

The Ozone Browser is a platform independent tool that enables users to visually augment the knowledge presented in a web document in an unobtrusive way. This tool supports the user comprehension of Web documents through the use of Semantic Overlays. This tool uses linked data and lightweight semantics for getting relevant information within a document. The current implementation uses a JavaScript bookmarklet.

The “overlay” nature of this interface attracted my attention. Suspect it would work with “other” sources of page annotation, such as topic maps.

Suspicion only since the project page, http://oak.dcs.shef.ac.uk/sparks/ is a dead link as of 4 October 2010. I have written to the project and will update its status.


Update:

Apologies for the long delay in following up on this entry!

The correct URL, not the one reported in the article: http://nebula.dcs.shef.ac.uk/sparks/ozone.

Now I will have to try to find the time to try the bookmarketlet. Comments if you have already?

Finding your way in a multi-dimensional semantic space with Luminoso

Filed under: Clustering,Interface Research/Design,Natural Language Processing — Patrick Durusau @ 4:53 am

Finding your way in a multi-dimensional semantic space with luminoso Authors: Robert H. Speer, Catherine Havasi, K. Nichole Treadway, Henry Lieberman Keywords: common sense, n-dimensional visualization, natural language processing, SVD

Abstract:

In AI, we often need to make sense of data that can be measured in many different dimensions — thousands of dimensions or more — especially when this data represents natural language semantics. Dimensionality reduction techniques can make this kind of data more understandable and more powerful, by projecting the data into a space of many fewer dimensions, which are suggested by the computer. Still, frequently, these results require more dimensions than the human mind can grasp at once to represent all the meaningful distinctions in the data.

We present Luminoso, a tool that helps researchers to visualize and understand a multi-dimensional semantic space by exploring it interactively. It also streamlines the process of creating such a space, by inputting text documents and optionally including common-sense background information. This interface is based on the fundamental operation of “grabbing” a point, which simultaneously allows a user to rotate their view using that data point, view associated text and statistics, and compare it to other data points. This also highlights the point’s neighborhood of semantically-associated points, providing clues for reasons as to why the points were classified along the dimensions they were. We show how this interface can be used to discover trends in a text corpus, such as free-text responses to a survey.

I particularly like the interactive rotation about a data point.

Makes me think of rotating identifications or even within complexes of subjects.

The presentation of “rotation” I suspect to be domain specific.

The “geek” graph/node presentation probably isn’t the best one for all audiences. Open question as to what might work better.

See: Luminoso (homepage) and Luminoso (Github)

A multimodal dialogue mashup for medical image semantics

Filed under: Data Integration,Interface Research/Design — Patrick Durusau @ 4:26 am

A multimodal dialogue mashup for medical image semantics Authors: Daniel Sonntag, and Manuel Möller Keywords: collaborative environments, design, touchscreen interface

Abstract:

This paper presents a multimodal dialogue mashup where different users are involved in the use of different user interfaces for the annotation and retrieval of medical images. Our solution is a mashup that integrates a multimodal interface for speech-based annotation of medical images and dialogue-based image retrieval with a semantic image annotation tool for manual annotations on a desktop computer. A remote RDF repository connects the annotation and querying task into a common framework and serves as the semantic backend system for the advanced multimodal dialogue a radiologist can use.

With regard to the semantics of the interface the authors say:

In a complex interaction system, a common ground of terms and structures is absolutely necessary. A shared representation and a common knowledge base ease the dataflow within the system and avoid costly and error-prone transformation processes.

I disagree with both statements but concede that for a particular use cases, the cost of dataflow question will be resolved differently.

I like the article as an example of interface design.

October 3, 2010

Subjects & Identifiers

Filed under: Subject Identity,Topic Maps — Patrick Durusau @ 7:27 am

For all the talk about assigning subjects identifiers, all subjects already have identifiers.

The ones we can talk about anyway.* Try it, you will see what I mean. As soon as you say a name or otherwise identify a subject, it has an identifier.

In the classic topic map use case, mapping indexes together, all the subjects had identifiers, the words the indexers had used. But the indexers had used the same words for different subjects and different words for the same subjects.

The search for universal identifiers is a known dead end, so what is the next best solution?

Interchangeable Identifiers.

Interchangeable identifiers provide more information to assist in matching up different identifiers for the same subjects. And distinguishing different subjects.

The development of “interchange” markup for texts and data started over twenty (20) years ago and continues today.

The sooner we start exploring interchangeable identifiers the sooner we will make up for lost time.

*(I don’t worry about subjects I can’t talk about.)

Automatic generation of research trails in web history

Filed under: Interface Research/Design,Search Interface,Searching,Trails — Patrick Durusau @ 7:23 am

Automatic generation of research trails in web history Authors: Elin Rønby Pedersen, Karl Gyllstrom, Shengyin Gu, Peter Jin Hong Keywords: activity based computing, automatic clustering, ethnography, semantic clustering, task browser, web history

Abstract:

We propose the concept of research trails to help web users create and reestablish context across fragmented research processes without requiring them to explicitly structure and organize the material. A research trail is an ordered sequence of web pages that were accessed as part of a larger investigation; they are automatically constructed by filtering and organizing users’ activity history, using a combination of semantic and activity based criteria for grouping similar visited web pages. The design was informed by an ethnographic study of ordinary people doing research on the web, emphasizing a need to support research processes that are fragmented and where the research question is still in formation. This paper motivates and describes our algorithms for generating research trails.

Research trails can be applied in several situations: as the underlying mechanism for a research task browser, or as feed to an ambient display of history information while searching. A prototype was built to assess the utility of the first option, a research trail browser.

What is a map if it isn’t an accumulated set of research trails?

In the early stages of what it means to create, share and extend trails into information sets.

Will you be one of the explorers who creates research trails into information sets as they pass the into the giga, tera and petabyte ranges and beyond?

Exploratory information search by domain experts and novices

Exploratory information search by domain experts and novices Authors: Ruogu Kang, Wai-Tat Fu Keywords: domain expertise, exploratory search, social search

Abstract:

The arising popularity of social tagging system has the potential to transform traditional web search into a new era of social search. Based on the finding that domain expertise could influence search behavior in traditional search engines, we hypothesized and tested the idea that domain expertise would have similar influence on search behavior in a social tagging system. We conducted an experiment comparing search behavior of experts and novices when they searched using a tradition search engine and a social tagging system. Results from our experiment showed that experts relied more on their own domain knowledge to generate search queries, while novices were influenced more by social cues in the social tagging system. Experts were also found to conform to each other more than novices in their choice of bookmarks and tags. Implications on the design of future social information systems are discussed.

Empirical validation of the idea that expert searchers (dare I say librarians?) can improve the search results for “novice” searchers.

A line of research that librarians need to take up and expand to combat budget cuts by the uninformed.

Note that experts suffer from the “vocabulary” problem just like novices, just in more sophisticated ways.

Designing a thesaurus-based comparison search interface for linked cultural heritage sources

Filed under: Classification,Heterogeneous Data,Interface Research/Design,Thesaurus — Patrick Durusau @ 7:15 am

Designing a thesaurus-based comparison search interface for linked cultural heritage sources Authors: Alia Amin, Michiel Hildebrand, Jacco van Ossenbruggen, Lynda Hardman Keywords: comparison search, thesauri, cultural heritage

Prototype: LISA, e-culture.multimedian.nl

Abstract:

Comparison search is an information seeking task where a user examines individual items or sets of items for similarities and differences. While this is a known information need among experts and knowledge workers, appropriate tools are not available. In this paper, we discuss comparison search in the cultural heritage domain, a domain characterized by large, rich and heterogeneous data sets, where different organizations deploy different schemata and terminologies to describe their artifacts. This diversity makes meaningful comparison difficult. We developed a thesaurus-based comparison search application called LISA, a tool that allows a user to search, select and compare sets of artifacts. Different visualizations allow users to use different comparison strategies to cope with the underlying heterogeneous data and the complexity of the search tasks. We conducted two user studies. A preliminary study identifies the problems experts face while performing comparison search tasks. A second user study examines the effectiveness of LISA in helping to solve comparison search tasks. The main contribution of this paper is to establish design guidelines for the data and interface of a comparison search application. Moreover, we offer insights into when thesauri and metadata are appropriate for use in such applications.

User-centric project that develops an interface into heterogeneous data sets.

What I would characterize as pre-mapping, that is no “canonical” mapping has yet been established.

Perhaps a good idea to preserve a pre-mapping stage as any mapping represents but one choice among many.

October 2, 2010

Anything to Topic Maps

Filed under: Authoring Topic Maps,Topic Map Software,Topic Maps — Patrick Durusau @ 5:08 am

Anything to Topic Maps.

Lars Heuer announced Anything to Topic Maps – (Email Announcement), saying while it maps anything, Mappify presently only maps atom feeds. (more to follow)

Lars also illustrates how promiscuous topic mapping can lead to unexpected results. 😉

« Newer PostsOlder Posts »

Powered by WordPress