Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

January 21, 2011

Feldspar: A System for Finding Information by Association

Filed under: Associations,Query Language,TMQL,Visual Query Language — Patrick Durusau @ 5:28 pm

Feldspar: A System for Finding Information by Association

…use non-specific requirements to find specific things.

Uses associations to build queries.

Associations developed by Google Desktop.

Very cool!

January 7, 2011

Association Game

Filed under: Associations,Humor — Patrick Durusau @ 7:16 pm

Actually it is called YouTube Name Mashup.

Enter a first and last name and it selects random YouTube videos.

It requires Chrome or Safari. (Seriously, even Mozilla dies.)

The association part?

Do this at a party, with or without topic mappers.

Divide into two teams. Each team has a turn at suggesting a first and last name for submission.

Teams have until the videos stop to write down a relationship (association to you topic map readers) with the roles in the relationship, between any subject in one video with any subject in the other video.

The best relationship is determined by applause from those attending the party.

Five rounds maximum.

Remember, the point of this exercise is to have fun and practice some imaginative thinking.

December 30, 2010

Neo4J 1.2 – Released!

Filed under: Associations,Neo4j,NoSQL — Patrick Durusau @ 7:13 pm

Neo4J 1.2 Released!

New features:

  • The Neo4j Server

    The Neo4j standalone server builds upon the RESTful API that was pre-released for Neo4j 1.1. The server provides a complete stand alone Neo4j graph database experience, making it easy to access Neo4j from any programming language or platform. Some of you have already provided great client libraries for languages such as Python, Ruby, PHP, the .Net stack and more. Links and further information about client libraries can be found at: http://www.delicious.com/neo4j/drivers

  • Neo4j High Availability

    The High Availability feature of Neo4j provides an easy way to set up a cluster of graph databases. This allows for read scalability and tolerates faults in any of the participating machines. Writes are allowed to any machine, but are synchronized with a slight delay across all of them.

    High Availability in Neo4j is still in quite an early stage of its evolution and thus still have a few limitations. While it provides scalability for read load, write operations are slightly slower. Adding new machines to a cluster still requires some manual work, and very large transactions cannot be transmitted across machines. These limitations will be addressed in the next version of Neo4j.

  • Some other noteworthy changes include:
    • Additional services for the Neo4j kernel can now be loaded during startup, or injected into a running instance. Examples of such additional services are the Neo4j shell server and the Neo4j management interface.
    • Memory footprint and read performance has been improved.
    • A new cache implementation has been added for high load, low latency workloads.
    • A new index API has been added that is more tightly integrated with the database. This new index API supports indexing relationships as well as nodes, and also supports indexing and querying multiple properties for each node or relationship. The old index API has been deprecated but remains available and will continue to receive bug fixes for a while.
    • The Neo4j shell supports performing path algorithm queries.
    • Built in automatic feedback to improve future versions of Neo4j. See: http://blog.neo4j.org/2010/10/announcing-neo4j-12m01-and-focus-on.html

Let me repeat part of that:

This new index API supports indexing relationships as well as nodes, and also supports indexing and querying multiple properties for each node or relationship.

Will be looking at the details on the indexing, more comments to follow.

December 22, 2010

Speller Challenge

Filed under: Associations,Topic Maps — Patrick Durusau @ 7:57 pm

Speller Challenge

Microsoft Research and Bing are sponsoring a best speller contest!

From the website:

  • Important Dates
  • January 17th 2011 Registration opens
  • May 27th 2011 Challenge ends at 11:59PM PDT
  • June 17th 2011 Winners Announced
  • July 1st 2011 Camera-ready workshop paper
  • July 2011 Workshop to present results

Goal of contest:

The goal of the Speller Challenge (the “Challenge”) is to build the best speller that proposes the most plausible spelling alternatives for each search query. Spellers are encouraged to take advantage of cloud computing and must be submitted to the Challenge in the form of REST-based Web Services. At the end of the challenge, the entry that you designate as your “primary entry” will be judged according to the evaluation measures described below to determine five (5) winners of the prizes described below.

Variant spellings seems like a natural application for topic maps.

Not by use of variant name, although that might work.

I was thinking more along the lines of associations.

I am curious how to model different sort orders for spellings for any single term.

Reasoning that presentation of spelling choices should vary depending on geographic location or similar data.

December 10, 2010

Facets and “Undoable” Merges

After writing Identifying Subjects with Facets, I started thinking about the merge of the subjects matching a set of facets. So the user could observe all the associations where the members of that subject participated.

If merger is a matter of presentation to the user, then the user should be able to remove one of the members that makes up a subject from the merge. Which results in the removal of associations where that member of the subject participated.

No more or less difficult than the inclusion/exclusion based on the facets, except this time it involves removal on the basis of roles in associations. That is the playing of a role, being a role, etc. are treated as facets of a subject.

Well, except that an individual member of a collective subject is being manipulated.

This capability would enable a user to manipulate what members of a subject are represented in a merge. Not to mention being able to unravel a merge one member of a subject at a time.

An effective visual representation of such a capability could be quite stunning.

December 5, 2010

idk (I Don’t Know) – Ontology, Semantic Web – Cablegate

Filed under: Associations,Ontology,Roles,Semantic Web,Subject Identity,Topic Maps — Patrick Durusau @ 4:45 pm

While researching the idk (I Don’t Know) post I ran across the suggestion unknown was not appropriate for an ontology:

Good principles of ontological design state that terms should represent biological entities that actually exist, e.g., functional activities that are catalyzed by enzymes, biological processes that are carried out in cells, specific locations or complexes in cells, etc. To adhere to these principles the Gene Ontology Consortium has removed the terms, biological process unknown ; GO:0000004, molecular function unknown ; GO:0005554 and cellular component unknown ; GO:0008372 from the ontology.

The “unknown” terms violated this principle of sound ontological design because they did not represent actual biological entities but instead represented annotation status. Annotations to “unknown” terms distinguished between genes that were curated when no information was available and genes that were not yet curated (i.e., not annotated). Annotation status is now indicated by annotating to the root nodes, i.e. biological_process ; GO:0008150, molecular_function ; GO:0003674, or cellular_component ; GO:0005575. These annotations continue to signify that a given gene product is expected to have a molecular function, biological process, or cellular component, but that no information was available as of the date of annotation.

Adhering to principles of correct ontology design should allow GO users to take advantage of existing tools and reasoning methods developed by the ontological community. (http://www.geneontology.org/newsletter/archive/200705.shtml, 5 December 2010)

I wonder about the restriction, “…entities that actually exist.” means?

If a leak of documents occurs, a leaker exists, but in a topic map, I would say that was a role, not an individual.

If the unknown person is represented as an annotation to a role, how do I annotate such an annotation with information about the unknown/unidentified leaker?

Being unknown, I don’t think we can get that with an ontology, at least not directly.

Suggestions?

PS: A topic map can represent unknown functions, etc., as first class subjects (using topics) for an appropriate use case.

November 20, 2010

From Documents To Targets: Geographic References

Filed under: Associations,Geographic Information Retrieval,Ontology,Spatial Index — Patrick Durusau @ 9:18 pm

Exploiting geographic references of documents in a geographical information retrieval system using an ontology-based index Author(s): Nieves R. Brisaboa, Miguel R. Luaces, Ángeles S. Places and Diego Seco Keywords: Geographic information retrieval, Spatial index, Textual index, Ontology, System architecture

Abstract:

Both Geographic Information Systems and Information Retrieval have been very active research fields in the last decades. Lately, a new research field called Geographic Information Retrieval has appeared from the intersection of these two fields. The main goal of this field is to define index structures and techniques to efficiently store and retrieve documents using both the text and the geographic references contained within the text. We present in this paper two contributions to this research field. First, we propose a new index structure that combines an inverted index and a spatial index based on an ontology of geographic space. This structure improves the query capabilities of other proposals. Then, we describe the architecture of a system for geographic information retrieval that defines a workflow for the extraction of the geographic references in documents. The architecture also uses the index structure that we propose to solve pure spatial and textual queries as well as hybrid queries that combine both a textual and a spatial component. Furthermore, query expansion can be performed on geographic references because the index structure is based in an ontology.

Obviously relevant to the Afghan War Diary materials.

The authors observe:

…concepts such as the hierarchical nature of geographic space and the topological relationships between the
geographic objects must be considered….

Interesting but topic maps would help with “What defensive or offensive assets I have in a geographic area?”

Associations: The Kind They Pay For

Filed under: Associations,Authoring Topic Maps,Data Mining,Data Structures — Patrick Durusau @ 4:56 pm

Fun at a Department Store: Data Mining Meets Switching Theory Author(s): Anna Bernasconi, Valentina Ciriani, Fabrizio Luccio, Linda Pagli Keywords: SOP, Implicants, Data Mining, Frequent Itemsets, Blulife

Abstract:

In this paper we introduce new algebraic forms, SOP +  and DSOP + , to represent functions f:{0,1}n → ℕ, based on arithmetic sums of products. These expressions are a direct generalization of the classical SOP and DSOP forms.

We propose optimal and heuristic algorithms for minimal SOP +  and DSOP +  synthesis. We then show how the DSOP +  form can be exploited for Data Mining applications. In particular we propose a new compact representation for the database of transactions to be used by the LCM algorithms for mining frequent closed itemsets.

A new technique for extracting associations between items present (or absent) in transactions (sales transactions).

Of interest to people with the funds to pay for data mining and topic maps.

Topic maps are useful to bind the mining of such associations to other information systems, such as supply chains.

Questions:

  1. How would you use data mining of transaction associations to guide collection development? (3-5 pages, with citations)
  2. How would you use topic maps with the mining of transaction associations? (3-5 pages, no citations)
  3. How would you bind an absence of data to other information? (3-5 pages, no citations)

Observation: Intelligence agencies recognize the absence of data as an association. Binding that absence to other date is a job for topic maps.

October 6, 2010

The RelFinder user interface: interactive exploration of relationships between objects of interest

Filed under: Associations,Interface Research/Design,RDF,Semantic Web,Software — Patrick Durusau @ 7:00 am

The RelFinder user interface: interactive exploration of relationships between objects of interest Authors: Steffen Lohmann, Philipp Heim, Timo Stegemann, Jürgen Ziegler Keywords: dbpedia, decision support, graph visualization, linked data, relationship discovery, relationship web, semantic user interfaces, semantic web, sparql, visual exploration

Abstract:

Being aware of the relationships that exist between objects of interest is crucial in many situations. The RelFinder user interface helps to get an overview: Even large amounts of relationships can be visualized, filtered, and analyzed by the user. Common concepts of knowledge representation are exploited in order to support interactive exploration both on the level of global filters and single relationships. The RelFinder is easy-to-use and works on every RDF knowledge base that provides standardized SPARQL access

Software: RelFinder

RelFinder presents a way to leverage data already in RDF for the creation of associations in topic maps.

Or to explore data already available in RDF.

Exploration of relationships is important for “data” but even more important for the syntaxes that contain data.

Such as equivalence between subjects represented by syntax tokens.

« Newer Posts

Powered by WordPress