Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

July 6, 2015

Which Functor Do You Mean?

Filed under: Homonymous,Names,Subject Identity — Patrick Durusau @ 8:34 pm

Peteris Krumins calls attention to the classic confusion of names that topic maps address in On Functors.

From the post:

It’s interesting how the term “functor” means completely different things in various programming languages. Take C++ for example. Everyone who has mastered C++ knows that you call a class that implements operator() a functor. Now take Standard ML. In ML functors are mappings from structures to structures. Now Haskell. In Haskell functors are just homomorphisms over containers. And in Prolog functor means the atom at the start of a structure. They all are different. Let’s take a closer look at each one.

Peter has said twice in the first paragraph that each of these “functors” is different. Don’t rush to his 2010 post to point out they are different. That was the point of the post. Yes?

Exercise: All of these uses of functor could be scoped by language. What properties of each “functor” would you use to distinguish them beside their language of origin?

May 2, 2015

Homonyms on EOL

Filed under: Homonymous — Patrick Durusau @ 8:50 pm

Homonyms on EOL [Encyclopedia of Life]

From the webpage:

Please join the Homonym Hunters community and help us find all the homonyms on EOL!

This collection is for all kinds of homonyms:

Cross-code homonyms

Homonyms across nomenclatural codes (ICBN, ICZN, ICNB, ICTV) are allowed, so there are plenty of them. Example: Satyrium, the orchid genus and Satyrium, the butterfly genus.

Cross-rank homonyms

At least in zoological nomenclature, homonyms are allowed if they refer to groups at different ranks. Example: Polyphaga, the roach genus and Polyphaga, the beetle suborder.

Invalid homonyms

Within codes and ranks, homonyms are not allowed, so only one of the homonymous names can be valid/accepted. If EOL gets these invalid names from a provider, we will have a page for it. Example: Acanthurus, the surgeon fish genus and Acanthurus, the weevil genus.

Comprehensive lists of homonyms have also been compiled elsewhere:

Systema Naturae 2000: Homonyms

Wikispecies: List of valid homonyms

In topic map parlance, the identification of homonyms across nomenclatural codes and across different ranks translates into setting the scope on a homonym.

That helps both people and machines in distinguishing homonyms.

For merging purposes, that also helps merge homonyms correctly. For example, Aaron Black tweeted:

kirstie-alley

As seen in the Washington Post.

Close to being a homonym anyway. 😉 I could distinguish Kirstie Alley from any possible Christie ally, even on a bad day. Our machines, not so much.

HT: Sam Hunting for the tweet.

February 27, 2013

URL Homonym Problem: A Topic Map Solution

Filed under: Homonymous,HTML5,W3C — Patrick Durusau @ 5:34 pm

You may have heard about the URL homonym problem.

The term “URL” is spelled and pronounced the same way but can mean:

URL as defined by Uniform Resource Identifier (URI): Generic Syntax, RFC 3986, or

URL as defined by HTML5 (Draft, December 17, 2012)

To refresh your memory:

URL in RFC 3986 is defined as:

The term “Uniform Resource Locator” (URL) refers to the subset of URIs that, in addition to identifying a resource, provide a means of locating the resource by describing its primary access mechanism (e.g., its network “location”).

A URL in RFC 3986 is a subtype of URI.

URL in HTML5 is defined as:

A URL is a string used to identify a resource.

A URL in HTML5 is a supertype of URI and IRI.

I would say that going from being a subtype of URI to being a supertype of URI + IRI is a “…willful violation of RFC 3986….”

In LTM syntax, I would solve the URL homonym problem as follows:

#VERSION "1.3"

/* association types */

[supertype-subtype = "Supertype-subtype";
@"http://psi.topicmaps.org/iso13250/model/supertype-subtype"]

[supertype = "Supertype";
@"http://psi.topicmaps.org/iso13250/model/supertype"]

[subtype = "Subtype";
@"http://psi.topicmaps.org/iso13250/model/subtype"]

/* topics */

[uri = "URI";
@"http://tools.ietf.org/html/rfc3986#1.1"]

[url-rfc3986 = "URL";;"URL-RFC 3986"
@"http://tools.ietf.org/html/rfc3986#1.1.3"]

supertype-subtype(uri : supertype,url-rfc3986 : subtype)

[url-html5 = "URL";;"URL-HTML5"
@"http://www.w3.org/TR/2012/CR-html5-20121217/infrastructure.html#urls"]

supertype-subtype(url-html5 : supertype,uri : subtype)

A solution to the URL homonym problem only in the sense of distinguishing which definition is in use.

March 21, 2011

Homonymous Authors

Filed under: Homonymous,Indexing — Patrick Durusau @ 8:53 am

A method for eliminating articles by homonymous authors from the large number of articles retrieved by author search.

Onodera, Natsuo, Mariko Iwasawa, Nobuyuki Midorikawa, Fuyuki Yoshikane, Kou Amano, Yutaka Ootani, Tadashi Kodama, Yasuhiko Kiyama, Hiroyuki Tsunoda, and Shizuka Yamazaki. 2011. “A method for eliminating articles by homonymous authors from the large number of articles retrieved by author search.” Journal of the American Society for Information Science & Technology 62, no. 4: 677-690.

Abstact:

This paper proposes a methodology which discriminates the articles by the target authors (‘true’ articles) from those by other homonymous authors (‘false’ articles). Author name searches for 2,595 ‘source’ authors in six subject fields retrieved about 629,000 articles. In order to extract true articles from the large amount of the retrieved articles, including many false ones, two filtering stages were applied. At the first stage any retrieved article was eliminated as false if either its affiliation addresses had little similarity to those of its source article or there was no citation relationship between the journal of the retrieved article and that of its source article. At the second stage, a sample of retrieved articles was subjected to manual judgment, and utilizing the judgment results, discrimination functions based on logistic regression were defined. These discrimination functions demonstrated both the recall ratio and the precision of about 95% and the accuracy (correct answer ratio) of 90-95%. Existence of common coauthor(s), address similarity, title words similarity, and interjournal citation relationships between the retrieved and source articles were found to be the effective discrimination predictors. Whether or not the source author was from a specific country was also one of the important predictors. Furthermore, it was shown that a retrieved article is almost certainly true if it was cited by, or cocited with, its source article. The method proposed in this study would be effective when dealing with a large number of articles whose subject fields and affiliation addresses vary widely.

Interesting study of heuristics that may be of assistance in creating topic maps from academic literature.

I suspect there are other “patterns” as it were in other forms of information that await discovery.

Powered by WordPress