Archive for the ‘Fuzzy Sets’ Category

A survey of fuzzy web mining

Thursday, April 18th, 2013

A survey of fuzzy web mining by Chun-Wei Lin and Tzung-Pei Hong. (Lin, C.-W. and Hong, T.-P. (2013), A survey of fuzzy web mining. WIREs Data Mining Knowl Discov, 3: 190–199. doi: 10.1002/widm.1091)

Abstract:

The Internet has become an unlimited resource of knowledge, and is thus widely used in many applications. Web mining plays an important role in discovering such knowledge. This mining can be roughly divided into three categories, including Web usage mining, Web content mining, and Web structure mining. Data and knowledge on the Web may, however, consist of imprecise, incomplete, and uncertain data. Because fuzzy-set theory is often used to handle such data, several fuzzy Web-mining techniques have been proposed to reveal fuzzy and linguistic knowledge. This paper reviews these techniques according to the three Web-mining categories above—fuzzy Web usage mining, fuzzy Web content mining, and fuzzy Web structure mining. Some representative approaches in each category are introduced and compared.

Written to cover fuzzy web mining but generally useful for data mining and organization as well.

Fuzzy techniques are probably closer to our mental processes than the precision of description logic.

Being mindful that mathematical and logical proofs are justifications for conclusions we already hold.

They are not the paths by which we arrived at those conclusions.

Learning Fuzzy β-Certain and β-Possible rules…

Wednesday, April 18th, 2012

Learning Fuzzy β-Certain and β-Possible rules from incomplete quantitative data by rough sets by Ali Soltan Mohammadi, L. Asadzadeh, and D. D. Rezaee.

Abstract:

The rough-set theory proposed by Pawlak, has been widely used in dealing with data classification problems. The original rough-set model is, however, quite sensitive to noisy data. Tzung thus proposed deals with the problem of producing a set of fuzzy certain and fuzzy possible rules from quantitative data with a predefined tolerance degree of uncertainty and misclassification. This model allowed, which combines the variable precision rough-set model and the fuzzy set theory, is thus proposed to solve this problem. This paper thus deals with the problem of producing a set of fuzzy certain and fuzzy possible rules from incomplete quantitative data with a predefined tolerance degree of uncertainty and misclassification. A new method, incomplete quantitative data for rough-set model and the fuzzy set theory, is thus proposed to solve this problem. It first transforms each quantitative value into a fuzzy set of linguistic terms using membership functions and then finding incomplete quantitative data with lower and the fuzzy upper approximations. It second calculates the fuzzy {\beta}-lower and the fuzzy {\beta}-upper approximations. The certain and possible rules are then generated based on these fuzzy approximations. These rules can then be used to classify unknown objects.

In part interesting because of its full use of sample data to illustrate the process being advocated.

Unless smooth sets in data are encountered by some mis-chance, rough sets will remain a mainstay of data mining for the foreseeable future.

We Are Not Alone!

Thursday, October 20th, 2011

While following some references I ran across: A proposal for transformation of topic-maps into similarities of topics (pdf) by Dr. Dominik Kuropka.

Abstract:

Newer information filtering and retrieval models like the Fuzzy Set Model or the Topic-based Vector Space Model consider term dependencies by means of numerical similarities between two terms. This leads to the question from what and how these numerical values can be deduced? This paper proposes an algorithm for the transformation of topic-maps into numerical similarities of paired topics. Further the relation of this work towards the above named information filtering and retrieval models is discussed.

Based in part on his paper Topic-Based Vector Space (2003).

This usage differs from ours in part because the work is designed to work at the document level in a traditional IR type context. “Topic maps,” in the ISO sense, are not limited to retrieval of documents or comparison by a particular method, however useful that method may be.

Still, it is good to get to know one’s neighbours so I will be sending him a note about our efforts.

North American Fuzzy Information Processing Society (NAFIPS)

Wednesday, October 19th, 2011

North American Fuzzy Information Processing Society (NAFIPS)

From the website:

As the premier fuzzy society in North America established in 1981, our purpose is to help guide and encourage the development of fuzzy sets and related technologies for the benefit of mankind. In this role, we understand the importance of, and the need for, developing a strong intellectual basis and encouraging new and innovative applications. In addition, we acknowledge our leadership role to foster interaction and technology transfer to other national and international organizations to bring the benefits of this technology to North America and the world.

Links, pointers to software, journals, etc.

NAFIPS 2012 : North American Fuzzy Information Processing Society

Wednesday, October 19th, 2011

NAFIPS 2012 : North American Fuzzy Information Processing Society

Dates:

When Aug 6, 2012 – Aug 8, 2012
Where Berkeley, CA
Submission Deadline Jan 29, 2012
Notification Due Mar 11, 2012
Final Version Due Apr 15, 2012

From the announcement:

Aims and Scope

NAFIPS 2012 aims to bring together researchers, engineers and practitioners to present the latest achievements and innovations in the area of fuzzy information processing, to discuss thought-provoking developments and challenges, to consider potential future directions.

Topics

The topics cover all aspects of fuzzy systems and their applications including, but not limited to:

  • fuzzy sets and fuzzy logic
  • mathematical foundations of fuzzy sets and fuzzy systems
  • approximate reasoning, fuzzy inference models, and soft computing
  • fuzzy decision analysis, decision making, optimization, and design
  • fuzzy system architectures and hardware
  • fuzzy methods in data analysis, statistics and imprecise probability
  • fuzzy databases and information retrieval
  • fuzzy pattern recognition and image processing
  • fuzzy sets in management science
  • fuzzy control and robotics
  • possibility theory
  • fuzzy sets and logic in ontology, web, and social networks
  • fuzzy preference modelling
  • fuzzy sets in operations research and manufacturing
  • fuzzy database mining and financial forecasting
  • fuzzy neural networks
  • evolutionary and hybrid systems
  • intelligent agents and ambient intelligence
  • learning, adaptive, and evolvable fuzzy systems

Subjects, Identifiers, IRI’s, Crisp Sets

Sunday, September 19th, 2010

I was reading Fuzzy Sets, Uncertainty, and Information by George J. Klir and Tina A. Folger, when it occurred to me that use of IRI’s as identifiers for subjects, is by definition a “crisp set.”

Klir and Folger observe:

The crisp set is defined in such a was as to dichotomize the individuals in some given universe of discourse into two groups: members (those that certainly belong in the set) and nonmembers (those that certainly do not). A sharp, unambiguous distinction exists between the members of the class or category represented by the crisp set. (p. 3)

A subject can be assigned an IRI as an identifier, based on some set of properties.

That assignment and use as an identifier makes identification a crisp set operation.

Eliminates fuzzy, rough, soft and other non-crisp set operations, as well as other means of identification.

******
What formal characteristics of crisp sets are useful for topic maps?

Are those characteristics useful for topic map design, authoring or both?

Extra credit: Any set software you would suggest to test your answers?

“Linguistic terms do not hold exact meaning….”

Sunday, September 5th, 2010

In some background research I ran across:

One of the most important applications of fuzzy set theory is the concept of linguistic variables. A linguistic variable is a variable whose values are not numbers, but words or sentences in a natural or artificial language. The value of a linguistic variable is defined as an element of its term set? a predefined set of appropriate linguistic terms. Linguistic terms are essentially subjective categories for a linguistic variable.

Linguistic terms do not hold exact meaning, however, and may be understood differently by different people. The boundaries of a given term are rather subjective, and may also depend on the situation. Linguistic terms therefore cannot be expressed by ordinary set theory; rather, each linguistic term is associated with a fuzzy set. (“Soft sets and soft groups,” by Haci Akta? and Naim Ça?man, Information Sciences, Volume 177, Issue 13, 1 July 2007, Pages 2726-2735

Fuzzy sets are yet another useful approach that has recognized linguistic uncertainty as an issue and developed mechanisms to address it.

What is “linguistic uncertainty” if it isn’t a question of “subject identity?”

Fuzzy sets have developed another way to answer questions about subject identity.

As topic maps mature I want to see the development of equivalences between approaches to subject identity.

Imagine a topic map system consisting of a medical scanning system that is identifying “subjects” in cultures using rough sets, with equivalences to “subjects” identified in published literature using fuzzy sets, that is refined by “subjects” from user contributions and interactions using PSIs or other mechanisms. (Or other mechanisms, past, present or future.)

“I say toh-mah-toh, you say toh-may-toh”

Friday, July 9th, 2010

Rough Fuzzies, and Beyond? I thought was a cute title.

But just scratching the surface in the area of rough sets and I find:

  • generalized fuzzy belief functions
  • generalized fuzzy rough approximation operators
  • fuzzy coverings
  • granular computing
  • training fuzzy systems
  • fuzzy generalization of rough sets
  • generalized fuzzy rough sets
  • fuzzy concept lattices
  • fuzzy implication operators
  • intuitionistic fuzzy implicators

How many of those would you think to search for?

Same semantic issues topic maps are designed to help resolve. But, resolving them means someone (err, that would be those of us interested in the area) have to undertake the real work to resolve those semantic issues.

The obvious answer is some robust system that allows tweets, instant messages, email (properly formatted), as well as updating protocols to update a topic map in real time. That is also an unlikely solution.

Suggestion:

What about an easier to reach solution? Lutz Maicher’s bibmap is a likely starting point.

We would have to ask Lutz about merging in additional data but I suspect he would be amenable to the suggestion.

Building a robust bibliography of topic map relevant materials would occupy us while waiting on more futuristic solutions.

Rough Fuzzies, and Beyond?

Friday, July 2nd, 2010

Reading Rought Sets: Theoretical Aspects of Reasoning about Data by Zdzislaw Pawlak, when I ran across this comparison of rough versus fuzzy sets:

Rough sets has often been compared to fuzzy sets, sometimes with a view to introduce them as competing models of imperfect knowledge. Such a comparison is unfounded. Indiscernibility and vagueness are distinct facets of imperfect knowledge. Indiscernibility refers to the granularity of knowledge, that affects the definition of universes of discourse. Vagueness is due to the fact that categories of natural language are often gradual notions, and refer to sets with smooth boundaries. Borrowing an example from image processing, rough set theory is about the size of pixels, fuzzy set theory is about the existence of more than two levels of grey. (pp. ix-x)

It occurred to me that the precision of our identifications or perhaps better, the fixed precision of our identifications is a real barrier to semantic integration. Because the precision I need for semantic integration is going to vary from subject to subject, depending upon what I already know, what I need to know and for what purpose. Very coarse identification may be acceptable for some purposes but not others.

I don’t know what it would look like to have varying degrees of precision to subject identification or even how that would be represented. But, I suspect solving those problems will be involved in any successful approach to semantic integration.