Archive for the ‘Fuzzy Matching’ Category

A survey of fuzzy web mining

Thursday, April 18th, 2013

A survey of fuzzy web mining by Chun-Wei Lin and Tzung-Pei Hong. (Lin, C.-W. and Hong, T.-P. (2013), A survey of fuzzy web mining. WIREs Data Mining Knowl Discov, 3: 190–199. doi: 10.1002/widm.1091)


The Internet has become an unlimited resource of knowledge, and is thus widely used in many applications. Web mining plays an important role in discovering such knowledge. This mining can be roughly divided into three categories, including Web usage mining, Web content mining, and Web structure mining. Data and knowledge on the Web may, however, consist of imprecise, incomplete, and uncertain data. Because fuzzy-set theory is often used to handle such data, several fuzzy Web-mining techniques have been proposed to reveal fuzzy and linguistic knowledge. This paper reviews these techniques according to the three Web-mining categories above—fuzzy Web usage mining, fuzzy Web content mining, and fuzzy Web structure mining. Some representative approaches in each category are introduced and compared.

Written to cover fuzzy web mining but generally useful for data mining and organization as well.

Fuzzy techniques are probably closer to our mental processes than the precision of description logic.

Being mindful that mathematical and logical proofs are justifications for conclusions we already hold.

They are not the paths by which we arrived at those conclusions.

Learning Fuzzy β-Certain and β-Possible rules…

Wednesday, April 18th, 2012

Learning Fuzzy β-Certain and β-Possible rules from incomplete quantitative data by rough sets by Ali Soltan Mohammadi, L. Asadzadeh, and D. D. Rezaee.


The rough-set theory proposed by Pawlak, has been widely used in dealing with data classification problems. The original rough-set model is, however, quite sensitive to noisy data. Tzung thus proposed deals with the problem of producing a set of fuzzy certain and fuzzy possible rules from quantitative data with a predefined tolerance degree of uncertainty and misclassification. This model allowed, which combines the variable precision rough-set model and the fuzzy set theory, is thus proposed to solve this problem. This paper thus deals with the problem of producing a set of fuzzy certain and fuzzy possible rules from incomplete quantitative data with a predefined tolerance degree of uncertainty and misclassification. A new method, incomplete quantitative data for rough-set model and the fuzzy set theory, is thus proposed to solve this problem. It first transforms each quantitative value into a fuzzy set of linguistic terms using membership functions and then finding incomplete quantitative data with lower and the fuzzy upper approximations. It second calculates the fuzzy {\beta}-lower and the fuzzy {\beta}-upper approximations. The certain and possible rules are then generated based on these fuzzy approximations. These rules can then be used to classify unknown objects.

In part interesting because of its full use of sample data to illustrate the process being advocated.

Unless smooth sets in data are encountered by some mis-chance, rough sets will remain a mainstay of data mining for the foreseeable future.

North American Fuzzy Information Processing Society (NAFIPS)

Wednesday, October 19th, 2011

North American Fuzzy Information Processing Society (NAFIPS)

From the website:

As the premier fuzzy society in North America established in 1981, our purpose is to help guide and encourage the development of fuzzy sets and related technologies for the benefit of mankind. In this role, we understand the importance of, and the need for, developing a strong intellectual basis and encouraging new and innovative applications. In addition, we acknowledge our leadership role to foster interaction and technology transfer to other national and international organizations to bring the benefits of this technology to North America and the world.

Links, pointers to software, journals, etc.

NAFIPS 2012 : North American Fuzzy Information Processing Society

Wednesday, October 19th, 2011

NAFIPS 2012 : North American Fuzzy Information Processing Society


When Aug 6, 2012 – Aug 8, 2012
Where Berkeley, CA
Submission Deadline Jan 29, 2012
Notification Due Mar 11, 2012
Final Version Due Apr 15, 2012

From the announcement:

Aims and Scope

NAFIPS 2012 aims to bring together researchers, engineers and practitioners to present the latest achievements and innovations in the area of fuzzy information processing, to discuss thought-provoking developments and challenges, to consider potential future directions.


The topics cover all aspects of fuzzy systems and their applications including, but not limited to:

  • fuzzy sets and fuzzy logic
  • mathematical foundations of fuzzy sets and fuzzy systems
  • approximate reasoning, fuzzy inference models, and soft computing
  • fuzzy decision analysis, decision making, optimization, and design
  • fuzzy system architectures and hardware
  • fuzzy methods in data analysis, statistics and imprecise probability
  • fuzzy databases and information retrieval
  • fuzzy pattern recognition and image processing
  • fuzzy sets in management science
  • fuzzy control and robotics
  • possibility theory
  • fuzzy sets and logic in ontology, web, and social networks
  • fuzzy preference modelling
  • fuzzy sets in operations research and manufacturing
  • fuzzy database mining and financial forecasting
  • fuzzy neural networks
  • evolutionary and hybrid systems
  • intelligent agents and ambient intelligence
  • learning, adaptive, and evolvable fuzzy systems

Introducing CorporateGroupings

Monday, September 19th, 2011

Introducing CorporateGroupings: where fuzzy concepts meet legal entities

From the webpage:

One of the key issues when you’re looking at any big company is what are the constituent parts – because these days a company of any size is pretty much never a single legal entity, but a web of companies, often spanning multiple jurisdictions.

Sometimes this is done because the company’s operations are in different territories, sometimes because the company is a conglomerate of different companies – an educational book publisher and a financial newspaper, for example. Sometimes it’s done to limit the company’s tax liability, or for other legal reasons (e.g. to benefit from a jurisdiction’s rules & regulations compared with the ‘parent’ company’s jurisdiction).

Whatever the reason, getting a handle on the constituent parts is pretty tricky, whether you’re a journalist, a campaigner, a government tax official or a competitor, and making it public is trickier still, meaning the same research is duplicated again and again. And while we may all want to ultimately surface in detail the complex cross-holdings of shareholdings between the different companies, that goal is some way off, not least because it’s not always possible to discover the shareholders of a company.


So you must make do with reading annual reports and trawling company registries around the world, and hoping you don’t miss any. We like to think OpenCorporates has already made this quite a bit easier, meaning that a single search for Tesco returns hundreds of results from around the world, not just those in the UK, or some other individual jurisdiction. But what about where the companies don’t include the group in the name, and how do you surface the information you’ve found for the rest of the world?

The solution to both, we think, is Corporate Groupings, a way of describing a grouping of companies without having to say exactly what legal form that relationship takes (it may be a subsidiary of a subsidiary, for example). In short, it’s what most humans (i.e. non tax-lawyers) think of when they think of a large company – whether it’s a HSBC, Halliburton or HP.

This could have legs.

Not to mention what is a separate subject to you (subsidiary) may be encompassed by a larger subject to me. Both are valid from a certain point of view.

Fuzzy Table

Thursday, November 25th, 2010

Tackling Large Scale Data In Government.

OK, but I cite the post because of its coverage of Fuzzy Table:

FuzzyTable is a large-scale, low-latency, parallel fuzzy-matching database built over Hadoop. It can use any matching algorithm that can compare two often high-dimensional items and return a similarity score. This makes it suitable not only for comparing fingerprints but other biometric modalities, images, audio, and anything that can be represented as a vector of features.

Hmmm, “anything that can be represented as a vector of features?”

Did someone mention subject identity? 😉

Worth a very close read. Software release coming.