Archive for the ‘Recommendation’ Category
Tuesday, May 14th, 2013
Social and Content Hybrid Image Recommender System for Mobile Social Networks by Faustino Sanchez, Marta Barrilero, Silvia Uribe, Federico Alvarez, Agustin Tena, Jose Manuel. Menendez.
Recommender System for Sport Videos Based on User Audiovisual Consumption by Sanchez, F. ; Alduan, M. ; Alvarez, F. ; Menendez, J.M. ; Baez, O.
A pair of papers I discovered at: New Model to Recommend Media Content According to Your Preferences, which summarizes the work as:
The traditional recommender system usually use: semantic techniques which result in products defined by themes, similar tags to the user interests, algorithms that use collective intelligence of a large set of user, in a way that this traditional system recommends themes that suit other people with similar preferences.
From this knowledge state, an applied model of multimedia content that goes beyond this paradigm has been developed, and it incorporates other features of whose influence, the user is not always aware and because of that reason has not been used so far in these types of systems.
Therefore, researchers at the UPM have analyzed in depth the audiovisual features that can be influential for users and they proved that some of these features that determine aesthetic trends and usually go unnoticed can be decisive when defining the user tastes.
For example, researchers proved that in a movie, the relative information to the narrative rhythm (shot length, scenes and sequences), the movements (camera or frame content) or the image nature (brightness, color, texture, information quantity) is relevant when cataloguing the preferences of each piece of information. Analogously to the movies, the researchers have analyzed images using a subset of descriptors considered in the case of video.
In order to verify this model, researchers used a database of 70,000 users and a million of reviews in a set of 200 movies whose features were previously extracted.
These descriptors, once they are standardized, processed and generated adequate statistical data, allow researchers to formally characterize the contents and to find the influence degree on each user as well as their preference conditions.
This makes me curious about how to exploit similar “unseen / unnoticed” factors that influence subject identification?
Both from a quality control perspective but also for the design of topic map authoring/consumption interfaces.
Our senses, as Scrooge points out: A slight disorder of the stomach makes them cheats.
Now we know they may be cheating and we are unaware of it.
Posted in Multimedia, Recommendation | No Comments »
Thursday, May 2nd, 2013
Beer Mapper: An experimental app to find the right beer for you by Nathan Yau.
Nathan reviews an app that with a data set of 10,000 beers, attempts to suggest similar beers based on your scoring of beers.
A clever app but I am betting on Lars Marius besting it more often than not!
Posted in Humor, Recommendation | 2 Comments »
Monday, April 8th, 2013
WTF: The Who to Follow Service at Twitter by Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, Reza Zadeh.
Abstract:
WTF (“Who to Follow”) is Twitter’s user recommendation service, which is responsible for creating millions of connections daily between users based on shared interests, common connections, and other related factors. This paper provides an architectural overview and shares lessons we learned in building and running the service over the past few years. Particularly noteworthy was our design decision to process the entire Twitter graph in memory on a single server, which signicantly reduced architectural complexity and allowed us to develop and deploy the service in only a few months. At the core of our architecture is Cassovary, an open-source in-memory graph processing engine we built from scratch for WTF. Besides powering Twitter’s user recommendations, Cassovary is also used for search, discovery, promoted products, and other services as well. We describe and evaluate a few graph recommendation algorithms implemented in Cassovary, including a novel approach based on a combination of random walks and SALSA. Looking into the future, we revisit the design of our architecture and comment on its limitations, which are presently being addressed in a second-generation system under development.
You know it is going to be an amusing paper when footnote 1 reads:
The confusion with the more conventional expansion of the acronym is intentional and the butt of many internal jokes. Also, it has not escaped our attention that the name of the service is actually ungrammatical; the pronoun should properly be in the objective case, as in \whom to follow”.
Algorithmic recommendations may miss the mark for an end user.
On the other hand, what about an authoring interface that supplies recommendations of associations and other subjects?
A paper definitely worth a slow read!
I first saw this at: WTF: The Who to Follow Service at Twitter (Pankaj Gupta, Ashish Goel, Jimmy Lin, Aneesh Sharma, Dong Wang, Reza Zadeh).
Posted in Cassovary, Graphs, Hadoop, Recommendation | No Comments »
Monday, March 11th, 2013
Reco4j
Reco4j is an open source project that aims at developing a recommendation framework based on graph data sources. We choose graph databases for several reasons. They are NoSQL databases that are “schemaless”. This means that it is possible to extend the basic data structure with intermediate information, i.e. similarity value between item and so on. Moreover, since every information are expressed with some properties, nodes and relations, the recommendation process can be customized to work on every graph.
Indeed Reco4j can be used on every graph where “user” and “item” are represented by nodes and the preferences are modelled as relationship between them.
The current implementation leverages on Neo4j as first example of graph database integrated in our framework.
The main features of Reco4j are:
- Performance, leveraging on the graph database and storing information in it for future retrieving it produce fast recommendations also after a system restart;
- Use of Network structure, integrating the simple recommendation algorithms with (social) network analisys;
- General purpose, it can be used with preexisting databases;
- Customizability, editing the properties file the recommender framework can be adapted to the current graph structure and use several types of the recommendation algorithms;
- Ready for Cloud, leveraging on the graph database cloud features the recommendation process can be splitted on several nodes.
Just in case you don’t like the recommendations you get from Amazon.
BTW, “splitted” is an archaic past tense form of split. (According to Merriam-Webster.)
Say rather “…the recommendation process can be split onto several nodes.”
Posted in Graphs, Neo4j, Recommendation | No Comments »
Sunday, March 3rd, 2013
Graph Based Recommendations using “How-To” Guides Dataset by Marcel Caraciolo.
From the post:
In this post I’d like to introduce another approach for recommender engines using graph concepts to recommend novel and interesting items. I will build a graph-based how-to tutorials recommender engine using the data available on the website SnapGuide (By the way I am a huge fan and user of this tutorials website), the graph database Neo4J and the graph traversal language Gremlin.
What is SnapGuide ?
Snapguide is a web service for anyone who wants to create and share step-by-step “how to guides”. It is available on the web and IOS app. There you can find several tutorials with easy visual instructions for a wide array of topics including cooking, gardening, crafts, projects, fashion tips and more. It is free and anyone is invitide to submit guides in order to share their passions and expertise with the community. I have extracted from their website for only research purposes the corpus of tutorials likes. Several users may like the tutorial and this signal can be quite useful to recommend similar tutorials based on what other users liked. Unfortunately I can’t provide the dataset for download but the code you can follow below for your own data set.
An excellent tutorial that walks you through the creation of graph based recommendations, from acquiring the data to posting queries to it.
The SnapGuide site looks like another opportunity for topic map related tutorial material.
Posted in Graphs, Networks, Recommendation | No Comments »
Thursday, January 17th, 2013
SVDFeature: A Toolkit for Feature-based Collaborative Filtering – implementation by Igor Carron.
From the post:
SVDFeature: A Toolkit for Feature-based Collaborative Filtering by Tianqi Chen, Weinan Zhang, Qiuxia Lu, Kailong Chen Zhao Zheng, Yong Yu. The abstract reads:
In this paper we introduce SVDFeature, a machine learning toolkit for feature-based collaborative filtering. SVDFeature is designed to efficiently solve the feature-based matrix factorization. The feature-based setting allows us to build factorization models incorporating side information such as temporal dynamics, neighborhood relationship, and hierarchical information. The toolkit is capable of both rate prediction and collaborative ranking, and is carefully designed for efficient training on large-scale data set. Using this toolkit, we built solutions to win KDD Cup for two consecutive years.
The wiki for the project and attendant code is here.
Can’t argue with two KDD cups in as many years!
Licensed under Apache 2.0.
Posted in Feature Vectors, Filters, Recommendation | No Comments »
Sunday, January 6th, 2013
Reco4j
From the webpage:
Reco4j is an open source project that aims at developing a recommendation framework based on graph data sources. We choose graph databases for several reasons. They are NoSQL databases that are “schemaless”. This means that it is possible to extend the basic data structure with intermediate information, i.e. similarity value between item and so on. Moreover, since every information are expressed with some properties, nodes and relations, the recommendation process can be customized to work on every graph.
Indeed Reco4j can be used on every graph where “user” and “item” are represented by nodes and the preferences are modelled as relationship between them.
The current implementation leverages on Neo4j as first example of graph database integrated in our framework.
The main features of Reco4j are:
- Performance, leveraging on the graph database and storing information in it for future retrieving it produce fast recommendations also after a system restart;
- Use of Network structure, integrating the simple recommendation algorithms with (social) network analisys;
- General purpose, it can be used with preexisting databases;
- Customizability, editing the properties file the recommender framework can be adapted to the current graph structure and use several types of the recommendation algorithms;
- Ready for Cloud, leveraging on the graph database cloud features the recommendation process can be splitted on several nodes.
The current version has two different projects:
- reco4j-core: this project contains the base structure, the interface and the recommendation engine;
- reco4j-neo4j: this project contains the neo4j implementation of the framework.
The “similarity value” comment caught my eye.
How much similarity between two or more items do you need, to have the same item, for some particular purpose?
I first saw this in a tweet by Peter Neubauer.
Posted in Graphs, Neo4j, Recommendation | No Comments »
Sunday, November 4th, 2012
Atepassar Recommendations: Recommending friends with MapReduce and Python by Marcel Caraciolo.
From the post:
In this post I will present one of the tecnhiques used at Atépassar, a brazilian social network that help students around Brazil in order to pass the exams for a civil job, our recommender system.
(graphic omitted)
I will describe some of the data models that we use and discuss our approach to algorithmic innovation that combines offline machine learning with online testing. For this task we use distributed computing since we deal with over with 140 thousand users. MapReduce is a powerful technique and we use it by writting in python code with the framework MrJob. I recommend you to read further about it at my last post here.
One of our recommender techniques is the simple ‘people you might know‘ recommender algorithm. Indeed, there are several components behind the algorithm since at Atépassar, users can follow other people as also be followed by other people. In this post I will talk about the basic idea of the algorithm which can be derivated for those other components. The idea of the algorithm is that if person A and person B do know each other but they have a lot of mutual friends, then the system should recommend that they connect with each other.
Is there a presumption in social recommendation programs that there are no duplicate people in the network? Using different names? If two people have exactly the same friends, is there some chance they could be the same person?
How many “same” friends would you require? 20? 30? 50? Some other number?
Curious because determining personal identity and identity of the people behind two or more entries, may be a matter of pattern matching.
BTW, this is a interesting looking blog. You may want to browse older entries or even subscribe.
Posted in MapReduce, Python, Recommendation | No Comments »
Monday, September 24th, 2012
How to Build a Recommendation Engine by John F. McGowan.
From the post:
This article shows how to build a simple recommendation engine using GNU Octave, a high-level interpreted language, primarily intended for numerical computations, that is mostly compatible with MATLAB. A recommendation engine is a program that recommends items such as books and movies for customers, typically of a web site such as Amazon or Netflix, to purchase. Recommendation engines frequently use statistical and mathematical methods to estimate what items a customer would like to buy or would benefit from purchasing.
From a purely business point of view, one would like to maximize the profit from a customer, discounted for time (a dollar today is worth more than a dollar next year), over the duration that the customer is a customer of the business. In a long term relationship with a customer, this probably means that the customer needs to be happy with most purchases and most recommendations.
Recommendation engines are “hot” right now. There are many attempts to apply advanced statistics and mathematics to predict what customers will buy, what purchases will make customers happy and buy again, and what purchases deliver the most value to customers. Data scientists are trying to apply a range of methods with fancy technical names such as principal component analysis (PCA), neural networks, and support vector machines (SVM) — amongst others — to predicting successful purchases and personalizing recommendations for individual customers based on their stated preferences, purchasing history, demographics and other factors.
This article presents a simple recommendation engine using Pearson’s product moment correlation coefficient, also known as the linear correlation coefficient. The engine uses the correlation coefficient to identify customers with similar purchasing patterns, and presumably tastes, and recommends items purchased by one customer to the other similar customer who has not purchased those items.
Probably not the recommendation engine you will use for commercial deployment.
But, it will give you a good start on understanding the principles of recommendation engines.
My interest in recommendations isn’t so much to identify the subjects of recommendation, which are topics in their own rights, as in probing the basis for subject identification by multiple users.
That is there is some identification that underlies a choice of some book or movie over another. It may not be possible to identify the components of that identification, but we do have aftermath of that identification.
Rather than collapsing dimensions, thinking we should expand the dimensions around choices to see if any patterns emerge.
I first saw this at DZone.
Posted in Recommendation | No Comments »
Saturday, September 15th, 2012
Overview of Nonparametric Techniques with Elaine Eisenbeisz.
Date: October 3, 2012
Time: 3pm Eastern Time UTC -4 (2pm Central, 1pm Mountain, 12pm Pacific)
From the description:
A distribution of data which is not normal does not mean it is abnormal. There are many data analysis techniques which do not require the assumption of normality.
This webinar will provide information on when it is best to use nonparametric alternatives and provides information on suggested tests to use in lieu of:
- Independent samples and paired t-tests
- Analysis of variance techniques
- Pearson’s Product Moment Correlation
- Repeated measures designs
A description of nonparametric techniques for use with count data and contingency tables will also be provided.
Movie ratings, a ranked population, are appropriate for nonparametric methods.
You just thought you didn’t know anything about nonparametric methods.
Applicable to all ranked populations (can you say recommendation?).
While you wait for the webinar, try some of the references from Wikipedia: Nonparametric Statistics.
Posted in Nonparametric, Recommendation, Statistics | No Comments »
Friday, September 14th, 2012
RecSys 2012: Beyond Five Stars by Daniel Tunkelang.
From the post:
I spent the past week in Dublin attending the 6th ACM International Conference on Recommender Systems (RecSys 2012). This young conference has become the premier global forum for discussing the state of the art in recommender systems, and I’m thrilled to have has the opportunity to participate.
Daniel’s review of RecSys 2012 with lots of links and pointers!
It will take you some time to work through all the hyperlinks so it is a good thing the weekend is upon us!
Enjoy!
Posted in Conferences, Recommendation | No Comments »
Tuesday, September 11th, 2012
Context-Aware Recommender Systems 2012 (In conjunction with the 6th ACM Conference on Recommender Systems (RecSys 2012))
I usually think of recommender systems as attempts to deliver content based on clues about my interests or context. If I dial 911, the location of the nearest pizza vendor probably isn’t high on my lists of interests, etc.
As I looked over these proceedings, it occurred to me that subject identity, for merging purposes, isn’t limited to the context of the subject in question.
That is some merging tests could depend upon my context as a user.
Take my 911 call for instance. For many purposes, a police substation, fire station, 24 hour medical clinic and a hospital are different subjects.
In a medical emergency situation, for which a 911 call might be a clue, all of those could be treated as a single subject – places for immediate medical attention.
What other subjects do you think might merge (or not) depending upon your context?
Table of Contents
- Optimal Feature Selection for Context-Aware Recommendation Using Differential Relaxation
Yong Zheng, Robin Burke, Bamshad Mobasher.
- Relevant Context in a Movie Recommender System: Users’ Opinion vs. Statistical Detection
Ante Odic, Marko Tkalcic, Jurij Franc Tasic, Andrej Kosir.
- Improving Novelty in Streaming Recommendation Using a Context Model
Doina Alexandra Dumitrescu, Simone Santini.
- Towards a Context-Aware Photo Recommender System
Fabricio Lemos, Rafael Carmo, Windson Viana, Rossana Andrade.
- Context and Intention-Awareness in POIs Recommender Systems
Hernani Costa, Barbara Furtado, Durval Pires, Luis Macedo, F. Amilcar Cardoso.
- Evaluation and User Acceptance Issues of a Bayesian-Classifier-Based TV Recommendation System
Benedikt Engelbert, Karsten Morisse, Kai-Christoph Hamborg.
- From Online Browsing to Offline Purchases: Analyzing Contextual Information in the Retail Business
Simon Chan, Licia Capra.
Posted in Context, Context-aware, Identity, Recommendation | No Comments »
Monday, July 9th, 2012
Recommendations and how to measure the ROI with some metrics ?
From the post:
We talked a lot about recommender systems, specially discussing the techniques and algorithms used to build and evaluate algorithmically those systems. But let’s discuss now how can we measure in quantitative terms how a social network or an on-line store can measure the return of investment (ROI) of a given recommendation.
The metrics used in recommender systems
We talk a lot about F1-measure, Accuracy, Precision, Recall, AUC, those buzzwords widely known by the machine learning researchers and data mining specialists. But do you know what is CTR, LOC, CER or TPR ? Let’s explain more about those metrics and how they can evaluate the quantitative benefits of a given recommendation.
Would you feel more comfortable if I said identification instead of recommendation?
Consider it done.
After all, a “recommendation” is some actor making a statement about identified subject. Run of the mill stuff for a topic map.
The ROI question is whether there is some benefit to that statement + identification?
Assuming you are using a topic map or similar measures to track the source of a recommendation, you could begin to attach ROI to particular sources of recommendation.
Posted in Recommendation | No Comments »
Wednesday, June 13th, 2012
SeRSy 2012: International Workshop on Semantic Technologies meet Recommender Systems & Big Data
Important Dates:
Submission of papers: July 31, 2012
Notification of acceptance: August 21, 2012
Camera-ready versions: September 10, 2012
[In connection with the 11th International Semantic Web Conference, Boston, USA, November 11-15, 2012.]
The scope statement:
People generally need more and more advanced tools that go beyond those implementing the canonical search paradigm for seeking relevant information. A new search paradigm is emerging, where the user perspective is completely reversed: from finding to being found. Recommender Systems may help to support this new perspective, because they have the effect of pushing relevant objects, selected from a large space of possible options, to potentially interested users. To achieve this result, recommendation techniques generally rely on data referring to three kinds of objects: users, items and their relations.
Recent developments of the Semantic Web community offer novel strategies to represent data about users, items and their relations that might improve the current state of the art of recommender systems, in order to move towards a new generation of recommender systems which fully understand the items they deal with.
More and more semantic data are published following the Linked Data principles, that enable to set up links between objects in different data sources, by connecting information in a single global data space: the Web of Data. Today, Web of Data includes different types of knowledge represented in a homogeneous form: sedimentary one (encyclopedic, cultural, linguistic, common-sense) and real-time one (news, data streams, …). This data might be useful to interlink diverse information about users, items, and their relations and implement reasoning mechanisms that can support and improve the recommendation process.
The challenge is to investigate whether and how this large amount of wide-coverage and linked semantic knowledge can be automatically introduced into systems that perform tasks requiring human-level intelligence. Examples of such tasks include understanding a health problem in order to make a medical decision, or simply deciding which laptop to buy. Recommender systems support users exactly in those complex tasks.
The primary goal of the workshop is to showcase cutting edge research on the intersection of Semantic Technologies and Recommender Systems, by taking the best of the two worlds. This combination may provide the Semantic Web community with important real-world scenarios where its potential can be effectively exploited into systems performing complex tasks.
Should be interesting to see whether the semantic technologies or the recommender systems or both get the “rough” or inexact edges.
Posted in Conferences, Recommendation, Semantic Web | No Comments »
Monday, June 11th, 2012
Neo4j in the Trenches
Thursday June 14 10:00 PDT / 19:00 CEST
From the webpage:
OpenCredo discusses Opigram: a social recommendation engine
In this webinar, Nicki Watt of OpenCredo presents the lessons learned (and being learned) on an active Neo4j project: Opigram. Opigram is a socially oriented recommendation engine which is already live, with some 150k users and growing. The webinar will cover Neo4j usage, challenges encountered, and solutions to these challenges.
I was curious enough to run down the homepage for OpenCredo.
Now there is an interesting homepage!
The blog post titles promise some interesting reading.
I will report back as I find items of interest.
Posted in Graphs, Neo4j, Recommendation | No Comments »
Thursday, April 26th, 2012
Simple tools for building a recommendation engine by Joseph Rickert.
From the post:
Revolution’s resident economist, Saar Golde, is very fond of saying that “90% of what you might from a recommendation engine can be achieved with simple techniques”. To illustrate this point (without doing a lot of work), we downloaded the million row movie dataset from www.grouplens.org with the idea of just taking the first obvious exploratory step: finding the good movies. Three zipped up .dat files comprise this data set. The first file, ratings.dat, contains 1,000,209 records of UserID, MovieID, Rating, and Timestamp for 6,040 users rating 3,952 movies. Ratings are whole numbers on a 1 to 5 scale. The second file, users.dat, contains the UserID, Gender, Age, Occupation and Zip-code for each user. The third file, movies.dat, contains the MovieID, Title and Genre associated with each movie.
Curious, if a topic map engine performed 90% of the possible merges in a topic map, would that be enough?
Would your answer differ if the topic map had less than 10,000 topics and associations versus a topic map with 100 million topics and associations?
Would your answer differ based on a timeline of the data? Say the older the data, the less reliable the merging. Recent medical data < 1% error rate, up to ten years, ten to twenty years, <= 10% error rate, more than twenty years, best efforts.
Which of course raises the question of how you would test for conformance to such requirements?
Posted in Dataset, R, Recommendation | No Comments »
Wednesday, April 25th, 2012
Auralist: introducing serendipity into music recommendation
Abstract:
Recommendation systems exist to help users discover content in a large body of items. An ideal recommendation system should mimic the actions of a trusted friend or expert, producing a personalised collection of recommendations that balance between the desired goals of accuracy, diversity, novelty and serendipity. We introduce the Auralist recommendation framework, a system that – in contrast to previous work – attempts to balance and improve all four factors simultaneously. Using a collection of novel algorithms inspired by principles of “serendipitous discovery”, we demonstrate a method of successfully injecting serendipity, novelty and diversity into recommendations whilst limiting the impact on accuracy. We evaluate Auralist quantitatively over a broad set of metrics and, with a user study on music recommendation, show that Auralist‘s emphasis on serendipity indeed improves user satisfaction.
A deeply interesting article for anyone interested in recommendation systems and the improvement thereof.
It is research that should go forward but among my concerns about the article:
1) I am not convinced of the definition of “serendipity:”
Serendipity represents the “unusualness” or “surprise” of recommendations. Unlike novelty, serendipity encompasses the semantic content of items, and can be imagined as the distance between recommended items and their expected contents. A recommendation of John Lennon to listeners of The Beatles may well be accurate and novel, but hardly constitutes an original or surprising recommendation. A serendipitous system will challenge users to expand their tastes and hopefully provide more interesting recommendations, qualities that can help improve recommendation satisfaction [23]
Or perhaps I am “hearing” it in the context of discovery. Such as searching for Smokestack Lighting and not finding the Yardbirds but Howling Wolf as the performer. Serendipity in that sense not having any sense of “challenge.”
2) A survey of 21 participants, mostly students, is better than experimenters asking each other for feedback but only just. The social sciences department should be able to advise on test protocols and procedures.
3) There was no showing that “user satisfaction,” the item to be measured, is the same thing as “serendipity.” I am not entirely sure that other than by example, “serendipity” can even be discussed, let alone measured.
Take my Howling Wolf example. How close or far away is the “serendipity” there versus an instance of “serendipity” as offered by Auralist? Unless and until we can establish a metric, at least a loose one, it is hard to say which one has more “serendipity.”
Posted in Music, Recommendation, Serendipity | No Comments »
Wednesday, April 25th, 2012
LAILAPS
From the website:
LAILAPS combines a keyword driven search engine for an integrative access to life science databases, machine learning for a content driven relevance ranking, recommender systems for suggestion of related data records and query refinements with a user feedback tracking system for an self learning relevance training.
Features:
- ultra fast keyword based search
- non-static relevance ranking
- user specific relevance profiles
- suggestion of related entries
- suggestion of related query terms
- self learning by user tracking
- deployable at standard desktop PC
- 100% JAVA
- installer for in-house deployment
I like the idea of a recommender system that “suggests” related data records and query refinements. It could be wrong.
I am as guilty as anyone of thinking in terms of “correct” recommendations that always lead to relevant data.
That is applying “crisp” set thinking to what is obviously a “rough” set situation. We as readers have to sort out the items in the “rough” set and construct for ourselves, a temporary and fleeting “crisp” set for some particular purpose.
If you are using LAILAPS, I would appreciate a note about your experiences and impressions.
Posted in Keywords, Machine Learning, Query Rewriting, Recommendation, Relevance, Search Engines | No Comments »
Friday, April 13th, 2012
Neo4J Tales from the Trenches: A Recommendation Engine Case Study
25 April 2012 – At 18:30 PM (“Oh to be in London,” he wished. Not for the last time.)
From the post:
In this talk for the Neo4j User Group, Nicki Watt and Michal Bachman present the lessons learned (and being learned) on an active Neo4J project – Opigram.
Opigram is a socially orientated recommendation engine which is already live, with some 150k users and growing. Nicki and Michal will outline their usage of Neo4j, and some of the challenges they have encountered, as well as the approaches and implications taken to address them.
Sound like a good introduction to Neo4j in the context of an actual project.
Posted in Neo4j, Recommendation | No Comments »
Saturday, February 25th, 2012
Similarity-based Recommendation Engines by Josh Adell.
From the post:
I am currently participating in the Neo4j-Heroku Challenge. My entry is a — as yet, unfinished — beer rating and recommendation service called FrostyMug. All the major functionality is complete, except for the actual recommendations, which I am currently working on. I wanted to share some of my thoughts and methods for building the recommendation engine.
I hear “similarity” as a measure of subject identity: beers recommended to X; movies enjoyed by Y users, even though those are group subjects.
Or perhaps better, as a possible means of subject identity. A person could list all the movies they have enjoyed and that list be the same as a recommendation list. Same subject, just a different method of identification. (Unless the means of subject identification has an impact on the subject you think is being identified.)
Posted in Contest, Heroku, Neo4j, Recommendation | 2 Comments »
Tuesday, January 10th, 2012
Proceedings of the 2nd International Workshop on Information Heterogeneity and Fusion in Recommender Systems
I am still working on the proceeding for the main conference but thought these might be of interest:
- Information market based recommender systems fusion
Efthimios Bothos, Konstantinos Christidis, Dimitris Apostolou, Gregoris Mentzas
Pages: 1-8
doi>10.1145/2039320.2039321
- A kernel-based approach to exploiting interaction-networks in heterogeneous information sources for improved recommender systems
Oluwasanmi Koyejo, Joydeep Ghosh
Pages: 9-16
doi>10.1145/2039320.2039322
- Learning multiple models for exploiting predictive heterogeneity in recommender systems
Clinton Jones, Joydeep Ghosh, Aayush Sharma
Pages: 17-24
doi>10.1145/2039320.2039323
- A generic semantic-based framework for cross-domain recommendation
Ignacio Fernández-Tobías, Iván Cantador, Marius Kaminskas, Francesco Ricci
Pages: 25-32
doi>10.1145/2039320.2039324
- Hybrid algorithms for recommending new items
Paolo Cremonesi, Roberto Turrin, Fabio Airoldi
Pages: 33-40
doi>10.1145/2039320.2039325
- Expert recommendation based on social drivers, social network analysis, and semantic data representation
Maryam Fazel-Zarandi, Hugh J. Devlin, Yun Huang, Noshir Contractor
Pages: 41-48
doi>10.1145/2039320.2039326
- Experience Discovery: hybrid recommendation of student activities using social network data
Robin Burke, Yong Zheng, Scott Riley
Pages: 49-52
doi>10.1145/2039320.2039327
- Personalizing tags: a folksonomy-like approach for recommending movies
Alan Said, Benjamin Kille, Ernesto W. De Luca, Sahin Albayrak
Pages: 53-56
doi>10.1145/2039320.2039328
- Personalized pricing recommender system: multi-stage epsilon-greedy approach
Toshihiro Kamishima, Shotaro Akaho
Pages: 57-64
doi>10.1145/2039320.2039329
- Matrix co-factorization for recommendation with rich side information and implicit feedback
Yi Fang, Luo Si
Pages: 65-69
doi>10.1145/2039320.2039330
Posted in Conferences, Heterogeneous Data, Recommendation | No Comments »
Tuesday, December 13th, 2011
DiveRS 2011 – ACM RecSys 2011 Workshop on Novelty and Diversity in Recommender Systems
From the conference page:
Most research and development efforts in the Recommender Systems field have been focused on accuracy in predicting and matching user interests. However there is a growing realization that there is more than accuracy to the practical effectiveness and added-value of recommendation. In particular, novelty and diversity have been identified as key dimensions of recommendation utility in real scenarios, and a fundamental research direction to keep making progress in the field.
Novelty is indeed essential to recommendation: in many, if not most scenarios, the whole point of recommendation is inherently linked to a notion of discovery, as recommendation makes most sense when it exposes the user to a relevant experience that she would not have found, or thought of by herself –obvious, however accurate recommendations are generally of little use.
Not only does a varied recommendation provide in itself for a richer user experience. Given the inherent uncertainty in user interest prediction –since it is based on implicit, incomplete evidence of interests, where the latter are moreover subject to change–, avoiding a too narrow array of choice is generally a good approach to enhance the chances that the user is pleased by at least some recommended item. Sales diversity may enhance businesses as well, leveraging revenues from market niches.
It is easy to increase novelty and diversity by giving up on accuracy; the challenge is to enhance these aspects while still achieving a fair match of the user’s interests. The goal is thus generally to enhance the balance in this trade-off, rather than just a diversity or novelty increase.
DiveRS 2011 aims to gather researchers and practitioners interested in the role of novelty and diversity in recommender systems. The workshop seeks to advance towards a better understanding of what novelty and diversity are, how they can improve the effectiveness of recommendation methods and the utility of their outputs. We aim to identify open problems, relevant research directions, and opportunities for innovation in the recommendation business. The workshop seeks to stir further interest for these topics in the community, and stimulate the research and progress in this area.
The abstract from “Fusion-based Recommender System for Improving Serendipity” by Kenta Oku, Fumio Hattori reads:
Recent work has focused on new measures that are beyond the accuracy of recommender systems. Serendipity, which is one of these measures, is defined as a measure that indicates how the recommender system can find unexpected and useful items for users. In this paper, we propose a Fusion-based Recommender System that aims to improve the serendipity of recommender systems. The system is based on the novel notion that the system finds new items, which have the mixed features of two user-input items, produced by mixing the two items together. The system consists of item-fusion methods and scoring methods. The item-fusion methods generate a recommendation list based on mixed features of two user-input items. Scoring methods are used to rank the recommendation list. This paper describes these methods and gives experimental results.
Interested yet?
Posted in Diversity, Novelty, Recommendation | No Comments »
Thursday, October 6th, 2011
VII PythonBrasil
Marcel Caraciolo covers his slides from keynotes at VII PythonBrasil, the most interesting for topic mappers would be Crab – A Python Framework for Building Recommender Systems.
Recommender systems by necessity have to identify the interests of a user (2 subjects, interests and user), match those to other interests (another subject) and then produce a recommendation (yet another subject), plus relationship subjects if you are interested. Recommender systems are already identifying all those subjects and gathering instances of them together.
What would you do to make their constantly interim results available to other systems?
Posted in Python, Recommendation | No Comments »
Tuesday, October 4th, 2011
VinWiki Part 1: Building an intelligent Web app using Seam, Hibernate, RichFaces, Lucene and Mahout
From the webpage:
This is the first post in a four part series about a wine rating and recommendation Web application, named VinWiki, built using open source technology. The purpose of this series is to document key design and implementation decisions, which may be of interest to anyone wanting to build an intelligent Web application using Java technologies. The end result will not be a 100% functioning Web application, but will have enough functionality to prove the concepts.
I thought about Lars Marius and his expertise at beer evaluation when I saw this series. Not that Lars would need it but it looks like the sort of thing you could build to recommend things you know something about, and like. Whatever that may be.
Posted in Lucene, Mahout, Recommendation | No Comments »
Monday, October 3rd, 2011
Algorithms of the Intelligent Web Review by Pearlene McKinley
From the post:
I have always had an interest in AI, machine learning, and data mining but I found the introductory books too mathematical and focused mostly on solving academic problems rather than real-world industrial problems. So, I was curious to see what this book was about.
I have read the book front-to-back (twice!) before I write this report. I started reading the electronic version a couple of months ago and read the paper print again over the weekend. This is the best practical book in machine learning that you can buy today — period. All the examples are written in Java and all algorithms are explained in plain English. The writing style is superb! The book was written by one author (Marmanis) while the other one (Babenko) contributed in the source code, so there are no gaps in the narrative; it is engaging, pleasant, and fluent. The author leads the reader from the very introductory concepts to some fairly advanced topics. Some of the topics are covered in the book and some are left as an exercise at the end of each chapter (there is a “To Do” section, which was a wonderful idea!). I did not like some of the figures (they were probably made by the authors not an artist) but this was only a minor aesthetic inconvenience.
The book covers four cornerstones of machine learning and intelligence, i.e. intelligent search, recommendations, clustering, and classification. It also covers a subject that today you can find only in the academic literature, i.e. combination techniques. Combination techniques are very powerful and although the author presents the techniques in the context of classifiers, it is clear that the same can be done for recommendations — as the Bell Korr team did for the Netflix prize.
Wonder if this will be useful in the Stanford AI course that starts next week with more than 130,000 students? Introduction to Artificial Intelligence – Stanford Class
I am going to order a copy, if for no other reason than to evaluate the reviewer’s claim of explanations “in plain English.” I have seen some fairly clever explanations of AI algorithms and would like to see how these stack up.
Posted in Algorithms, Artificial Intelligence, Classification, Clustering, Recommendation, Search Algorithms | No Comments »
Saturday, September 24th, 2011
Recommendation Engine by Ricky Ho.
From the post:
In a classical model of recommendation system, there are “users” and “items”. User has associated metadata (or content) such as age, gender, race and other demographic information. Items also has its metadata such as text description, price, weight … etc. On top of that, there are interaction (or transaction) between user and items, such as userA download/purchase movieB, userX give a rating 5 to productY … etc.
Ricky does a good job of stepping through the different approaches to making recommendations. Iimportant for topic map interfaces that recommend additional topics to their users.
Posted in Recommendation | No Comments »
Thursday, September 22nd, 2011
A Graph-Based Movie Recommender Engine by Marko A. Rodriguez.
From the post:
A recommender engine helps a user find novel and interesting items within a pool of resources. There are numerous types of recommendation algorithms and a graph can serve as a general-purpose substrate for evaluating such algorithms. This post will demonstrate how to build a graph-based movie recommender engine using the publicly available MovieLens dataset, the graph database Neo4j, and the graph traversal language Gremlin. Feel free to follow along in the Gremlin console as the post will go step-by-step from data acquisition, to parsing, and ultimately, to traversing.
As important as graph engines, algorithms and research are at present, and as important as they will become, I think the Neo4j community itself is worthy of direct study. There are stellar contributors to the technology and the community, but is that what makes it such an up and coming community? Or perhaps how they contributed? It would take a raft (is that the term for a group of sociologists?) of sociologists and perhaps there are existing studies of online communities that might have some clues. I mention that because there are other groups I would like to see duplicate the success of the Neo4j community.
Marko takes you from data import to a useful (albeit limited) application in less than 2500 words. (measured to the end of the conclusion, excluding further reading)
And leaves you with suggestions for further exploring.
That is a blog post that promotes a paradigm. (And for anyone who takes offense at that observation, it applies to my efforts as well. There are other ways to promote a paradigm but you have to admit, this is a fairly compelling one.)
Put Marko’s post on your read with evening coffee list.
Posted in Graphs, Gremlin, Neo4j, Recommendation | No Comments »
Monday, September 19th, 2011
Recommender Systems
This website provides support for “Recommender Systems: An Introduction” and “Recommender Systems Handbook.”
Recommender systems are an important area of research for topic maps because recommendation of necessity involves recognition (or attempted recognition) of subjects similar to an example subject. That recommendation may be captured in relationship to a particular set of user characteristics or it can be used as the basis for identifying a subject.
The site offers pointers to very strong teaching materials (as of 19 September 2011):
Slides
Tutorials
Courses
If you want to contribute teaching materials, please contact dietmar.jannach (at) udo.edu.
Posted in Recommendation, Similarity | No Comments »
Sunday, September 11th, 2011
New Challenges in Distributed Information Filtering and Retrieval
Proceedings of the 5th International Workshop on New Challenges in Distributed Information Filtering and Retrieval
Palermo, Italy, September 17, 2011.
Edited by:
Cristian Lai – CRS4, Loc. Piscina Manna, Building 1 – 09010 Pula (CA), Italy
Giovanni Semeraro – Dept. of Computer Science, University of Bari, Aldo Moro, Via E. Orabona, 4, 70125 Bari, Italy
Eloisa Vargiu – Dept. of Electrical and Electronic Engineering, University of Cagliari, Piazza d’Armi, 09123 Cagliari, Italy
Table of Contents:
- Experimenting Text Summarization on Multimodal Aggregation
Giuliano Armano, Alessandro Giuliani, Alberto Messina, Maurizio Montagnuolo, Eloisa Vargiu
- From Tags to Emotions: Ontology-driven Sentimental Analysis in the Social Semantic Web
Matteo Baldoni, Cristina Baroglio, Viviana Patti, Paolo Rena
- A Multi-Agent Decision Support System for Dynamic Supply Chain Organization
Luca Greco, Liliana Lo Presti, Agnese Augello, Giuseppe Lo Re, Marco La Cascia, Salvatore Gaglio
- A Formalism for Temporal Annotation and Reasoning of Complex Events in Natural Language
Francesco Mele, Antonio Sorgente
- Interaction Mining: the new Frontier of Call Center Analytics
Vincenzo Pallotta, Rodolfo Delmonte, Lammert Vrieling, David Walker
- Context-Aware Recommender Systems: A Comparison Of Three Approaches
Umberto Panniello, Michele Gorgoglione
- A Multi-Agent System for Information Semantic Sharing
Agostino Poggi, Michele Tomaiuolo
- Temporal characterization of the requests to Wikipedia
Antonio J. Reinoso, Jesus M. Gonzalez-Barahona, Rocio Muñoz-Mansilla, Israel Herraiz
- From Logical Forms to SPARQL Query with GETARUN
Rocco Tripodi, Rodolfo Delmonte
- ImageHunter: a Novel Tool for Relevance Feedback in Content Based Image Retrieval
Roberto Tronci, Gabriele Murgia, Maurizio Pili, Luca Piras, Giorgio Giacinto
Posted in Filters, Image Recognition, Information Retrieval, Ontology, Recommendation, SPARQL, Semantic Web, Summarization, Temporal Semantic Analysis | No Comments »