Archive for the ‘Multimedia’ Category

What You Don’t See Makes A Difference

Tuesday, May 14th, 2013

Social and Content Hybrid Image Recommender System for Mobile Social Networks by Faustino Sanchez, Marta Barrilero, Silvia Uribe, Federico Alvarez, Agustin Tena, Jose Manuel. Menendez.

Recommender System for Sport Videos Based on User Audiovisual Consumption by Sanchez, F. ; Alduan, M. ; Alvarez, F. ; Menendez, J.M. ; Baez, O.

A pair of papers I discovered at: New Model to Recommend Media Content According to Your Preferences, which summarizes the work as:

The traditional recommender system usually use: semantic techniques which result in products defined by themes, similar tags to the user interests, algorithms that use collective intelligence of a large set of user, in a way that this traditional system recommends themes that suit other people with similar preferences.

From this knowledge state, an applied model of multimedia content that goes beyond this paradigm has been developed, and it incorporates other features of whose influence, the user is not always aware and because of that reason has not been used so far in these types of systems.

Therefore, researchers at the UPM have analyzed in depth the audiovisual features that can be influential for users and they proved that some of these features that determine aesthetic trends and usually go unnoticed can be decisive when defining the user tastes.

For example, researchers proved that in a movie, the relative information to the narrative rhythm (shot length, scenes and sequences), the movements (camera or frame content) or the image nature (brightness, color, texture, information quantity) is relevant when cataloguing the preferences of each piece of information. Analogously to the movies, the researchers have analyzed images using a subset of descriptors considered in the case of video.

In order to verify this model, researchers used a database of 70,000 users and a million of reviews in a set of 200 movies whose features were previously extracted.

These descriptors, once they are standardized, processed and generated adequate statistical data, allow researchers to formally characterize the contents and to find the influence degree on each user as well as their preference conditions.

This makes me curious about how to exploit similar “unseen / unnoticed” factors that influence subject identification?

Both from a quality control perspective but also for the design of topic map authoring/consumption interfaces.

Our senses, as Scrooge points out: A slight disorder of the stomach makes them cheats.

Now we know they may be cheating and we are unaware of it.

Distributed Multimedia Systems (Archives)

Tuesday, February 12th, 2013

Proceedings of the International Conference on Distributed Multimedia Systems

From the webpage:

DMS 2012 Proceedings August 9 to August 11, 2012 Eden Roc Renaissance Miami Beach, USA
DMS 2011 Proceedings August 18 to August 19, 2011 Convitto della Calza, Florence, Italy
DMS 2010 Proceedings October 14 to October 16, 2010 Hyatt Lodge at McDonald’s Campus, Oak Brook, Illinois, USA
DMS 2009 Proceedings September 10 to September 12, 2009 Hotel Sofitel, Redwood City, San Francisco Bay, USA
DMS 2008 Proceedings September 4 to September 6, 2008 Hyatt Harborside at Logan Int’l Airport, Boston, USA
DMS 2007 Proceedings September 6 to September 8, 2007 Hotel Sofitel, Redwood City, San Francisco Bay, USA

For coverage, see the Call for Papers, DMS 2013.

Another archive with topic map related papers!

DMS 2013

Tuesday, February 12th, 2013

DMS 2013: The 19th International Conference on Distributed Multimedia Systems


Paper submission due: April 29, 2013
Notification of acceptance: May 31, 2013
Camera-ready copy: June 15, 2013
Early conference registration due: June 15, 2013
Conference: August 8 – 10, 2013

From the call for papers:

With today’s proliferation of multimedia data (e.g., images, animations, video, and sound), comes the challenge of using such information to facilitate data analysis, modeling, presentation, interaction and programming, particularly for end-users who are domain experts, but not IT professionals. The main theme of the 19th International Conference on Distributed Multimedia Systems (DMS’2013) is multimedia inspired computing. The conference organizers seek contributions of high quality papers, panels or tutorials, addressing any novel aspect of computing (e.g., programming language or environment, data analysis, scientific visualization, etc.) that significantly benefits from the incorporation/integration of multimedia data (e.g., visual, audio, pen, voice, image, etc.), for presentation at the conference and publication in the proceedings. Both research and case study papers or demonstrations describing results in research area as well as industrial development cases and experiences are solicited. The use of prototypes and demonstration video for presentations is encouraged.


Topics of interest include, but are not limited to:

Distributed Multimedia Technology

  • media coding, acquisition and standards
  • QoS and Quality of Experience control
  • digital rights management and conditional access solutions
  • privacy and security issues
  • mobile devices and wireless networks
  • mobile intelligent applications
  • sensor networks, environment control and management

Distributed Multimedia Models and Systems

  • human-computer interaction
  • languages for distributed multimedia
  • multimedia software engineering issues
  • semantic computing and processing
  • media grid computing, cloud and virtualization
  • web services and multi-agent systems
  • multimedia databases and information systems
  • multimedia indexing and retrieval systems
  • multimedia and cross media authoring

Applications of Distributed Multimedia Systems

  • collaborative and social multimedia systems and solutions
  • humanities and cultural heritage applications, management and fruition
  • multimedia preservation
  • cultural heritage preservation, management and fruition
  • distance and lifelong learning
  • emergency and safety management
  • e-commerce and e-government applications
  • health care management and disability assistance
  • intelligent multimedia computing
  • internet multimedia computing
  • virtual, mixed and augmented reality
  • user profiling, reasoning and recommendations

The presence of information/data doesn’t mean topic maps return good ROI.

On the other hand, the presence of information/data does mean semantic impedance is present.

The question is what need you have to overcome semantic impedance and at what cost?

Efficient similarity search on multimedia databases [No Petraeus Images, Yet]

Wednesday, November 14th, 2012

Efficient similarity search on multimedia databases by Mariela Lopresti, Natalia Miranda, Fabiana Piccoli, Nora Reyes.


Manipulating and retrieving multimedia data has received increasing attention with the advent of cloud storage facilities. The ability of querying by similarity over large data collections is mandatory to improve storage and user interfaces. But, all of them are expensive operations to solve only in CPU; thus, it is convenient to take into account High Performance Computing (HPC) techniques in their solutions. The Graphics Processing Unit (GPU) as an alternative HPC device has been increasingly used to speedup certain computing processes. This work introduces a pure GPU architecture to build the Permutation Index and to solve approximate similarity queries on multimedia databases. The empirical results of each implementation have achieved different level of speedup which are related with characteristics of GPU and the particular database used.

No images have been published, yet, in the widening scandal around David Petraeus.

When they do, searching multimedia databases such as Flickr, Facebook, YouTube and others will be a hot issue.

Once found, there is the problem of finding unique ones again and duplicates not again.

Center for Intelligent Information Retrieval (CIIR) [University of Massachusetts Amherst]

Tuesday, August 28th, 2012

Center for Intelligent Information Retrieval (CIIR)

From the webpage:

The Center for Intelligent Information Retrieval (CIIR) is one of the leading research groups working in the areas of information retrieval and information extraction. The CIIR studies and develops tools that provide effective and efficient access to large networks of heterogeneous, multimedia information.

CIIR accomplishments include significant research advances in the areas of retrieval models, distributed information retrieval, information filtering, information extraction, topic models, social network analysis, multimedia indexing and retrieval, document image processing, search engine architecture, text mining, structured data retrieval, summarization, evaluation, novelty detection, resource discovery, interfaces and visualization, digital libraries, computational social science, and cross-lingual information retrieval.

The CIIR has published more than 900 papers on these areas, and has worked with over 90 government and industry partners on research and technology transfer. Open source software supported by the Center is being used worldwide.

Please contact us to talk about potential new projects, collaborations, membership, or joining us as a graduate student or visiting researcher.

To get an idea of the range of their activities, visit the publications page and just browse.

Machine See, Machine Do

Friday, May 4th, 2012

While we wait for maid service robots, news that computers can be trained as human mimics for labeling of multimedia resources. Game-powered machine learning reports success with game based training for music labeling.

The authors, Luke Barrington, Douglas Turnbull, and Gert Lanckriet, neatly summarize music labeling as a problem of volume:

…Pandora, a popular Internet radio service, employs musicologists to annotate songs with a fixed vocabulary of about five hundred tags. Pandora then creates personalized music playlists by finding songs that share a large number of tags with a user-specified seed song. After 10 y of effort by up to 50 full time musicologists, less than 1 million songs have been manually annotated (5), representing less than 5% of the current iTunes catalog.

A problem that extends to the “…7 billion images are uploaded to Facebook each month (1), YouTube users upload 24 h of video content per minute….”

The authors created to:

… investigate and answer two important questions. First, we demonstrate that the collective wisdom of Herd It’s crowd of nonexperts can train machine learning algorithms as well as expert annotations by paid musicologists. In addition, our approach offers distinct advantages over training based on static expert annotations: it is cost-effective, scalable, and has the flexibility to model demographic and temporal changes in the semantics of music. Second, we show that integrating Herd It in an active learning loop trains accurate tag models more effectively; i.e., with less human effort, compared to a passive approach.

The approach promises an augmentation (not replacement) of human judgement with regard to classification of music. An augmentation that would enable human judgement to reach further across the musical corpus than ever before:

…while a human-only approach requires the same labeling effort for the first song as for the millionth, our game-powered machine learning solution needs only a small, reliable training set before all future examples can be labeled automatically, improving efficiency and cost by orders of magnitude. Tagging a new song takes 4 s on a modern CPU: in just a week, eight parallel processors could tag 1 million songs or annotate Pandora’s complete song collection, which required a decade of effort from dozens of trained musicologists.

A promising technique for IR with regard to multimedia resources.

What I wonder about is the extension of the technique, games designed to train machine learning for:

  • e-discovery in legal proceedings
  • “tagging” or indexing if you will, text resources
  • vocabulary expansion for searching
  • contexts for semantic matching
  • etc.

A first person shooter game that annotates the New York Times archives would be really cool!

Excellent Papers for 2011 (Google)

Friday, March 23rd, 2012

Excellent Papers for 2011 (Google)

Corinna Cortes and Alfred Spector of Google Research have collected up great papers published by Glooglers in 2011.

To be sure there are the obligatory papers on searching and natural language processing but there are also papers on audio processing, human-computer interfaces, multimedia, systems and other topics.

Many of these will be the subjects of separate posts in the future. For now, peruse at your leisure and sing out when you see one of special interest.

Semantic Multimedia

Sunday, February 13th, 2011

Special Issue on Semantic Multimedia.

The Journal of Semantic Computing has issued the following call for papers:

In the new millennium Multimedia Computing plays an increasingly important role as more and more users produce and share a constantly growing amount of multimedia documents. The sheer number of documents available in large media repositories or even the World Wide Web makes indexing and retrieval of multimedia documents as well as browsing and annotation more important tasks than ever before. Research in this area is of great importance because of the very limited understanding of the semantics of such data sources as well as the limited ways in which they can be accessed by the users today. The field of Semantic Computing has much to offer with respect to these challenges. This special issue invites articles that bring together Semantic Computing and Multimedia to address the challenges arising by the constant growth of Multimedia.

Important Dates
June 3, 2011: Submissions due
August 3, 2011: Notification date
October 18, 2011: Final versions due

Personally I would argue there is “…very limited understanding of the semantics of … [all] data sources….” 😉

Multimedia documents are more popular and expected so the failure there may be more visible.

SAMT 2010 – Conference

Thursday, November 11th, 2010

SAMT 2010 – Semantic and Digital Media Technologies

Saarbrücken, Germany, 1-3 December 2010

From the announcement:

Large amounts of multimedia material, such as images, audio, video, and 3D/4D material, as well as computer generated 2D, 3D, and 4D content, already exist and are growing at increasing rates. While these amounts are growing, managing distribution of and access to multimedia material is becoming ever harder, both for lay and professional users.

The SAMT conference series tackles these problems by investigating the semantics and pragmatics of multimedia generation, management, and user access. The conference targets scientifically valuable research tackling the semantic gap between the low-level signal data representation of multimedia material and the high-level meaning that providers, consumers, and prosumers associate with the content.

I won’t be in Germany in early December but would appreciate a note from anyone who can attend this conference.

This is an opportunity to see a very strong program of speakers and to mingle with others working in the field. If you are in Germany on the conference dates, it would be time well spent.

The University of Amsterdam’s Concept Detection System at ImageCLEF 2009

Saturday, November 6th, 2010

The University of Amsterdam’s Concept Detection System at ImageCLEF 2009. Authors: Koen E. A. van de Sande, Theo Gevers and Arnold W. M. Smeulders Keywords: Color, Invariance, Concept Detection, Object and Scene Recognition, Bag-of-Words, Photo Annotation, Spatial Pyramid


Our group within the University of Amsterdam participated in the large-scale visual concept detection task of ImageCLEF 2009. Our experiments focus on increasing the robustness of the individual concept detectors based on the bag-of-words approach, and less on the hierarchical nature of the concept set used. To increase the robustness of individual concept detectors, our experiments emphasize in particular the role of visual sampling, the value of color invariant features, the influence of codebook construction, and the effectiveness of kernel-based learning parameters. The participation in ImageCLEF 2009 has been successful, resulting in the top ranking for the large-scale visual concept detection task in terms of both EER and AUC. For 40 out of 53 individual concepts, we obtain the best performance of all submissions to this task. For the hierarchical evaluation, which considers the whole hierarchy of concepts instead of single detectors, using the concept likelihoods estimated by our detectors directly works better than scaling these likelihoods based on the class priors.

Good example of the content to expect from ImageCLEF papers.

This is a very important area of rapidly developing research.

ImageCLEF – The CLEF Cross Language Image Retrieval Track

Saturday, November 6th, 2010

ImageCLEF – The CLEF Cross Language Image Retrieval Track.

The European side of working with digital video.

From the 2009 event website:

ImageCLEF is the cross-language image retrieval track run as part of the Cross Language Evaluation Forum (CLEF) campaign. This track evaluates retrieval of images described by text captions based on queries in a different language; both text and image matching techniques are potentially exploitable.

TREC Video Retrieval Evaluation

Saturday, November 6th, 2010

TREC Video Retrieval Evaluation.

Since I have posted several resources on digital video and concept discovery today, listing the TREC track on the same seemed appropriate.

From the website:

The TREC conference series is sponsored by the National Institute of Standards and Technology (NIST) with additional support from other U.S. government agencies. The goal of the conference series is to encourage research in information retrieval by providing a large test collection, uniform scoring procedures, and a forum for organizations interested in comparing their results. In 2001 and 2002 the TREC series sponsored a video “track” devoted to research in automatic segmentation, indexing, and content-based retrieval of digital video. Beginning in 2003, this track became an independent evaluation (TRECVID) with a workshop taking place just before TREC.

You will find publications, tools, bibliographies, data sets, etc., first class resource site.

Internet Multimedia Search and Mining

Saturday, November 6th, 2010

Internet Multimedia Search and Mining Authors: Xian-Sheng Hua, Marcel Worring, and Tat-Seng Chua


In this chapter, we address the visual learning of automatic concept detectors from web video as available from services like YouTube. While allowing a much more efficient, flexible, and scalable concept learning compared to expert labels, web-based detectors perform poorly when applied to different domains (such as specific TV channels). We address this domain change problem using a novel approach, which – after an initial training on web content – performs a highly efficient online adaptation on the target domain.

In quantitative experiments on data from YouTube and from the TRECVID campaign, we first validate that domain change appears to be the key problem for web-based concept learning, with much more significant impact than other phenomena like label noise. Second, the proposed adaptation is shown to improve the accuracy of web-based detectors significantly, even over SVMs trained on the target
domain. Finally, we extend our approach with active learning such that adaptation can be interleaved with manual annotation for an efficient exploration of novel domains.

The authors cite authority for the proposition that by 2013 that 91% of all Internet traffic will be digital video.

Perhaps, perhaps not, but in any event, “concept detection” is an important aid to topic map authors working with digital video.


  1. Later research on “concept detection” in digital video? (annotated bibliography)
  2. Use in library contexts? (3-5 pages, citations)
  3. How would you design human augmentation of automated detection? (project)