Archive for the ‘Tensors’ Category

Text To Image Synthesis Using Thought Vectors

Sunday, August 28th, 2016

Text To Image Synthesis Using Thought Vectors by Paarth Neekhara.

Abstract:

This is an experimental tensorflow implementation of synthesizing images from captions using Skip Thought Vectors. The images are synthesized using the GAN-CLS Algorithm from the paper Generative Adversarial Text-to-Image Synthesis. This implementation is built on top of the excellent DCGAN in Tensorflow. The following is the model architecture. The blue bars represent the Skip Thought Vectors for the captions.

OK, that didn’t grab my attention, but this did:

generated-images-tensorflow-460

Full size image.

Not quite “Tea, Earl Grey, Hot,” but a step in that direction!

Holographic Embeddings of Knowledge Graphs [Are You Blinding/Gelding Raw Data?]

Monday, October 19th, 2015

Holographic Embeddings of Knowledge Graphs by Maximilian Nickel, Lorenzo Rosasco, Tomaso Poggio.

Abstract:

Learning embeddings of entities and relations is an efficient and versatile method to perform machine learning on relational data such as knowledge graphs. In this work, we propose holographic embeddings (HolE) to learn compositional vector space representations of entire knowledge graphs. The proposed method is related to holographic models of associative memory in that it employs circular correlation to create compositional representations. By using correlation as the compositional operator HolE can capture rich interactions but simultaneously remains efficient to compute, easy to train, and scalable to very large datasets. In extensive experiments we show that holographic embeddings are able to outperform state-of-the-art methods for link prediction in knowledge graphs and relational learning benchmark datasets.

Heavy sledding but also a good candidate for practicing How to Read a Paper.

I suggest that in part because of this comment by the authors in the conclusion:

In future work we plan to further exploit the fixed-width representations of holographic embeddings in complex scenarios, as they are especially suitable to model higher-arity relations (e.g., taughtAt(John, AI, MIT)) and facts about facts (e.g., believes(John, loves(Tom, Mary))).

Any representation where statements of “higher-arity relations” and “facts about facts” are not easily recorded and processed, is seriously impaired when it comes to capturing human knowledge.

Perhaps capturing only triples and “facts” explains the multiple failures of the U.S. intelligence community. It is working with tools that blind and geld its raw data. The rich nuances of intelligence data are lost in a grayish paste suitable for computer consumption.

A line of research worth following. Maximilian Nickel‘s homepage at MIT is a good place to start.

I first saw this in a tweet by Stefano Bertolo.

The tensor renaissance in data science

Saturday, May 16th, 2015

The tensor renaissance in data science by Ben Lorica.

From the post:

After sitting in on UC Irvine Professor Anima Anandkumar’s Strata + Hadoop World 2015 in San Jose presentation, I wrote a post urging the data community to build tensor decomposition libraries for data science. The feedback I’ve gotten from readers has been extremely positive. During the latest episode of the O’Reilly Data Show Podcast, I sat down with Anandkumar to talk about tensor decomposition, machine learning, and the data science program at UC Irvine.

Modeling higher-order relationships

The natural question is: why use tensors when (large) matrices can already be challenging to work with? Proponents are quick to point out that tensors can model more complex relationships. Anandkumar explains:

Tensors are higher order generalizations of matrices. While matrices are two-dimensional arrays consisting of rows and columns, tensors are now multi-dimensional arrays. … For instance, you can picture tensors as a three-dimensional cube. In fact, I have here on my desk a Rubik’s Cube, and sometimes I use it to get a better understanding when I think about tensors. … One of the biggest use of tensors is for representing higher order relationships. … If you want to only represent pair-wise relationships, say co-occurrence of every pair of words in a set of documents, then a matrix suffices. On the other hand, if you want to learn the probability of a range of triplets of words, then we need a tensor to record such relationships. These kinds of higher order relationships are not only important for text, but also, say, for social network analysis. You want to learn not only about who is immediate friends with whom, but, say, who is friends of friends of friends of someone, and so on. Tensors, as a whole, can represent much richer data structures than matrices.

The passage:

…who is friends of friends of friends of someone, and so on. Tensors, as a whole, can represent much richer data structures than matrices.

caught my attention.

The same could be said about other data structures, such as graphs.

I mention graphs because data representations carry assumptions and limitations that aren’t labeled for casual users. Such as directed acyclic graphs not supporting the representation of husband-wife relationships.

BTW, the Wikipedia entry on tensors has this introduction to defining tensor:

There are several approaches to defining tensors. Although seemingly different, the approaches just describe the same geometric concept using different languages and at different levels of abstraction.

Wonder if there is a mapping between the components of the different approaches?

Suggestions of other tensor resources appreciated!

Tensor Decompositions and Applications

Tuesday, March 26th, 2013

Tensor Decompositions and Applications by Tamara G. Kolda and Brett W. Bader.

Abstract:

This survey provides an overview of higher-order tensor decompositions, their applications, and available software. A tensor is a multidimensional or N-way array. Decompositions of higher-order tensors (i.e., N-way arrays with N ≥ 3) have applications in psychometrics, chemometrics, signal processing, numerical linear algebra, computer vision, numerical analysis, data mining, neuroscience, graph analysis, and elsewhere. Two particular tensor decompositions can be considered to be higher-order extensions of the matrix singular value decomposition:CANDECOMP/PARAFAC (CP) decomposes a tensor as a sum of rank-one tensors, and the Tucker decomposition is a higher-order form of principal component analysis. There are many other tensor decompositions, including INDSCAL, PARAFAC2, CANDELINC, DEDICOM, and PARATUCK2 as well as nonnegative variants of all of the above. The N-way Toolbox, Tensor Toolbox, and Multilinear Engine are examples of software packages for working with tensors.

At forty-five pages and two hundred and forty-five (245) references, this is a broad survey of tensor decompostion with numerous pointers to other survey and more specialized works.

I found this shortly after discovering the post I cover in: Tensors and Their Applications…

As I said in the earlier post, this has a lot of promise.

Although it isn’t yet clear to me how you would compare/contrast tensors with different dimensions and perhaps even a different number of dimensions.

Still, a lot of reading to do so perhaps I haven’t reached that point yet.

Tensors and Their Applications…

Saturday, March 23rd, 2013

Tensors and Their Applications in Graph-Structured Domains by Maximilian Nickel and Volker Tresp. (Slides.)

Along with the slides, you will like abstract and bibliography found at: Machine Learning on Linked Data: Tensors and their Applications in Graph-Structured Domains.

Abstract:

Machine learning has become increasingly important in the context of Linked Data as it is an enabling technology for many important tasks such as link prediction, information retrieval or group detection. The fundamental data structure of Linked Data is a graph. Graphs are also ubiquitous in many other fields of application, such as social networks, bioinformatics or the World Wide Web. Recently, tensor factorizations have emerged as a highly promising approach to machine learning on graph-structured data, showing both scalability and excellent results on benchmark data sets, while matching perfectly to the triple structure of RDF. This tutorial will provide an introduction to tensor factorizations and their applications for machine learning on graphs. By the means of concrete tasks such as link prediction we will discuss several factorization methods in-depth and also provide necessary theoretical background on tensors in general. Emphasis is put on tensor models that are of interest to Linked Data, which will include models that are able to factorize large-scale graphs with millions of entities and known facts or models that can handle the open-world assumption of Linked Data. Furthermore, we will discuss tensor models for temporal and sequential graph data, e.g. to analyze social networks over time.

Devising a system to deal with the heterogeneous nature of linked data.

Just skimming the slides I could see, this looks very promising.

I first saw this in a tweet by Stefano Bertolo.


Update: I just got an email from Maximilian Nickel and he has altered the transition between slides. Working now!

From slide 53 forward is pure gold for topic map purposes.

Heavy sledding but let me give you one statement from the slides that should capture your interest:

Instance matching: Ranking of entities by their similarity in the entity-latent-component space.

Although written about linked data, not limited to linked data.

What is more, Maximilian offers proof that the technique scales!

Complex, configurable, scalable determination of subject identity!

[Update: deleted note about issues with slides, which read: (Slides for ISWC 2012 tutorial, Chrome is your best bet. Even better bet, Chrome on Windows. Chrome on Ubuntu crashed every time I tried to go to slide #15. Windows gets to slide #46 before failing to respond. I have written to inquire about the slides.)]