Archive for the ‘Factor Graphs’ Category

Accelerating Inference: towards a full Language, Compiler and Hardware stack

Friday, December 14th, 2012

Accelerating Inference: towards a full Language, Compiler and Hardware stack by Shawn Hershey, Jeff Bernstein, Bill Bradley, Andrew Schweitzer, Noah Stein, Theo Weber, Ben Vigoda.

Abstract:

We introduce Dimple, a fully open-source API for probabilistic modeling. Dimple allows the user to specify probabilistic models in the form of graphical models, Bayesian networks, or factor graphs, and performs inference (by automatically deriving an inference engine from a variety of algorithms) on the model. Dimple also serves as a compiler for GP5, a hardware accelerator for inference.

From the introduction:

Graphical models alleviate the complexity inherent to large dimensional statistical models (the so-called curse of dimensionality) by dividing the problem into a series of logically (and statistically) independent components. By factoring the problem into subproblems with known and simple interdependencies, and by adopting a common language to describe each subproblem, one can considerably simplify the task of creating complex Bayesian models. Modularity can be taken advantage of further by leveraging this modeling hierarchy over several levels (e.g. a submodel can also be decomposed into a family of sub-submodels). Finally, by providing a framework which abstracts the key concepts underlying classes of models, graphical models allow the design of general algorithms which can be efficiently applied across completely different fields, and systematically derived from a model description.

Suggestive of sub-models of merging?

I first saw this in a tweet from Stefano Bertolo.

factorie: Probabilistic programming with imperatively-defined factor graphs

Friday, March 11th, 2011

factorie: Probabilistic programming with imperatively-defined factor graphs

The website says factorie has been applied to:

FACTORIE has been successfully applied to various tasks in natural language processing and information integration, including

  • named entity recognition
  • entity resolution
  • relation extraction
  • parsing
  • schema matching
  • ontology alignment
  • latent-variable generative models, including latent Dirichlet allocation.

Sound like topic map tasks to me!

Currently at version 0.90 but the website indicates the the project is planning on a 1.0 release in early 2011.

Just so you know what you are looking forward to:

FACTORIE is a toolkit for deployable probabilistic modeling, implemented as a software library in Scala. It provides its users with a succinct language for creating relational factor graphs, estimating parameters and performing inference. Key features:

  • It is object-oriented, enabling encapsulation, abstraction and inheritance in the definition of random variables, factors, inference and learning methods.
  • It is scalable, with demonstrated success on problems with many millions of variables and factors, and on models that have changing structure, such as case factor diagrams. It has also been plugged into a database back-end, representing a new approach to probabilistic databases capable of handling billions of variables.
  • It is flexible, supporting multiple modeling and inference paradigms. Its original emphasis was on conditional random fields, undirected graphical models, MCMC inference, online training, and discriminative parameter estimation. However, it now also supports directed generative models (such as latent Dirichlet allocation), and has preliminary support for variational inference, including belief propagation and mean-field methods.
  • It is embedded into a general purpose programming language, providing model authors with familiar and extensive resources for implementing the procedural aspects of their solution, including the ability to beneficially mix data pre-processing, diagnostics, evaluation, and other book-keeping code in the same files as the probabilistic model specification.
  • It allows the use of imperative (procedural) constructs to define the factor graph—an unusual and powerful facet that enables significant efficiencies and also supports the injection of both declarative and procedural domain knowledge into model design.

The structure of generative models can be expressed as a program that describes the generative storyline. The structure undirected graphical models can be specified in an entity-relationship language, in which the factor templates are expressed as compatibility functions on arbitrary entity-relationship expressions; alternatively, factor templates may also be specified as formulas in first-order logic. However, most generally, data can be stored in arbitrary data structures (much as one would in deterministic programming), and the connectivity patterns of factor templates can be specified in a Turing-complete imperative style. This usage of imperative programming to define various aspects of factor graph construction and operation is an innovation originated in FACTORIE; we term this approach imperatively-defined factor graphs. The above three methods for specifying relational factor graph structure can be mixed in the same model.