Archive for the ‘Hypergraphs’ Category

ConceptNet5 [Herein of Hypergraphs]

Friday, May 17th, 2013

ConceptNet5

From the website:

ConceptNet is a semantic network containing lots of things computers should know about the world, especially when understanding text written by people.

It is built from nodes representing concepts, in the form of words or short phrases of natural language, and labeled relationships between them. These are the kinds of things computers need to know to search for information better, answer questions, and understand people’s goals. If you wanted to build your own Watson, this should be a good place to start!

ConceptNet contains everyday basic knowledge:

(…)

ConceptNet 5 is a graph

To be precise, it’s a hypergraph, meaning it has edges about edges. Each statement in ConceptNet has justfications pointing to it, explaining where it comes from and how reliable the information seems to be.

Previous versions of ConceptNet has been distributed as idiosyncratic database structures plus some software to interact with them. ConceptNet 5 is not a piece of software or a database; it is a graph. It’s a set of nodes and edges, which we can represent in multiple formats including JSON. You probably know better than we do what software you want to use to interact with it!

(That said, you can have our idiosyncratic Solr index if you want, but that’s not ConceptNet, it’s just a system for quickly looking things up in ConceptNet.)

Some other interesting properties:

  • The ConceptNet graph is ID-less. Every node and assertion contains all the information necessary to identify it and no more in its URI, and does not rely on arbitrarily-assigned IDs. The advantage of this is that if multiple branches of ConceptNet are developed in multiple places, we can later merge them simply by taking the union of the nodes and edges. (And we hope for this to happen!)
  • ConceptNet supports linked data: you can download a list of links to the greater Semantic Web, via DBPedia and via RDF/OWL WordNet. For example, our concept cat is linked to the DBPedia node at http://dbpedia.org/resource/Cat.

In addition to being a data source, interesting notion of “ID-less” nodes and edges.

Information on the software setup, Solr and Python to deliver ConceptNet5 as a hypergraph is also available.

I first saw this in Max De Marzi’s Knowledge Bases in Neo4j. You will find that Max’s approach involves dumbing down the hypergraph.

Why Hypergraphs?

Thursday, April 25th, 2013

Why Hypergraphs? by Linas Vepstas.

From the post:

OpenCog uses hypergraphs to represent knowledge. Why? I don’t think this is clearly, succinctly explained anywhere, so I will try to do so here. This is a very important point: I can’t begin to tell you how many times I went searching for some whiz-bang logic programming system, or inference engine, or theorem-prover, or some graph re-writing engine, or some probabilistic programming system, only to throw up my hands up and realize that, after many wasted hours, none of them do what I want. If you’re interested in AGI, then let me assure you: they don’t do what you want, either. So, what do I want them to do, and why?

Well, lets begin easy: with graph re-writing systems. These days, almost everyone agrees that a great way to represent knowledge is with graphs. The structure IsA(Cat, Animal) looks like a graph with two vertexes, Cat and Animal, and a labelled edge, IsA, between them. If I also know that IsA(Binky, Cat), then, in principle, I should be able to deduce that IsA(Binky, Animal). This is a simple transitive relationship, and the act of logical deduction, for this example, is a simple graph re-write rule: If you see two IsA edges in a row, you should draw a third IsA edge between the first and the last vertex. Easy, right?

So perhaps you’d think that all logic induction and reasoning engines have graph rewrite systems at their core, right? So you’d think. In fact, almost none of them do. And those that do, do it in some internal, ad hoc, non-public, undocumented way: there’s no API, its not exposed externally; its not an ‘official’ part of the system for you to use or tinker with.

You know how I feel about AI triumphalism so I won’t bother to repeat the rant.

However, the hypergraph part of this work looks interesting. Whatever your views on AI.

A good place to start would be the OpenCog Development page.

HyperGraphDB: A Generalized Graph Database

Thursday, February 14th, 2013

HyperGraphDB: A Generalized Graph Database by Borislav Iordanov.

Abstract:

We present HyperGraphDB, a novel graph database based on generalized hypergraphs where hyperedges can contain other hyperedges. This generalization automatically reifies every entity expressed in the database thus removing many of the usual difficulties in dealing with higher-order relationships. An open two-layered architecture of the data organization yields a highly customizable system where specific domain representations can be optimized while remaining within a uniform conceptual framework. HyperGraphDB is an embedded, transactional database designed as a universal data model for highly complex, large scale knowledge representation applications such as found in artificial intelligence, bioinformatics and natural language processing.

A formal treatment of HyperGraphDB.

Merits being printed out and given a slow read.

Borisla comments on both RDF and Topic Maps:

…Two other prominent issues are contextuality (scoping) and reification.

Those and other considerations from semantic web research disappear or find natural solutions in the model implemented by HyperGraphDB.

But when I search the paper, scoping comes up in an NLP example as:

The tree-like structure of the document is also recorded in HyperGraphDB with scoping parent-child binary links between (a) the document and its paragraphs, (b) a paragraph and its sentences, (c) a sentence and each linguistic relationship inferred from it.

Scoping at least in one sense of the word, but not the in the sense of say a name being “scoped” by the language French.

Reification, other than the discussion of RDF and topic maps, doesn’t appear again in the paper.

As I said, it needs a slow read but if you see something about scoping and/or reification that I have missed, please give a shout!

Hypergraph-based multidimensional data modeling…

Thursday, February 14th, 2013

Hypergraph-based multidimensional data modeling towards on-demand business analysis by Duong Thi Anh Hoang, Torsten Priebe and A. Min Tjoa. (Proceeding iiWAS ’11 Proceedings of the 13th International Conference on Information Integration and Web-based Applications and Services Pages 36-43 )

Abstract:

In the last few years, web-based environments have witnessed the emergence of new types of on-demand business analysis that facilitate complex and integrated analytical information from multidimensional databases. In these on-demand environments, users of business intelligence architectures can have very different reporting and analytical needs, requiring much greater flexibility and adaptability of today’s multidimensional data modeling. While structured data models for OLAP have been studied in detail, a majority of current approaches has not put its focus on the dynamic aspect of the multidimensional design and/or semantic enriched impact model. Within the scope of this paper, we present a flexible approach to model multidimensional databases in the context of dynamic web-based analysis and adaptive users’ requirements. The introduced formal approach is based on hypergraphs with the ability to provide formal constructs specifying the different types of multidimensional elements and relationships which enable the support of highly customized business analysis. The introduced hypergraphs are used to formally define the semantics of multidimensional models with unstructured ad-hoc analytic activities. The proposed model also supports a formal representation of advanced concepts like dynamic hierarchies, many-to-many associations, additivity constraints etc. Some scenario example are also provided to motivate and illustrate the proposed approach.

If you like illustrations of technologies with examples from the banking industry, this is the paper on hypergraphs for you.

Besides, banks are where they keep the money. ;-)

Seriously, a very well illustrated introduction to the use of hypergraphs and multidimensional data modeling, plus why multidimensional data models matter to clients. (Another place where they keep money.)

HyperGraphDB 1.2 Final

Thursday, December 27th, 2012

HyperGraphDB 1.2 Final

From the post:

HyperGraphDB is a general purpose, free open-source data storage mechanism. Geared toward modern applications with complex and evolving domain models, it is suitable for semantic web, artificial intelligence, social networking or regular object-oriented business applications.

This release contains numerous bug fixes and improvements over the previous 1.1 release. A fairly complete list of changes can be found at the Changes for HyperGraphDB, Release 1.2 wiki page.

  1. Introduction of a new HyperNode interface together with several implementations, including subgraphs and access to remote database peers. The ideas behind are documented in the blog post HyperNodes Are Contexts.
  2. Introduction of a new interface HGTypeSchema and generalized mappings between arbitrary URIs and HyperGraphDB types.
  3. Implementation of storage based on the BerkeleyDB Java Edition (many thanks to Alain Picard and Sebastian Graf!). This version of BerkeleyDB doesn’t require native libraries, which makes it easier to deploy and, in addition, performs better for smaller datasets (under 2-3 million atoms).
  4. Implementation of parametarized pre-compiled queries for improved query performance. This is documented in the Variables in HyperGraphDB Queries blog post.

HyperGraphDB is a Java based product built on top of the Berkeley DB storage library.

This release dates from November 4, 2012. Apologies for missing the news until now.

Getting Started With Hyperdex

Saturday, August 11th, 2012

Getting Started With Hyperdex by Ṣeyi Ogunyẹ́mi.

From the post:

Alright, let’s start this off with a fitting soundtrack just because we can. Open it up in a tab and come back?

Greetings, valiant adventurer!

So, I heard you care about data. You aren’t storing your precious data in anything that acknowledges PUT requests before being certain it’ll be able to return it to you? Well then, you’ve come to the right place.

Okay, I’m clearly excited, but with good reason. Some time in the past few months, I ran into a paper; “HyperDex: A Distributed, Searchable Key-Value Store”1 from a team at Cornell. By now the typical reaction to NoSQL news tends to be that your eyes glaze over and you start mouthing “…is Web-Scale™”, but this isn’t “yet another NoSQL database”. So, I’ve finally gotten round to writing this piece in hopes of sharing it with others.

Before plunging into the deep end, it’s probably a good idea to discuss why I’ve found HyperDex to be particularly exciting. For reasons that will probably be in a different blog post, I’ve been researching the design of a distributed key/value store with support for strong consistency (for the morbidly curious, it’s connected to Ampify). You must realise that the state-of-the-art distributed key/value stores such as Dynamo (and it’s open-source clone, Riak) tend to aim for eventual consistency.

If you aren’t already experimenting with Hyperdex you may well be after reading this post.

Hypergraphs and Colored Maps

Monday, July 23rd, 2012

Hypergraphs and Colored Maps by James Mallos.

From the post:

A graph, in general terms, is a set of vertices connected by edges. Finding good colorings for the vertices (or edges) of a graph may seem like a hobby interest, but, in fact, graphs bearing certain rule-based colorings represent mathematical objects that are more general than graphs themselves. By bearing colors, graphs let us see objects that could not be quite so easily drawn.

I discovered James’ site, Weave Anything while searching for blogs on hypergraphs.

I recommend it as an entertaining way to learn more about graphs, hypergraphs, hypermaps and similar structures.

Software support for SBGN maps: SBGN-ML and LibSBGN

Friday, July 20th, 2012

Software support for SBGN maps: SBGN-ML and LibSBGN (Martijn P. van Iersel, Alice C. Villéger, Tobias Czauderna, Sarah E. Boyd, Frank T. Bergmann, Augustin Luna, Emek Demir, Anatoly Sorokin, Ugur Dogrusoz, Yukiko Matsuoka, Akira Funahashi, Mirit I. Aladjem, Huaiyu Mi, Stuart L. Moodie, Hiroaki Kitano, Nicolas Le Novère, and Falk Schreiber
Software support for SBGN maps: SBGN-ML and LibSBGN Bioinformatics 2012 28: 2016-2021. )

Warning: Unless you really like mapping and markup languages this is likely to be a boring story. If you do (and I do), it is the sort of thing you will print out and enjoy reading. Just so you know.

Abstract:

Motivation: LibSBGN is a software library for reading, writing and manipulating Systems Biology Graphical Notation (SBGN) maps stored using the recently developed SBGN-ML file format. The library (available in C++ and Java) makes it easy for developers to add SBGN support to their tools, whereas the file format facilitates the exchange of maps between compatible software applications. The library also supports validation of maps, which simplifies the task of ensuring compliance with the detailed SBGN specifications. With this effort we hope to increase the adoption of SBGN in bioinformatics tools, ultimately enabling more researchers to visualize biological knowledge in a precise and unambiguous manner.

Availability and implementation: Milestone 2 was released in December 2011. Source code, example files and binaries are freely available under the terms of either the LGPL v2.1+ or Apache v2.0 open source licenses from http://libsbgn.sourceforge.net.

Contact: sbgn-libsbgn@lists.sourceforge.net

I included the hyperlinks to standards and software for the introduction but not the article references. Those are of interest too but for the moment I only want to entice you to read the article in full. There is a lot of graph work going on in bioinformatics and we would all do well to be more aware of it.

The Systems Biology Graphical Notation (SBGN, Le Novère et al., 2009) facilitates the representation and exchange of complex biological knowledge in a concise and unambiguous manner: as standardized pathway maps. It has been developed and supported by a vibrant community of biologists, biochemists, software developers, bioinformaticians and pathway databases experts.

SBGN is described in detail in the online specifications (see http://sbgn.org/Documents/Specifications). Here we summarize its concepts only briefly. SBGN defines three orthogonal visual languages: Process Description (PD), Entity Relationship (ER) and Activity Flow (AF). SBGN maps must follow the visual vocabulary, syntax and layout rules of one of these languages. The choice of language depends on the type of pathway or process being depicted and the amount of available information. The PD language, which originates from Kitano’s Process Diagrams (Kitano et al., 2005) and the related CellDesigner tool (Funahashi et al., 2008), is equivalent to a bipartite graph (with a few exceptions) with one type of nodes representing pools of biological entities, and a second type of nodes representing biological processes such as biochemical reactions, transport, binding and degradation. Arcs represent consumption, production or control, and can only connect nodes of differing types. The PD language is very suitable for metabolic pathways, but struggles to concisely depict the combinatorial complexity of certain proteins with many phosphorylation states. The ER language, on the other hand, is inspired by Kohn’s Molecular Interaction Maps (Kohn et al., 2006), and describes relations between biomolecules. In ER, two entities can be linked with an interaction arc. The outcome of an interaction (for example, a protein complex), is considered an entity in itself, represented by a black dot, which can engage in further interactions. Thus ER represents dependencies between interactions, or putting it differently, it can represent which interaction is necessary for another one to take place. Interactions are possible between two or more entities, which make ER maps roughly equivalent to a hypergraph in which an arc can connect more than two nodes. ER is more concise than PD when it comes to representing protein modifications and protein interactions, although it is less capable when it comes to presenting biochemical reactions. Finally, the third language in the SBGN family is AF, which represents the activities of biomolecules at a higher conceptual level. AF is suitable to represent the flow of causality between biomolecules even when detailed knowledge on biological processes is missing.

Efficient integration of the SBGN standard into the research cycle requires adoption by visualization and modeling software. Encouragingly, a growing number of pathway tools (see http://sbgn.org/SBGN_Software) offer some form of SBGN compatibility. However, current software implementations of SBGN are often incomplete and sometimes incorrect. This is not surprising: as SBGN covers a broad spectrum of biological phenomena, complete and accurate implementation of the full SBGN specifications represents a complex, error-prone and time-consuming task for individual tool developers. This development step could be simplified, and redundant implementation efforts avoided, by accurately translating the full SBGN specifications into a single software library, available freely for any tool developer to reuse in their own project. Moreover, the maps produced by any given tool usually cannot be reused in another tool, because SBGN only defines how biological information should be visualized, but not how the maps should be stored electronically. Related community standards for exchanging pathway knowledge, namely BioPAX (Demir et al., 2010) and SBML (Hucka et al., 2003), have proved insufficient for this role (more on this topic in Section 4). Therefore, we observed a second need, for a dedicated, standardized SBGN file format.

Following these observations, we started a community effort with two goals: to encourage the adoption of SBGN by facilitating its implementation in pathway tools, and to increase interoperability between SBGN-compatible software. This has resulted in a file format called SBGN-ML and a software library called LibSBGN. Each of these two components will be explained separately in the next sections.

Of course, there is always the data prior to this markup and the data that comes afterwards, so you could say I see a role for topic maps. ;-)

Clustering by hypergraphs and dimensionality of cluster systems

Friday, May 11th, 2012

Clustering by hypergraphs and dimensionality of cluster systems by S. Albeverio and S.V. Kozyrev.

Abstract:

In the present paper we discuss the clustering procedure in the case where instead of a single metric we have a family of metrics. In this case we can obtain a partially ordered graph of clusters which is not necessarily a tree. We discuss a structure of a hypergraph above this graph. We propose two definitions of dimension for hyperedges of this hypergraph and show that for the multidimensional p-adic case both dimensions are reduced to the number of p-adic parameters.

We discuss the application of the hypergraph clustering procedure to the construction of phylogenetic graphs in biology. In this case the dimension of a hyperedge will describe the number of sources of genetic diversity.

A pleasant reminder that hypergraphs and hyperedges are simplifications of the complexity we find in nature.

If hypergraphs/hyperedges are simplifications, what would you call a graph/edges?

A simplification of a simplification?

Graphs are useful sometimes.

Useful sometimes doesn’t mean useful at all times.

Neo4j – Hyperedges and Cypher – Suggested Revisions

Friday, March 30th, 2012

Recently “Hyperedges and Cypher” was cited to illustrate “improvements” to Neo4j documentation. It is deeply problematic.

The first paragraph and header read:

5.1 Hyperedges and Cypher

Imagine a user being part of different groups. A group can have different roles, and a user can be part of different groups. He also can have different roles in different groups apart from the membership. The association of a User, a Group and a Role can be referred to as a HyperEdge. However, it can be easily modeled in a property graph as a node that captures this n-ary relationship, as depicted below in the U1G2R1 node.

This is the first encounter of hyperedge (other than in the table of contents) for the reader. The manual offers no definition for or illustration of a “hyperedge.”

When terms are introduced, they need to be defined.

Here is the Neo4j illustration for the preceding description (from the latest milestone release):

Neo4j-Cypher-HyperEdge

I don’t get that graph from the description in the text.

This graph comes closer:

User-Roles-Group

You may object that role1 and role2 should be nodes rather than an edges, but that is a modeling decision, another area where the Neo4j manual is weak. The reader doesn’t share in that process, nodes and edges suddenly appear and the reader must work out why?

If the current prose were cleaned up, by providing a better prose description, modeling choices and alternatives could be illustrated, along with Cypher queries.

On hypergraphs/hyperedges:

A user having different roles in different groups could be modeled with a hyperedge, but not necessarily so. If Neo4j isn’t going to support hyperedges, why bring it up? Show the modeling that Neo4j does support.

If I were going to discuss hyperedges/hypergraphs at all, I would point out examples of where they are used, along with citations to the literature.

Visualization of Hyperedges in Fixed Graph Layouts

Wednesday, March 28th, 2012

Visualization of Hyperedges in Fixed Graph Layouts by Martin Junghans.

Abstract:

Graphs and their visualizations are widely used to communicate the structure of complex data in a formal way. Hypergraphs are dedicated to represent real-world data as they allow to relate multiple objects with each other. However, existing graph drawing techniques lack the ability to embed hyperedges into fixed two-dimensional graph layouts. We utilize a set of curves to visualize hyperedges and employ an energy-based technique to position them in the layout. By avoiding node occlusion and cluster intersections we are able to preserve the expressiveness of the given graph layout. Additionally, we investigate techniques to reduce the visual complexity of hypergraph drawings. A comprehensive evaluation using real-world data sets demonstrates the suitability of the proposed hyperedge layout techniques.

A thesis I ran across today while researching the display of hyperedges.

Graphs are being used for the storage/analysis/visualization of data. Given the history of hypergraphs in CS research, hypergraphs aren’t far behind. Now would be the time to get ahead of the curve, however briefly.

…trimming the spring algorithm for drawing hypergraphs

Tuesday, March 20th, 2012

…trimming the spring algorithm for drawing hypergraphs by Harri Klemetti, Ismo Lapinleimu, Erkki Mäkinen, and Mika Sieranta. ACM SIGCSE Bulletin, Volume 27 Issue 3, Sept. 1995.

Abstract:

Graph drawing problems provide excellent material for programming projects. As an example, this paper describes the results of an undergraduate project which dealt with hypergraph drawing. We introduce a practical method for drawing hypergraphs. The method is based on the spring algorithm, a well-known method for drawing normal graphs.

Not the earliest or the latest on drawing hypergraphs (for which there is apparently no consensus) but something I ran across while researching the issue. Thought it best to write it down so I can refer to it from other posts.

Hypergraphs have a long history with analysis of relational databases and I suspect their applications to modeling NoSQL databases has already happened or at least isn’t far off. Not to mention their relevance to graph databases.

In any event, being able to visualize hypergraphs, by one of more methods, is likely to be useful both for topic map authors and users but other investigators as well.

Bipartite Graphs as Intermediate Model for RDF

Saturday, March 3rd, 2012

Bipartite Graphs as Intermediate Model for RDF by Jonathan Hayes and Claudio Gutierrez.

Abstract:

RDF Graphs are sets of assertions in the form of subject-predicate-object triples of information resources. Although for simple examples they can be understood intuitively as directed labeled graphs, this representation does not scale well for more complex cases, particularly regarding the central notion of connectivity of resources.

We argue in this paper that there is need for an intermediate representation of RDF to enable the application of well-established methods from Graph Theory. We introduce the concept of Bipartite Statement-Value Graph and show its advantages as intermediate model between the abstract triple syntax and data structures used by applications. In the light of this model we explore issues like transformation costs, data/schema structure, the notion of connectivity, and database mappings.

A quite different take on the representation of RDF than in Is That A Graph In Your Cray? Here we encounter hypergraphs for modeling RDF.

Suggestions on how to rank graph representations of RDF?

Or perhaps better, suggestion on how to rank graph representations for use cases?

Putting the question of what (connections/properties) we want to model before the question of how (RDF, etc.) we intend to model it.

Isn’t that the right order?

Comments?

The hypernode model and its associated query language

Wednesday, February 8th, 2012

The hypernode model and its associated query language

Abstract:

A data model called the hypernode model, whose single data structure is the hypernode, is introduced. Hypernodes are graphs whose node set can contain graphs in addition to primitive nodes. Hypernodes can be used to represent arbitrarily complex objects and can support the encapsulation of information, to any level. A declarative logic-based language for the hypernode model is introduced and shown to be query complete. It is then shown that hypernodes can represent extensional functions, nested relations, and composite objects. Thus, the model is at least as expressive as the functional and nested relational database models. It is demonstrated that the hypernode model can be regarded as an object-oriented one.

Interesting departure from hypergraphs with hyperedges, the latter being replaced in this model with hypernodes. Hypernodes consists of a unique label, nodes, which may be primitive or hypernodes, and, edges between nodes.

The authors went on to create an implementation and storage model for this model.

A transient hypergraph-based model for data access

Monday, February 6th, 2012

A transient hypergraph-based model for data access by Carolyn Watters and Michael A. Shepherd.

Abstract:

Two major methods of accessing data in current database systems are querying and browsing. The more traditional query method returns an answer set that may consist of data values (DBMS), items containing the answer (full text), or items referring the user to items containing the answer (bibliographic). Browsing within a database, as best exemplified by hypertext systems, consists of viewing a database item and linking to related items on the basis of some attribute or attribute value.

A model of data access has been developed that supports both query and browse access methods. The model is based on hypergraph representation of data instances. The hyperedges and nodes are manipulated through a set of operators to compose new nodes and to instantiate new links dynamically, resulting in transient hypergraphs. These transient hypergraphs are virtual structures created in response to user queries, and lasting only as long as the query session. The model provides a framework for general data access that accommodates user-directed browsing and querying, as well as traditional models of information and data retrieval, such as the Boolean, vector space, and probabilistic models. Finally, the relational database model is shown to provide a reasonable platform for the implementation of this transient hypergraph-based model of data access.

I call your attention to the line that reads:

The hyperedges and nodes are manipulated through a set of operators to compose new nodes and to instantiate new links dynamically, resulting in transient hypergraphs.

For a topic map to create subject representatives (nodes) and relationships between subjects (edges) dynamically and differently depending upon the user, would be a very useful thing.

Don’t be daunted by the complexity of the proposal.

The authors had a working prototype 22 years ago using a relational database.

(Historical note: You will not find HyTime mentioned in this paper because it was published prior to the first edition of HyTime.)

Dynamic Shortest Path Algorithms for Hypergraphs

Monday, February 6th, 2012

Dynamic Shortest Path Algorithms for Hypergraphs by Jianhang Gao, Qing Zhao, Wei Ren, Ananthram Swami, Ram Ramanathan and Amotz Bar-Noy.

Abstract:

A hypergraph is a set V of vertices and a set of non-empty subsets of V, called hyperedges. Unlike graphs, hypergraphs can capture higher-order interactions in social and communication networks that go beyond a simple union of pairwise relationships. In this paper, we consider the shortest path problem in hypergraphs. We develop two algorithms for finding and maintaining the shortest hyperpaths in a dynamic network with both weight and topological changes. These two algorithms are the first to address the fully dynamic shortest path problem in a general hypergraph. They complement each other by partitioning the application space based on the nature of the change dynamics and the type of the hypergraph.

The applicability of hypergraphs to “…social and communication networks…” should push this item to near the top of your reading list. In addition to the alphabet soup of various government agencies from around the world mining such networks are e-commerce and traditional vendors. Developing solutions and/or having the skills to mine such networks should make you a hot-ticket item.

Cypher Cookbook

Friday, October 14th, 2011

Cypher Cookbook

I have been learning to bake bread so you can imagine my disappointment when I saw “Cypher Cookbook” only to find that Peter was talking about Neo4j queries. Really! ;-)

From the first entry:

Hyperedges and Cypher

Imagine a user being part of different groups. A group can have different roles, and a user can be part of different groups. He also can have different roles in different groups apart from the membership. The association of a User, a Group and a Role can be referred to as a HyperEdge. However, it can be easily modeled in a property graph as a node that captures this n-ary relationship, as depicted below in the U1G2R1 node.

The graph model is necessary to illustrate the query but hyperedges need to also be treated under modeling or the current Domain Modeling Galllery. I would argue that full examples should be provided for as many domains as possible. (See how easy it is to assign work to others? I don’t know how soon but I hope to be a contributor in that respect.)

Another “cookbook” section could address importing data into Neo4j. Particularly from some of the larger public databases.

If anyone who wants wider adoption of Neo4j needs a motivating example, consider the number of people who use DocBook (its an XML format) versus ODF or OOXML (used by OpenOffice and MS Office (well, MS Office saves as both). If you want wide adoption (which I personally think is a good idea for graph databases) then use can’t be a test of user dedication or integrity.

…Advanced Graph-­‐Analysis Algorithms on Very Large Graphs

Tuesday, July 26th, 2011

Enabling Rapid Development and Execution of Advanced Graph-­‐Analysis Algorithms on Very Large Graphs by Aydin Buluc, John Gilbert, Adam Lugowski, and, Steve Reinhardt.

Great overview of the Knowledge Discovery Toolkit project and its goals.

From the website:

The Knowledge Discovery Toolbox (KDT) provides domain experts with a simple interface to analyze very large graphs quickly and effectively without requiring knowledge of the underlying graph representation or algorithms. The current version provides a tiny selection of functions on directed graphs, from simple exploratory functions to complex algorithms. Because KDT is open-source, it can be customized or extended by interested (and intrepid) users.

HyperGraphDB

Sunday, June 5th, 2011

HyperGraphDB has changed in appearance since my last visit.

From the website:

HyperGraphDB is a general purpose, open-source data storage mechanism based on a powerful knowledge management formalism known as directed hypergraphs. While a persistent memory model designed mostly for knowledge management, AI and semantic web projects, it can also be used as an embedded object-oriented database for Java projects of all sizes. Or a graph database. Or a (non-SQL) relational database.

Read Alex Popescu’s HyperGraphDB interview with Borislav Iordanov for a high-level overview.

Watch Borislav Iordanov’s HyperGraphDB Presentation at StrangeLoop 2010.

Feature Summary

  • Powerful data modeling and knowledge representation.
  • Graph-oriented storage.
  • N-ary, higher order relationships (edges) between graph nodes.
  • Graph traversals and relational-style queries.
  • Customizable indexing.
  • Customizable storage management.
  • Extensible, dynamic DB schema through custom typing.
  • Out of the box Java OO database.
  • Fully transactional and multi-threaded, MVCC/STM.
  • P2P framework for data distribution.

HyperGraphDB implements TopicMaps 1.0, TuProlog and a number of other models/standards.

Definitely worth taking out for a spin!

HyperGraphDB – Data Management for Complex Systems

Wednesday, December 22nd, 2010

HyperGraphDB – Data Management for Complex Systems Author: Borislav Iordanov

Presentation on the architecture of HyperGraphDB.

Slides and MP3 file are available at the presentation link.

Covers the architecture of HyperGraphDB in just under 20 minutes.

Good for an overview but I would suggest looking at the documentation, etc. for a more detailed view.

The documentation describes its topic map component in part as:

In HGTM, all topic maps constructs are represented as HGDB atoms. The Java classes implementing those atoms are in the package org.hypergraphdb.apps.tm. The API is an almost complete implementation of the 1.0 specification. Everything except merging is implementing. Merging wouldn’t be hard, but I haven’t found the need for it yet.

I will be following up with the HyperGraphDB project on how merging was understood.

Will report back on what comes of that discussion.

Peter McBrien

Saturday, May 22nd, 2010

Peter McBrien focuses on data modeling and integration.

Part of the AutoMed project on database integration. Recent work includes temporal constraints and P2P exchange of heterogeneous data.

Publications (dblp).

Homepage

Databases: Tools and Data for Teaching and Research: Useful collection of datasets and other materials on databases, data modeling and integration.

I first encountered Peter’s research in Comparing and Transforming Between Data Models via an Intermediate Hypergraph Data Model.

From a topic map perspective, the authors assumed the identities of the subjects to which their transformation rules were applied. Someone less familiar with the schema languages could have made other choices.

That’s the hard question isn’t it? How to have reliable integration without presuming a common perspective/interpretation of the schema languages?

*****
PS: This is the first of many posts on researchers working in areas of interest to the topic maps community.