Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 4, 2012

Social networks in the database: using a graph database

Filed under: Graph Databases,Neo4j,Social Networks — Patrick Durusau @ 7:17 pm

Social networks in the database: using a graph database

The Neo4j response to Lorenzo Alberton’s post on social networks in a relational database.

From the post:

Recently Lorenzo Alberton gave a talk on Trees In The Database where he showed the most used approaches to storing trees in a relational database. Now he has moved on to an even more interesting topic with his article Graphs in the database: SQL meets social networks. Right from the beginning of his excellent article Alberton puts this technical challenge in a proper context:

Graphs are ubiquitous. Social or P2P networks, thesauri, route planning systems, recommendation systems, collaborative filtering, even the World Wide Web itself is ultimately a graph! Given their importance, it’s surely worth spending some time in studying some algorithms and models to represent and work with them effectively.

After a brief explanation of what a graph data structure is, the article goes on to show how graphs can be represented in a table-based database. The rest of the article shows in detail how an adjacency list model can be used to represent a graph in a relational database. Different examples are used to illustrate what can be done in this way.

Graph databases and Neo4j in particular offer advantages when used with graphs but the Neo4j post overlooks several points.

Unlike graph databases, SQL databases are nearly, if not always, ubiquitous. It may well be that the first “taste” of graph processing may come via a SQL database and lead users to expect more graph capabilities than a SQL solution can offer.

As Lorenzo points out in his posting, performance will vary depending upon the graph operations you need to perform. True for SQL databases and graph databases as well. Having a graph database doesn’t mean all graph algorithms run efficiently on your data set.

Finally:

A table-based system makes a good fit for static and simple data structures, ….

Isn’t going to ring true for anyone familiar with Oracle, PostgreSQL, MySQL, SQL Server, Informix, DB2 or any number of other “table-based systems.”

February 29, 2012

Solving Problems on Recursively Constructed Graphs

Filed under: Graph Databases,Graphs — Patrick Durusau @ 7:22 pm

Solving Problems on Recursively Constructed Graphs by Richard B. Borie , R. Gary Parker , Craig A. Tovey.

Abstract:

Fast algorithms can be created for many graph problems when instances are confined to classes of graphs that are recursively constructed. This article first describes some basic conceptual notions regarding the design of such fast algorithms, and then the coverage proceeds through several recursive graph classes. Specific classes include trees, series-parallel graphs, k-terminal graphs, treewidth-k graphs, k-trees, partial k-trees, k-jackknife graphs, pathwidth-k graphs, bandwidth-k graphs, cutwidth-k graphs, branchwidth-k graphs, Halin graphs, cographs, cliquewidth-k graphs, k-NLC graphs, k-HB graphs, and rankwidth-k graphs. The definition of each class is provided. Typical algorithms are applied to solve problems on instances of most classes. Relationships between the classes are also discussed.

Part survey and part tutorial, this article (at 51 pages) will take some time to read in detail.

I wanted to mention it because one of the topics being discussed in the graph reading club will be the partitioning of graph databases.

I suspect, but obviously don’t know for certain, that the graph databases that are constructed in enterprise settings are not going to be random graphs. That is to say some (all?) have repetitive (if not recursive) structures that can be exploited to solve particular graph operations on those databases.

Suggestions of other resources on recursively constructed graphs?

Graph Databases: Information Silo Busters

In a post about InfiniteGraph 2.1 I found the following:

Other big data solutions all lack one thing, Clark contends. There is no easy way to represent the connection information, the relationships across the different silos of data or different data stores, he says. “That is where Objectivity can provide the enhanced storage for actually helping extract and persist those relationships so you can then ask queries about how things are connected.”

(Brian Clark, vice president, Data Management, Objectivity)

It was the last line of the post but I would have sharpened it and made it the lead slug.

Think about what Clark is saying: Not only can we persist relationship information within a datastore but also generate and persist relationship information between datastores. With no restriction on the nature of the datastores.

Try doing that with a relational database and SQL.

What I find particularly attractive is that persisting relationships across datastores means that we can jump the hurdle of making everyone use a common data model. It can be as common (in the graph) as it needs to be and no more.

Of course I think about this as being particularly suited for topic maps as we can document why we have mapped components of diverse data models to particular points in the graph but what did you expect?

But used robustly, graph databases are going to allow you to perform integration across whatever datastores are available to you, using whatever data models they use, and mapped to whatever data model you like. As others may map your graph database to models they prefer as well.

I think the need for documenting those mappings is one that needs attention sooner rather than later.

BTW, feel free to use the phrase “Graph Databases: Information Silo Busters.” (with or without attribution – I want information silos to fall more than I want personal recognition.)

« Newer Posts

Powered by WordPress