Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 20, 2011

Graphity: An efficient Graph Model for Retrieving the Top-k News Feeds for users in social networks

Filed under: Graphity,Graphs,Neo4j,Networks,Social Media,Social Networks — Patrick Durusau @ 4:11 pm

Graphity: An efficient Graph Model for Retrieving the Top-k News Feeds for users in social networks by Rene Pickhardt.

From the post:

I already said that my first research results have been submitted to SIGMOD conference to the social networks and graph databases track. Time to sum up the results and blog about them.

I created a data model to make retrieval of social news feeds in social networks very efficient. It is able to dynamically retrieve more than 10’000 temporal ordered news feeds per second in social networks with millions of users like Facebook and Twitter by using graph data bases (like neo4j)

10,000 temporally ordered news feeds per second? I can imagine any number of use cases that fit comfortably within those performance numbers!

How about you?

Looking forward to the paper (and source code)!

November 17, 2011

Angels of the Right – version 2.0

Filed under: Networks,Visualization — Patrick Durusau @ 8:38 pm

Angels of the Right – version 2.0

From the post:

I’ve been working for the past several months to build AngelsOfTheRight.net a new interactive version of the conservative philanthropy network data from the Media Matters Conservative Transparency Project and other sources. The idea is to have an atlas where you can dive in, explore, and see which organisations have similar patterns of funding relationships. As always, my hope is to make some of these invisible economic and power relationships a bit more tangible.

If you want to see network maps pushed really hard in HTML5, this looks like the place to be.

Certainly useful visualization techniques for a number of purposes.

November 1, 2011

Graph and Network Analysis – DERI 2011

Filed under: Graphs,Networks — Patrick Durusau @ 3:34 pm

Graph and Network Analysis – DERI 2011 – Dr. Derek Greene

From the website:

Summer School: July 6, 2011 – July 13, 2011. DERI, NUI Galway

Supporting material for the tutorial “Graph and Network Analysis” by Dr. Derek Greene from the Clique Research Cluster, providing an introduction to social network analysis, with examples using the Python NetworkX library.

Related Resources:

  • NetworkX: Library for network analysis (recommended v1.5) for Python (recommended v2.6.x / 2.7.x)
  • Gephi: Java interactive visualisation platform and toolkit – “Photoshop for graphs”.
  • Graclus: Graph partitioning tool
  • Louvain: Disjoint community finding software
  • CFinder: Overlapping community finding software
  • Moses: Overlapping community finding software
  • GCE: Overlapping community finding software
  • Dynamic community finding software

In case you weren’t able to make it to the Summer School, the next best thing!

October 21, 2011

ForceAtlas2

Filed under: Gephi,Graphs,Networks,Social Graphs,Social Networks — Patrick Durusau @ 7:27 pm

ForceAtlas2 (paper) +appendices by Mathieu Jacomy, Sebastien Heymann, Tommaso Venturini, and Mathieu Bastian.

Abstract:

ForceAtlas2 is a force vector algorithm proposed in the Gephi software, appreciated for its simplicity and for the readability of the networks it helps to visualize. This paper presents its distinctive features, its energy-model and the way it optimizes the “speed versus precision” approximation to allow quick convergence. We also claim that ForceAtlas2 is handy because the force vector principle is unaffected by optimizations, offering a smooth and accurate experience to users.

I knew I had to cite this paper when I read:

These earliest Gephi users were not fully satisfied with existing spatialization tools. We worked on empirical improvements and that’s how we created the first version of our own algorithm, ForceAtlas. Its particularity was a degree-dependant repulsion force that causes less visual cluttering. Since then we steadily added some features while trying to keep in touch with users’ needs. ForceAtlas2 is the result of this long process: a simple and straightforward algorithm, made to be useful for experts and profanes. (footnotes omitted, emphasis added)

Profanes. I like that! Well, rather I like the literacy that enables a writer to use that in a technical paper.

Highly recommended paper.

September 27, 2011

Production and Network Formation Games with Content Heterogeneity

Filed under: Games,Group Theory,Networks — Patrick Durusau @ 6:49 pm

Production and Network Formation Games with Content Heterogeneity by Yu Zhang, Jaeok Park, and Mihaela van der Schaar.

Abstract:

Online social networks (e.g. Facebook, Twitter, Youtube) provide a popular, cost-effective and scalable framework for sharing user-generated contents. This paper addresses the intrinsic incentive problems residing in social networks using a game-theoretic model where individual users selfishly trade off the costs of forming links (i.e. whom they interact with) and producing contents personally against the potential rewards from doing so. Departing from the assumption that contents produced by difference users is perfectly substitutable, we explicitly consider heterogeneity in user-generated contents and study how it influences users’ behavior and the structure of social networks. Given content heterogeneity, we rigorously prove that when the population of a social network is sufficiently large, every (strict) non-cooperative equilibrium should consist of either a symmetric network topology where each user produces the same amount of content and has the same degree, or a two-level hierarchical topology with all users belonging to either of the two types: influencers who produce large amounts of contents and subscribers who produce small amounts of contents and get most of their contents from influencers. Meanwhile, the law of the few disappears in such networks. Moreover, we prove that the social optimum is always achieved by networks with symmetric topologies, where the sum of users’ utilities is maximized. To provide users with incentives for producing and mutually sharing the socially optimal amount of contents, a pricing scheme is proposed, with which we show that the social optimum can be achieved as a non-cooperative equilibrium with the pricing of content acquisition and link formation.

The “content heterogeneity” caught my eye but after reading the abstract, this appears relevant to topic maps for another reason.

One of the projects I hear discussed from time to time is a “public” topic map that encourages users to interact in a social context and to add content to the topic map. Group dynamics and the study of the same seem directly relevant to such “public” topic maps.

Interesting paper but I am not altogether sure about the “social optimum” as outlined in the paper. Not that I find it objectionable, but more that “social optimums” are a matter of social practice than engineering.

The ubiquity of small-world networks

Filed under: Clustering,Graphs,Networks — Patrick Durusau @ 6:46 pm

The ubiquity of small-world networks by Qawi K. Telesford, Karen E. Joyce, Satoru Hayasaka, Jonathan H. Burdette, and Paul J. Laurienti.

Abstract:

Small-world networks by Watts and Strogatz are a class of networks that are highly clustered, like regular lattices, yet have small characteristic path lengths, like random graphs. These characteristics result in networks with unique properties of regional specialization with efficient information transfer. Social networks are intuitive examples of this organization with cliques or clusters of friends being interconnected, but each person is really only 5-6 people away from anyone else. While this qualitative definition has prevailed in network science theory, in application, the standard quantitative application is to compare path length (a surrogate measure of distributed processing) and clustering (a surrogate measure of regional specialization) to an equivalent random network. It is demonstrated here that comparing network clustering to that of a random network can result in aberrant findings and networks once thought to exhibit small-world properties may not. We propose a new small-world metric, {\omega} (omega), which compares network clustering to an equivalent lattice network and path length to a random network, as Watts and Strogatz originally described. Example networks are presented that would be interpreted as small-world when clustering is compared to a random network but are not small-world according to {\omega}. These findings have significant implications in network science as small-world networks have unique topological properties, and it is critical to accurately distinguish them from networks without simultaneous high clustering and low path length.

What sort of network is your topic map?

Wonder if there will emerge classes of topic maps? Some of which are small-world networks and others that are not? I ask because knowing the conditions/requirements that lead to one type or the other would be another tool for designing topic maps for particular purposes.

September 16, 2011

Active Learning for Node Classification in Assortative and Disassortative Networks

Filed under: Clustering,Networks — Patrick Durusau @ 6:42 pm

Active Learning for Node Classification in Assortative and Disassortative Networks by Cristopher Moore, Xiaoran Yan, Yaojia Zhu, Jean-Baptiste Rouquier, and Terran Lane.

Abstract:

In many real-world networks, nodes have class labels, attributes, or variables that affect the network’s topology. If the topology of the network is known but the labels of the nodes are hidden, we would like to select a small subset of nodes such that, if we knew their labels, we could accurately predict the labels of all the other nodes. We develop an active learning algorithm for this problem which uses information-theoretic techniques to choose which nodes to explore. We test our algorithm on networks from three different domains: a social network, a network of English words that appear adjacently in a novel, and a marine food web. Our algorithm makes no initial assumptions about how the groups connect, and performs well even when faced with quite general types of network structure. In particular, we do not assume that nodes of the same class are more likely to be connected to each other—only that they connect to the rest of the network in similar ways.

If abstract doesn’t recommend this paper as weekend reading, perhaps the following quote from the paper will:

our focus is on the discovery of functional communities in the network, and our underlying generative model is designed around the assumption of that these communities exist.

You will recall from Don’t Trust Your Instincts that we are likely to see what we expect to see in text, or in this case, networks. Not that using this approach frees us from introducing bias, but it does insure the observer bias is uniformly applied across the data set. Which may lead to results that startle us, interest us or that we consider to be spurious. In any event, this is one more approach to test and possibly illuminate our understanding of a network.

PS: Are communities the equivalent of clusters?

September 2, 2011

Category-Based Routing in Social Networks:…

Filed under: Identity,Networks,Social Networks — Patrick Durusau @ 7:58 pm

Category-Based Routing in Social Networks: Membership Dimension and the Small-World Phenomenon (Short) by David Eppstein, Michael T. Goodrich, Maarten Löffler, Darren Strash, and Lowell Trott.

Abstract:

A classic experiment by Milgram shows that individuals can route messages along short paths in social networks, given only simple categorical information about recipients (such as “he is a prominent lawyer in Boston” or “she is a Freshman sociology major at Harvard”). That is, these networks have very short paths between pairs of nodes (the so-called small-world phenomenon); moreover, participants are able to route messages along these paths even though each person is only aware of a small part of the network topology. Some sociologists conjecture that participants in such scenarios use a greedy routing strategy in which they forward messages to acquaintances that have more categories in common with the recipient than they do, and similar strategies have recently been proposed for routing messages in dynamic ad-hoc networks of mobile devices. In this paper, we introduce a network property called membership dimension, which characterizes the cognitive load required to maintain relationships between participants and categories in a social network. We show that any connected network has a system of categories that will support greedy routing, but that these categories can be made to have small membership dimension if and only if the underlying network exhibits the small-world phenomenon.

So, if identity is a social construct and the result of small-world networks, then we may need a different kind of precision (from scientific measurement) to identify subjects.

Perhaps the reverse of 20-questions, how many questions do we need for a particular subject? Does anyone remember if there was a common number of questions that were sufficient for the 20-questions game?

July 25, 2011

Visualizing NetworkX graphs in the browser using D3

Filed under: Graphs,Networks,Visualization — Patrick Durusau @ 6:40 pm

Visualizing NetworkX graphs in the browser using D3 by Drew Conway.

From the post:

During one of our impromptu sprints at SciPy 2011, the NetworkX team decided it would be nice to add the ability to export networks for visualization with the D3 JavaScript library. This would allow people to post their visualizations online very easily. Mike Bostock, the creator and maintainer of D3, also has a wonderful example of how to render a network using a force-directed layout in the D3 examples gallery.

So, we decided to insert a large portion of Mike’s code into the development version of NetworkX in order to allow people to quickly export networks to JSON and visualize them in the browser. Unfortunately, I have not had the chance to write any tests for this code, so it is only available in my fork of the main NetworkX repository on Github. But, if you clone this repository and install it you will have the new features (along with an additional example file for building networks for web APIs in NX).

You have to see the visualization in the post to get the full impact. You won’t be disappointed!

July 23, 2011

Information Propagation in Twitter’s Network

Filed under: Networks,Similarity,Social Networks — Patrick Durusau @ 3:12 pm

Information Propagation in Twitter’s Network

From the post:

It’s well-known that Twitter’s most powerful use is as tool for real-time journalism. Trying to understand its social connections and outstanding capacity to propagate information, we have developed a mathematical model to identify the evolution of a single tweet.

The way a tweet is spread through the network is closely related with Twitter’s retweet functionality, but retweet information is fairly incomplete due to the fight for earning credit/users by means of being the original source/author. We have taken into consideration this behavior and our approach uses text similarity measures as complement of retweet information. In addition, #hashtags and urls are included in the process since they have an important role in Twitter’s information propagation.

Once we designed (and implemented) our mathematical model, we tested it with some Twitter’s topics we had tracked using a visualization tool (Life of a Tweet) . Our conclusiones after the experiments were:

  1. Twitter’s real propagation is based on information (tweets’ content) and not on Twitter’s structure (retweet).
  2. Based on we can detect Twitter’s real propagation, we can retrieve Twitter’s real networks.
  3. Text similarity scores allow us to select how fuzzy are the tweet’s connections and, in extension, the network’s connections. This means that we can set a minimun threshold to determine when two tweets contain the same concept.

Interesting. Useful for anyone who want to grab “real” connections and networks to create topics for merging further information about the same.

You may want to also look at: Meme Diffusion Through Mass Social Media which is about a $900K NSF project on tracking memes through social media.

Admittedly an important area of research but the results I would view with a great deal of caution. Here’s why:

  1. Memes travel through news outlets, print, radio, TV, websites
  2. Memes travel through social outlets, such as churches, synagogues, mosques, social clubs
  3. Memes travel through business relationships and work places
  4. Memes travel through family gatherings and relationships
  5. Memes travel over cell phone conversations as well as tweets

That some social media is easier to obtain and process than others doesn’t make it a reliable basis for decision making.

June 29, 2011

Path Finding with Neo4j

Filed under: Graphs,Neo4j,Networks — Patrick Durusau @ 9:04 am

Path Finding with Neo4j by Josh Adell.

From the post:

In my previous post I talked about graphing databases (Neo4j in particular) and how they can be applied to certain classes of problems where data may have multiple degrees of separation in their relationships.

The thing that makes graphing databases useful is the ability to find relationship paths from one node to another. There are many algorithms for finding paths efficiently, depending on the use case.

When they say “multiple degrees of separation in their relationships” that sounds a lot like topic maps to me. Or at least topic maps in some use cases I should say.

Enjoy the post and what I anticipate to follow.

May 25, 2011

GraphStream 1.0 Release

Filed under: Graphs,Networks — Patrick Durusau @ 1:25 pm

GraphStream 1.0 Release

From the website:

With GraphStream you deal with graphs. Static and Dynamic.
You create them from scratch, from a file or any source.
You display and render them.

From Getting Started:

GraphStream is a graph handling Java library that focuses on the dynamics aspects of graphs. Its main focus is on the modeling of dynamic interaction networks of various sizes.

The goal of the library is to provide a way to represent graphs and work on it. To this end, GraphStream proposes several graph classes that allow to model directed and undirected graphs, 1-graphs or p-graphs (a.k.a. multigraphs, that are graphs that can have several edges between two nodes).

GraphStream allows to store any kind of data attribute on the graph elements: numbers, strings, or any object.

Moreover, in addition, GraphStream provides a way to handle the graph evolution in time. This means handling the way nodes and edges are added and removed, and the way data attributes may appear, disappear and evolve.

You can also get an idea of the range of capabilities from the GraphStream 1.0 video.

GraphStream 1.0 Video

Filed under: Graphs,Networks — Patrick Durusau @ 1:24 pm

GraphStream 1.0 Video

I could roll this into a post about the GraphStream 1.0 release but this is a serious piece of work on its own.

The following connections demonstration should be on interest to the intelligence communities around the world.

High quality intelligence is no long the sole province of those who can afford one-off computer installations.

April 29, 2011

Horton: online query execution on large distributed graphs

Filed under: Graphs,Networks,Query Language,Social Graphs — Patrick Durusau @ 1:12 pm

Horton: online query execution on large distributed graphs by Sameh Elnikety, Microsoft Research.

The presentation addresses three problems with large, distributed graphs:

  1. How to partition the graph
  2. How to query the graph
  3. How to update the graph

Investigates a graph query language, execution engine and optimizer, and concludes with initial results.

April 26, 2011

…Efficient Subgraph Matching on Huge Networks (or, > 1 billion edges < 1 second)

Filed under: Graphs,Networks,RDF — Patrick Durusau @ 2:18 pm

A Budget-Based Algorithm for Efficient Subgraph Matching on Huge Networks by Matthais Br&oul;cheler, Andrea Pugliese, V.S. Subrahmanian. (Presented at GDM 2011.)

Abstract:

As social network and RDF data grow dramatically in size to billions of edges, the ability to scalably answer queries posed over graph datasets becomes increasingly important. In this paper, we consider subgraph matching queries which are often posed to social networks and RDF databases — for such queries, we want to find all matching instances in a graph database. Past work on subgraph matching queries uses static cost models which can be very inaccurate due to long-tailed degree distributions commonly found in real world networks. We propose the BudgetMatch query answering algorithm. BudgetMatch costs and recosts query parts adaptively as it executes and learns more about the search space. We show that using this strategy, BudgetMatch can quickly answer complex subgraph queries on very large graph data. Specifically, on a real world social media data set consisting of 1.12 billion edges, we can answer complex subgraph queries in under one second and significantly outperform existing subgraph matching algorithms.

Built on top of Neo4J, BudgetMatch, dynamically updates budgets assigned to vertexes.

Aggressive pruning gives some rather attractive results.

April 25, 2011

The igraph library

Filed under: Graphs,Networks,Visualization — Patrick Durusau @ 3:35 pm

The igraph library

From the website:

igraph is a free software package for creating and manipulating undirected and directed graphs. It includes implementations for classic graph theory problems like minimum spanning trees and network flow, and also implements algorithms for some recent network analysis methods, like community structure search.

The efficient implementation of igraph allows it to handle graphs with millions of vertices and edges. The rule of thumb is that if your graph fits into the physical memory then igraph can handle it.

….

  • igraph contains functions for generating regular and random graphs according to many algorithms and models from the network theory literature.
  • igraph provides routines for manipulating graphs, adding and removing edges and vertices.
  • You can assign numeric or textual attribute to the vertices or edges of the graph, like edge weights or textual vertex ids.
  • A rich set of functions calculating various structural properties, eg. betweenness, PageRank, k-cores, network motifs, etc. are also included.
  • Force based layout generators for small and large graphs
  • The R package and the Python module can visualize graphs many ways, in 2D and 3D, interactively or non-interactively.
  • igraph provides data types for implementing your own algorithm in C, R, Python or Ruby.
  • Community structure detection algorithms using many recently developed heuristics.
  • igraph can read and write many file formats, e.g., GraphML, GML or Pajek.
  • igraph contains efficient functions for deciding graph isomorphism and subgraph isomorphism
  • It also contains an implementation of the push/relabel algorithm for calculating maximum network flow, and this way minimum cuts, vertex and edge connectivity.
  • igraph is well documented both for users and developers.
  • igraph is open source and distributed under GNU GPL.

April 15, 2011

Network Workbench

Filed under: Networks,Visualization — Patrick Durusau @ 6:25 am

Network Workbench: A Workbench for Network Scientists

From the website:

Network Workbench: A Large-Scale Network Analysis, Modeling and Visualization Toolkit for Biomedical, Social Science and Physics Research.

This project will design, evaluate, and operate a unique distributed, shared resources environment for large-scale network analysis, modeling, and visualization, named Network Workbench (NWB). The envisioned data-code-computing resources environment will provide a one-stop online portal for researchers, educators, and practitioners interested in the study of biomedical, social and behavioral science, physics, and other networks.

The NWB will support network science research across scientific boundaries. Users of the NWB will have online access to major network datasets or can upload their own networks. They will be able to perform network analysis with the most effective algorithms available. In addition, they will be able to generate, run, and validate network models to advance their understanding of the structure and dynamics of particular networks. NWB will provide advanced visualization tools to interactively explore and understand specific networks, as well as their interaction with other types of networks.

A major computer science challenge is the development of an algorithm integration framework that supports the easy integration and dissemination of existing and new algorithms and can deal with the multitude of network data formats in existence today. Another challenge is the design and implementation of an easy to use menu-based, online portal interface for interactive algorithm selection, data manipulation, user and session management. The NWB will be evaluated in diverse research projects and educational settings in biology, social and behavioral science, and physics research. It will be well documented and available as open source for easy duplication and usage at other sites. An annual summer school and a series of workshops and tutorials are planned to introduce the tool to diverse research communities.

April 14, 2011

Cytoscape

Filed under: Graphs,Networks,Visualization — Patrick Durusau @ 7:21 am

Cytoscape: An Open Source Platform for Complex-Network Analysis and Visualization

From the website:

Cytoscape is an open source software platform for visualizing complex-networks and integrating these with any type of attribute data. A lot of plugins are available for various kinds of problem domains, including bioinformatics, social network analysis, and semantic web.

Alluded to in: AllegroMCOCE: GPU-accelerated Cytoscape Plugin TM Explorer? but encountering Cytoscape again, decided it needed a separate posting.

April 11, 2011

University of Florida Sparse Matrix Collection

Filed under: Algorithms,Computational Geometry,Graphs,Matrix,Networks — Patrick Durusau @ 5:49 am

University of Florida Sparse Matrix Collection

It’s not what you think. Well, it is but it is so much more. You really have to see the images at this site.

Abstract (from the paper by the same title):

We describe the University of Florida Sparse Matrix Collection, a large and actively growing set of sparse matrices that arise in real applications. The Collection is widely used by the numerical linear algebra community for the development and performance evaluation of sparse matrix algorithms. It allows for robust and repeatable experiments: robust because performance results with artificially-generated matrices can be misleading, and repeatable because matrices are curated and made publicly available in many formats. Its matrices cover a wide spectrum of domains, include those arising from problems with underlying 2D or 3D geometry (as structural engineering, computational fluid dynamics, model reduction, electromagnetics, semiconductor devices, thermodynamics, materials, acoustics, computer graphics/vision, robotics/kinematics, and other discretizations) and those that typically do not have such geometry (optimization, circuit simulation, economic and financial modeling, theoretical and quantum chemistry, chemical process simulation, mathematics and statistics, power networks, and other networks and graphs). We provide software for accessing and managing the Collection, from MATLAB, Mathematica, Fortran, and C, as well as an online search capability. Graph visualization of the matrices is provided, and a new multilevel coarsening scheme is proposed to facilitate this task.

A Java viewer for matrices is also found here.

April 10, 2011

TCS: Call for papers on Graph Searching

Filed under: Graphs,Networks,Search Algorithms,Searching — Patrick Durusau @ 2:52 pm

TCS: Call for papers on Graph Searching

From the call:

Manuscripts are solicited for a special issue in the journal “Theoretical Computer Science” (TCS) on “Theory and Applications of Graph Searching Problems”. This special issue will be dedicated to the 60th birthday of Lefteris M. Kirousis.

….

  • Graph Searching and Logic
  • Graph Parameters Related to Graph Searching
  • Graph searching and Robotics
  • Conquest and Expansion Games
  • Database Theory and Robber and Marshals Games
  • Probabilistic Techniques in Graph Searching
  • Monotonicity and Connectivity in Graph Searching
  • New Variants of Graph Searching
  • Graph Searching and Distributed Computing
  • Graph Searching and Network Security

Deadline for submission is: 31 July 2011.

Interesting as a submission venue or waiting for this issue to appear.

February 28, 2011

RDBMS in the Social Networks Age

Filed under: Networks,RDBMS,Social Networks — Patrick Durusau @ 10:03 am

RDBMS in the Social Networks Age by Lorenzo Alberton.

A slide deck that made me wish I had seen the presentation!

Its treatment of graph representation in a relational system is particularly strong.

The bibliography is useful as well.

Just to tempt you into viewing the slide deck, slide 19, The Boring Stuff, is very amusing.

February 20, 2011

AllegroMCOCE: GPU-accelerated Cytoscape Plugin
TM Explorer?

Filed under: Bioinformatics,Biomedical,Graphic Processors,Networks — Patrick Durusau @ 10:45 am

AllegroMCOCE: GPU-accelerated Cytoscape Plugin

From the website:

AllegroMCODE is a high-performance Cytoscape plugin to find clusters, or highly interconnected groups of nodes in a huge complex network such as a protein interaction network and a social network in real time. AllegroMCODE finds the same clusters as the MCODE plugin does, but the analysis usually takes less than a second even for a large complex network. The plugin user interface of AllegroMCODE is based on MOCDE and has additional features. AllegroMCODE is an open source software and freely available under LGPL.

Cluster has various meanings according to the sources of networks. For instance, a protein-protein interaction network is represented as proteins are nodes and interactions between proteins are edges. Clusters in the network can be considered as protein complexes and functional modules, which can be identified as highly interconnected subgraphs. For social networks, people and their relationships are represented as nodes and edges, respectively. A cluster in the network can be considered as a community which has strong inter-relationship among their members.

AllegroMCODE exploits our high performance GPU computing architecture to make your analysis task faster than ever. The analysis task of the MCODE algorithm to find the clusters can be long for large complex networks even though the MCODE is a relatively fast method of clustering. AllegroMCODE provides our parallel algorithm implementation base on the original sequential MCODE algorithm. It can achieve two orders of magnitude speedup for the analysis of a large complex network by using the latest graphics card. You can also exploit the GPU acceleration without any special graphics hardware since it provides the seamless remote processing in our free GPU computing server.

You do not need to purchase any special GPU hardware or systems and also not to care about the tedious installation task of them. All you have to do are to install the AllegroMCODE plugin module on your computer and create a free account on our server.

Simply awesome!

The ability to dynamically explore and configure topic maps will be priceless.

A greater gap than between hot-lead type and a modern word processor.

Will take weeks/months to fully explore but wanted to bring it to your attention.

February 13, 2011

Software for Non-Human Users?

The description of: Emerging Intelligent Data and Web Technologies (EIDWT-2011) is a call for software designed for non-human users.

The Social Life of Information by John Seely Brown and Paul Duguid, makes it clear that human users don’t want to share data because sharing data represents a loss of power/status.

A poll of the readers of CACM or Computer would report a universal experience of working in an office where information is hoarded up by individuals in order to increase their own status or power.

9/11 was preceded and followed by, to this day, by a non-sharing of intelligence data. Even national peril cannot overcome the non-sharing reflex with regard to data.

EIDWT-2011 and conferences like it, are predicated on a sharing of data known to not exist, at least among human users.

Hence, I suspect the call must be directed at software for non-human users.

January 23, 2011

A Path Algebra for Mapping Multi-Relational Networks to Single-Relational Networks

Filed under: Data Structures,Graphs,Neo4j,Networks — Patrick Durusau @ 4:54 pm

A Path Algebra for Mapping Multi-Relational Networks to Single-Relational Networks

A proposal for re-use of existing algorithms, designed for single relational networks with multi-relational networks.

By mapping multi-relational networks onto single relational networks.

Makes me wonder if heterogeneous identifications could be mapped in a similar way to a single identifier?

Or would there be too much information loss?

Depends on the circumstances and goals.

January 14, 2011

Graphs, Networks and Semantics

Filed under: Networks,Semantics — Patrick Durusau @ 4:46 pm

I have been reading a lot of graph and network theory stuff lately.

It occurred to me that in any graph or network, the nodes that we choose to write down, as well as the edges between them, are arbitrary choices on our part.

That is to say that someone else, drawing the same graph or network, might include more or fewer nodes and edges.

Neither one would be more correct than the other, but they would be different networks or graphs.

I mention that because it implies that for every graph or network that we write down, there are other graphs or networks lurking just beyond our reach.

Perhaps within the reach of others, but just not ourselves.

That seems to me to strike at the heart of the notion of primitives in the various logics and ontologies.

There may well be primitives from a particular point of view but only from a particular point of view.

So when someone assures you that a particular set of primitives is required for their semantic solution, be sure you hear that as the limits of their graph, network, or semantics. Your mileage may vary.

January 11, 2011

Every Subject A Topic?

Filed under: Authoring Topic Maps,Graphs,Networks — Patrick Durusau @ 10:16 am

The obvious answer to the question: Every Subject A Topic?, is no but I wanted to write up a specific use case I saw discussed today.

I was watching Understanding Graph Databases with Darren Wood, part of the NoSQL Tapes earlier today.

Wood mentioned that in intelligence work a node that has a lot of connections to other nodes, really isn’t that interesting.

For example, modeling telephone calls, that everyone calls the local pizza place isn’t all that interesting.

On the other hand, a node with few connections, especially a connection that bridges subgraphs, could be very interesting.

I thought about that in terms of modeling say campaign finances with a topic map.

I could have a topic that represents Democrats, one that represents Republicans and one for each of the other parties.

Plus create an association with each of those topics for each donation.

But noisy when you think about it from the perspective of the resulting graph.

Some options come to mind:

  1. Preserve the information but as part of each donation represented as a topic.
  2. Create a topic that is just the number of donations and the sum donated.
  3. A variant on #2 except by zip code, to enable a map coloring of donations by zip code.

Will have to think about different ways to create a topic map on the same data.

To establish a baseline for comparing modeling choices.

Finishing up ODF edits this month but perhaps something in the February time frame.

January 10, 2011

NoSQL Tapes

Filed under: Cassandra,CouchDB,Graphs,MongoDB,Neo4j,Networks,NoSQL,OrientDB,Social Networks — Patrick Durusau @ 1:33 pm

NoSQL Tapes: A filmed compilation of interviews, explanations & case studies

From the email announcement by Tim Anglade:

Late last year, as the NOSQL Summer drew to a close, I got the itch to start another NOSQL community project. So, with the help of vendors Scality and InfiniteGraph, I toured around the world for 77 days to meet and record video interviews with 40+ NOSQL vendors, users and dudes-you-can-trust.

….

My original goals were to attempt to map a comprehensive view of the NOSQL world, its origins, its current trends and potential future. NOSQL knowledge seemed to me to be heavily fragmented and hard to reconcile across projects, vendors & opinions. I wanted to try to foster more sharing in our community and figure out what people thought ‘NOSQL’ meant. As it happens, I ended up learning quite a lot in the process (as I’m sure even seasoned NOSQLers on this list will too).

I’d like to take this opportunity to thank everybody who agreed to participate in this series: 10gen, Basho, Cloudant, CouchOne, FourSquare, Ben Black, RethinkDB, MarkLogic, Cloudera, SimpleGeo, LinkedIn, Membase, Ryan Rawson, Cliff Moon, Gemini Mobile, Furuhashi-san, Luca Garulli, Sergio Bossa, Mathias Meyer, Wooga, Neo4J, Acunu (and a few other special guests I’m keeping under wraps for now); I couldn’t have done it without them and learned by leaps & bounds for every hour I spent with each of them.

I’d also like to thank my two sponsors, Scality & InfiniteGraph, from the bottom of my heart. They were supportive in a way I didn’t think companies could be and let me total control of the shape & content of the project. I’d encourage you to check them out if you haven’t done so already.

As always, I’ll be glad to take any comments or suggestions you may have either by email (tim@nosqltapes.com) or on Twitter (@timanglade).

Simply awesome!

January 9, 2011

Center for Computational Analysis of Social and Organizational Systems (CASOS)

Center for Computational Analysis of Social and Organizational Systems (CASOS)

Home of both ORA and AutoMap but I thought it merited an entry of its own.

Directed by Dr. Kathleen Carley:

CASOS brings together computer science, dynamic network analysis and the empirical study of complex socio-technical systems. Computational and social network techniques are combined to develop a better understanding of the fundamental principles of organizing, coordinating, managing and destabilizing systems of intelligent adaptive agents (human and artificial) engaged in real tasks at the team, organizational or social level. Whether the research involves the development of metrics, theories, computer simulations, toolkits, or new data analysis techniques advances in computer science are combined with a deep understanding of the underlying cognitive, social, political, business and policy issues.

CASOS is a university wide center drawing on a group of world class faculty, students and research and administrative staff in multiple departments at Carnegie Mellon. CASOS fosters multi-disciplinary research in which students and faculty work with students and faculty in other universities as well as scientists and practitioners in industry and government. CASOS research leads the way in examining network dynamics and in linking social networks to other types of networks such as knowledge networks. This work has led to the development of new statistical toolkits for the collection and analysis of network data (Ora and AutoMap). Additionally, a number of validated multi-agent network models in areas as diverse as network evolution , bio-terrorism, covert networks, and organizational adaptation have been developed and used to increase our understanding of real socio-technical systems.

CASOS research spans multiple disciplines and technologies. Social networks, dynamic networks, agent based models, complex systems, link analysis, entity extraction, link extraction, anomaly detection, and machine learning are among the methodologies used by members of CASOS to tackle real world problems.

Definitely a group that bears watching by anyone interested in topic maps!

AutoMap – Extracting Topic Maps from Texts?

Filed under: Authoring Topic Maps,Entity Extraction,Networks,Semantics,Software — Patrick Durusau @ 10:59 am

AutoMap: Extract, Analyze and Represent Relational Data from Texts (according to its webpage).

From the webpage:

AutoMap is a text mining tool that enables the extraction of network data from texts. AutoMap can extract content analytic data (words and frequencies), semantic networks, and meta-networks from unstructured texts developed by CASOS at Carnegie Mellon. Pre-processors for handling pdf’s and other text formats exist. Post-processors for linking to gazateers and belief inference also exist. The main functions of AutoMap are to extract, analyze, and compare texts in terms of concepts, themes, sentiment, semantic networks and the meta-networks extracted from the texts. AutoMap exports data in DyNetML and can be used interoperably with *ORA.

AutoMap uses parts of speech tagging and proximity analysis to do computer-assisted Network Text Analysis (NTA). NTA encodes the links among words in a text and constructs a network of the linked words.

AutoMap subsumes classical Content Analysis by analyzing the existence, frequencies, and covariance of terms and themes.

For a rough cut at a topic map from a text, AutoMap looks like a useful tool.

In addition to the software, training material and other information is available.

My primary interest is the application of such a tool to legislative debates, legislation and court decisions.

None of those occur in a vacuum and topic maps could help provide a context for understand such material.

ORA – Topic Maps as Networks?

Filed under: Networks,Software — Patrick Durusau @ 10:28 am

ORA (Organization Risk Analyzer) is a toolkit developed for the analysis of organizational networks that could prove to be very useful for topic maps when viewed as networks.

From the website:

*ORA is a dynamic meta-network assessment and analysis tool developed by CASOS at Carnegie Mellon. It contains hundreds of social network, dynamic network metrics, trail metrics, procedures for grouping nodes, identifying local patterns, comparing and contrasting networks, groups, and individuals from a dynamic meta-network perspective. *ORA has been used to examine how networks change through space and time, contains procedures for moving back and forth between trail data (e.g. who was where when) and network data (who is connected to whom, who is connected to where …), and has a variety of geo-spatial network metrics, and change detection techniques. *ORA can handle multi-mode, multi-plex, multi-level networks. It can identify key players, groups and vulnerabilities, model network changes over time, and perform COA analysis. It has been tested with large networks (106 nodes per 5 entity classes).Distance based, algorithmic, and statistical procedures for comparing and contrasting networks are part of this toolkit.

Comments on which parts of this toolkit you find the most useful welcome.

« Newer PostsOlder Posts »

Powered by WordPress