Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 30, 2012

Graph Theory and Complex Networks: An Introduction

Filed under: Graphs,Networks — Patrick Durusau @ 9:32 am

Graph Theory and Complex Networks: An Introduction by Maarten van Steen.

From the webpage:

GTCN aims to explain the basics of graph theory that are needed at an introductory level for students in computer or information sciences. To motivate students and to show that even these basic notions can be extremely useful, the book also aims to provide an introduction to the modern field of network science.

I take the starting-point that mathematics for most students is unnecessarily intimidating. Explicit attention is paid in the first chapters to mathematical notations and proof techniques, emphasizing that the notations form the biggest obstacle, not the mathematical concepts themselves. Taking this approach has allowed me to gradually prepare students for using tools that are necessary to put graph theory to work: complex networks.

In the second part of the book the student learns about random networks, small worlds, the structure of the Internet and the Web, and social networks. Again, everything is discussed at an elementary level, but such that in the end students indeed have the feeling that they:

  1. Have learned how to read and understand the basic mathematics related to graph theory
  2. Understand how basic graph theory can be applied to optimization problems such as routing in communication networks
  3. Know a bit more about this sometimes mystical field of small worlds and random networks.

The full text of Graph Theory and Complex Networks (GTCN) is available as a “personalized” download (“personalized for” at the top of each page and “your email address” at the bottom of each page) or from Amazon for $25.00.

Additional course materials are also available at this site.

You will be amused to read about the difficulty of graph/network notation:

It is also not that difficult, as most notations come directly from set theory.

That’s reassuring. šŸ˜‰

GTCN offers suggestions for translating mathematical notation into English. A useful skill, here and elsewhere.

I ran across this resource at: Introductions to Graph Databases and Theory.

May 29, 2012

Network diagrams simplified

Filed under: Graphics,Networks,Visualization — Patrick Durusau @ 9:14 am

Network diagrams simplified

Kim Rees at Flowing Data writes of a new network visualization technique by Cody Dunn:

In essence, he aggregates leaf nodes into a fan glyph that describes the underlying data in its size, arc, and color. Span nodes are similarly captured into crescent glyphs. The result is an easy to read, high level look at the network. You can easily compare different sections of the network, understand areas that may have been occluded by the lines in a traditional diagram, and see relationships far more quickly.

The explanation is useful but I think you will find the visualizations impressive!

Check Kim’s post for images and links to more materials.

May 25, 2012

Silo Indictment #1,000,001

Filed under: Functional Decomposition,Hierarchy,Networks,Silos — Patrick Durusau @ 5:08 am

Derek Miers writes what may be the 1,000,001st indictment of silos in Silos and Functional Decomposition:

I think we would all agree that BPM and business architecture set out to overcome the issues associated with silos. And I think we would also agree that the problems associated with silos derive from functional decomposition.

While strategy development usually takes a broad, organization-wide view, so many change programs still cater to the sub-optimization perspectives of individual silos. Usually, these individual change programs consist of projects that deal with the latest problem to rise to the top of the political agenda ā€” effectively applying a Band-Aid to fix a broken customer-facing process or put out a fire associated with some burning platform.

Silo-based thinking is endemic to Western culture ā€” itā€™s everywhere. This approach to management is very much a command-and-control mentality injected into our culture by folks like Smith, Taylor, Newton and Descartes. Letā€™s face it: the world has moved on, and the network is now far more important than the hierarchy.

But guess what technique about 99.9% of us use to fix the problems associated with functional decomposition? You guessed it: yet more functional decomposition. I think Einstein had something to say about using the same techniques and expecting different results. This is a serious groupthink problem!

When we use functional decomposition to model processes, we usually conflate the organizational structure with the work itself. Rather than breaking down the silos, this approach reinforces them ā€” effectively putting them on steroids. And when other techniques emerge that explicitly remove the conflation of process and organizational structure, those who are wedded to old ways of thinking come out of the woodwork to shoot them down. Examples include role activity diagrams (Martyn Ould), value networks (Verna Allee), and capability mapping (various authors, including Forrester analysts).

Or it may be silo indictment #1,000,002, it is hard to keep an accurate count.

I don’t doubt a word that Derek says, although I might put a different emphasis on parts of it.

But in any case, let’s just emphasize agreement that silos are a problem.

Now what?

May 24, 2012

Lima on Networks

Filed under: Complex Networks,Complexity,Graphs,Networks — Patrick Durusau @ 2:10 pm

I saw a mention of RSA Animate ā€“ The Power of Networks by Manuel Lima over at Flowing Data.

A high speed chase through ideas but the artistry of the presentation and presenter make it hold together quite nicely.

Manuel makes the case that organization of information is more complex than trees. In fact, makes a good case for networks being a better model.

If that bothers you, you might want to cut Manuel some slack or perhaps even support the “network” (singular) model.

There are those of us who don’t think a single network is sufficient.

šŸ˜‰

Resources to review before viewing the video:

Science and Complexity – Warren Weaver (1948 – reprint): The paper that Manuel cites in his presentation.

Wikipedia – Complexity Not bad as Wikipedia entries go. At least a starting point.

May 20, 2012

An Example of Social Network Analysis with R using Package igraph

Filed under: igraph,Networks,R,Social Networks — Patrick Durusau @ 10:33 am

An Example of Social Network Analysis with R using Package igraph by Yanchang Zhao.

From the post:

This post presents an example of social network analysis with R using package igraph.

The data to analyze is Twitter text data of @RDataMining used in the example of Text Mining, and it can be downloaded as file ā€œtermDocMatrix.rdataā€ at the Data webpage. Putting it in a general scenario of social networks, the terms can be taken as people and the tweets as groups on LinkedIn, and the term-document matrix can then be taken as the group membership of people. We will build a network of terms based on their co-occurrence in the same tweets, which is similar with a network of people based on their group memberships.

I like the re-use of traditional social network analysis with tweets.

And the building of a network of terms based on co-occurrence.

May or may not serve your purposes but:

If you don’t look, you won’t see.

May 7, 2012

Clusters & Communities (overlapping dense groups in networks)

Filed under: CFinder,Clustering,Networks — Patrick Durusau @ 7:18 pm

Clusters & Communities (overlapping dense groups in networks) (CFinder)

From the webpage:

CFinder is a free software for finding and visualizing overlapping dense groups of nodes in networks, based on the Clique Percolation Method (CPM) of Palla et. al., Nature 435, 814-818 (2005). CFinder was recently applied to the quantitative description of the evolution of social groups: Palla et. al., Nature 446, 664-667 (2007).

CFinder offers a fast and efficient method for clustering data represented by large graphs, such as genetic or social networks and microarray data. CFinder is also very efficient for locating the cliques of large sparse graphs.

I rather like the title for the webpage as opposed to simply CFinder, which is what I started to use. Would be accurate but also wouldn’t capture the notion of discovering overlapping dense groups in networks.

Whether we realize it or not, the choice of a basis for relationships can produce or conceal any number of dense overlapping groups in a network.

I have mentioned CFinder elsewhere but wanted to call it out, in part to raise its position on my horizon.

Parallel clustering with CFinder

Filed under: CFinder,Clustering,Networks,Parallel Programming — Patrick Durusau @ 7:18 pm

Parallel clustering with CFinder by Peter Pollner, Gergely Palla, and Tamas Vicsek.

Abstract:

The amount of available data about complex systems is increasing every year, measurements of larger and larger systems are collected and recorded. A natural representation of such data is given by networks, whose size is following the size of the original system. The current trend of multiple cores in computing infrastructures call for a parallel reimplementation of earlier methods. Here we present the grid version of CFinder, which can locate overlapping communities in directed, weighted or undirected networks based on the clique percolation method (CPM). We show that the computation of the communities can be distributed among several CPU-s or computers. Although switching to the parallel version not necessarily leads to gain in computing time, it definitely makes the community structure of extremely large networks accessible.

If you aren’t familiar with CFinder, you should be.

May 1, 2012

Basic graph analytics using igraph

Filed under: Graphs,igraph,Networks — Patrick Durusau @ 4:47 pm

Basic graph analytics using igraph by Ricky Ho.

From the post:

Social Network Site such as Facebook, Twitter becomes are integral part of people’s life in. People interact with each other in different form of activities and a lot of information has been captured in the social network. Mining such a network can reveal some very useful information that can help an organization to gain competitive advantages.

I recently come across a powerful tools called igraph that provides some very powerful graph mining capabilities. Following are some interesting things that I have found.

Ricky doesn’t give a link to igraph, which you can find here. Development version.

He does cover:

  • Create a Graph
  • Basic Graph Algorithms
  • Graph Statistics
  • Centrality Measures

April 21, 2012

Complex Networks -> Local Networks

Filed under: Complex Networks,Graphs,Networks — Patrick Durusau @ 4:34 pm

The Game of Go: A Complex Network post reminds us that complex networks, with care, can be decomposed into local networks.

From the post:

Using a database containing around 5 000 games played by professional and amateur go players in international tournaments, Bertrand Georgeot from the Theoretical Physics Laboratory (UniversitĆ© Toulouse III-Paul Sabatier/CNRS) and Olivier Giraud from the Laboratory of Theoretical Physics and Statistical Models (UniversitĆ© Paris-Sud/CNRS) applied network theory to this game of strategy. They constructed a network whose nodes are local patterns on the board, while the vertices (which represent the links) reflect the sequence of moves. This enabled them to recapture part of the local game strategy. In this game, where players place their stones at the intersections of a grid consisting of 19 vertical and 19 horizontal lines (making 361 intersections), the researchers studied local patterns of 9 intersections. They showed that the statistical frequency distribution of these patterns follows Zipf’s law, similar to the frequency distribution of words in a language.

Although the go network’s features resemble those of other real networks (social networks or the Internet), it has its own specific properties. While the most recent simulation programs already include statistical data from real games, albeit at a still basic level, these new findings should allow better modeling of this kind of board game.

The researchers did not even attempt to solve the entire board but rather looked for “local” patterns on the board.

What “local patterns” are you missing in “big data?”

Article reference: The game of go as a complex network. B. Georgeot and O. Giraud 2012 EPL 97 68002.

Abstract:

We study the game of go from a complex network perspective. We construct a directed network using a suitable definition of tactical moves including local patterns, and study this network for different datasets of professional and amateur games. The move distribution follows Zipf’s law and the network is scale free, with statistical peculiarities different from other real directed networks, such as, e.g., the World Wide Web. These specificities reflect in the outcome of ranking algorithms applied to it. The fine study of the eigenvalues and eigenvectors of matrices used by the ranking algorithms singles out certain strategic situations. Our results should pave the way to a better modelization of board games and other types of human strategic scheming.

March 30, 2012

DiaGen/DiaMeta

Filed under: DiaGen,DiaMeta,Graphs,Networks — Patrick Durusau @ 4:38 pm

DiaGen/DiaMeta

From the webpage:

The Diagram Editor Generator

DiaGen

DiaGen is a system for easy developing of powerful diagram editors. It consists of two main parts:

  • A framework of Java classes that provide generic functionality for editing and analyzing diagrams.
  • A GUI tool (the DiaGen designer) for specifying the diagram language and automatically generating a visual editor from this specification.

The combination of the following main features distinguishes DiaGen from other existing diagram editing/analysis systems:

  • DiaGen editors include an analysis module to recognize the structure and syntactic correctness of diagrams on-line during the editing process. The structural analysis is based on hypergraph transformations and grammars, which provide a flexible syntactic model and allow for efficient parsing.
  • DiaGen has been specially designed for fault-tolerant parsing and handling of diagrams that are only partially correct.
  • DiaGen uses the structural analysis results to provide syntactic highlighting and an interactive automatic layout facility. The layout mechanism is based on flexible geometric constraints and relies on an external constraint-solving engine.
  • DiaGen combines free-hand editing in the manner of a drawing program with syntax-directed editing for major structural modifications of the diagram. The language implementor can therefore easily supply powerful syntax-oriented operations to support frequent editing tasks, but she does not have to worry about explicitly considering every editing requirement that may arise.
  • DiaGen is entirely written in Java and is based on Java SE (Version 6 is required). It is therefore platform-independent and can take full advantage of all the features of the Java2D graphics API: For example, DiaGen supports unrestricted zooming, and rendering quality is adjusted automatically during user interactions.

DiaGen uses hypergraph grammars to specify diagram languages. While this approach is powerful and its theory is very clear, it is hard for the unexperienced user to model an editor. A very common solution to this problem is meta-modeling, which is used by DiaMeta.

DiaMeta

DiaMeta allows to use meta models instead of grammars to specify visual languages. The current implementation employs the Eclipse Modeling Framework (EMF). Additionally, support for MOF (via MOFLON) is currently added. Editors generated by DiaMeta have the same benefits as those generated by DiaGen, and they show a similar behaviour like DiaGen.

If you need a custom diagram editor, this may be a good place to look around.

NodeXL: Network Overview, Discovery and Exploration for Excel

Filed under: Excel,Graphs,Networks,NodeXL — Patrick Durusau @ 4:37 pm

NodeXL: Network Overview, Discovery and Exploration for Excel

From the webpage:

NodeXL is a free, open-source template for MicrosoftĀ® ExcelĀ® 2007 and 2010 that makes it easy to explore network graphs. With NodeXL, you can enter a network edge list in a worksheet, click a button and see your graph, all in the familiar environment of the Excel window.

NodeXL Features

  • Flexible Import and Export Import and export graphs in GraphML, Pajek, UCINet, and matrix formats.
  • Direct Connections to Social Networks Import social networks directly from Twitter, YouTube, Flickr and email, or use one of several available plug-ins to get networks from Facebook, Exchange and WWW hyperlinks.
  • Zoom and Scale Zoom into areas of interest, and scale the graph’s vertices to reduce clutter.
  • Flexible Layout Use one of several “force-directed” algorithms to lay out the graph, or drag vertices around with the mouse. Have NodeXL move all of the graph’s smaller connected components to the bottom of the graph to focus on what’s important.
  • Easily Adjusted Appearance Set the color, shape, size, label, and opacity of individual vertices by filling in worksheet cells, or let NodeXL do it for you based on vertex attributes such as degree, betweenness centrality or PageRank.
  • Dynamic Filtering Instantly hide vertices and edges using a set of slidersā€”hide all vertices with degree less than five, for example.
  • Powerful Vertex Grouping Group the graph’s vertices by common attributes, or have NodeXL analyze their connectedness and automatically group them into clusters. Make groups distinguishable using shapes and color, collapse them with a few clicks, or put each group in its own box within the graph. “Bundle” intergroup edges to make them more manageable.
  • Graph Metric Calculations Easily calculate degree, betweenness centrality, closeness centrality, eigenvector centrality, PageRank, clustering coefficient, graph density and more.
  • Task Automation Perform a set of repeated tasks with a single click.

Homepage for NodeXL, which uses Excel as the framework for display and exploration of graphs.

There is something to be said about software that ties itself to other successful software. I think that is “increased chances of success.” Don’t you?

Structural Analysis of Large Networks: Observations and Applications

Filed under: Graphs,Networks,Social Networks — Patrick Durusau @ 4:34 pm

Structural Analysis of Large Networks: Observations and Applications by Mary McGlohon.

Abstract:

Network data (also referred to as relational data, social network data, real graph data) has become ubiquitous, and understanding patterns in this data has become an important research problem. We investigate how interactions in social networks are formed and how these interactions facilitate diffusion, model these behaviors, and apply these findings to real-world problems.

We examined graphs of size up to 16 million nodes, across many domains from academic citation networks, to campaign contributions and actor-movie networks. We also performed several case studies in online social networks such as blogs and message board communities.

Our major contributions are the following: (a) We discover several surprising patterns in network topology and interactions, such as Popularity Decay power law (in-links to a blog post decay with a power law with &emdash;1:5 exponent) and the oscillating size of connected components; (b) We propose generators such as the Butterfly generator that reproduce both established and new properties found in real networks; (c) several case studies, including a proposed method of detecting misstatements in accounting data, where using network effects gave a significant boost in detection accuracy.

A dissertation that establishes it isn’t the size of the network (think “web scale”) but the skill with which it is analyzed that is important.

McGlohon investigates the discovery of outliers, fraud and the like.

Worth reading and then formulating questions for your graph/graph database vendor about their support for such features.

The structure and function of complex networks (2003)

Filed under: Complex Networks,Graphs,Networks — Patrick Durusau @ 4:34 pm

The structure and function of complex networks by M. E. J. Newman (2003).

Abstract:

Inspired by empirical studies of networked systems such as the Internet, social networks, and biological networks, researchers have in recent years developed a variety of techniques and models to help us understand or predict the behavior of these systems. Here we review developments in this field, including such concepts as the small-world effect, degree distributions, clustering, network correlations, random graph models, models of network growth and preferential attachment, and dynamical processes taking place on networks.

Not the earliest survey of work on complex networks nor the latest but one that gives a good overview of the area. I will be citing later survey work on graphs and complex networks. Your pointers/suggestions are most welcome.

March 9, 2012

Global Brain Institute

Filed under: Artificial Intelligence,Networks — Patrick Durusau @ 8:44 pm

Global Brain Institute

From the webpage (under development):

The Global Brain can be defined as the distributed intelligence emerging from the planetary network of people and machinesā€”as supported by the Internet. The Global Brain Institute (GBI) was founded in January 2012 at the Vrije Universiteit Brussel to research this revolutionary phenomenon. The GBI grew out of the Global Brain Group, an international community of researchers founded in 1996.

MissionTim Berners-Lee’s breakthrough invention of the Web stems from a simple and easy way to link any kind of information, anywhere on Earth. Since then, the development of the web has been largely an erratic proliferation of mutually incompatibleWeb 2.0 technologies with no clear direction. This demands a new unified paradigm to facilitate their integration.

The Global Brain Institute intends to develop a theory of the global brain that would help us to understand and steer this on-going evolution towards ever-stronger interconnection between humans and machines. If successful, this would help us achieve a much higher level of distributed intelligence that would allow us to efficiently tackle global problems too complex for present approaches.

Objectives

  • Develop a theory of the Global Brain that may offer us a long-term vision of where our information society is heading.
  • Build a mathematical model and computer simulation of the structure and dynamics of the Global Brain.
  • Survey the most important developments in society and ICT that are likely to impact on the evolution of the Global Brain.
  • Compare these observations with the implications of the theory.
  • Investigate how both observed and theorized developments may contribute to the main indicators of globally intelligent organization:
    • education, democracy, freedom, peace, development, sustainability, well-being, etc.
  • Disseminate our understanding of the Global Brain towards a wider public, so as to make people aware of this impending revolution

Our approach

We see people, machines and software systems as agents that communicate via a complex network of communication links. Problems, questions or opportunities define challenges that may incite these agents to act.

Challenges that cannot be fully resolved by a single agent are normally propagated to one or more other agents, along the links in the network. These agents contribute their own expertise to resolving the challenge, and if necessary propagate the challenge further, until it is fully resolved. Thus, the skills and knowledge of the different agents are pooled into a collective intelligence much more powerful than the one of its individual members.

The propagation of challenges across the global network is a complex, self-organizing process, similar to the “spreading activation” that characterizes thinking in the human brain. This process will typically change the network by reinforcing useful links, while weakening the others. Thus, the network learns or adapts to new challenges, becoming more intelligent in the process.

Sounds to me like there are going to be subject identity issues galore in a project such as this one.

January 22, 2012

The Role of Social Networks in Information Diffusion

Filed under: Networks,Social Graphs,Social Media,Social Networks — Patrick Durusau @ 7:35 pm

The Role of Social Networks in Information Diffusion by Eytan Bakshy, Itamar Rosenn, Cameron Marlow and Lada Adamic.

Abstract:

Online social networking technologies enable individuals to simultaneously share information with any number of peers. Quantifying the causal effect of these technologies on the dissemination of information requires not only identification of who influences whom, but also of whether individuals would still propagate information in the absence of social signals about that information. We examine the role of social networks in online information diffusion with a large-scale field experiment that randomizes exposure to signals about friends’ information sharing among 253 million subjects in situ. Those who are exposed are significantly more likely to spread information, and do so sooner than those who are not exposed. We further examine the relative role of strong and weak ties in information propagation. We show that, although stronger ties are individually more influential, it is the more abundant weak ties who are responsible for the propagation of novel information. This suggests that weak ties may play a more dominant role in the dissemination of information online than currently believed.

Sample size: 253 million Facebook users.

Pay attention to the line:

We show that, although stronger ties are individually more influential, it is the more abundant weak ties who are responsible for the propagation of novel information.

If you have an “Web scale” (whatever that means) information delivery issue, you need to not only target CNN and Drudge with press releases but should consider targeting actors with abundant weak ties.

Thinking this could be important in topic map driven applications that “push” novel information into the social network of a large, distributed company. You know how few of us actually read the tiresome broadcast stuff from HR, etc., so what if the important parts were “reported” piecemeal by others?

It is great to have a large functioning topic map but it doesn’t become useful until people make the information it delivers their own and take action based upon it.

January 21, 2012

Getting Started With The Gephi…

Filed under: Gephi,Graphs,Networks,Visualization — Patrick Durusau @ 10:10 pm

Getting Started With The Gephi Network Visualisation App ā€“ My Facebook Network, Part I by Tony Hirst.

From the post:

A couple of weeks ago, I came across Gephi, a desktop application for visualising networks.

And quite by chance, a day or two after I was asked about any tools I knew of that could visualise and help analyse social network activity around an OU courseā€¦ which I take as a reasonable justification for exploring exactly what Gephi can do šŸ™‚

So, after a few false starts, hereā€™s what Iā€™ve learned so farā€¦

First up, we need to get some graph data ā€“ netvizz ā€“ facebook to gephi suggests that the netvizz facebook app can be used to grab a copy of your Facebook network in a format that Gephi understands, so I installed the app, downloaded my network file, and then uninstalled the appā€¦ (canā€™t be too careful šŸ˜‰

Once Gephi is launched (and updated, if itā€™s a new download ā€“ youā€™ll see an updates prompt in the status bar along the bottom of the Gephi window, right hand side) Openā€¦ the network file you downloaded.

If you like part 1 as an introduction to Gephi, be sure to take in:

Getting Started With Gephi Network Visualisation App ā€“ My Facebook Network, Part II: Basic Filters

which starts out:

In Getting Started With Gephi Network Visualisation App ā€“ My Facebook Network, Part I I described how to get up and running with the Gephi network visualisation tool using social graph data pulled out of my Facebook account. In this post, Iā€™ll explore some of the tools that Gephi provides for exploring a network in a more structured way.

If you arenā€™t familiar with Gephi, and if you havenā€™t read Part I of this series, I suggest you do so nowā€¦

ā€¦done thatā€¦?

Okay, so where do we begin? As before, Iā€™m going to start with a fresh worksheet, and load my Facebook network data, downloaded via the netvizz app, into Gephi, but as an undirected graph this time! So far, so exactly the same as last time. Just to give me some pointers over the graph, Iā€™m going to set the node size to be proportional to the degree of each node (that is, the number of people each person is connected to).

You will find lots more to explore with Gephi but this should give you a good start.

January 16, 2012

Graph Theory and Network Science (Aurelius)

Filed under: Graphs,Neo4j,Networks — Patrick Durusau @ 2:37 pm

Graph Theory and Network Science (Aurelius)

When a post that starts out:

Graph theory and network science are two related academic fields that have found application in numerous commercial industries. The terms ā€˜graphā€™ and ā€˜networkā€™ are synonymous and one or the other is favored depending on the domain of application. A Rosetta Stone of terminology is provided below to help ground the academic terms to familiar, real-world structures.

And ends:

Ranking web pages is analogous to determining the most influential people in a social network or finding the most relevant concepts in a knowledge network. Finally, all these problems are variations of one general processā€”graph traversing. Graph traversing is the simple process of moving from one vertex to another vertex over the edges in the graph and either mutating the structure or collecting bits of information along the way. The result of a traversal is either an evolution of the graph or a statistic about the graph.

The tools and techniques developed by graph theorists and networks scientists has an astounding number of practical applications. Interestingly enough, once one has a general understanding of graph theory and network science, the worldā€™s problems start to be seen as one in the same problem.

With about as nice an introduction to why graphs/networks are important as I have read in a long time, you know it is going to be a good day!

Particularly when the source cited on graph traversal is none other than Marko A. Rodriguez and Peter Neubauer of Neo4j fame. (They may be famous for other reasons as well but you will have to contribute those.) (Yes, I noticed who is associated with the site.)

I find the notion of mutating the structure of a graph based on traversal to be deeply interesting and useful in a topic maps context.

January 15, 2012

On the Hyperbolicity of Small-World Networks and Tree-Like Graphs

Filed under: Graphs,Networks,Small World — Patrick Durusau @ 9:11 pm

On the Hyperbolicity of Small-World Networks and Tree-Like Graphs by Wei Chen, Wenjie Fang, Guangda Hu and Michael W. Mahoney.

Abstract:

Hyperbolicity is a property of a graph that may be viewed as being a “soft” version of a tree, and recent empirical and theoretical work has suggested that many graphs arising in Internet and related data applications have hyperbolic properties. Here, we consider Gromov’s notion of $\delta$-hyperbolicity, and we establish several positive and negative results for small-world and tree-like random graph models. In particular, we show that small-world random graphs built from underlying grid structures do not have strong improvement in hyperbolicity, even when the rewiring greatly improves decentralized navigation. On the other hand, for a class of tree-like graphs called ringed trees that have constant hyperbolicity, adding random links among the leaves in a manner similar to the small-world graph constructions may easily destroy the hyperbolicity of the graphs, except for a class of random edges added using an exponentially decaying probability function based on the ring distance among the leaves. In addition to being of interest in their own right, our main results shed light on the relationship between hyperbolicity and navigability, as well as the relationship between $\delta$-hyperbolicity and the use of randomness in common random graph constructions.

Something to keep you off the streets after work this coming week. šŸ˜‰

To understand why this work (and work like it is important):

Hyperbolicity, a property of metric spaces that generalizes the idea of Riemannian manifolds with negative curvature, has received considerable attention in both mathematics and computer science. When applied to graphs, as is typical in computer science applications, one may think of hyperbolicity as characterizing a ā€œsoftā€ version of a treeā€”trees are graphs that have hyperbolicity equal to zero, and graphs that ā€œlook likeā€ trees in terms of their metric structure have ā€œsmallā€ hyperbolicity. Since trees are an important class of graphs and since tree-like graphs arise in numerous applications, the idea of hyperbolicity has received attention in a range of applications. For example, it has found usefulness in the visualization of the Internet, the Web, and other large graphs []; it has been applied to questions of compact routing, navigation, and decentralized search in Internet graphs and small-world social networks []; and it has been applied to a range of other problems such as distance estimation, network security, sensor networks, and traffic flow and congestion minimization []. (cites omitted)

Interesting question to which I don’t know the answer off hand: Do topic maps exhibit the characteristics of a “small-world network?” Or for that matter, has anyone investigated the nature of the graphs that result from topic maps?

I will be blogging more about the graph nature of topic maps. Please post pointers, suggestions, questions.

January 14, 2012

Sufficient Conditions for Formation of a Network Topology by Self-interested Agents

Filed under: Graphs,Networks — Patrick Durusau @ 7:41 pm

Sufficient Conditions for Formation of a Network Topology by Self-interested Agents by Swapnil Dhamal and Y. Narahari.

Abstract:

The current literature on network formation primarily addresses the problem: given a set of self-interested nodes and a set of conditions, what topologies are pairwise stable and hence are likely to emerge. A pairwise stable network is one in which no node wants to delete any of its links and no two nodes would want to create a link between them. Pairwise stable networks span a wide range of topologies and some of them might be far from desirable. Motivated by the necessity for ensuring that the emerging network is exactly a desired one, we study the following reverse problem: given a network topology, what conditions are required so that best response strategies played by self-interested agents ultimately result in that network topology. We propose a sequential network formation game model that captures principal determinants of network formation, namely benefits from immediate neighbors, costs of maintaining links with immediate neighbors, benefits from indirect neighbors, and bridging benefits. Based on this model, we analyze some common network topologies, namely star, complete graph and bipartite Turán graph, and derive a set of sufficient conditions under which these network topologies emerge.

I mention this to suggest that as future work, merging of nodes based on the subject they represent should be considered as a benefit. Depending upon linking, merger could result in a reduction of linking between nodes, something the authors already recognize as a potential benefit. The potential exists for some merging operations to be found to be too expensive, either temporarily or longer term.

Rather than an axiomatic merging rule, a cost/benefit merging rule that is invoked only in some circumstances.

January 11, 2012

Social Networks and Archival Context Project (SNAC)

Filed under: Archives,Networks,Social Graphs,Social Networks — Patrick Durusau @ 8:03 pm

Social Networks and Archival Context Project (SNAC)

From the homepage:

The Social Networks and Archival Context Project (SNAC) will address the ongoing challenge of transforming description of and improving access to primary humanities resources through the use of advanced technologies. The project will test the feasibility of using existing archival descriptions in new ways, in order to enhance access and understanding of cultural resources in archives, libraries, and museums.

Archivists have a long history of describing the people whoā€”acting individually, in families, or in formally organized groupsā€”create and collect primary sources. They research and describe the people who create and are represented in the materials comprising our shared cultural legacy. However, because archivists have traditionally described records and their creators together, this information is tied to specific resources and institutions. Currently there is no system in place that aggregates and interrelates those descriptions.

Leveraging the new standard Encoded Archival Context-Corporate Bodies, Persons, and Families (EAC-CPF), the SNAC Project will use digital technology to ā€œunlockā€ descriptions of people from finding aids and link them together in exciting new ways.

On the Prototype page you will find the following description:

While many of the names found in finding aids have been carefully constructed, frequently in consultation with LCNAF, many other names present extraction and matching challenges. For example, many personal names are in direct rather than indirect (or catalog entry) order. Life dates, if present, some times appear in parentheses or brackets. Numerous names some times appear in the same <persname>, <corpname>, or <famname>. Many names are incorrectly tagged, for example, a personal name tagged as a .

We will continue to refine the extraction and matching algorithms over the course of the project, but it is anticipated that it will only be possible to address some problems through manual editing, perhaps using “professional crowd sourcing.”

While the project is still a prototype, it occurs to me that it would make a handy source of identifiers.

Try:

Or one of the many others you will find at: Find Corporate, Personal, and Family Archival Context Records.

OK, now I have a question for you: All of the foregoing also appear in Wikipedia.

For your comparison:

If you could choose only one identifier for a subject, would you choose the SNAC or the Wikipedia links?

I ask because some semantic approaches take a “one ring” approach to identification. Ignoring the existence of multiple identifiers, even URL identifiers for the same subjects.

Of course, you already know that with topic maps you can have multiple identifiers for any subject.

In CTM syntax:

bush-vannevar
href=”http://socialarchive.iath.virginia.edu/xtf/view?docId=bush-vannevar-1890-1974-cr.xml ;
href=”http://en.wikipedia.org/wiki/Vannevar_Bush ;
– “Vannevar Bush” ;
– varname: “Bush, Vannevar, 1890-1974” ;
– varname: “Bush, Vannevar, 1890-” .

Which of course means that if I want to make a statement about the webpage for Vannevar Bush at Wikipedia, I can do so without any confusion:

wikipedia-vannevar-bush
= href=”http://en.wikipedia.org/wiki/Vannevar_Bush ;
descr: “URL as subject locator.” .

Or I can comment on a page at SNAC and map additional information to it. And you will always know if I am using the URL as an identifier or to point you towards a subject.

January 7, 2012

Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting

Filed under: Graphs,Law,Networks — Patrick Durusau @ 3:56 pm

Network Analysis and Law: Introductory Tutorial @ Jurix 2011 Meeting

Slides from a tutorial given by Daniel Martin Katz at the Jurix 2011 meeting.

Runs 317 slides but is awash in links to resources and software.

You will either learn a lot about network analysis or if you already know network analysis, you will be entertained and informed.

Saw it referenced at: http://computationallegalstudies.com/

January 4, 2012

Big Brother’s Name is…

Filed under: Marketing,Networks,Social Media,Social Networks — Patrick Durusau @ 7:09 am

not the FBI, CIA, Interpol, Mossad, NSA or any other government agency.

Walmart all but claims that name at: Social Genome.

From the webpage:

In a sense, the social world ā€” all the millions and billions of tweets, Facebook messages, blog postings, YouTube videos, and more ā€“ is a living organism itself, constantly pulsating and evolving. The Social Genome is the genome of this organism, distilling it to the most essential aspects.

At the labs, we have spent the past few years building and maintaining the Social Genome itself. We do this using public data on the Web, proprietary data, and a lot of social media. From such data we identify interesting entities and relationships, extract them, augment them with as much information as we can find, then add them to the Social Genome.

For example, when Susan Boyle was first mentioned on the Web, we quickly detected that she was becoming an interesting person in the world of social media. So we added her to the Social Genome, then monitored social media to collect more information about her. Her appearances became events, and the bigger events were added to the Social Genome as well. As another example, when a new coffee maker was mentioned on the Web, we detected and added it to the Social Genome. We strive to keep the Social Genome up to date. For example, we typically detect and add information from a tweet into the Social Genome within two seconds, from the moment the tweet arrives in our labs.

As a result of our effort, the Social Genome is a vast, constantly changing, up-to-date knowledge base, with hundreds of millions of entities and relationships. We then use the Social Genome to perform semantic analysis of social media, and to power a broad array of e-commerce applications. For example, if a user never uses the word ā€œcoffeeā€, but has mentioned many gourmet coffee brands (such as ā€œKopi Luwakā€) in his tweets, we can use the Social Genome to detect the brands, and infer that he is interested in gourmet coffee. As another example, using the Social Genome, we may find that a user frequently mentions movies in her tweets. As a result, when she tweeted ā€œI love salt!ā€, we can infer that she is probably talking about the movie ā€œsaltā€, not the condiment (both of which appear as entities in the Social Genome).

Two seconds after you hit “send” on your tweet, it has been stripped, analyzed and added to the Social Genome at WalMart. For every tweet. Plus other data.

How should we respond to this news?

One response is to trust that WalMart and whoever it sells this data trove to, will use the information to enhance your shopping experience and achieve greater fulfilment by balancing shopping against your credit limit.

Another response is to ask for legislation to attempt regulation of a multi-national corporation that is larger than many governments.

Another response is to hold sit-ins and social consciousness raising events at WalMart locations.

My suggestion? One good turn deserves another.

WalMart is owned by someone. Walmart has a board of directors. Walmart has corporate officers. Walmart has managers, sales representatives, attorneys and advertising executives. All of who have information footprints. Perhaps not as public as ours, but they exist. Wny not gather up information on who is running Walmart? Fighting fire with fire as they say. Publish that information so that regulators, stock brokers, divorce lawyers and others can have access to it.

Let’s welcome WalMart as “Little Big Brothers.”

January 3, 2012

Tribeforth

Filed under: Networks,Politics — Patrick Durusau @ 5:14 pm

Tribeforth

From the homepage:

Tribeforth Foundation is a group of people developing and promoting a collective intelligence computer system to assist in stimulating new solutions ideas and connections on a global scale. The system as it is planned is not unlike an every day wiki. The key difference is that millions can speak as one with out losing a voice and the software tunes the conversation into reason. This keeps us from getting lost in syntax and helping us to work with the real semantics.

Heavily rooted in collective intelligence and the semantic web (Web 3.0) we are organizing a collection of open source software and then extending them to create the most high tech discussion platform in human history. Available to anyone, anywhere as a basic standard of living.

A handful of powerful, fundamental principles and values guide us here at Tribeforth. We use these principles to create new tools for all of us

A project built on the principles of self reflection an echo of human ingenuity.

I don’t know if topic maps would be of assistance or not but when you are talking about making connections that persist across semantic boundaries (my words, not theirs), then you are going to need topic maps or something very similar.

I suppose I am a bit old school for the disclaimer:

THE TRIBEFORTH SYSTEM WILL NOT COLLECT ANY INFORMATION REGARDING MILITARY PERSONNEL, SYSTEMS, EQUIPMENT, PLANNING OR DEPLOYMENT. INCIDENTS REGARDING HUMAN RIGHTS VIOLATIONS ARE NOT SUBJECT TO THIS POLICY.

Existing solutions/structures are not going to go into the night quietly. That is a historical certainty. I would rather be prepared for the push back.

Voting Networks in the Danish Parliament [2004-2011]

Filed under: Graphs,Networks,R — Patrick Durusau @ 5:12 pm

Voting Networks in the Danish Parliament [2004-2011]

From the post:

One of my Christmas presents was the book Beautiful Visualization. Chapter 8 by Andrew Odewahn is a very nice piece on visualizing the U.S Senate social graph. Odewahn basically builds an affinity network, where ties represent whether two senator have voted in the same manner during a given time period. The rules for creating the network are nicely broken down to the following steps:

  1. Nodes represent senators
  2. Nodes are colored according to party affiliation
  3. Nodes are connected with an edge if two senators voted together more than 65% of the time during a given timeframe

Based on the above rules Odewahn builds a series of interesting graphs, showing that there are a few consistently bipartisan senators on both sides in almost every session of the Congress.

But rather than just grousing about American politics (don’t get me started), the author produces voting network graphs of the Danish Parliament!

I leave it for you to decide if the results signal hope or despair. šŸ˜‰

December 29, 2011

Backbone of the flavor network

Filed under: Food,Graphs,Networks — Patrick Durusau @ 9:12 pm

Backbone of the flavor network by Nathan Yau at FlowingData.

From the post:

Food flavors across cultures and geography vary a lot. Some cuisines use a lot of scallion and ginger, whereas another might use a lot of onion and butter. Then again, everyone seems to use garlic. Yong-Yeol Ahn, et al. took a closer look at what makes food taste different, breaking ingredients into flavor compounds and examining what the ingredients had in common. A flavor network was the result:

Each node denotes an ingredient, the node color indicates food category, and node size reflects the ingredient prevalence in recipes. Two ingredients are connected if they share a significant number of flavor compounds, link thickness representing the number of shared compounds between the two ingredients. Adjacent links are bundled to reduce the clutter.

Mushrooms and liver are on the edges, out on their lonesome.

You really need to see the graph/network with the post.

I am not ready to throw over my Julia Child’s cookbook in favor of using it to create recipes but it is an impressive piece of work.

Certainly could figure into being data that is merged into a recipe topic map to explain (possibly) substitutions or possible substitutions of ingredients.

Any foodies in the house?

December 26, 2011

Hive Plots

Filed under: Hive Plots,Networks,Visualization — Patrick Durusau @ 8:24 pm

Hive Plots

From the website:

Hive plots ā€” for the impatient

The hive plot is a rational visualization method for drawing networks. Nodes are mapped to and positioned on radially distributed linear axes ā€” this mapping is based on network structural properties. Edges are drawn as curved links. Simple and interpretable.

The purpose of the hive plot is to establish a new baseline for visualization of large networks ā€” a method that is both general and tunable and useful as a starting point in visually exploring network structure.

You will really have to visit the link to properly experience hive plots. No description on my part would really be adequate.

December 25, 2011

Arnetminer

Filed under: Networks,Social Networks — Patrick Durusau @ 6:07 pm

Arnetminer: search and mining of academic social networks

From the webpage:

Arnetminer (arnetminer.org) aims to provide comprehensive search and mining services for researcher social networks. In this system, we focus on: (1) creating a semantic-based profile for each researcher by extracting information from the distributed Web; (2) integrating academic data (e.g., the bibliographic data and the researcher profiles) from multiple sources; (3) accurately searching the heterogeneous network; (4) analyzing and discovering interesting patterns from the built researcher social network. The main search and analysis functions in arnetminer include:

  • Profile search: input a researcher name (e.g., Jie Tang), the system will return the semantic-based profile created for the researcher using information extraction techniques. In the profile page, the extracted and integrated information include: contact information, photo, citation statistics, academic achievement evaluation, (temporal) research interest, educational history, personal social graph, research funding (currently only US and CN), and publication records (including citation information, and the papers are automatically assigned to several different domains).
  • Expert finding: input a query (e.g., data mining), the system will return experts on this topic. In addition, the system will suggest the top conference and the top ranked papers on this topic. There are two ranking algorithms, VSM and ACT. The former is similar to the conventional language model and the latter is based on our Author-Conference-Topic (ACT) model. Users can also provide feedbacks to the search results.
  • Conference analysis: input a conference name (e.g., KDD), the system returns who are the most active researchers on this conference, and the top-ranked papers.
  • Course search: input a query (e.g., data mining), the system will tell you who are teaching courses relevant to the query.
  • Associate search: input two researcher names, the system returns the association path between the two researchers. The function is based on the well-known "six-degree separation" theory.
  • Sub-graph search: input a query (e.g., data mining), the system first tells you what topics are relevant to the query (e.g., five topics "Data mining", "XML Data", "Data Mining / Query Processing", "Web Data / Database design", "Web Mining" are relevant), and then display the most important sub-graph discovered on each relevant topic, augmented with a summary for the sub-graph.
  • Topic browser: based on our Author-Conference-Topic (ACT) model, we automatically discover 200 hot topics from the publications. For each topic, we automatically assign a label to represent its meanings. Furthermore, the browser presents the most active researchers, the most relevant conferences/papers, and the evolution trend of the topic is discovered.
  • Academic ranks: we define 8 measures to evaluate the researcher's achievement. The measures include "H-index", "Citation", "Uptrend, "Activity", "Longevity", "Diversity, "Sociability", "New Star". For each measure, we output a ranking list in different domains. For example, one can search who have the highest citation number in the "data mining" domain.
  • User management: one can register as a user to: (1) modify the extracted profile information; (2) provide feedback on the search results; (3) follow researchers in arnetminer; (4) create an arnetminer page (which can be used to advertise confs/workshops, or recruit students).

Arnetminer.org has been in operation on the internet for more than three years. Currently, the academic network includes more than 6,000 conferences, 3,200,000 publications, 700,000 researcher profiles. The system attracts users from more than 200 countries and receives >200,000 access logs per day. The top five countries where users come from are United States, China, Germany, India, and United Kingdom.

A rich data source and a way to explore who’s who in particular domains.

December 18, 2011

Arts, Humanities, and Complex Networks…

Filed under: Conferences,Graphs,Networks — Patrick Durusau @ 8:46 pm

Arts, Humanities, and Complex Networks ā€” 3rd Leonardo satellite symposium at NetSci2012

Dates:

The deadline for applications is March 16, 2012.
Decisions for acceptance will be sent out by March 30, 2012.
The symposium will take place at Northwestern University near Chicago, IL on the shores of Lake Michigan on June 19, 2012.

From the webpage:

We are pleased to announce the third Leonardo satellite symposium at NetSci2012 on Arts, Humanities, and Complex Networks. The aim of the symposium is to foster cross-disciplinary research on complex systems within or with the help of arts and humanities.

The symposium will highlight arts and humanities as an interesting source of data, where the combined experience of arts, humanities research, and natural science makes a huge difference in overcoming the limitations of artificially segregated communities of practice.

Furthermore, the symposium will focus on striking examples, where artists and humanities researchers make an impact within the natural sciences. By bringing together network scientists and specialists from the arts and humanities we strive for a better understanding of networks and their visualizations in general.

The overall mission is to bring together pioneer work, leveraging previously unused potential by developing the right questions, methods, and tools, as well as dealing with problems of information accuracy and incompleteness. Running parallel to the NetSci2012 conference, the symposium will also provide a unique opportunity to mingle with leading researchers and practitioners of complex network science, potentially sparking fruitful collaborations.

In addition to keynotes and interdisciplinary discussion, we are looking for a number of contributed talks. Selected papers will be published in print in a Special Section of Leonardo Journal (MIT Press), as well as online in Leonardo Transactions.

For previous edition papers and video presentations please visit the following URLs:
2010 URL: http://artshumanities.netsci2010.net
2011 URL: http://artshumanities.netsci2011.net

This sounds seriously cool!

You do realize the graphs and networks of the “hard” sciences are impoverished when compared to the networks encountered on a daily basis by humanists? In the humanities, some of the nodes and edges can only be deduced from their impact on other nodes and edges. And are themselves subject to being influenced by other unseen and perhaps unknowable nodes and edges.

Still, it can be instructive to simplify a graph from the humanities for representation by hard science techniques. At least we can gain some sense of what has to be thrown away from the humanities side.

December 14, 2011

Network Graph Visualizer

Filed under: Collaboration,Networks,Software,Visualization — Patrick Durusau @ 7:44 pm

Network Graph Visualizer

I ran across this at Github while tracking the progress of a project.

Although old hat (2008), I thought it worth pointing out as a graph that has one purpose, to keep developers informed of each others’ activities in a collaborative environment, and it does that very well.

I suspect there is a lesson there for topic map software (or even software in general).

December 2, 2011

Netwitness

Filed under: Networks,Security — Patrick Durusau @ 4:56 pm

Netwitness

I was researching the InfiniteGraph post but the splash on the homepage of Netwitness: “Know Everything Answer Anything: The Revolutionary Approach to Network Monitoring” caught my eye.

You really need to watch the intro videos for NetWitness Investigator. I thought it was impressive but then I used command line tools more than a decade ago for similar purposes. And the art has changed a lot since then.

Although I think the interface is “busy,” I did like the idea of blurring the separation of querying and navigation.

That is an idea we would do well to think about for topic maps. (Would depend on the domain.)

The NetWitness Investigator is a free download so I am going to play with a copy. (Well, on my Windows box b/c Linux is available for the commercial version.)

« Newer PostsOlder Posts »

Powered by WordPress