Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 12, 2014

High-Performance Browser Networking

Filed under: Networks,Topic Map Software,WWW — Patrick Durusau @ 10:42 am

High-Performance Browser Networking by Ilya Grigorik.

From the foreword:

In High Performance Browser Networking, Ilya explains many whys of networking: Why latency is the performance bottleneck. Why TCP isn’t always the best transport mechanism and UDP might be your better choice. Why reusing connections is a critical optimization. He then goes even further by providing specific actions for improving networking performance. Want to reduce latency? Terminate sessions at a server closer to the client. Want to increase connection reuse? Enable connection keep-alive. The combination of understanding what to do and why it matters turns this knowledge into action.

Ilya explains the foundation of networking and builds on that to introduce the latest advances in protocols and browsers. The benefits of HTTP 2.0 are explained. XHR is reviewed and its limitations motivate the introduction of Cross-Origin Resource Sharing. Server-Sent Events, WebSockets, and WebRTC are also covered, bringing us up to date on the latest in browser networking.

Viewing the foundation and latest advances in networking from the perspective of performance is what ties the book together. Performance is the context that helps us see the why of networking and translate that into how it affects our website and our users. It transforms abstract specifications into tools that we can wield to optimize our websites and create the best user experience possible. That’s important. That’s why you should read this book.

Network latency may be responsible for a non-responsive app but can you guess who the user is going to blame?

Right in one, the app!

“Not my fault” isn’t a line item on any bank deposit form.

You or someone on your team needs to be tasked with performance, including reading High-Performance Browser Networking.

I first saw this in a tweet by Jonas Bonér

May 10, 2014

Self organising hypothesis networks

Filed under: Medical Informatics,Networks,Self-Organizing — Patrick Durusau @ 3:51 pm

Self organising hypothesis networks: a new approach for representing and structuring SAR knowledge by Thierry Hanser, et al. (Journal of Cheminformatics 2014, 6:21)

Abstract:

Background

Combining different sources of knowledge to build improved structure activity relationship models is not easy owing to the variety of knowledge formats and the absence of a common framework to interoperate between learning techniques. Most of the current approaches address this problem by using consensus models that operate at the prediction level. We explore the possibility to directly combine these sources at the knowledge level, with the aim to harvest potentially increased synergy at an earlier stage. Our goal is to design a general methodology to facilitate knowledge discovery and produce accurate and interpretable models.

Results

To combine models at the knowledge level, we propose to decouple the learning phase from the knowledge application phase using a pivot representation (lingua franca) based on the concept of hypothesis. A hypothesis is a simple and interpretable knowledge unit. Regardless of its origin, knowledge is broken down into a collection of hypotheses. These hypotheses are subsequently organised into hierarchical network. This unification permits to combine different sources of knowledge into a common formalised framework. The approach allows us to create a synergistic system between different forms of knowledge and new algorithms can be applied to leverage this unified model. This first article focuses on the general principle of the Self Organising Hypothesis Network (SOHN) approach in the context of binary classification problems along with an illustrative application to the prediction of mutagenicity.

Conclusion

It is possible to represent knowledge in the unified form of a hypothesis network allowing interpretable predictions with performances comparable to mainstream machine learning techniques. This new approach offers the potential to combine knowledge from different sources into a common framework in which high level reasoning and meta-learning can be applied; these latter perspectives will be explored in future work.

One interesting feature of this publication is a graphic abstract:

abstract

Assuming one could control the length of the graphic abstracts, that would be an interesting feature for conference papers.

What should be the icon for repeating old news before getting to the new stuff? 😉

Among a number of good points in this paper, see in particular:

  • Distinction between SOHN and “a Galois lattice used in Formal Concept
    Analysis [19] (FCA)” (at page 10).
  • Discussion of the transparency of this approach at page 21.

In a very real sense, announcing an answer to a medical question may be welcome, but it isn’t very informative. Nor will it enable others to advance the medical arts.

Other domains where answers are important but how you arrived at an answer is equally important if not more so?

May 2, 2014

Experimental CS – Networks

Filed under: Computer Science,Networks — Patrick Durusau @ 8:00 pm

Design and analysis of experiments in networks: Reducing bias from interference by Dean Eckles, Brian Karrer, and, Johan Ugander.

Abstract:

Estimating the effects of interventions in networks is complicated when the units are interacting, such that the outcomes for one unit may depend on the treatment assignment and behavior of many or all other units (i.e., there is interference). When most or all units are in a single connected component, it is impossible to directly experimentally compare outcomes under two or more global treatment assignments since the network can only be observed under a single assignment. Familiar formalism, experimental designs, and analysis methods assume the absence of these interactions, and result in biased estimators of causal effects of interest. While some assumptions can lead to unbiased estimators, these assumptions are generally unrealistic, and we focus this work on realistic assumptions. Thus, in this work, we evaluate methods for designing and analyzing randomized experiments that aim to reduce this bias and thereby reduce overall error. In design, we consider the ability to perform random assignment to treatments that is correlated in the network, such as through graph cluster randomization. In analysis, we consider incorporating information about the treatment assignment of network neighbors. We prove sufficient conditions for bias reduction through both design and analysis in the presence of potentially global interference. Through simulations of the entire process of experimentation in networks, we measure the performance of these methods under varied network structure and varied social behaviors, finding substantial bias and error reductions. These improvements are largest for networks with more clustering and data generating processes with both stronger direct effects of the treatment and stronger interactions between units.

Deep sledding but that is to be expected as CS matures and abandons simplistic models, such as non-interaction between units in a network.

While I was reading the abstract, it occurred to me that merges that precipitate other merges could be said to cause interaction between topics.

Since the authors found error reduction in networks with as few as 1,000 vertices, you should not wait until you are building very large topic maps to take this paper into account.

March 25, 2014

Network Analysis and the Law:…

Filed under: Law,Networks — Patrick Durusau @ 7:35 pm

Network Analysis and the Law: Measuring the Legal Importance of Precedents at the U.S. Supreme Court by James H. Fowler, et al.

Abstract:

We construct the complete network of 26,681 majority opinions written by the U.S. Supreme Court and the cases that cite them from 1791 to 2005. We describe a method for using the patterns in citations within and across cases to create importance scores that identify the most legally relevant precedents in the network of Supreme Court law at any given point in time. Our measures are superior to existing network-based alternatives and, for example, offer information regarding case importance not evident in simple citation counts. We also demonstrate the validity of our measures by showing that they are strongly correlated with the future citation behavior of state courts, the U.S. Courts of Appeals, and the U.S. Supreme Court. In so doing, we show that network analysis is a viable way of measuring how central a case is to law at the Court and suggest that it can be used to measure other legal concepts.

Danny Bickson pointed this paper out in: Spotlight: Ravel Law – introducing graph analytics to law research.

Interesting paper but remember that models are just that, models. Subsets of a more complex reality.

For example, I don’t know of any models of the Supreme Court (U.S.) that claim to be able to predict The switch in time that saved nine. If you don’t know the story, it makes really interesting reading. I won’t spoil the surprise but you will come away feeling the law is less “fixed” than you may have otherwise thought.

I commend this paper to you but if you need of legal advice, it’s best to consult an attorney and not an model.

February 20, 2014

Mapping Twitter Topic Networks:…

Filed under: Networks,Politics,Skepticism,Tweets — Patrick Durusau @ 9:13 pm

Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters by Marc A. Smith, Lee Rainie, Ben Shneiderman and Itai Himelboim.

From the post:

Conversations on Twitter create networks with identifiable contours as people reply to and mention one another in their tweets. These conversational structures differ, depending on the subject and the people driving the conversation. Six structures are regularly observed: divided, unified, fragmented, clustered, and inward and outward hub and spoke structures. These are created as individuals choose whom to reply to or mention in their Twitter messages and the structures tell a story about the nature of the conversation.

If a topic is political, it is common to see two separate, polarized crowds take shape. They form two distinct discussion groups that mostly do not interact with each other. Frequently these are recognizably liberal or conservative groups. The participants within each separate group commonly mention very different collections of website URLs and use distinct hashtags and words. The split is clearly evident in many highly controversial discussions: people in clusters that we identified as liberal used URLs for mainstream news websites, while groups we identified as conservative used links to conservative news websites and commentary sources. At the center of each group are discussion leaders, the prominent people who are widely replied to or mentioned in the discussion. In polarized discussions, each group links to a different set of influential people or organizations that can be found at the center of each conversation cluster.

While these polarized crowds are common in political conversations on Twitter, it is important to remember that the people who take the time to post and talk about political issues on Twitter are a special group. Unlike many other Twitter members, they pay attention to issues, politicians, and political news, so their conversations are not representative of the views of the full Twitterverse. Moreover, Twitter users are only 18% of internet users and 14% of the overall adult population. Their demographic profile is not reflective of the full population. Additionally, other work by the Pew Research Center has shown that tweeters’ reactions to events are often at odds with overall public opinion— sometimes being more liberal, but not always. Finally, forthcoming survey findings from Pew Research will explore the relatively modest size of the social networking population who exchange political content in their network.

Great study on political networks but all the more interesting for introducing an element of sanity into discussions about Twitter.

At a minimum, Twitter having 18% of all Internet users and 14% of the overall adult population casts serious doubt on metrics using Twitter to rate software popularity. (“It’s all we have” is a pretty lame excuse for using bad metrics.)

Not to say it isn’t important to mine Twitter data for what content it holds but at the same time to remember Twitter isn’t the world.

I first saw this at Mapping Twitter Topic Networks: From Polarized Crowds to Community Clusters by FullTextReports.

January 8, 2014

BIIIG:…

Filed under: BI,Graphs,Neo4j,Networks — Patrick Durusau @ 8:03 pm

BIIIG : Enabling Business Intelligence with Integrated Instance Graphs by André Petermann, Martin Junghanns, Robert Müller, Erhard Rahm.

Abstract:

We propose a new graph-based framework for business intelligence called BIIIG supporting the flexible evaluation of relationships between data instances. It builds on the broad availability of interconnected objects in existing business information systems. Our approach extracts such interconnected data from multiple sources and integrates them into an integrated instance graph. To support specific analytic goals, we extract subgraphs from this integrated instance graph representing executed business activities with all their data traces and involved master data. We provide an overview of the BIIIG approach and describe its main steps. We also present initial results from an evaluation with real ERP data.

Very interesting paper because on one hand it talks about merging data from heterogeneous data sets and at the same time claims to be using Neo4j.

In case you didn’t know, Neo4j enforces normalization and doesn’t have a concept of merging nodes. (True, Cypher has a “merge” operator but it doesn’t “merge” nodes in any meaningful sense of the word. Either a node is matched or a new node is created. Not how I interpret “merge.”)

It took more than one read but in puzzling over:

For integrated objects we can merge the properties from the sources. For the example in Fig. 2, we can combine employees objects with CIT.employees.erp_empl_number = ERP.EmplyeeTable.number and merge their properties from both sources (name, degree, dob, address, phone).

I realized the authors were producing a series of graphs where only the final version of the graph has the “merged” nodes. If you notice, the nodes are created first and then populated with associations, which resolves the question of using different pointers from the original sources.

The authors also point out that Neo4j cannot manage sets of graphs. I had overlooked that point. That is a fairly severe limitation.

Do spend some time at the Database Group Leipzig. There are several other recent papers that look very interesting.

Introducing mangal,…

Filed under: Environment,Graphs,Networks,R,Taxonomy — Patrick Durusau @ 9:37 am

Introducing mangal, a database for ecological networks

From the post:

Working with data on ecological networks is usually a huge mess. Most of the time, what you have is a series of matrices with 0 and 1, and in the best cases, another file with some associated metadata. The other issue is that, simply put, data on ecological networks are hard to get. The Interaction Web Database has some, but it's not as actively maintained as it should, and the data are not standardized in any way. When you need to pull a lot of networks to compare them, it means that you need to go through a long, tedious, and error-prone process of cleaning and preparing the data. It should not be that way, and that is the particular problem I've been trying to solve since this spring.

About a year ago, I discussed why we should have a common language to represent interaction networks. So with this idea in mind, and with great feedback from colleagues, I assembled a series of JSON schemes to represent networks, in a way that will allow programmatic interaction with the data. And I'm now super glad to announce that I am looking for beta-testers, before I release the tool in a formal way. This post is the first part of a series of two or three posts, which will give informations about the project, how to interact with the database, and how to contribute data. I'll probably try to write a few use-cases, but if reading these posts inspire you, feel free to suggest some!

So what is that about?

mangal (another word for a mangrove, and a type of barbecue) is a way to represent and interact with networks in a way that is (i) relatively easy and (ii) allows for powerful analyses. It's built around a data format, i.e. a common language to represent ecological networks. You can have an overview of the data format on the website. The data format was conceived with two ideas in mind. First, it must makes sense from an ecological point of view. Second, it must be easy to use to exchange data, send them to database, and get them through APIs. Going on a website to download a text file (or an Excel one) should be a thing of the past, and the data format is built around the idea that everything should be done in a programmatic way.

Very importantly, the data specification explains how data should be formatted when they are exchanged, not when they are used. The R package, notably, uses igraph to manipulate networks. It means that anyone with a database of ecological networks can write an API to expose these data in the mangal format, and in turn, anyone can access the data with the URL of the API as the only information.

Because everyone uses R, as I've mentionned above, we are also releasing a R package (unimaginatively titled rmangal). You can get it from GitHub, and we'll see in a minute how to install it until it is released on CRAN. Most of these posts will deal with how to use the R package, and what can be done with it. Ideally, you won't need to go on the website at all to interact with the data (but just to make sure you do, the website has some nice eye-candy, with clickable maps and animated networks).

An excellent opportunity to become acquainted with the iGraph package for R (299 pages), IGraph for Python (394 pages), and iGraph C Library (812 pages).

Unfortunately, iGraph does not support multigraphs or hypergraphs.

December 13, 2013

Immersion Reveals…

Filed under: Graphs,Networks,Social Networks — Patrick Durusau @ 4:24 pm

Immersion Reveals How People are Connected via Email by Andrew Vande Moere.

From the post:

Immersion [mit.edu] is a quite revealing visualization tool of which the NSA – or your own national security agency – can only be jealous of… Developed by MIT students Daniel Smilkov, Deepak Jagdish and César Hidalgo, Immersion generates a time-varying network visualization of all your email contacts, based on how you historically communicated with them.

Immersion is able to aggregate and analyze the “From”, “To”, “Cc” and “Timestamp” data of all the messages in any (authorized) Gmail, MS Exchange or Yahoo email account. It then filters out the ‘collaborators’ – people from whom one has received, and sent, at least 3 email messages from, and to.

Remember what I said about IT making people equal?

Access someone’s email account, which are often hacked, and you can have a good idea of their social network.

Or I assume you can run it across mailing list archives with a diluted result for any particular person.

December 1, 2013

Computational Social Science

Filed under: Graphs,Networks,Social Networks,Social Sciences — Patrick Durusau @ 9:26 pm

Georgia Tech CS 8803-CSS: Computational Social Science by Jacob Eisenstein

From the webpage:

The principle aim for this graduate seminar is to develop a broad understanding of the emerging cross-disciplinary field of Computational Social Science. This includes:

  • Methodological foundations in network and content analysis: understanding the mathematical basis for these methods, as well as their practical application to real data.
  • Best practices and limitations of observational studies.
  • Applications to political science, sociolinguistics, sociology, psychology, economics, and public health.

Consider this as an antidote to the “everything’s a graph, so let’s go” type approach.

Useful application of graph or network analysis requires a bit more than enthusiasm for graphs.

Just scanning the syllabus, devoting serious time to the readings will give you a good start on the skills required to be useful with network analysis.

I first saw this in a tweet by Jacob Eisenstein.

November 13, 2013

PowerLyra

Filed under: GraphLab,Graphs,Networks — Patrick Durusau @ 8:36 pm

PowerLyra by Danny Bickson.

Danny has posted an email from Rong Chen, Shanghai Jiao Tong University, which reads in part:

We argued that skewed distribution in natural graphs also calls for differentiated processing of high-degree and low-degree vertices. We then developed PowerLyra, a new graph analytics engine that embraces the best of both worlds of existing frameworks, by dynamically applying different computation and partition strategies for different vertices. PowerLyra uses Pregel/GraphLab like computation models for process low-degree vertices to minimize computation, communication and synchronization overhead, and uses PowerGraph-like computation model for process high-degree vertices to reduce load imbalance and contention. To seamless support all PowerLyra application, PowerLyra further introduces an adaptive unidirectional graph communication.

PowerLyra additionally proposes a new hybrid graph cut algorithm that embraces the best of both worlds in edge-cut and vertex-cut, which adopts edge-cut for low-degree vertices and vertex-cut for high-degree vertices. Theoretical analysis shows that the expected replication factor of random hybrid-cut is always better than both random vertex-cut and edge-cut. For skewed power-law graph, empirical validation shows that random hybrid-cut also decreases the replication factor of current default heuristic vertex-cut (Grid) from 5.76X to 3.59X and from 18.54X to 6.76X for constant 2.2 and 1.8 of synthetic graph respectively. We also develop a new distributed greedy heuristic hybrid-cut algorithm, namely Ginger, inspired by Fennel (a greedy streaming edge-cut algorithm for a single machine). Compared to Gird vertex-cut, Ginger can reduce the replication factor by up to 2.92X (from 2.03X) and 3.11X (from 1.26X) for synthetic and real-world graphs accordingly.

Finally, PowerLyra adopts locality-conscious data layout optimization in graph ingress phase to mitigate poor locality during vertex communication. we argue that a small increase of graph ingress time (less than 10% for power-law graph and 5% for real-world graph) is more worthwhile for an often larger speedup in execution time (usually more than 10% speedup, specially 21% for Twitter follow graph).

Right now, PowerLyra is implemented as an execution engine and graph partitions of GraphLab, and can seamlessly support all GraphLab applications. A detail evaluation on 48-node cluster using three different graph algorithms (PageRank, Approximate Diameter and Connected Components) show that PowerLyra outperforms current synchronous engine with Grid partition of PowerGraph (Jul. 8, 2013. commit:fc3d6c6) by up to 5.53X (from 1.97X) and 3.26X (from 1.49X) for real-world (Twitter, UK-2005, Wiki, LiveJournal and WebGoogle) and synthetic (10-million vertex power-law graph ranging from 1.8 to 2.2) graphs accordingly, due to significantly reduced replication factor, less communication cost and improved load balance.

The website of PowerLyra: http://ipads.se.sjtu.edu.cn/projects/powerlyra.html
….

Pass this along if you are interested in cutting edge graph software development.

November 9, 2013

Analyzing Social Media Networks using NodeXL [D.C., Nov. 13th]

Filed under: Graphs,Microsoft,Networks,NodeXL,Visualization — Patrick Durusau @ 8:22 pm

Analyzing Social Media Networks using NodeXL by Marc Smith.

From the post:

I am excited to have the opportunity to present a NodeXL workshop with Data Community DC on November 13th at 6pm in Washington, D.C.

In this session I will describe the ways NodeXL can simplify the process of collecting, storing, analyzing, visualizing and publishing reports about connected structures. NodeXL supports the exploration of social media with import features that pull data from personal email indexes on the desktop, Twitter, Flickr, Youtube, Facebook and WWW hyperlinks.

NodeXL allows non-programmers to quickly generate useful network statistics and metrics and create visualizations of network graphs. Filtering and display attributes can be used to highlight important structures in the network. Innovative automated layouts make creating quality network visualizations simple and quick.

Apologies for the short notice but I just saw the workshop announcement today.

If you are in the D.C. area and have any interest in graphs or visualization at all, you need to catch this presentation.

If you don’t believe me, take a look at the NodeXL gallery that Marc mentions in his post:

http://nodexlgraphgallery.org/Pages/Default.aspx

Putting graph visualization into the hands of users?

October 26, 2013

Sylva

Filed under: Graphs,Networks — Patrick Durusau @ 2:40 pm

Sylva: A Relaxed Schema Graph Database Management System

From the webpage:

Sylva [from the Old Spanish “silva”, a Rennaisance type of book to organize knowledge] is a free, easy-to-use, flexible, and scalable database management system that helps you collect, collaborate, visualize and query large data sets.

No programming knowledge is required to use Sylva!

The data sets are stored according to the schemas (graphs) created by the user, and can be visualized as networks and as lists. Researchers have absolute freedom to grant different permits to collaborators and to import and export their schemas and data sets.

Start using Sylva as soon as it becomes available!

Not available, yet, but putting it on a watch list.

The splash page reads as though graph operations can be restricted by a schema.

That would be an interesting capability.

In particular if workflow could be modeled by a schema.

September 27, 2013

Meet Node-RED…

Filed under: Networks,Organic Programming,Programming — Patrick Durusau @ 3:09 pm

Meet Node-RED, an IBM project that fulfills the internet of things’ missing link by Stacey Higginbotham.

From the post:

If you play around with enough connected devices or hang out with enough people thinking about what it means to have 200 connected gizmos in your home, eventually you get to a pretty big elephant in the room: How the heck are you going to connect all this stuff? To a hub? To the internet? To each other?

It’s one thing to set a program to automate your lights/thermostat/whatever to go to a specific setting when you hit a button/lock your door/exit your home’s Wi-Fi network, but it’s quite another to have a truly intuitive and contextual experience in a connected home if you have to manually program it using IFTTT or a series of apps. Imagine if instead of popping a couple Hue Light Bulbs into your bedroom lamp, you bought home 30 or 40 for your entire home. That’s a lot of adding and setting preferences.

Organic programming: Just let it go

If you take this out of the residential setting and into a factory or office it’s magnified and even more daunting because of a variety of potential administrative tasks and permissions required. Luckily, there are several people thinking about this problem. Mike Kuniavsky, a principal in the innovation services group at PARC, first introduced me to this concept back in February and will likely touch on this in a few weeks at our Mobilize conference next month. He likens it to a more organic way of programming.

The basic idea is to program the internet of things much like you play a Sims-style video game — you set things up to perform in a way you think will work and then see what happens. Instead of programming an action, you’re programming behaviors and trends in a device or class of devices. Then you put them together, give them a direction and they figure out how to get there.

Over at IBM, a few engineers are actually building something that might be helpful in implementing such systems. It’s called node-RED and it’s a way to interject a layer of behaviors for devices using a visual interface. It’s built on top of node.js and is available over on github.

If you have ever seen the Eureka episode H.O.U.S.E. Rules, you will have serious doubts about the wisdom of “…then see what happens” with regard to your house. 😉

I wonder if this will be something truly different, like organic computing or a continuation of well known trends.

Early computers were programmed using switches but quickly migrated to assembler, but few write assembler now and those chores are done by compilers.

Some future compiler may accept examples of the “same” subject and decide on the most effective way to search and collate all the data for a given subject.

That will require a robust understanding of subject identity on the part of the compiler writers.

September 26, 2013

Time-varying social networks in a graph database…

Filed under: AutoComplete,Graphs,Neo4j,Networks,Social Networks,Time — Patrick Durusau @ 4:02 pm

Time-varying social networks in a graph database: a Neo4j use case by Ciro Cattuto, Marco Quaggiotto, André Panisson, and Alex Averbuch.

Abstract:

Representing and efficiently querying time-varying social network data is a central challenge that needs to be addressed in order to support a variety of emerging applications that leverage high-resolution records of human activities and interactions from mobile devices and wearable sensors. In order to support the needs of specific applications, as well as general tasks related to data curation, cleaning, linking, post-processing, and data analysis, data models and data stores are needed that afford efficient and scalable querying of the data. In particular, it is important to design solutions that allow rich queries that simultaneously involve the topology of the social network, temporal information on the presence and interactions of individual nodes, and node metadata. Here we introduce a data model for time-varying social network data that can be represented as a property graph in the Neo4j graph database. We use time-varying social network data collected by using wearable sensors and study the performance of real-world queries, pointing to strengths, weaknesses and challenges of the proposed approach.

A good start on modeling networks that vary based on time.

If the overhead sounds daunting, remember the graph data used here measured the proximity of actors every 20 seconds for three days.

Imagine if you added social connections between those actors, attended the same schools/conferences, co-authored papers, etc.

We are slowly loosing our reliance on simplification of data and models to make them computationally tractable.

September 25, 2013

Easier than Excel:…

Filed under: Excel,Gephi,Graphs,Networks,Social Networks — Patrick Durusau @ 4:59 pm

Easier than Excel: Social Network Analysis of DocGraph with Gephi by Janos G. Hajagos and Fred Trotter. (PDF)

From the session description:

The DocGraph dataset was released at Strata RX 2012. The dataset is the result of FOI request to CMS by healthcare data activist Fred Trotter (co-presenter). The dataset is minimal where each row consists of just three numbers: 2 healthcare provider identifiers and a weighting factor. By combining these three numbers with other publicly available information sources novel conclusions can be made about delivery of healthcare to Medicare members. As an example of this approach see: http://tripleweeds.tumblr.com/post/42989348374/visualizing-the-docgraph-for-wyoming-medicare-providers

The DocGraph dataset consists of over 49,685,810 relationships between 940,492 different Medicare providers. Analyzing the complete dataset is too big for traditional tools but useful subsets of the larger dataset can be analyzed with Gephi. Gephi is a opensource tool to visually explore and analyze graphs. This tutorial will teach participants how to use Gephi for social network analysis on the DocGraph dataset.

Outline of the tutorial:

Part 1: DocGraph and the network data model (30% of the time)

The DocGraph dataset The raw data Helper data (NPI associated data) The graph / network data model Nodes versus edges How graph models are integral to social networking Other Healthcare graph data sets

Part 2: Using Gephi to perform analysis (70% of the time)

Basic usage of Gephi Saving and reading the GraphML format Laying out edges and nodes of a graph Navigating and exploring the graph Generating graph metrics on the network Filtering a subset of the graph Producing the final output of the graph.

Links from the last slide:

http://strata.oreilly.com/2012/11/docgraph-open-social-doctor-data.html (information)

https://github.com/jhajagos/DocGraph (code)

http://notonlydev.com/docgraph-data (open source $1 covers bandwidth fees)

https://groups.google.com/forum/#!forum/docgraph (mailing list)

Just in case you don’t have it bookmarked already: Gephi.

The type of workshop that makes an entire conference seem like lagniappe.

Just sorry I will have to appreciate it from afar.

Work through this one carefully. You will acquire useful skills doing so.

September 22, 2013

Relationship Timelines

Filed under: Associations,Networks,Time,Timelines — Patrick Durusau @ 3:09 pm

Relationship Timelines by Skye Bender-deMoll.

From the post:

I finally had a chance to pull together a bunch of interesting timeline examples–mostly about the U.S. Congress. Although several of these are about networks, the primary features being visualized are changes in group structure and membership over time. Should these be called “alluvial diagrams”, “stream graphs” “Sankey charts”, “phase diagrams”, “cluster timelines”?

From the U.S. Congress to characters in the Lord of the Rings (movie version) and beyond, Skye explores visualization of dynamic relationships over time.

Raises the interesting issue of how do you represent a dynamic relationship in a topic map?

For example, at some point in a topic map of a family, the mother and father did not know each other. At some later point they met, but were not yet married. Still later they were married and later still, had children. Other events in their lives happened before or after those major events.

Scope could segment off a segment of events, but you would have to create a date/time datatype or use one from the W3C, XML Schema Part 2: Datatypes Second Edition, for calculation of which scope precedes or follows another scope.

A closely related problem is to show what facts were known to a person at some point in time. Or as put by Howard Baker:

“What did the President know and when did he know it?” [During the Watergate Hearings

That may again be a relevant question in the not too distant future.

Suggestions for a robust topic map modeling solution would be most welcome!

September 11, 2013

…Conceptual Model For Evolving Graphs

Filed under: Distributed Computing,Evoluntionary,Graphs,Networks — Patrick Durusau @ 5:17 pm

An Analytics-Aware Conceptual Model For Evolving Graphs by Amine Ghrab, Sabri Skhiri, Salim Jouili, and Esteban Zimanyi.

Abstract:

Graphs are ubiquitous data structures commonly used to represent highly connected data. Many real-world applications, such as social and biological networks, are modeled as graphs. To answer the surge for graph data management, many graph database solutions were developed. These databases are commonly classified as NoSQL graph databases, and they provide better support for graph data management than their relational counterparts. However, each of these databases implement their own operational graph data model, which differ among the products. Further, there is no commonly agreed conceptual model for graph databases.

In this paper, we introduce a novel conceptual model for graph databases. The aim of our model is to provide analysts with a set of simple, well-defined, and adaptable conceptual components to perform rich analysis tasks. These components take into account the evolving aspect of the graph. Our model is analytics-oriented, flexible and incremental, enabling analysis over evolving graph data. The proposed model provides a typing mechanism for the underlying graph, and formally defines the minimal set of data structures and operators needed to analyze the graph.

The authors concede that much work remains to be done, both theoretical and practical on their proposal.

With the rise of distributed computing, every “fact” depends upon a calculated moment of now. What was a “fact” five minutes ago may not longer be considered as a “fact” but as an “error.”

Who is responsible for changes in “facts,” warranties for “facts,” who gives and gets notices about changes in “facts,” all remain to be determined.

Models for evolving graphs may assist in untangling the rights, obligations and relationships that are nearly upon us with distributed computing.

August 26, 2013

CCNx®

Filed under: CCNx,Networks — Patrick Durusau @ 1:44 pm

CCN® (Content-Centric Networking)

From the about page:

Project CCNx® exists to develop, promote, and evaluate a new approach to communication architecture we call content-centric networking. We seek to carry out this mission by creating and publishing open protocol specifications and an open source software reference implementation of those protocols. We provide support for a community of people interested in experimentation, research, and building applications with this technology, all contributing to its evolution.

Research Origins and Current State

CCNx technology is still at an early stage of development, with pure infrastructure and no applications, best suited to researchers and adventurous network engineers or software developers. If you’re looking for cool applications ready to download and use, you are a little too early.

Project CCNx is sponsored by the Palo Alto Research Center (PARC) and is based upon the PARC Content-Centric Networking (CCN) architecture, which is the focus of a major, long-term research and development program. There are interesting problems in many areas still to be solved to fully realize and apply the vision, but we believe that enough of an architectural foundation is in place to enable significant experiments to begin. Since this new approach to networking can be deployed through middleware software communicating in an overlay on existing networks, it is possible to start applying it now to solve communication problems in new ways. Project CCNx is an invitation to join us and participate in this exploration of the frontier of content networking.

An odd echo of my post earlier today on HSA – Heterogeneous System Architecture, where heterogeneous processors share the same data.

The abstract from the paper, Networking Named Content by Van Jacobson, Diana K. Smetters, James D. Thornton, Michael F. Plass, Nicholas H. Briggs and Rebecca L. Braynard (2009), gives a good overview:

Network use has evolved to be dominated by content distribution and retrieval, while networking technology still speaks only of connections between hosts. Accessing content and services requires mapping from the what that users care about to the network’s where. We present Content-Centric Networking (CCN) which treats content as a primitive – decoupling location from identity, security and access, and retrieving content by name. Using new approaches to routing named content, derived heavily from IP, we can simultaneously achieve scalability, security and performance. We implemented our architecture’s basic features and demonstrate resilience and performance with secure file downloads and VoIP calls.

I rather like that: “…requires mapping from the what that users care about to the network’s where.”

As a user I don’t care nearly as much where content is located as I do about the content itself.

Do you?

You may have to get out your copy of TCP IP Illustrated by W. Richard Stevens but it will be worth the effort.

I haven’t gone over all the literature but I haven’t seen any mention of the same data originating from multiple addresses. Not the caching of content, that’s pretty obvious but the same named content at different locations.

The usual content semantic issues plus being able to say that two or more named contents are the same content.

August 13, 2013

Are EigenVectors Dangerous?

Filed under: Graphs,Mathematics,Networks,PageRank,Ranking — Patrick Durusau @ 7:44 pm

neo4j: Extracting a subgraph as an adjacency matrix and calculating eigenvector centrality with JBLAS by Mark Needham.

Mark continues his exploration of Eigenvector centrality by adding Eigenvector centrality values back to the graph from which it was developed.

Putting the Eigenvector centrality measure results back into Neo4j make they easier to query.

What troubles me is that Eigenvector centrality values are based only upon the recorded information we have for the graph.

There is no allowance for missing relationships or any validation of the Eigenvector centrality values found.

Recalling Paul Revere was a “terrorist” in his day, the NSA uses algorithms to declare nodes “important,” lack of access to courts for detainees, and Eigenvector centrality values start to look dangerous.

How would you validate Eigenvector centrality values? Not mathematically but against known values or facts outside of your graph.

How Important is Your Node in the Social Graph?

Filed under: Graphs,Mathematics,Networks,PageRank,Ranking — Patrick Durusau @ 6:08 pm

Java/JBLAS: Calculating eigenvector centrality of an adjacency matrix by Mark Needham.

OK, Mark’s title is more accurate but mine is more likely to get you to look beyond the headline. 😉

From the post:

I recently came across a very interesting post by Kieran Healy where he runs through a bunch of graph algorithms to see whether he can detect the most influential people behind the American Revolution based on their membership of various organisations.

The first algorithm he looked at was betweenness centrality which I’ve looked at previously and is used to determine the load and importance of a node in a graph.

This algorithm would assign a high score to nodes which have a lot of nodes connected to them even if those nodes aren’t necessarily influential nodes in the graph.

If we want to take the influence of the other nodes into account then we can use an algorithm called eigenvector centrality.

You may remember Kieran Healy’s post from Using Metadata to Find Paul Revere [In a Perfect World], where I pointed out that Kieran was using clean data. No omissions, no variant spellings, no confusion of any sort.

I suspect any sort of analysis would succeed with the proviso that it only gets clean data. Unlikely in an unclean data world.

But that to one side, Mark does a great job of assembling references on eigenvectors and code for processing. Follow all the resources in Mark’s post and you will have a much deeper understanding of this area.

Be sure to take note of the comparison between PageRank and Eigenvector centrality. Results are computational artifacts of choices that are visible when examining the end results.

PS: The Wikipedia link for Centrality cites Opsahl, Tore; Agneessens, Filip; Skvoretz, John (2010). “Node centrality in weighted networks: Generalizing degree and shortest paths“. Social Networks 32 (3): 245. doi:10.1016/j.socnet.2010.03.006 as a good summary. The link for the title leads to a preprint which is freely available.

June 21, 2013

Graphillion

Filed under: Graphillion,Graphs,Networks,Python — Patrick Durusau @ 6:07 pm

Graphillion

From the webpage:

Graphillion is a Python library for efficient graphset operations. Unlike existing graph tools such as NetworkX, which are designed to manipulate just a single graph at a time, Graphillion handles a large set of graphs very efficiently. Surprisingly, trillions of trillions of graphs can be processed on a single computer with Graphillion.

You may be curious about an uncommon concept of graphset, but it comes along with any graph or network when you consider multiple subgraphs cut from the graph; e.g., considering possible driving routes on a road map, examining feasible electric flows on a power grid, or evaluating the structure of chemical reaction networks. The number of such subgraphs can be trillions even in a graph with just a few hundreds of edges, since subgraphs increase exponentially with the graph size. It takes millions of years to examine all subgraphs with a naive approach as demonstrated in the funny movie above; Graphillion is our answer to resolve this issue.

Graphillion allows you to exhaustively but efficiently search a graphset with complex, even nonconvex, constraints. In addition, you can find top-k optimal graphs from the complex graphset, and can also extract common properties among all graphs in the set. Thanks to these features, Graphillion has a variety of applications including graph database, combinatorial optimization, and a graph structure analysis. We will show some practical use cases in the following tutorial, including evaluation of power distribution networks.

Just skimming the tutorial, this looks way cool!

Be sure to check out the references:

  • Takeru Inoue, Hiroaki Iwashita, Jun Kawahara, and Shin-ichi Minato: “Graphillion: Software Library Designed for Very Large Sets of Graphs in Python,” Hokkaido University, Division of Computer Science, TCS Technical Reports, TCS-TR-A-13-65, June 2013.
    (pdf)
  • Takeru Inoue, Keiji Takano, Takayuki Watanabe, Jun Kawahara, Ryo Yoshinaka, Akihiro Kishimoto, Koji Tsuda, Shin-ichi Minato, and Yasuhiro Hayashi, “Loss Minimization of Power Distribution Networks with Guaranteed Error Bound,” Hokkaido University, Division of Computer Science, TCS Technical Reports, TCS-TR-A-12-59, 2012. (pdf)
  • Ryo Yoshinaka, Toshiki Saitoh, Jun Kawahara, Koji Tsuruma, Hiroaki Iwashita, and Shin-ichi Minato, “Finding All Solutions and Instances of Numberlink and Slitherlink by ZDDs,” Algorithms 2012, 5(2), pp.176-213, 2012. (doi)
  • DNET – Distribution Network Evaluation Tool

I first saw this in a tweet by David Gutelius.

June 13, 2013

Loopy Lattices Redux

Filed under: Faunus,Graphs,Networks,Titan — Patrick Durusau @ 4:45 pm

Loopy Lattices Redux by Marko A. Rodriguez.

Comparison of Titan and Faunus counting the number of paths in a 20 x 20 lattice.

Interesting from a graph-theoretic perspective but since the count can be determined analytically, I am not sure of the utility of being about to count the paths?

In some ways this reminds me of Counting complex disordered states by efficient pattern matching: chromatic polynomials and Potts partition functions by Marc Timme, Frank van Bussel, Denny Fliegner and Sebastian Stolzenberg, New Journal of Physics 11 (2009) 023001.

The question Timme and colleagues were investigating was the coloring of nodes in a graph which depended upon the coloring of other nodes. For a chess board sized graph, the calculation is estimated to take billions of years. The technique developed here takes less than seven (7) seconds for a chess board sized graph.

Traditionally, assigning a color to a vertex required knowledge of the entire graph. Here, instead of assigning a color, the color that should be assigned is represented by a formula stating the unknowns. Once all the nodes have such a formula:

The computation of the chromatic polynomial has been reduced to a process of alternating expansion of expressions and symbolically replacing terms in an appropriate order. In the language of computer science, these operations are represented as the expanding, matching and sorting of patterns, making the algorithm suitable for computer algebra programs optimized for pattern matching.

What isn’t clear is whether a similar technique could be applied to merging conditions where the merging state of a proxy depends upon, potentially, all other proxies.

May 31, 2013

“You Know Because I Know”

Filed under: Graphs,Multidimensional,Networks — Patrick Durusau @ 5:06 pm

“You Know Because I Know”: a Multidimensional Network Approach to Human Resources Problem by Michele Coscia, Giulio Rossetti, Diego Pennacchioli, Damiano Ceccarelli, Fosca Giannotti.

Abstract:

Finding talents, often among the people already hired, is an endemic challenge for organizations. The social networking revolution, with online tools like Linkedin, made possible to make explicit and accessible what we perceived, but not used, for thousands of years: the exact position and ranking of a person in a network of professional and personal connections. To search and mine where and how an employee is positioned on a global skill network will enable organizations to find unpredictable sources of knowledge, innovation and know-how. This data richness and hidden knowledge demands for a multidimensional and multiskill approach to the network ranking problem. Multidimensional networks are networks with multiple kinds of relations. To the best of our knowledge, no network-based ranking algorithm is able to handle multidimensional networks and multiple rankings over multiple attributes at the same time. In this paper we propose such an algorithm, whose aim is to address the node multi-ranking problem in multidimensional networks. We test our algorithm over several real world networks, extracted from DBLP and the Enron email corpus, and we show its usefulness in providing less trivial and more flexible rankings than the current state of the art algorithms.

Although framed in a human resources context, it isn’t much of a jump to see this work as applicable to other multidimensional networks.

Including multidimensional networks of properties that define subject identities.

May 19, 2013

Visualizing your LinkedIn graph using Gephi (Parts 1 & 2)

Filed under: Gephi,Graphics,Networks,Social Networks,Visualization — Patrick Durusau @ 1:41 pm

Visualizing your LinkedIn graph using Gephi – Part 1

&

Visualizing your LinkedIn graph using Gephi – Part 2

by Thomas Cabrol.

From part 1:

Graph analysis becomes a key component of data science. A lot of things can be modeled as graphs, but social networks are really one of the most obvious examples.

In this post, I am going to show how one could visualize its own LinkedIn graph, using the LinkedIn API and Gephi, a very nice software for working on this type of data. If you don’t have it yet, just go to http://gephi.org/ and download it now !

My objective is to simply look at my connections (the “nodes” or “vertices” of the graph), see how they relate to each other (the “edges”) and find clusters of strongly connected users (“communities”). This is somewhat emulating what is available already in the InMaps data product, but, hey, this is cool to do it by ourselves, no ?

The first thing to do for running this graph analysis is to be able to query LinkedIn via its API. You really don’t want to get the data by hand… The API uses the oauth authentification protocol, which will let an application make queries on behalf of a user. So go to https://www.linkedin.com/secure/developer and register a new application. Fill the form as required, and in the OAuth part, use this redirect URL for instance:

Great introduction to Gephi!

As a bonus, reinforces the lesson that ETL isn’t required to re-use data.

ETL may be required in some cases but in a world of data APIs those are getting fewer and fewer.

Think of it this way: Non-ETL data access means someone else is paying for maintenance, backups, hardware, etc.

How much of your IT budget is supporting duplicated data?

May 18, 2013

Graph Representation – Edge List

Filed under: Graphs,Networks,Programming — Patrick Durusau @ 12:44 pm

Graph Representation – Edge List

From the post:

An Edge List is a form of representation for a graph. It maintains a list of all the edges in the graph. For each edge, it keeps track of the 2 connecting vertices as well as the weight between them.

Followed by C++ code as an example.

A hypergraph would require tracking of 3 or more connected nodes.

May 14, 2013

HeadStart for Planet Earth [Titan]

Filed under: Education,Graphs,Networks,Titan — Patrick Durusau @ 8:45 am

Educating the Planet with Pearson by Marko A. Rodriguez.

From the post:

Pearson is striving to accomplish the ambitious goal of providing an education to anyone, anywhere on the planet. New data processing technologies and theories in education are moving much of the learning experience into the digital space — into massive open online courses (MOOCs). Two years ago Pearson contacted Aurelius about applying graph theory and network science to this burgeoning space. A prototype proved promising in that it added novel, automated intelligence to the online education experience. However, at the time, there did not exist scalable, open-source graph database technology in the market. It was then that Titan was forged in order to meet the requirement of representing all universities, students, their resources, courses, etc. within a single, unified graph. Moreover, beyond representation, the graph needed to be able to support sub-second, complex graph traversals (i.e. queries) while sustaining at least 1 billion transactions a day. Pearson asked Aurelius a simple question: “Can Titan be used to educate the planet?” This post is Aurelius’ answer.

Liking the graph approach in general and Titan in particular does not make me any more comfortable with some aspects of this posting.

You don’t need to spin up a very large Cassandra database on Amazon to see the problems.

Consider the number of concepts for educating the world, some 9,000 if the chart is to be credited.

Suggested Upper Merged Ontology (SUMO) has “~25,000 terms and ~80,000 axioms when all domain ontologies are combined.

The SUMO totals being before you get into the weeds of any particular subject, discipline or course material.

Or the subset of concepts and facts represented in DBpedia:

The English version of the DBpedia knowledge base currently describes 3.77 million things, out of which 2.35 million are classified in a consistent Ontology, including 764,000 persons, 573,000 places (including 387,000 populated places), 333,000 creative works (including 112,000 music albums, 72,000 films and 18,000 video games), 192,000 organizations (including 45,000 companies and 42,000 educational institutions), 202,000 species and 5,500 diseases.

In addition, we provide localized versions of DBpedia in 111 languages. All these versions together describe 20.8 million things, out of which 10.5 million overlap (are interlinked) with concepts from the English DBpedia. The full DBpedia data set features labels and abstracts for 10.3 million unique things in up to 111 different languages; 8.0 million links to images and 24.4 million HTML links to external web pages; 27.2 million data links into external RDF data sets, 55.8 million links to Wikipedia categories, and 8.2 million YAGO categories. The dataset consists of 1.89 billion pieces of information (RDF triples) out of which 400 million were extracted from the English edition of Wikipedia, 1.46 billion were extracted from other language editions, and about 27 million are data links to external RDF data sets. The Datasets page provides more information about the overall structure of the dataset. Dataset Statistics provides detailed statistics about 22 of the 111 localized versions.

I don’t know if the 9,000 concepts cited in the post would be sufficient for a world wide HeadStart program in multiple languages.

Moreover, why would any sane person want a single unified graph to represent course delivery from Zaire to the United States?

How is a single unified graph going to deal with the diversity of educational institutions around the world? A diversity that I take as a good thing.

It sounds like Pearson is offering a unified view of education.

My suggestion is to consider the value of your own diversity before passing on that offer.

May 13, 2013

Motif Simplification…[Simplifying Graphs]

Filed under: Graphics,Graphs,Interface Research/Design,Networks,Visualization — Patrick Durusau @ 3:22 pm

Motif Simplification: Improving Network Visualization Readability with Fan, Connector, and Clique Glyphs by Cody Dunne and Ben Shneiderman.

Abstract:

Analyzing networks involves understanding the complex relationships between entities, as well as any attributes they may have. The widely used node-link diagrams excel at this task, but many are difficult to extract meaning from because of the inherent complexity of the relationships and limited screen space. To help address this problem we introduce a technique called motif simplification, in which common patterns of nodes and links are replaced with compact and meaningful glyphs. Well-designed glyphs have several benefits: they (1) require less screen space and layout effort, (2) are easier to understand in the context of the network, (3) can reveal otherwise hidden relationships, and (4) preserve as much underlying information as possible. We tackle three frequently occurring and high-payoff motifs: fans of nodes with a single neighbor, connectors that link a set of anchor nodes, and cliques of completely connected nodes. We contribute design guidelines for motif glyphs; example glyphs for the fan, connector, and clique motifs; algorithms for detecting these motifs; a free and open source reference implementation; and results from a controlled study of 36 participants that demonstrates the effectiveness of motif simplification.

When I read “replace,” “aggregation,” etc., I automatically think about merging in topic maps. 😉

After replacing “common patterns of nodes and links” I may still be interested in the original content of those nodes and links.

Or I may wish to partially unpack them based on some property in the original content.

Definitely a paper for a slow, deep read.

Not to mention research on the motifs in graph representations of your topic maps.

I first saw this in Visualization Papers at CHI 2013 by Enrico Bertini.

May 12, 2013

Guess: The Graph Exploration System

Filed under: Graphs,Networks,Visualization — Patrick Durusau @ 4:47 pm

Guess: The Graph Exploration System

From the webpage:

GUESS is an exploratory data analysis and visualization tool for graphs and networks. The system contains a domain-specific embedded language called Gython (an extension of Python, or more specifically Jython) which supports the operators and syntactic sugar necessary for working on graph structures in an intuitive manner. An interactive interpreter binds the text that you type in the interpreter to the objects being visualized for more useful integration. GUESS also offers a visualization front end that supports the export of static images and dynamic movies.

Graph movies? Cool!

If you could catch a graph in an unguarded moment, what would you want to capture in a movie?

See also: Sourceforge – Guess.

Bond Percolation in GraphLab

Filed under: GraphLab,Graphs,Networks — Patrick Durusau @ 4:11 pm

Bond Percolation in GraphLab by Danny Bickson.

From the post:

I was asked by Prof. Scott Kirkpatrick to help and implement bond percolation in GraphLab. It is an oldie but goldie problem which is closely related to the connected components problem.

Here is an explanation about bond percolation from Wikipedia:

A representative question (and the source of the name) is as follows. Assume that some liquid is poured on top of some porous material. Will the liquid be able to make its way from hole to hole and reach the bottom? This physical question is modelled mathematically as a three-dimensional network of n × n × n vertices, usually called “sites”, in which the edge or “bonds” between each two neighbors may be open (allowing the liquid through) with probability p, or closed with probability 1 – p, and they are assumed to be independent. Therefore, for a given p, what is the probability that an open path exists from the top to the bottom? The behavior for large n is of primary interest. This problem, called now bond percolation, was introduced in the mathematics literature by Broadbent & Hammersley (1957), and has been studied intensively by mathematicians and physicists since.

Perculation Graph

In social networks, Danny notes this algorithm is used to find groups of friends.

Similar mazes appear in puzzle books.

My curiosity is about finding groups of subject identity properties.

A couple of other percolation resources of interest:

Percolation Exercises by Eric Mueller.

PercoVis (Mac), visualization of percolation by Daniel B. Larremore.

May 11, 2013

Medusa: Simplified Graph Processing on GPUs

Filed under: GPU,Graphs,Networks,Parallel Programming — Patrick Durusau @ 12:30 pm

Medusa: Simplified Graph Processing on GPUs by Jianlong Zhong, Bingsheng He.

Abstract:

Graphs are the de facto data structures for many applications, and efficient graph processing is a must for the application performance. Recently, the graphics processing unit (GPU) has been adopted to accelerate various graph processing algorithms such as BFS and shortest path. However, it is difficult to write correct and efficient GPU programs and even more difficult for graph processing due to the irregularities of graph structures. To simplify graph processing on GPUs, we propose a programming framework called Medusa which enables developers to leverage the capabilities of GPUs by writing sequential C/C++ code. Medusa offers a small set of user-defined APIs, and embraces a runtime system to automatically execute those APIs in parallel on the GPU. We develop a series of graph-centric optimizations based on the architecture features of GPU for efficiency. Additionally, Medusa is extended to execute on multiple GPUs within a machine. Our experiments show that (1) Medusa greatly simplifies implementation of GPGPU programs for graph processing, with much fewer lines of source code written by developers; (2) The optimization techniques significantly improve the performance of the runtime system, making its performance comparable with or better than the manually tuned GPU graph operations.

Just in case you are interested in high performance graph processing. 😉

« Newer PostsOlder Posts »

Powered by WordPress