Archive for the ‘Gephi’ Category

Visualizing your LinkedIn graph using Gephi (Parts 1 & 2)

Sunday, May 19th, 2013

Visualizing your LinkedIn graph using Gephi – Part 1

&

Visualizing your LinkedIn graph using Gephi – Part 2

by Thomas Cabrol.

From part 1:

Graph analysis becomes a key component of data science. A lot of things can be modeled as graphs, but social networks are really one of the most obvious examples.

In this post, I am going to show how one could visualize its own LinkedIn graph, using the LinkedIn API and Gephi, a very nice software for working on this type of data. If you don’t have it yet, just go to http://gephi.org/ and download it now !

My objective is to simply look at my connections (the “nodes” or “vertices” of the graph), see how they relate to each other (the “edges”) and find clusters of strongly connected users (“communities”). This is somewhat emulating what is available already in the InMaps data product, but, hey, this is cool to do it by ourselves, no ?

The first thing to do for running this graph analysis is to be able to query LinkedIn via its API. You really don’t want to get the data by hand… The API uses the oauth authentification protocol, which will let an application make queries on behalf of a user. So go to https://www.linkedin.com/secure/developer and register a new application. Fill the form as required, and in the OAuth part, use this redirect URL for instance:

Great introduction to Gephi!

As a bonus, reinforces the lesson that ETL isn’t required to re-use data.

ETL may be required in some cases but in a world of data APIs those are getting fewer and fewer.

Think of it this way: Non-ETL data access means someone else is paying for maintenance, backups, hardware, etc.

How much of your IT budget is supporting duplicated data?

Rebuilding Gephi’s core for the 0.9 version

Tuesday, March 5th, 2013

Rebuilding Gephi’s core for the 0.9 version by Mathieu Bastian.

From the post:

This is the first article about the future Gephi 0.9 version. Our objective is to prepare the ground for a future 1.0 release and focus on solving some of the most difficult problems. It all starts with the core of Gephi and we’re giving today a preview of the upcoming changes in that area. In fact, we’re rewriting the core modules from scratch to improve performance, stability and add new features. The core modules represent and store the graph and attributes in memory so it’s available to the rest of the application. Rewriting Gephi’s core is like replacing the engine of a truck and involves adapting a lot of interconnected pieces. Gephi’s current graph structure engine was designed in 2009 and didn’t change much in multiple releases. Although it’s working, it doesn’t have the level of quality we want for Gephi 1.0 and needs to be overhauled. The aim is to complete the new implementation and integrate it in the 0.9 version.

Deeply interesting work!

To follow, consider subscribing to: gephi-dev — List for core developers.

Large Steam network visualization with Google Maps + Gephi

Wednesday, November 21st, 2012

Large Steam network visualization with Google Maps + Gephi

From the post:

I’ve used Google Maps API to visualize a relatively large network collected from Steam Community members. The data is collected from public player profiles that Valve reveals through their Steam Web API. For each player their links to friends and links to Steam Groups they belong are collected. This creates a social network which can be visualized using Gephi.

Graph consists of 212600 nodes and 4045203 edges. Before filtering outliers and low/high degree nodes there are approximately 800 000 groups and over 11 million users.

Very impressive visualization.

Enjoy!

“Drug Deal” Network Analysis with Gephi (Tutorial)

Friday, November 9th, 2012

“Drug Deal” Network Analysis with Gephi (Tutorial) by A. J. Hirst.

A.J. reviews Even Wholesale Drug Dealers Can Use a Little Retargeting: Graphing, Clustering & Community Detection in Excel and Gephi, suggests that you read it before continuing, and then reviews how to use Gephi to converse with the drug dealer data set.

Good tutorial on Gephi and just as good on “conversing” with the data.

Gephi Blueprints plugin

Wednesday, November 7th, 2012

Gephi Blueprints plugin by David Suvee.

From the homepage:

The Gephi Blueprints plugin allows a user to import graph-data from any graph database that implements the Tinkerpop Blueprints generic graph API. Out of the box, the plugin provides support for TinkerGraph, Neo4j, OrientDB, Dex and RexterGraph. Additionally, it also provides support for the FluxGraph temporal graph database.

Excellent!

Not to mention having a short list of interesting graph software to boot!

Information Diffusion on Twitter by @snikolov

Friday, October 26th, 2012

Information Diffusion on Twitter by @snikolov by Marti Hearst.

From the post:

Today Stan Nikolov, who just finished his masters at MIT in studying information diffusion networks, walked us through one particular theoretical model of information diffusion which tries to predict under what conditions an idea stops spreading based on a network’s structure (from the popular Easley and Kleinberg Network book). Stan also gathered a huge amount of Twitter data, processed it using Pig scripts, and graphed the results using Gephi. The video lecture below shows you some great visualizations of the spreading behavior of the data!

(video omitted)

The slides in his Lecture Notes let you see the Pig scripts in more detail.

Another deeply awesome lecture from Marti’s class on Twitter and big data.

Also an example of the level of analysis that a Twitter stream will need to withstand to avoid “imperial entanglements.”

Twitter Results Recipe with Gephi Garnish

Tuesday, October 2nd, 2012

Grabbing Twitter Search Results into Google Refine And Exporting Conversations into Gephi by Tony Hirst.

From the post:

How can we get a quick snapshot of who’s talking to whom on Twitter in the context of a particular hashtag?

What follows is a detailed recipe with the answer to that question.

NodeGL: An online interactive viewer for NodeXL graphs uploaded to Google Spreadsheet

Friday, March 30th, 2012

NodeGL: An online interactive viewer for NodeXL graphs uploaded to Google Spreadsheet.

Martin Hawksey writes:

Recently Tony (Hirst) tipped me off about a new viewer for Gephi graphs. Developed by Raphaël Velt it uses JavaScript to parse Gephi .gefx files and output the result on a HTML5 canvas. The code for the viewer is on github available under a MIT license if you want to download and remash, I’ve also put an instance here if you want to play. Looking for a solution to render NodeXL data from a Google Spreadsheet in a similar way here is some background in the development of NodeGL – an online viewer of NodeXL graphs hosted on Google Spreadsheets

Introduction to NodeGL.

Getting Started With The Gephi…

Saturday, January 21st, 2012

Getting Started With The Gephi Network Visualisation App – My Facebook Network, Part I by Tony Hirst.

From the post:

A couple of weeks ago, I came across Gephi, a desktop application for visualising networks.

And quite by chance, a day or two after I was asked about any tools I knew of that could visualise and help analyse social network activity around an OU course… which I take as a reasonable justification for exploring exactly what Gephi can do :-)

So, after a few false starts, here’s what I’ve learned so far…

First up, we need to get some graph data – netvizz – facebook to gephi suggests that the netvizz facebook app can be used to grab a copy of your Facebook network in a format that Gephi understands, so I installed the app, downloaded my network file, and then uninstalled the app… (can’t be too careful ;-)

Once Gephi is launched (and updated, if it’s a new download – you’ll see an updates prompt in the status bar along the bottom of the Gephi window, right hand side) Open… the network file you downloaded.

If you like part 1 as an introduction to Gephi, be sure to take in:

Getting Started With Gephi Network Visualisation App – My Facebook Network, Part II: Basic Filters

which starts out:

In Getting Started With Gephi Network Visualisation App – My Facebook Network, Part I I described how to get up and running with the Gephi network visualisation tool using social graph data pulled out of my Facebook account. In this post, I’ll explore some of the tools that Gephi provides for exploring a network in a more structured way.

If you aren’t familiar with Gephi, and if you haven’t read Part I of this series, I suggest you do so now…

…done that…?

Okay, so where do we begin? As before, I’m going to start with a fresh worksheet, and load my Facebook network data, downloaded via the netvizz app, into Gephi, but as an undirected graph this time! So far, so exactly the same as last time. Just to give me some pointers over the graph, I’m going to set the node size to be proportional to the degree of each node (that is, the number of people each person is connected to).

You will find lots more to explore with Gephi but this should give you a good start.

Running along the graph using Neo4J Spatial and Gephi

Thursday, January 5th, 2012

Running along the graph using Neo4J Spatial and Gephi

Just to whet your appetite:

When I started running some years ago, I bought a Garmin Forerunner 405. It’s a nifty little device that tracks GPS coordinates while you are running. After a run, the device can be synchronized by uploading your data to the Garmin Connect website. Based upon the tracked time and GPS coordinates, the Garmin Connect website provides you with a detailed overview of your run, including distance, average pace, elevation loss/gain and lap splits. It also visualizes your run, by overlaying the tracked course on Bing and/or Google maps. Pretty cool! One of my last runs can be found here.

Apart from simple aggregations such as total distance and average speed, the Garmin Connect website provides little or no support to gain deeper insights in all of my runs. As I often run the same course, it would be interesting to calculate my average pace at specific locations. When combining the data of all of my courses, I could deduct frequently encountered locations. Finally, could there be a correlation between my average pace and my distance from home? In order to come up with answers to these questions, I will import my running data into a Neo4J Spatial datastore. Neo4J Spatial extends the Neo4J Graph Database with the necessary tools and utilities to store and query spatial data in your graph models. For visualizing my running data, I will make use of Gephi, an open-source visualization and manipulation tool that allows users to interactively browse and explore graphs.

Suggestion: If you want to know where you go and/or how you spend your time, try tracking both for a week. Faithfully record how you spend your time, reading, commuting, TV, exercise, work, etc., in say 30 minute intervals. Also keep track of your physical location. Don’t try to be overly precise, use big buckets. And no peeking as to how the week is shaping up. I think you will be surprised at how your week shapes up.

Gephi: Graph Streaming API

Tuesday, November 22nd, 2011

Gephi: Graph Streaming API

Matt O’Donnell, @mdbod, wanted more information on the graph streaming API for Gephi, then tweets the URL you see above.

I have collaborated with Matt before. It is like working with a caffeinated fire hose. ;-)

Seriously, Matt does extremely good work from biblical languages, linguistics, markup languages and now NLP and beyond.

Looking forward to him working on topic maps and related areas.

Gephi adds Neo4j graph database support

Monday, November 21st, 2011

Gephi adds Neo4j graph database support (screencast)

From the webpage:

Neo4j is a powerful, award-wining graph database written in Java. It can store billions of nodes and relationships and allows very fast query/traversal. We release today a new version of the Neo4j Plugin supporting the latest 1.5 version of Neo4j. In Gephi, go to Tools > Plugins to install the plug-in.

The plugin let you visualize a graph stored in a Neo4j database and play with it. Features include full import, traversal, filter, export and lazy loading.

Warning: A real time sink! ;-)

Seriously, very cool plugin that will enhance your use of Neo4j!

Enjoy!

Visualizing RDF Schema inferencing through Neo4J, Tinkerpop, Sail and Gephi

Monday, November 21st, 2011

Visualizing RDF Schema inferencing through Neo4J, Tinkerpop, Sail and Gephi by Dave Suvee.

From the post:

Last week, the Neo4J plugin for Gephi was released. Gephi is an open-source visualization and manipulation tool that allows users to interactively browse and explore graphs. The graphs themselves can be loaded through a variety of file formats. Thanks to Martin Škurla, it is now possible to load and lazily explore graphs that are stored in a Neo4J data store.

In one of my previous articles, I explained how Neo4J and the Tinkerpop framework can be used to load and query RDF triples. The newly released Neo4J plugin now allows to visually browse these RDF triples and perform some more fancy operations such as finding patterns and executing social network analysis algorithms from within Gephi itself. Tinkerpop’s Sail Ouplementation also supports the notion of RDF Schema inferencing. Inferencing is the process where new (RDF) data is automatically deducted from existing (RDF) data through reasoning. Unfortunately, the Sail reasoner cannot easily be integrated within Gephi, as the Gephi plugin grabs a lock on the Neo4J store and no RDF data can be added, except through the plugin itself.

Being able to visualize the RDF Schema reasoning process and graphically indicate which RDF triples were added manually and which RDF data was automatically inferred would be a nice to have. To implement this feature, we should be able to push graph changes from Tinkerpop and Neo4J to Gephi. Luckily, the Gephi graph streaming plugin allows us to do just that. In the rest of this article, I will detail how to setup the required Gephi environment and how we can stream (inferred) RDF data from Neo4J to Gephi.

Visual is good!

Visual display and exploration of graphs is better!

Visual display and exploration of Neo4j data stores from within Gephi is the best!

Dave concludes:

With just a few lines of code we are able to stream (inferred) RDF triples to Gephi and make use of its powerful visualization and analysis tools to explore and inspect our datasets. As always, the complete source code can be found on the Datablend public GitHub repository. Make sure to surf the internet to find some other nice Gephi streaming examples, the coolest one probably being the visualization of the Egyptian revolution on Twitter.

Other suggestions for Gephi streaming examples?

ForceAtlas2

Friday, October 21st, 2011

ForceAtlas2 (paper) +appendices by Mathieu Jacomy, Sebastien Heymann, Tommaso Venturini, and Mathieu Bastian.

Abstract:

ForceAtlas2 is a force vector algorithm proposed in the Gephi software, appreciated for its simplicity and for the readability of the networks it helps to visualize. This paper presents its distinctive features, its energy-model and the way it optimizes the “speed versus precision” approximation to allow quick convergence. We also claim that ForceAtlas2 is handy because the force vector principle is unaffected by optimizations, offering a smooth and accurate experience to users.

I knew I had to cite this paper when I read:

These earliest Gephi users were not fully satisfied with existing spatialization tools. We worked on empirical improvements and that’s how we created the first version of our own algorithm, ForceAtlas. Its particularity was a degree-dependant repulsion force that causes less visual cluttering. Since then we steadily added some features while trying to keep in touch with users’ needs. ForceAtlas2 is the result of this long process: a simple and straightforward algorithm, made to be useful for experts and profanes. (footnotes omitted, emphasis added)

Profanes. I like that! Well, rather I like the literacy that enables a writer to use that in a technical paper.

Highly recommended paper.

Introducing Gephi 0.7

Saturday, August 13th, 2011

Introducing Gephi 0.7

Yes, I know that Gephi 0.8 is out in alpha release but this video is worth viewing, even though it is about the “old” version.

From the description:

The video highlights the following features:

  • grouping: Group nodes into clusters and navigate in multi-level graphs.
  • multi-level layout: Very fast layout algorithm that coersen the graph to reduce computation.
  • interaction: Highlight neighbors and interact directly with the visualization when using tools.
  • partitionning: Use data attributes to colorize partitions and communities.
  • ranking: Use degree, metrics or data attributes to set nodes/edges’ color and size.
  • metrics: Run various algorithm in one click and get HTML report page.
  • data laboratory: Data table view with search feature.
  • dynamics: Use Timeline to explore dynamic graphs.
  • filtering: Dynamic queries, create and combine a large set of filters.
  • auto update: The application is updating itself it’s core and plugins.
  • vectorial preview: Switch to the preview tab to put the final touch before explorting in SVG or PDF.

Gephi News: new Visualization API

Friday, August 12th, 2011

Gephi News: new Visualization API

Work is underway on a new visualization API for Gephi. If you are interested in writing visualization of graph software, here’s your opportunity to make a difference.

A New Best Friend: Gephi for Large-scale Networks

Tuesday, August 9th, 2011

A New Best Friend: Gephi for Large-scale Networks

Though I never intended it, some posts of mine from a few years back dealing with 26 tools for large-scale graph visualization have been some of the most popular on this site. Indeed, my recommendation for Cytoscape for viewing large-scale graphs ranks within the top 5 posts all time on this site.

When that analysis was done in January 2008 my company was in the midst of needing to process the large UMBEL vocabulary, which now consists of 28,000 concepts. Like anything else, need drives research and demand, and after reviewing many graphing programs, we chose Cytoscape, then provided some ongoing guidelines in its use for semantic Web purposes. We have continued to use it productively in the intervening years.

Like for any tool, one reviews and picks the best at the time of need. Most recently, however, with growing customer usage of large ontologies and the development of our own structOntology editing and managing framework, we have begun to butt up against the limitations of large-scale graph and network analysis. With this post, we announce our new favorite tool for semantic Web network and graph analysis — Gephi — and explain its use and showcase a current example.

Times change and sometimes software choices do as well.

This is a case in point that reviews the current limitations of Cytoscape, the good points of Gephi, its needed improvements and pointers to more resources on Gephi. Can’t ask for much more.

Scientific graphs Generators plugin

Thursday, April 28th, 2011

Scientific graphs Generators plugin

A new plugin for Gephi, described as:

Cezary Bartosiak and Rafa? Kasprzyk just released the Complex Generators plugin, introducing many awaited scientific generators. These generators are extremely useful for scientists, as they help to simulate various real networks. They can test their models and algorithms on well-studied graph examples. For instance, the Watts-Strogatz generator creates networks as described by Duncan Watts in his Six Degrees book.

The plugin contains the following generators:

  • Balanced Tree
  • Barabasi Albert
  • Barabasi Albert Generalized
  • Barabasi Albert Simplified A
  • Barabasi Albert Simplified B
  • Erdos Renyi Gnm
  • Erdos Renyi Gnp
  • Kleinberg
  • Watts Strogatz Alpha
  • Watts Strogatz Beta

Playing with Gephi, Bio4j and Go

Wednesday, March 30th, 2011

Playing with Gephi, Bio4j and Go

From the blog:

It had already been some time without having some fun with Gephi so today I told myself: why not trying visualizing the whole Gene Ontology and seeing what happens?

First of all I had to generate the corresponding file in gexf format containing all the terms and relationships belonging to the ontology.

For that I did a small program (GenerateGexfGo.java) which uses Bio4j for terms/relationships info retrieval and a couple of XML Gexf wrapper classes from the github project Era7BioinfoXML.

This looks like fun!

And a good way to look at an important data set, that could benefit from a topic map.

Gephi Workshop

Saturday, March 5th, 2011

Gephi Workshop 23 March 2011

Its events like this that make me wish I were on the West Coast.

Even so, there are a number of resources listed for those of us who cannot attend.

From the website:

The next Gephi Workshop will be on Wednesday, March 23rd at 1PM at the IC classroom in Green Library.

I’ll occasionally be able to provide two-hour workshops on the basics of using Gephi, the network analysis package with which I’ve made the images and videos below. The workshops will focus on:

  • getting graph data into Gephi using .gexf, .csv and database connections
  • running Filters, Analytics and Layouts on the data
  • optimization of Gephi for large datasets
  • overview of layout algorithms and strategies for their use
  • creating dynamic (time-enabled) networks
  • general Q&A

Visualising Twitter Dynamics in Gephi, Part 1 – Post

Thursday, February 17th, 2011

Visualising Twitter Dynamics in Gephi, Part 1

From the post:

In the following posts I’m finally keeping my promise to explore in earnest the use of Gephi’s dynamic timeline feature for visualising Twitter-based discussions as they unfolded in real time. A few months ago, Jean posted a first glimpse of our then still very experimental data on Twitter dynamics, with a string of caveats attached – and I followed up on this a little while later with some background on the Gawk scripts we’re using to generate timeline data in GEXF format from our trusty Twapperkeeper archives (note that I’ve updated one of the scripts in that post, to make the process case-insensitive). Building on those posts, here I’ll outline the entire process and show some practical results (disclaimer: actual dynamic animations will follow in part two, tomorrow – first we’re focussing on laying the groundwork).

This article was mentioned in Dynamic Twitter graphs with R and Gephi (clip and code) as an interesting example of “aging” edges.

While there is an obvious time component to tweets, is there an implied relevancy based on time for other information as well?

Tactical information should be displayed to ground level commanders and be sans longer term planning data, while for command headquarters, tactical information is just clutter on the display.

Looks like a fruitful area for exploration.

Mapping Wikileaks’ Cablegate topics using Python, MongoDB, Neo4j and Gephi

Saturday, February 5th, 2011

Mapping Wikileaks’ Cablegate topics using Python, MongoDB, Neo4j and Gephi

Data and slides and movies while the conference is ongoing! Oh My!

This is the sort of effort that topic maps needs to step up to and compete against.

I have some thoughts on what that would take with the Afghan war diaries that I will be posting later today.

Mapping Wikileaks’ Cablegate using Python, mongoDB and Gephi – Saturday, 5 Feburary 2011

Wednesday, February 2nd, 2011

Mapping Wikileaks’ Cablegate using Python, mongoDB and Gephi

From the website:

Text analysis and graph visualization on the Wikileaks Cablegate dataset.

We propose to present a complete work-flow of textual data analysis, from acquisition to visual exploration of a complex network. Through the presentation of a simple software specifically developed for this talk, we will cover a set of productive and widely used softwares and libraries in text analysis, then introduce some features of Gephi, an open-source network visualization & analysis software, using the data collected and transformed with cablegate-semnet.

See: cablegate-semnet

If you are in (or can be) Brussels, Belgium this coming Saturday and Sunday, don’t miss this presentation!

There will be many others worthy of your attention as well.

GSoC 2010 mid-term: Graph Streaming API – Post

Wednesday, January 26th, 2011

GSoC 2010 mid-term: Graph Streaming API by André Panisson.

From the blog:

The purpose of the Graph Streaming API project, run by André Panisson, is to build a unified framework for streaming graph objects. Gephi’s data structure and visualization engine has been built with the idea that a graph is not static and might change continuously. By connecting Gephi with external data-sources, we leverage its power to visualize and monitor complex systems or enterprise data in real-time. Moreover, the idea of streaming graph data goes beyond Gephi, and a unified and standardized API could bring interoperability with other available tools for graph and network analysis, as they could start to interoperate with other tools in a distributed and cooperative fashion.

There are times when no comment seems adequate. This is one of those times.

Read the post, play with the code, follow the work (and support it!).

Gephi – The Open Graph Viz Platform

Sunday, August 8th, 2010

Gephi is an “interactive visualization and exploration platform” for graphs.

From the site:

  • Exploratory Data Analysis: intuition-oriented analysis by networks manipulations in real time.
  • Link Analysis: revealing the underlying structures of associations between objects, in particular in scale-free networks.
  • Social Network Analysis: easy creation of social data connectors to map community organizations and small-world networks.
  • Biological Network analysis: representing patterns of biological data.
  • Poster creation: scientific work promotion with hi-quality printable maps.

I find the notion of interaction with a graph, or in our case a topic map represented as a graph quite fascinating.

Imagine selecting or even adding properties as the basis for merging and then examining those results in an interactive rather than batch process.

Can “drag-n-drop” topic map authoring be that far away?