Archive for the ‘Github’ Category

The GitHub Data Challenge II

Friday, April 5th, 2013

The GitHub Data Challenge II

From the webpage:

There are millions of projects on GitHub. Every day, people from around the world are working to make these projects better. Opening issues, pushing code, submitting Pull Requests, discussing project details — GitHub activity is a papertrail of progress. Have you ever wondered what all that data looks like? There are millions of stories to tell; you just have to look.

Last year we held our first data challenge. We saw incredible visualizations, interesting timelines and compelling analysis.

What stories will be told this year? It’s up to you!

To Enter

Send a link to a GitHub repository or gist with your graph(s) along with a description to data@github.com before midnight, May 8th, 2013 PST.

Approaching 100M rows, how would you visualize the data and what questions would you explore?

GitHub Social Graphs with Groovy and GraphViz

Tuesday, May 29th, 2012

GitHub Social Graphs with Groovy and GraphViz

From the post:

Using the GitHub API, Groovy and GraphViz to determine, interpret and render a graph of the relationships between GitHub users based on the watchers of their repositories. The end result can look something like this.

[Image omitted. I stared to embed the image but on the narrow scale of my blog, it just didn't look good. See the post for the full size version.]

A must see for all Groovy fans!

For an alternative, see:

Mining GitHub – Followers in Tinkerpop

Pointers to social graphs for GitHub using other tools appreciated!

Mining GitHub – Followers in Tinkerpop

Monday, May 14th, 2012

Mining GitHub – Followers in Tinkerpop

Patrick Wagstrom writes:

Development of any moderately complex software package is a social process. Even if a project is developed entirely by a single person, there is still a social component that consists of all of the people who use the software, file bugs, and provide recommendations for enhancements. This social aspect is one of the driving forces behind the proliferation of social software development sites such as GitHub, SourceForge, Google Code, and BitBucket.

These sites combine together a variety of tools that are common for software development such as version control, bug trackers, mailing lists, release management, project planning, and wikis. In addition, some of these have more social aspects that allow you find and follow individual developers or watch particular projects. In this post I’m going to show you how we can use some this information to gain insight into a software development community, specifically the community around the Tinkerpop stack of tools for graph databases.

GitHub as a social community. Who knew? ;-)

Very instructive walk through Gremlin, GraphML, and R with a prepared data set. It doesn’t get much better than this!