Archive for the ‘Contest’ Category

Data Science Bowl 2018 – Spot Nuclei. Speed Cures.

Tuesday, January 16th, 2018

Spot Nuclei. Speed Cures.

From the webpage:

The 2018 Data Science Bowl offers our most ambitious mission yet: Create an algorithm to automate nucleus detection and unlock faster cures.

Compete on Kaggle

Three months. $100,000.

Even if you “lose,” think of the experience you will gain. No losers.


PS: Just thinking outloud but if:

This dataset contains a large number of segmented nuclei images. The images were acquired under a variety of conditions and vary in the cell type, magnification, and imaging modality (brightfield vs. fluorescence). The dataset is designed to challenge an algorithm’s ability to generalize across these variations.

isn’t the ability to generalize, with lower performance a downside?

Why not use the best algorithm for a specified set of data conditions, “merging” that algorithm so to speak, so that scientists always have the best algorithm for their specific data set.

So outside the contest, perhaps recognizing the conditions of the images are the most important subjects and they should be matched to the best conditions for particular algorithms.

Anyone interested in collaborating on a topic map entry?

Re-imagining Legislative Data – Semantic Integration Alert

Tuesday, June 27th, 2017

Innovate, Integrate, and Legislate: Announcing an App Challenge by John Pull.

From the post:

This morning, on Tuesday, June 27, 2017, Library of Congress Chief Information Officer Bernard A. Barton, Jr., is scheduled to make an in-person announcement to the attendees of the 2017 Legislative Data & Transparency Conference in the CVC. Mr. Barton will deliver a short announcement about the Library’s intention to launch a legislative data App Challenge later this year. This pre-launch announcement will encourage enthusiasts and professionals to bring their app-building skills to an endeavor that seeks to create enhanced access and interpretation of legislative data.

The themes of the challenge are INNOVATE, INTEGRATE, and LEGISLATE. Mr. Barton’s remarks are below:

Here in America, innovation is woven into our DNA. A week from today our nation celebrates its 241st birthday, and those years have been filled with great minds who surveyed the current state of affairs, analyzed the resources available to them, and created devices, systems, and ways of thinking that created a better future worldwide.

The pantheon includes Benjamin Franklin, George Washington Carver, Alexander Graham Bell, Bill Gates, and Steve Jobs. It includes first-generation Americans like Nikolai Tesla and Albert Einstein, for whom the nation was an incubator of innovation. And it includes brilliant women such as Grace Hopper, who led the team that invented the first computer language compiler, and Shirley Jackson, whose groundbreaking research with subatomic particles enabled the inventions of solar cells, fiber-optics, and the technology the brings us something we use every day: call waiting and caller ID.

For individuals such as these, the drive to innovate takes shape through understanding the available resources, surveying the landscape for what’s currently possible, and taking it to the next level. It’s the 21st Century, and society benefits every day from new technology, new generations of users, and new interpretations of the data surrounding us. Social media and mobile technology have rewired the flow of information, and some may say it has even rewired the way our minds work. So then, what might it look like to rewire the way we interpret legislative data?

It can be said that the legislative process – at a high level – is linear. What would it look like if these sets of legislative data were pushed beyond a linear model and into dimensions that are as-yet-unexplored? What new understandings wait to be uncovered by such interpretations? These understandings could have the power to evolve our democracy.

That’s a pretty grand statement, but it’s not without basis. The sets of data involved in this challenge are core to a legislative process that is centuries old. It’s the source code of America government. An informed citizenry is better able to participate in our democracy, and this is a very real opportunity to contribute to a better understanding of the work being done in Washington. It may even provide insights for the people doing the work around the clock, both on the Hill, and in state and district offices. Your innovation and integration may ultimately benefit the way our elected officials legislate for our future.

Improve the future, and be a part of history.

The 2017 Legislative Data App Challenge will launch later this summer. Over the next several weeks Information will be made available at, and individuals are invited to connect via

I mention this as a placeholder only because Pull’s post is general enough to mean several things, their opposites or something entirely different.

The gist of the post is that later this summer (2017), a challenge involving an “app” will be announced. The “app” will access/deliver/integrate legislative data. Beyond that, no further information is available at this time.

Watch for future posts as more information becomes available.

The Hack2Win 2017 5K – IP Address 1 July 2017

Monday, June 12th, 2017

No, an annoying road race, that’s $5K in USD!

Hack2Win 2017 – The Online Version

From the post:

Want to get paid for a vulnerability similar to this one?

Contact us at:

We proud to announce the first online hacking competition!

The rules are very simple – you need to hack the D-link router (AC1200 / DIR-850L) and you can win up to 5,000$ USD.

To try and help you win – we bought a D-link DIR-850L device and plugged it to the internet (we will disclose the IP address on 1st of July 2017) for you to try to hack it, while the WAN access is the only point of entry for this device, we will be accepting LAN vulnerabilities as well.

If you successfully hack it – submit your findings to us ssd[], you will get paid and we will report the information to the vendor.

The competition will end on the 1st of September 2017 or if a total of 10,000$ USD was handed out to eligible research.
… (emphasis in original)

Great opportunity to learn about the D-link router (AC1200 / DIR-850L) because hacked doesn’t count:

Usage of any known method of hacking – known methods including anything that we can use Google/Bing/etc to locate – this includes: documented default password (that cannot be changed), known vulnerabilities/security holes (found via Google, exploit-db, etc)

Makes me think having all the known vulnerabilities of the D-link router (AC1200 / DIR-850L) could be a competitive advantage.

Topic maps anyone?

PS: For your convenience, I have packaged up the D-Link files as of Monday, 12 June 2017 for the AC1200, hardware version A1,

Outbrain Challenges the Research Community with Massive Data Set

Sunday, November 13th, 2016

Outbrain Challenges the Research Community with Massive Data Set by Roy Sasson.

From the post:

Today, we are excited to announce the release of our anonymized dataset that discloses the browsing behavior of hundreds of millions of users who engage with our content recommendations. This data, which was released on the Kaggle platform, includes two billion page views across 560 sites, document metadata (such as content categories and topics), served recommendations, and clicks.

Our “Outbrain Challenge” is a call out to the research community to analyze our data and model user reading patterns, in order to predict individuals’ future content choices. We will reward the three best models with cash prizes totaling $25,000 (see full contest details below).

The sheer size of the data we’ve released is unprecedented on Kaggle, the competition’s platform, and is considered extraordinary for such competitions in general. Crunching all of the data may be challenging to some participants—though Outbrain does it on a daily basis.

The rules caution:

The data is anonymized. Please remember that participants are prohibited from de-anonymizing or reverse engineering data or combining the data with other publicly available information.

That would be a more interesting question than the ones presented for the contest.

After the 2016 U.S. presidential election we know that racists, sexists, nationalists, etc., are driven by single factors so assuming you have good tagging, what’s the problem?


Or is human behavior is not only complex but variable?

Good luck!

Improv at DARPA (No, Not Comedy)

Saturday, March 12th, 2016

Improv Proposers Day Webcast Special Notice March 29 and March 30, 2016 (DARPA SN-16-26)

From the notice:


The DARPA/DSO Improv program is seeking prototype products and systems that have the potential to threaten current military operations, equipment, or personnel and are assembled primarily from commercially available technology. The technology scope of Improv is broad, and the program is structured to encourage participation by a wide range of technical specialists, researchers, developers, and skilled hobbyists. Performers may reconfigure, repurpose, program, reprogram, modify, combine, or recombine commercially available technology in any way within the bounds of local, state, and federal laws and regulations. Use of components, products, and systems from non-military technical specialties (e.g., transportation, construction, maritime, and communications) is of particular interest.

Pre-recorded seven hour webcast and recording is prohibited:

Tuesday, March 29, 2016 at 10:00 a.m. – 5:00 p.m., and Wednesday, March 30, 2016 at 10:00 a.m. – 5:00 p.m.

No cost but pre-registration is required:

This looks like fun!

Your effort won’t be wasted in any event. If your idea isn’t funded here, you can still market it to others.

PS: I tried to register on 12 March 2016 and the website was down. 🙁 Will try again next week.

Microsoft Quantum Challenge [Deadline April 29, 2016.]

Tuesday, February 2nd, 2016

Microsoft Quantum Challenge

From the webpage:

Join students from around the world to investigate and solve problems facing the quantum universe using Microsoft’s simulator, LIQUi|>.

Win big prizes, or the opportunity to interview for internships at Microsoft Research.

Objectives of the Quantum Challenge

The Quantum Architectures and Computing Group QuArC is seeking exceptional students!

WE want to find students who are eager to expand their knowledge of quantum computing, and who can translate thoughts into programs. Thereby we will expand the use of Microsoft’s Quantum Simulator LIQUi|>.

How to enter

First of all, REGISTER for the Challenge so that you can receive updates about the contest.

In the challenge you will use the LIQUi|> simulator to solve a novel problem and then report on your findings. So, think of a project. Then, download the simulator from GitHub and work with it to solve your problem. Finally, write a report about your findings and submit it. Your report submission will enter you into the Challenge.

In the report, present a description of the project including goals, methods, challenges, and any result obtained using LIQUi|>. You do not need to submit circuits and the software you develop, however, sample input and output for LIQUi|> must be submitted to show you used the simulator in the project. Your entry must consist of six pages or less, in PDF format.

The Challenge is open to students at colleges and universities world-wide (with a few restrictions) and aged 18+. NO PURCHASE NECESSARY. For full details, see the Official Rules

The prizes

 The Quantum Challenge is your change to win a big prize!

  • First Prize:  $5,000
  • Second Prizes:   Four at $2,500
  • Honorary Mention: Certificates will be presented to runner-up entries

Extra – visits or internship interviews

As a result of the challenge, some entrants could be invited to visit the QuArC team at Microsoft Research in Redmond, or have an opportunity to interview for internships at Microsoft Research. Internships are highly prestigious and involve working with the QuArC team for 12 weeks on cutting edge research.

If you are young enough to enter, just a word of warning about the “big prize.” $5,000 today isn’t a “big prize.” Maybe a nice weekend if you keep it low key but only just.

Interaction with the QuArC team, either by winning or in online discussions is the real prize.

Besides, who need $5,000 if you can break quantum encrypted bank transfer orders? 😉

Introducing Kaggle Datasets [No Data Feudalism Here]

Saturday, January 23rd, 2016

Introducing Kaggle Datasets

From the post:

At Kaggle, we want to help the world learn from data. This sounds bold and grandiose, but the biggest barriers to this are incredibly simple. It’s tough to access data. It’s tough to understand what’s in the data once you access it. We want to change this. That’s why we’ve created a home for high quality public datasets, Kaggle Datasets.

Kaggle Datasets has four core components:

  • Access: simple, consistent access to the data with clear licensing
  • Analysis: a way to explore the data without downloading it
  • Results: visibility to the previous work that’s been created on the data
  • Conversation: forums and comments for discussing the nuances of the data

Are you interested in publishing one of your datasets on Submit a sample here.

Unlike some medievalists who publish in the New England Journal of Medicine, Kaggle not only makes the data sets freely available, but offers tools to help you along.

Kaggle will also assist you in making your datasets available as well.

Science Bowl [Different from the Quiche Bowl?]

Sunday, December 20th, 2015

Even basic cable has an overwhelming number “bowl” (American football) games. Mostly corporate sponsor names although the Cure Bowl” was sponsored by AutoNation. It’s for a worthy cause (breast cancer research) but that isn’t obvious from a TV listing.

If you aren’t interested in encouraging physical injuries, including concussions, you have to look elsewhere for bowl game excitement.

Have you considered the Second Annual Data Science Bowl?

From the web page:

We all have a heart. Although we often take it for granted, it’s our heart that gives us the moments in life to imagine, create, and discover. Yet cardiovascular disease threatens to take away these moments. Each day, 1,500 people in the U.S. alone are diagnosed with heart failure—but together, we can help. We can use data science to transform how we diagnose heart disease. By putting data science to work in the cardiology field, we can empower doctors to help more people live longer lives and spend more time with those that they love.

Declining cardiac function is a key indicator of heart disease. Doctors determine cardiac function by measuring end-systolic and end-diastolic volumes (i.e., the size of one chamber of the heart at the beginning and middle of each heartbeat), which are then used to derive the ejection fraction (EF). EF is the percentage of blood ejected from the left ventricle with each heartbeat. Both the volumes and the ejection fraction are predictive of heart disease. While a number of technologies can measure volumes or EF, Magnetic Resonance Imaging (MRI) is considered the gold standard test to accurately assess the heart’s squeezing ability.


The challenge with using MRI to measure cardiac volumes and derive ejection fraction, however, is that the process is manual and slow. A skilled cardiologist must analyze MRI scans to determine EF. The process can take up to 20 minutes to complete—time the cardiologist could be spending with his or her patients. Making this measurement process more efficient will enhance doctors’ ability to diagnose heart conditions early, and carries broad implications for advancing the science of heart disease treatment.

The 2015 Data Science Bowl challenges you to create an algorithm to automatically measure end-systolic and end-diastolic volumes in cardiac MRIs. You will examine MRI images from more than 1,000 patients. This data set was compiled by the National Institutes of Health and Children’s National Medical Center and is an order of magnitude larger than any cardiac MRI data set released previously. With it comes the opportunity for the data science community to take action to transform how we diagnose heart disease.

This is not an easy task, but together we can push the limits of what’s possible. We can give people the opportunity to spend more time with the ones they love, for longer than ever before.


  • February 29, 2016 – First submission and team merger deadline. Your team must make its first submission by this deadline. This is also the last day you may merge with another team.
  • March 7, 2016 – Stage one deadline and stage two data release. Your model must be finalized and uploaded to Kaggle by this deadline. After this deadline, the test set is released, the answers to the validation set are released, and participants make predictions on the test set.
  • March 14, 2016 – Final submission deadline.

Motivations (in no particular order):

  • Bragging rights!
  • Experience with complex data modeling problem.
  • Prizes:
    • 1st place – $125,000
    • 2nd place – $50,000
    • 3rd place – $25,000
  • Substantial contribution to bioinformatics/heart research

I first saw this in a tweet by Kirk Borne.

3 ways to win “Practical Data Science with R”! (Contest ends December 12, 2015 at 11:59pm EST)

Friday, December 4th, 2015

3 ways to win “Practical Data Science with R”!.

Renee is running a contest to give away three copies of “Practical Data Science with R” by Nina Zumel and John Mount!

You must enter on or before December 12, 2015 at 11:59pm EST.

Three ways to win, see Renee’s post for the details!

Bytes that Rock! Software Awards 2015 (Nominations Open Now – Close 16th November 2015)

Friday, November 13th, 2015

Bytes that Rock! Software Awards 2015 (Nominations Open Now – Close 16th November 2015)

An awards program for excellence in software and blogs!

The only limitation I could find is:

Bytes that Rock recognizes the best software and blogs for their excellence in the past 12 months.

Your game/software/blog may have been excellent three (3) years ago but that doesn’t count. 😉

Subject to that mild limitation, step up and:

Submit a blog, software or game clicking on the categories below!

Software blogs
VideoGame blogs
Security blogs

PC Software
Software UI
Innovative Software
Protection Software
Open Source Software

PC Games
Indie Games
Mods for games

This is not a next week, or after I ask X, or when I get home task.

This is a hit a submit link now task!

You will feel better after having made a nomination. Promise. 😉


(Select the graphic for a much larger version of the image.)

Data Science Challenge 3

Friday, October 24th, 2014

Data Science Challenge 3

From the post:

Challenge Period

The Fall 2014 Data Science Challenge runs October 11, 2014 through January 21, 2015.

Challenge Prerequisite

You must pass Data Science Essentials (DS-200) prior to registering for the Challenge.

Challenge Description

The Fall 2014 Data Science Challenge incorporates three independent problems derived from real-world scenarios and data sets. Each problem has its own data, can be solved independently, and should take you no longer than eight hours to complete. The Fall 2014 Challenge includes problems dealing with online travel services, digital advertising, and social networks.

Problem 1: SmartFly
You have been contacted by a new online travel service called SmartFly. SmartFly provides its customers with timely travel information and notifications about flights, hotels, destination weather, and airport traffic, with the goal of making your travel experience smoother. SmartFly’s product team has come up with the idea of using the flight data that it has been collecting to predict whether customers’ flights will be delayed in order to respond proactively. The team has now contacted you to help test out the viability of the idea. You will be given SmartFly’s data set from January 1 to September 30, 2014 and be asked to return a list of of upcoming flights sorted from the most likely to the least likely to be delayed.

Problem 2: Almost Famous
Congratulations! You have just published your first book on data science, advanced analytics, and predictive modeling. You’ve also decided to use your skills as a data scientist to build and optimize a website that promotes your book, and you have started several ad campaigns on a popular search engine in order to drive traffic to your site. Using your skills in data munging and statistical analysis, you will be asked to evaluate the performance of a series of campaigns directed towards site visitors using the log data in Hadoop as your source of truth.

Problem 3: WINKLR
WINKLR is a curiously popular social network for fans of the 1970s sitcom Happy Days. Users can post photos, write messages, and, most importantly, follow each other’s posts. This helps members keep up with new content from their favorite users. To help its users discover new people to follow on the site, WINKLR is building a new machine learning system called The Fonz to predict who a given user might like to follow. Phase One of The Fonz project is underway. The engineers can export the entire user graph as tuples. You have joined the Fonz project to implement Phase Two, which improves on this result. Given the user graph and the list of frequent-click tuples, you are being asked to select a 70,000 tuple subset in “user1,user2” format, where you believe user1 is mostly likely to want to follow user2. These will result in emails to the users, inviting them to follow the recommended user.

Prize for success: CCP: Data Scientist status

Great way to start 2015!

I first saw this in a tweet by Sarah.

Magna Carta Ballot – Deadline 31 October 2014

Tuesday, October 7th, 2014

Win a chance to see all four original 1215 Magna Carta manuscripts together for the first time #MagnaCartaBallot

From the post:

Magna Carta is one of the world’s most influential documents. Created in 1215 by King John and his barons, it has become a potent symbol of liberty and the rule of law.

Eight hundred years later, all four surviving original manuscripts are being brought together for the first time on 3 February 2015. The British Library, Lincoln Cathedral and Salisbury Cathedral have come together to stage a one-off, one-day event sponsored by Linklaters.

This is your chance to be part of history as we give 1,215 people the unique opportunity to see all four Magna Carta documents at the British Library in London.

The unification ballot to win tickets is free to enter. The closing date is 31 October 2014.

According to the FAQ you have to get yourself to London on the specified date and required time.

Good luck!

Apps for Energy

Friday, January 31st, 2014

Apps for Energy

Deadline: March 9, 2014

From the webpage:

The Department of Energy is awarding $100,000 in prizes for the best web and mobile applications that use one or more featured APIs, standards or ideas to help solve a problem in a unique way.

Submit an application by March 9, 2014!

Not much in the way of semantic integration opportunities, at least as the contest is written.

Still, it is an opportunity to work with government data and there is a chance you could win some money!

The Structure Data awards:… [Vote For GraphLab]

Wednesday, January 29th, 2014

The Structure Data awards: Honoring the best data startups of 2013 by Derrick Harris.

From the post:

Data is taking over the world, which makes for an exciting time to be covering information technology. Almost every new company understands the importance of analyzing data, and many of their products — from fertility apps to stream-processing engines — are based on this understanding. Whether it’s helping users do new things or just do the same old things better, data analysis really is changing the enterprise and consumer technology spaces, and the world, in general.

With that in mind, we have decided to honor some of the most-promising, innovative and useful data-based startups with our inaugural Structure Data awards. The criteria were simple. Companies (or projects) must have launched in 2013; must have been covered in Gigaom; and, most importantly, must make the collection and analysis of data a key part of the user experience. Identifying these companies was the easy part; the hard part was paring down the list of categories and candidates to a reasonable number.

Just a quick head’s up about the Readers’ Choice awards at GIgaom. Voting closes 14 February 2014.

If you need a suggestion under Machine Learning/AI, vote for GraphLab!

Want to win $1,000,000,000 (yes, that’s one billion dollars)?

Wednesday, January 22nd, 2014

Want to win $1,000,000,000 (yes, that’s one billion dollars)? by Ann Drobnis.

The offer is one billion dollars for picking the winners of every game in the NCAA men’s basketball tournament in the Spring of 2014.

Unfortunately, none of the news stories I saw had links back to any authentic information from Quicken Loans and Berkshire Hathaway about the offer.

After some searching I found: Win a Billion Bucks with the Quicken Loans Billion Dollar Bracket Challenge by Clayton Closson, on January 21, 2014 on the Quicken Loans blog. (As far as I can tell it is an authentic post on the QL website.)

From that post:

You could be America’s next billionaire if you’re the grand prize winner of the Quicken Loans Billion Dollar Bracket Challenge. You read that right: one billion. Not one million. Not one hundred million. Not five hundred million. One billion U.S. dollars.

All you have to do is pick a perfect tournament bracket for the upcoming 2014 tournament. That’s it. Guess all the winners of all the games correctly, and Quicken Loans, along with Berkshire Hathaway, will make you a billionaire. The official press release is below. The contest starts March 3, 2014, so we’ll soon have all the info on how and when to enter your perfect bracket.

Good luck, my friends. This is your chance to play in perhaps the biggest sweepstakes in U.S. history. It’s your chance for a billion.

Oh, and by the way, the 20 closest imperfect brackets will win a cool hundred grand to put toward their home (or new home). Plus, in conjunction with the sweepstakes, Quicken Loans will donate $1 million to Detroit and Cleveland nonprofits to help with education of inner city youth.

So, to recap: If you’re perfect, you’ll win a billion. If you’re not perfect, you could win $100,000. The entry period begins Monday, March 3, 2014 and runs until Wednesday, March 19, 2014. Stay tuned on how to enter.

Contest updates at:

The odds against winning are absurd but this has all the markings of a big data project. Historical data, current data on the teams and players, models, prior outcomes to test your models, etc.

I wonder if Watson likes basketball?

Social Science Dataset Prize!

Wednesday, January 22nd, 2014

Statwing is awarding $1,500 for the best insights from its massive social science dataset by Derrick Harris.

All submissions are due through the form on this page by January 30 at 11:59pm PST.

From the post:

Statistics startup Statwing has kicked off a competition to find the best insights from a 406-variable social science dataset. Entries will be voted on by the crowd, with the winner getting $1,000, second place getting $300 and third place getting $200. (Check out all the rules on the Statwing site.) Even if you don’t win, though, it’s a fun dataset to play with.

The data comes from the General Social Survey and dates back to 1972. It contains variables ranging from sex to feelings about education funding, from education level to whether respondents think homosexual men make good parents. I spent about an hour slicing and dicing variable within the Statwing service, and found some at least marginally interesting stuff. Contest entries can use whatever tools they want, and all 79 megabytes and 39,662 rows are downloadable from the contest page.

Time is short so you better start working.

The rules page, where you make your submission, emphasizes:

Note that this is a competition for the most interesting finding(s), not the best visualization.

Use any tool or method, just find the “most interesting finding(s)” as determined by crowd vote.

On the dataset:

Every other year since 1972, the General Social Survey (GSS) has asked thousands of Americans 90 minutes of questions about religion, culture, beliefs, sex, politics, family, and a lot more. The resulting dataset has been cited by more than 14,000 academic papers, books, and dissertations—more than any except the U.S. Census.

I can’t decide if Americans have more odd opinions now than before. 😉

Maybe some number crunching will help with that question.

Neo4j GraphGist December Challenge

Sunday, December 8th, 2013

Neo4j GraphGist December Challenge

Meetup Slides say: Deadline for entry is January 31st (2014). I mention that because the webpage still says Dec 31, 2013.

From the webpage:

This time we want you to look into these 10 categories and provide us with really easy to understand and still insightful Graph Use-Cases: Do not take the example keywords literally, you know your domain much better than we do!

  • Education – Schools, Universities, Courses, Planning, Management etc
  • Finance – Loans, Risks Fraud
  • Life Science – Biology, Genetics, Drug research, Medicine, Doctors, Referrals
  • Manufacturing – production line management, supply chain, parts list, product lines
  • Sports – Football, Baseball, Olympics, Public Sports
  • Resources – Energy Market, Consumption, Resource exploration, Green Energy, Climate Modeling
  • Retail – Recommendations, Product categories, Price Management, Seasons, Collections
  • Telecommunication – Infrastructure, Authorization, Planning, Impact
  • Transport – Shipping, Logistics, Flights, Cruises, Road/Train optimizations, Schedules
  • Advanced Graph Gists – for those of you that run outside of the competition anyway, give your best 🙂


We want to offer in each of our 10 categories Amazon gift-cards valued:

  1. Winner: 300 USD
  2. Second: 150 USD
  3. Third: 50 USD
  4. Every participant gets a special GraphGist t-shirt too.

In addition to the resources at the webpage, you may find AsciiDoc Cheatsheet helpful.

The meetup video where the GraphGist was announced.

Easy to understand graph use cases should not be too difficult.

Easy to solve graph use cases, that may be another matter. 😉

Vidi Competition [Closes 14th February 2014]

Monday, December 2nd, 2013

Vidi Competition by Marieke Guy. (A public email notice I received today.)

At the start of November the LinkedUp Project launched the second in our LinkedUp Challenge – the Vidi Competition.

For the Vidi Competition we are inviting you to design and build innovative and robust prototypes and demos for tools that analyse and/or integrate open web data for educational purposes. The competition will run from 4th November 2013 till 14th February 2014. Prizes (up to €3,000 for first) will be awarded at the European Semantic Web Conference in Crete, Greece in May 2014. You can find out full details on the LinkedUp Challenge Website.

For this Competition we have one open track and two focused tracks that may guide teams or provide inspiration.

We’ve recently published blog posts on the tracks:

  • Pathfinder: Using linked data to ease access to recommendations and guidance
  • Simplificator: Using linked data to add context to domain-specific resources

There is also a blog post detailing the technical support we can offer.

We’d like to complement these posts with an online webinar which will introduce LinkedUp and the Vidi Competition. There will also be an opportunity to ask our technical support team questions and find out more about the data sets available. The webinar will take approximately 45 minutes and will be recorded.

The webinar is still in planning but is likely to take place in the next couple of weeks, if you are interested in participating please register your email address and we will share times with you.

A collection of suggested data sources can be found at the LinkedUp Data Repository

The overall theme of the competition:

We’re inviting you to design and build innovative and robust prototypes and demos for tools that analyse and/or integrate open web data for educational purposes. You can submit your Web application, App, analysis toolkit, documented API or any other tool that connects, exploits or analyses open or linked data and that addresses real educational needs. Your tool still may contain some bugs, as long as it has a stable set of features and you have some proof that it can be deployed on a realistic scale.

You could approach this competition several ways:

  1. Do straight linked data as a credential of your ability to produce and use linked data.
  2. Do straight linked data and supplement it with a topic map, either separately or as part of the competition.
  3. Create a solution (topic maps and/or linked data) and approach people with an interest in these resources.

A regular reader of this blog recently reminded me people are not shopping for topic maps (or linked data) but for results. (That’s #3 in my list.)

BRDI Announces Data and Information Challenge

Thursday, October 10th, 2013

BRDI Announces Data and Information Challenge by Stephanie Hagstrom.

From the post:

The National Academy of Sciences Board on Research Data and Information (BRDI; announces an open challenge to increase awareness of current issues and opportunities in research data and information. These issues include, but are not limited to, accessibility, integration, discoverability, reuse, sustainability, perceived versus real value and reproducibility.

A Letter of Intent is requested by December 1, 2013 and the deadline for final entries is May 15, 2014.

Awardees will be invited to present their projects at the National Academy of Sciences in Washington DC as part of a symposium of the regularly scheduled Board of Research Data and Information meeting in the latter half of 2014.

More information is available at Please contact Cheryl Levey ( with any questions.

This looks quite interesting.

The main site reports:

The National Academy of Sciences Board on Research Data and Information (BRDI; is holding an open challenge to increase awareness of current issues and opportunities in research data and information. These issues include, but are not limited to, accessibility, integration, discoverability, reuse, sustainability, perceived versus real value and reproducibility. Opportunities include, but are not limited to, analyzing such data and information in new ways to achieve significant societal benefit.

Entrants are expected to describe one or more of the following:

  • Novel ideas
  • Tools
  • Processes
  • Models
  • Outcomes

using research data and information. There is no restriction on the type of data or information, or the type of innovation that can be described. All data and tools that form the basis of a contestant’s entry must be made freely and openly available. The challenge is held in memory of Lee Dirks, a pioneer in scholarly communication.

Anticipated outcomes of the challenge include the potential for original and innovative solutions to societal problems using existing research data and information, national recognition for the successful contestants and possibly their institutions.

Looks ideal for a topic map-based proposal.

Suggestions on data sets?

Legislative XML Data Mapping [$10K]

Friday, September 13th, 2013

Legislative XML Data Mapping (Library of Congress)

First, the important stuff:

First Place: $10K

Entry due by: December 31 at 5:00pm EST

Second, the details:

The Library of Congress is sponsoring two legislative data challenges to advance the development of international data exchange standards for legislative data. These challenges are an initiative to encourage broad participation in the development and application of legislative data standards and to engage new communities in the use of legislative data. Goals of this initiative include:
• Enabling wider accessibility and more efficient exchange of the legislative data of the United States Congress and the United Kingdom Parliament,
• Encouraging the development of open standards that facilitate better integration, analysis, and interpretation of legislative data,
• Fostering the use of open source licensing for implementing legislative data standard.

The Legislative XML Data Mapping Challenge invites competitors to produce a data map for US bill XML and the most recent Akoma Ntoso schema and UK bill XML and the most recent Akoma Ntoso schema. Gaps or issues identified through this challenge will help to shape the evolving Akoma Ntoso international standard.

The winning solution will win $10,000 in cash, as well as opportunities for promotion, exposure, and recognition by the Library of Congress. For more information about prizes please see the Official Rules.

Can you guess what tool or technique I would suggest that you use? 😉

The winner is announced February 12, 2014 at 5:00pm EST.

Too late for the holidays this year, too close to Valentines Day, what holiday will you be wanting to celebrate?

Ponder This

Thursday, August 1st, 2013

Ponder This: August 2013

From the webpage:

Welcome to our monthly puzzles

You are cordially invited to match wits with some of the best minds in IBM Research.

Seems some of us can’t see a problem without wanting to take a crack at solving it. Does that sound like you? Good. Forge ahead and ponder this month’s problem. We’ll post a new one every month, and allow a month for you to submit solutions (we may even publish submitted answers, especially if they’re correct). We usually won’t reply individually to submitted solutions but every few days we will update a list of people who answered correctly. At the beginning of the next month, we’ll post the answer.

The August 2013 puzzle starts off:

Put five-bit numbers on the vertices of a 9-dimensional hypercube such that, from any vertex, you can reach any number in no more than two moves along the edges of the hypercube.

A worked example using four-bit numbers on a 5-dimensional hypercube is given.

Are higher dimension data structures in your future?

British Library Labs – Competition 2013

Sunday, May 5th, 2013

British Library Labs – Competition 2013

Deadline for entry: Wednesday 26 June , 2013 (midnight GMT)

From the webpage:

We want you to propose an innovative and transformative project using the British Library’s digital collections and if your idea is chosen, the Labs team will work with you to make it happen and you could win a prize of up to £3,000.

From the digitisation of thousands of books, newspapers and manuscripts, the curation of UK websites, bird sounds or location data for our maps, over the last two decades we’ve been faithfully amassing a vast and wide-ranging number of digital collections for the nation. What remains elusive, however, is understanding what researchers need in place in order to unlock the potential for new discoveries within these fascinating and diverse sets of digital content.

The Labs competition is designed to attract scholars, explorers, trailblazers and software developers who see the potential for new and innovative research and development opportunities lurking within these immense digital collections. Through soliciting imaginative and transformative projects utilising this content you will be giving us a steer as to the types of new processes, platforms, arrangements, services and tools needed to make it more accessible. We’ll even throw the Library’s resources behind you to make your idea a reality.

Numerous ways to get support for developing your idea before submission.

In terms of PR for your solution (hopefully topic maps based) do note:


Winners will get direct curatorial and financial support for completing their project from the Labs team, which may involve an expenses paid residency at the British Library for a mutually agreed period of time (dependent on the winners’ circumstances, the winning ideas, access to resources and budget allowing).

  • Winners will receive £3000 for completing their project
  • Runners-up will receive £1000 for completing their project

The work will take place between between Saturday July 6 and Monday 4 November, 2013, with the completed projects being showcased during November 2013 when prizes will be awarded.

What happens to your ideas?

All ideas will be posted on the Labs website after they have been judged. All project ideas submitted for the competition can continue to be worked on and where possible the Labs team will provide support (time and resources permitting). Well developed projects will be showcased together with the competition winners during November 2013.

This is also a good excuse to spend more time at the British Library website. I don’t spend nearly enough time there myself.

KDD Cup 2013 – Author-Paper Identification Challenge

Thursday, April 18th, 2013

KDD Cup 2013 – Author-Paper Identification Challenge

Started: 3:47 am, Thursday 18 April 2013 UTC
Ends: 12:00 am, Wednesday 12 June 2013 UTC (54 total days)

From the post:

The ability to search literature and collect/aggregate metrics around publications is a central tool for modern research. Both academic and industry researchers across hundreds of scientific disciplines, from astronomy to zoology, increasingly rely on search to understand what has been published and by whom.

Microsoft Academic Search is an open platform that provides a variety of metrics and experiences for the research community, in addition to literature search. It covers more than 50 million publications and over 19 million authors across a variety of domains, with updates added each week. One of the main challenges of providing this service is caused by author-name ambiguity. On one hand, there are many authors who publish under several variations of their own name. On the other hand, different authors might share a similar or even the same name.

As a result, the profile of an author with an ambiguous name tends to contain noise, resulting in papers that are incorrectly assigned to him or her. This KDD Cup task challenges participants to determine which papers in an author profile were truly written by a given author.

$7,500 and bragging rights.

Is there going to be a topic map entry this year?

Increasing Interoperability of Data for Social Good [$100K]

Saturday, March 23rd, 2013

Increasing Interoperability of Data for Social Good

March 4, 2013 through May 7, 2013 11:30 AM PST

Each Winner to Receive $100,000 Grant

Got your attention? Good!

From the notice:

The social sector is full of passion, intuition, deep experience, and unwavering commitment. Increasingly, social change agents from funders to activists, are adding data and information as yet one more tool for decision-making and increasing impact.

But data sets are often isolated, fragmented and hard to use. Many organizations manage data with multiple systems, often due to various requirements from government agencies and private funders. The lack of interoperability between systems leads to wasted time and frustration. Even those who are motivated to use data end up spending more time and effort on gathering, combining, and analyzing data, and less time on applying it to ongoing learning, performance improvement, and smarter decision-making.

It is the combining, linking, and connecting of different “data islands” that turns data into knowledge – knowledge that can ultimately help create positive change in our world. Interoperability is the key to making the whole greater than the sum of its parts. The Bill & Melinda Gates Foundation, in partnership with Liquidnet for Good, is looking for groundbreaking ideas to address this significant, but solvable, problem. See the website for more detail on the challenge and application instructions. Each challenge winner will receive a grant of $100,000.

From the details website:

Through this challenge, we’re looking for game-changing ideas we might never imagine on our own and that could revolutionize the field. In particular, we are looking for ideas that might provide new and innovative ways to address the following:

  • Improving the availability and use of program impact data by bringing together data from multiple organizations operating in the same field and geographical area;
  • Enabling combinations of data through application programming interface (APIs), taxonomy crosswalks, classification systems, middleware, natural language processing, and/or data sharing agreements;
  • Reducing inefficiency for users entering similar information into multiple systems through common web forms, profiles, apps, interfaces, etc.;
  • Creating new value for users trying to pull data from multiple sources;
  • Providing new ways to access and understand more than one data set, for example, through new data visualizations, including mashing up government and other data;
  • Identifying needs and barriers by experimenting with increased interoperability of multiple data sets;
  • Providing ways for people to access information that isn’t normally accessible (for using natural language processing to pull and process stories from numerous sources) and combing that information with open data sets.

Successful Proposals Will Include:

  • Identification of specific data sets to be used;
  • Clear, compelling explanation of how the solution increases interoperability;
  • Use case;
  • Description of partnership or collaboration, where applicable;
  • Overview of how solution can be scaled and/or adapted, if it is not already cross-sector in nature;
  • Explanation of why the organization or group submitting the proposal has the capacity to achieve success;
  • A general approach to ongoing sustainability of the effort.

I could not have written a more topic map oriented challenge. You?

They suggest the usual social data sites:

Apache Solr 4 Cookbook (Win a free copy)

Saturday, March 16th, 2013

Apache Solr 4 Cookbook (Win a free copy)

Deadline 28.03.2013.

From the post:

Readers would be pleased to know that we have teamed up with Packt Publishing to organize a Giveaway of the Apache Solr 4 Cookbook. Two lucky winners will win a copy of the book (in eBook format). Keep reading to find out how you can be one of the Lucky Winners.

Let’s start with a little reminder about the book:

  • Learn how to make Apache Solr search faster, more complete, and comprehensively scalable
  • Solve performance, setup, configuration, analysis, and query problems in no time
  • Get to grips with, and master, the new exciting features of Apache Solr 4

Read more about this book and download free Sample Chapter.

How to Enter ?

All you need to do is head on over to the book page (Apache Solr 4 Cookbook) and look through the product description of the book and drop a line via the comments below this post to let us know what interests you the most about this book. It’s that simple.

Product Description:


The contest will close on 28.03.2013. Winners will be contacted by email, so be sure to use your real email address when you comment!

Who Will Win ?

The winners will be chosen by the team randomly from readers entering the competition that replied with on topic comment.

If you want to increase your chances of winning, write a small review of the book using the sample chapter on and also forward the same post to

You would know I see this contest two (2) days about purchasing an electronic copy of this book!

I may enter the contest anyway so I can forward someone the “extra” copy of it.

Netflix Cloud Prize [$10K plus other stuff]

Friday, March 15th, 2013

Netflix Cloud Prize

Duration of Contest: 13th March 2013 to 15th September 2013.

From github:

This contest is for software developers.

Step 0 – You need your own GitHub account

Step 1 – Read the rules in the Wiki

Step 2 – Fork this repo to your own GitHub account

Step 3 – Send us your email address

Step 4 – Modify your copy of the repo as your Submission


We want you to build something cool using or modifying our open source software. Your submission will be a standalone program or a patch for one of our open source projects. Your submission will be judged in these categories:

  1. Best Example Application Mash-Up

  2. Best New Monkey

  3. Best Contribution to Code Quality

  4. Best New Feature

  5. Best Contribution to Operational Tools, Availability, and Manageability

  6. Best Portability Enhancement

  7. Best Contribution to Performance Improvements

  8. Best Datastore Integration

  9. Best Usability Enhancement

  10. Judges Choice Award

If you win, you’ll get US$10,000 cash, US$5000 AWS credits, a trip to Las Vegas for two, a ticket to Amazon’s user conference, and fame and notoriety (at least within Netflix Engineering).

I can see several of those categories where topic maps would make a nice fit.


Yes, I have an ulterior motive. Having topic maps underlying one or more winners or even runners-up in this contest would promote topic maps and gain needed visibility.

I first saw this at: $10k prizes up for grabs in Netflix cloud contest by Elliot Bentley.

Competition: visualise open government data and win $2,000

Wednesday, February 13th, 2013

Competition: visualise open government data and win $2,000 by Simon Rogers.

Closing date: 23:59 BST on 2 April 2013

What can you do with the thousands of open government datasets? With Google and Open Knowledge Foundation we are launching a competition to find the best dataviz out there. You might even win a prize.

(graphic omitted)

Governments around the world are releasing a tidal wave of open data – on everything from spending through to crime and health. Now you can compare national, regional and city-wide data from hundreds of locations around the world.

But how good is this data? We want to see what you can do with it. What apps and visualisations can you make with this data? We want to see how the data changes the way you see the world.

In conjunction with Google and the Open Knowledge Foundation (who will be helping us judge the results), see if you can win the $2,000 prize.

All we want you to do is to take an open dataset from any government open data website (there’s a list of them at the bottom of this article) and visualise it.

The competition is open to citizens of the UK, US, France, Germany, Spain, Netherlands, Sweden. The winner will take home $2,000 and the result will be published on the Guardian Datastore on our Show and Tell site.

Here are some of the key datasets we’ve found (list below) – and feel free to bring your own data to the party – we only ask that it is freely available and open as in

You are visualizing data anyway, why not take a chance on free PR and $2,000?

The Power of Semantic Diversity

Sunday, February 10th, 2013

Prize-based contests can provide solutions to computational biology problems by Karim R Lakhani, et al. (Nature Biotechnology 31, 108–111 (2013) doi:10.1038/nbt.2495)

From the article:

Advances in biotechnology have fueled the generation of unprecedented quantities of data across the life sciences. However, finding analysts who can address such ‘big data’ problems effectively has become a significant research bottleneck. Historically, prize-based contests have had striking success in attracting unconventional individuals who can overcome difficult challenges. To determine whether this approach could solve a real big-data biologic algorithm problem, we used a complex immunogenomics problem as the basis for a two-week online contest broadcast to participants outside academia and biomedical disciplines. Participants in our contest produced over 600 submissions containing 89 novel computational approaches to the problem. Thirty submissions exceeded the benchmark performance of the US National Institutes of Health’s MegaBLAST. The best achieved both greater accuracy and speed (1,000 times greater). Here we show the potential of using online prize-based contests to access individuals without domain-specific backgrounds to address big-data challenges in the life sciences.


Over the last ten years, online prize-based contest platforms have emerged to solve specific scientific and computational problems for the commercial sector. These platforms, with solvers in the range of tens to hundreds of thousands, have achieved considerable success by exposing thousands of problems to larger numbers of heterogeneous problem-solvers and by appealing to a wide range of motivations to exert effort and create innovative solutions18, 19. The large number of entrants in prize-based contests increases the probability that an ‘extreme-value’ (or maximally performing) solution can be found through multiple independent trials; this is also known as a parallel-search process19. In contrast to traditional approaches, in which experts are predefined and preselected, contest participants self-select to address problems and typically have diverse knowledge, skills and experience that would be virtually impossible to duplicate locally18. Thus, the contest sponsor can identify an appropriate solution by allowing many individuals to participate and observing the best performance. This is particularly useful for highly uncertain innovation problems in which prediction of the best solver or approach may be difficult and the best person to solve one problem may be unsuitable for another19.

An article that merits wider reading that it is likely to get behind a pay-wall.

A semantically diverse universe of potential solvers is more effective than a semantically monotone group of selected experts.

An indicator of what to expect from the monotone logic of the Semantic Web.

Good for scheduling tennis matches with Tim Berners-Lee.

For more complex tasks, rely on semantically diverse groups of humans.

I first saw this at: Solving Big-Data Bottleneck: Scientists Team With Business Innovators to Tackle Research Hurdles.

Call for KDD Cup Competition Proposals

Sunday, February 10th, 2013

Call for KDD Cup Competition Proposals

From the post:

Please let us know if you are interested in being considered for the 2013 KDD Cup Competition by filling out the form below.

This is the official call for proposals for the KDD Cup 2013 competition. The KDD Cup is the well known data mining competition of the annual ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. KDD-2013 conference will be held in Chicago from August 11 – 14, 2013. The competition will last between 6 and 8 weeks and the winners should be notified by end-June. The winners will be announced in the KDD-2013 conference and we are planning to run a workshop as well.

A good competition task is one that is practically useful, scientifically or technically challenging, can be done without extensive application domain knowledge, and can be evaluated objectively. Of particular interest are non-traditional tasks/data that require novel techniques and/or thoughtful feature construction.

Proposals should involve data and a problem whose successful completion will result in a contribution of some lasting value to a field or discipline. You may assume that Kaggle will provide the technical support for running the contest. The data needs to be available no later than mid-March.

If you have initial questions about the suitability of your data/problem feel free to reach out to claudia.perlich [at]

Do you have:

non-traditional tasks/data that require[s] novel techniques and/or thoughtful feature construction?

Is collocation of information on the basis of multi-dimensional subject identity a non-traditional task?

Does extraction of multiple dimensions of a subject identity from users require novel techniques?

If so, what data sets would you suggest using in this challenge?

I first saw this at: 19th ACM SIGKDD Knowledge Discovery and Data Mining Conference.

International Space Apps Challenge

Monday, February 4th, 2013

International Space Apps Challenge

From the webpage:

The International Space Apps Challenge is a two-day technology development event during which citizens from around the world will work together to address current challenges relevant to both space exploration and social need.

NASA believes that mass collaboration is key to creating and discovering state-of-the-art technology. The International Space Apps Challenge aims to engage YOU in developing innovative solutions to our toughest challenges.

Join us on April 20-21, 2013, as we join together cities around the world to be part of pioneering the future. Sign up to be notified when registration opens in early 2013!

The list of challenges will be released around March 15th,

I won’t be able to attend in person but would be interested in participating with others should a semantic integration challenge come up.

I first saw this at: NASA launches second International Space Apps Challenge by Alex Howard.