Archive for the ‘Public Data’ Category

Enigma

Friday, May 10th, 2013

Enigma

I suppose it had to happen. With all the noise about public data sets that someone would create a startup to search them. ;-)

Not a lot of detail at the site but you can sign up for a free trial.

Features:

100,000+ Public Data Sources: Access everything from import bills of lading, to aircraft ownership, lobbying activity,real estate assessments, spectrum licenses, financial filings, liens, government spending contracts and much, much more.

Augment Your Data: Get a more complete picture of investments, customers, partners, and suppliers. Discover unseen correlations between events, geographies and transactions.

API Access: Get direct access to the data sets, relational engine and NLP technologies that power Enigma.

Request Custom Data: Can’t find a data set anywhere else? Need to synthesize data from disparate sources? We are here to help.

Discover While You Work: Never miss a critical piece of information. Enigma uncovers entities in context, adding intelligence and insight to your daily workflow.

Powerful Context Filters: Our vast collection of public data sits atop a proprietary data ontology. Filter results by topics, tags and source to quickly refine and scope your query.

Focus on the Data: Immerse yourself in the details. Data is presented in its raw form, full screen and without distraction.

Curated Metadata: Source data is often unorganized and poorly documented. Our domain experts focus on sanitizing, organizing and annotating the data.

Easy Filtering: Rapidly prototype hypotheses by refining and shaping data sets in context. Filter tools allow the sorting, refining, and mathematical manipulation of data sets.

The “proprietary data ontology” jumps out at me as an obvious question. Do users get to know what the ontology is?

Not to mention the “our domain experts focus on sanitizing,….” Works for some cases, take legal research for example. Not sure that “your” experts works as well as “my” experts for less focused areas.

Looking forward to learning more about Enigma!

Scenes from a Dive

Wednesday, March 20th, 2013

Scenes from a Dive – what’s big data got to do with fighting poverty and fraud? by Prasanna Lal Das.

From the post:

A more detailed recap will follow soon but here’s a very quick hats off to the about 150 data scientists, civic hackers, visual analytics savants, poverty specialists, and fraud/anti-corruption experts that made the Big Data Exploration at Washington DC over the weekend such an eye-opener.We invite you to explore the work that the volunteers did (these are rough documents and will likely change as you read them so it’s okay to hold off if you would rather wait for a ‘final’ consolidated  document). The projects that the volunteers worked on include: 

Here are some visualizations that some project teams built. A few photos from the event are here (thanks @neilfantom). More coming soon (and yes, videos too!). Thanks @francisgagnon for the first blog about the event. The event hashtag was #data4good (follow @datakind and @WBopenfinances for more updates on Twitter).

Great meeting and projects but I would suggest a different sort of “big data”

Requiring recipients to grant reporting access to all bank accounts where funds will be transferred and requiring the same for any entity paid out of those accounts to the point where transfers over 90 days are less than $1,000 for any entity (or related entity), would be a better start.

With the exception of the “related entity” information, banks already keep transfer of funds information as a matter of routine business. It would be “big data” that is rich in potential for spotting fraud and waste.

The reporting banks should also be required to deliver other banking records they have on the accounts where funds are transferred and other activity in those accounts.

Before crying “invasion of privacy,” remember World Bank funding is voluntary.

As is acceptance of payment from World Bank funded projects. Anyone and everyone is free to decline such funding and avoid the proposed reporting requirements.

“Big data” to track fraud and waste is already collected by the banking industry.

The question is whether we will use that “big data” to effectively track fraud and waste or wait for particularly egregious cases to come to light?

February NYC DataKind Meetup (video)

Friday, March 15th, 2013

February NYC DataKind Meetup (video)

From the post:

A video of our February NYC DataKind Meetup is online for those of you who couldn’t join us in New York. Hear about the projects our amazing Data Ambassadors are working on with Medic Mobile, Sunlight Foundation, and Refugees United as well as listen to Anoush Tatevossian from the UN Global Pulse talk about how the UN is using data for the greater good. It was a fantastic event and we’re thrilled to get to share it with all of you.

A great pre-meeting format, beer first and during the presentations.

Need to recommend that format to Balisage.

None for the speaker, they could be the “designated driver” before and during their presentation.

New Army Guide to Open-Source Intelligence

Sunday, September 16th, 2012

New Army Guide to Open-Source Intelligence

If you don’t know Full Text Reports, you should.

A top-tier research professional’s hand-picked selection of documents from academe, corporations, government agencies, interest groups, NGOs, professional societies, research institutes, think tanks, trade associations, and more.

You will winnow some chaff but also find jewels like Open Source Intelligence (PDF).

From the post:

  • Provides fundamental principles and terminology for Army units that conduct OSINT exploitation.
  • Discusses tactics, techniques, and procedures (TTP) for Army units that conduct OSINT exploitation.
  • Provides a catalyst for renewing and emphasizing Army awareness of the value of publicly available information and open sources.
  • Establishes a common understanding of OSINT.
  • Develops systematic approaches to plan, prepare, collect, and produce intelligence from publicly available information from open sources.

Impressive intelligence overview materials.

Would be nice to re-work into a topic map intelligence approach document with the ability to insert a client’s name and industry specific examples. Has that militaristic tone that is hard to capture with civilian writers.

Importing public data with SAS instructions into R

Wednesday, July 11th, 2012

Importing public data with SAS instructions into R by David Smith.

From the post:

Many public agencies release data in a fixed-format ASCII (FWF) format. But with the data all packed together without separators, you need a “data dictionary” defining the column widths (and metadata about the variables) to make sense of them. Unfortunately, many agencies make such information available only as a SAS script, with the column information embedded in a PROC IMPORT statement.

David reports on the SAScii package from Anthony Damico.

You still have to parse the files but it gets you one step closer to having useful information.

Data-gov Wiki

Monday, June 27th, 2011

Data-gov Wiki

From the wiki:

The Data-gov Wiki is a project being pursued in the Tetherless World Constellation at Rensselaer Polytechnic Institute. We are investigating open government datasets using semantic web technologies. Currently, we are translating such datasets into RDF, getting them linked to the linked data cloud, and developing interesting applications and demos on linked government data. Most of the datasets shown on this page come from the US government’s data.gov Web site, although some are from other countries or non-government sources.

Try out their Drupal site with new demos:

Linking Open Government Data

My misgivings about the “openness” that releasing government data brings to one side, the Drupal site is a job well done and merits your attention.

Open Government Data 2011 wrap-up

Sunday, June 19th, 2011

Open Government Data 2011 wrap-up by Lutz Maicher.

From the post:

On June 16, 2011 the OGD 2011 – the first Open Data Conference in Austria – took place. Thanks to a lot of preliminary work of the Semantic Web Company the topic open (government) data is very hot in Austria, especially in Vienna and Linz. Hence 120 attendees (see the list here) for the first conference is a real success. Congrats to the organizers. And congrats to the community which made the conference to a very vital and interesting event.

If there is a Second Open Data Conference, it is a venue where topic maps should put in an appearance.

PublicData.EU Launched During DAA

Sunday, June 19th, 2011

PublicData.EU Launched During DAA

From the post:

During the Digital Agenda Assembly this week in Brussels the new portal PublicData.EU was launched in beta. This is a step aimed to make public data easier to find across the EU. As it says on the ‘about’ page:

“In order to unlock the potential of digital public sector information, developers and other prospective users must be able to find datasets they are interested in reusing. PublicData.eu will provide a single point of access to open, freely reusable datasets from numerous national, regional and local public bodies throughout Europe.

Information about European public datasets is currently scattered across many different data catalogues, portals and websites in many different languages, implemented using many different technologies. The kinds of information stored about public datasets may vary from country to country, and from registry to registry. PublicData.eu will harvest and federate this information to enable users to search, query, process, cache and perform other automated tasks on the data from a single place. This helps to solve the “discoverability problem” of finding interesting data across many different government websites, at many different levels of government, and across the many governments in Europe.

In addition to providing access to official information about datasets from public bodies, PublicData.eu will capture (proposed) edits, annotations, comments and uploads from the broader community of public data users. In this way, PublicData.eu will harness the social aspect of working with data to create opportunities for mass collaboration. For example, a web developer might download a dataset, convert it into a new format, upload it and add a link to the new version of the dataset for others to use. From fixing broken URLs or typos in descriptions to substantive comments or supplementary documentation about using the datasets, PublicData.eu will provide up to date information for data users, by data users.”

PublicData.EU is built by the Open Knowledge Foundation as part of the LOD2 project. “PublicData.eu is powered by CKAN, a data catalogue system used by various institutions and communities to manage open data. CKAN and all its components are open source software and used by a wide community of catalogue operators from across Europe, including the UK Government’s data.gov.uk portal.”

Here’s a European marketing opportunity for topic maps. How would a topic map solution be different from what is offered here? (There are similar opportunities in the US as well.)