Archive for the ‘EU’ Category

DCAT Application Profile for Data Portals in Europe – Final Draft

Wednesday, May 22nd, 2013

DCAT Application Profile for Data Portals in Europe – Final Draft

From the post:

The DCAT Application profile for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe. Its basic use case is to enable a cross-data portal search for data sets and make public sector data better searchable across borders and sectors. This can be achieved by the exchange of descriptions of data sets among data portals.

This final draft is open for public review until 10 June 2013. Members of the public are invited to download the specification and post their comments directly on this page. To be able to do so you need to be registered and logged in.

If you are interested in integration of data from European data portals, it is worth the time to register, etc.

Not all the data you are going to need to integrate a data set but at least a start in the right direction.

Pan-European open data…

Wednesday, March 13th, 2013

Pan-European open data available online from EuroGeographics

From the post:

Data compiled from national mapping supplied by 45 European countries and territories can now be downloaded for free at

From today (8 March 2013), the 1:1 million scale topographic dataset, EuroGlobalMap will be available free of charge for any use under a new open data licence. It is produced using authoritative geo-information provided by members of EuroGeographics, the Association for European Mapping, Cadastre and Land Registry Authorities.


“World leaders acknowledge the need for further mainstream sustainable development at all levels, integrating economic, social and environmental aspects and recognising their inter-linkages,” she said. [EuroGeographics’ President, Ingrid Vanden Berghe]

“Geo-information is key. It provides a vital link among otherwise unconnected information and enables the use of location as the basis for searching, cross-referencing, analysing and understanding Europe-wide data.”

Geographic location is a common binding point for information.

Interesting to think about geographic steganography. Right latitude but wrong longitude, or other variations.

Six Degrees of Francis Bacon…

Friday, March 8th, 2013

Six Degrees of Francis Bacon, a 17th century social network by Nathan Yau.

From the post:

Network of Francis Bacon

Nathan points us to a project to determine the relationships of Francis Bacon:

Six Degrees of Francis Bacon.

Imagine that instead of collecting “door pass” data in the Man Bites Dog story about influence of special interests in the EU Parliment, the study collected financial, social, education, and other relationships with members of the EU Parliament and the favors it bestows.

Same outcome? Or different?

EU Commission – Open Data Portal Open

Tuesday, February 26th, 2013

EU Commission – Open Data Portal Open

From the post:

The European Union Commission has unveiled a new Open Data Portal, with over 5,580 data sets – the majority of which comes from the Eurostat (the statistical office of the European Union). The portal is the result of the Commission’s ‘Open Data Strategy for Europe’, and will publish data from the European Commission and other bodies of the European Union; it already holds data from the European Environment Agency.

The portal has a SPARQL endpoint to provide linked data, and will also feature applications that use this data. The published data can be downloaded by everyone interested to facilitate reuse, linking and the creation of innovative services. This shows the commitment of the Commission to the principles of openness and transparency.

For more information

If the Commission is committed to “principles of openness and transparency, when can we expect to see:

  1. Rosters of the institutions and individual participants in EU funded research from 1980 to present?
  2. Economic analysis of the results of EU funded projects, on a project by project basis, from 1980 to present?

Noting from 1984 – 2013, the total research funding exceeds EUR 118 billion.

To be fair, CORDIS: Community Research and Development Information Service has report summaries and project reports for FP5, FP6 and FP7. And CORDIS Search Service provides coverage back to the early 1980’s.

About Projects on Cordis has a wealth of information to guide searching into EU funded research.

While a valuable resource, CORDIS requires the extraction of detailed information on a project by project basis, making large scale analysis difficult if not prohibitively expensive.

PS: Of the 5855 datasets, some 5680 datasets, were previously published by EuroStat. European Environmental Agency, 106 datasets. Perhaps a net increase of 59 datasets over those previously available.

LobbyPlag: compares text of EU regulation with texts of lobbyists’ proposals

Wednesday, February 13th, 2013

LobbyPlag: compares text of EU regulation with texts of lobbyists’ proposals

From the post:

A service called LobbyPlag lets users view provisions of EU regulations and compare them to provisions of lobbyists’ proposals.

The example currently available on LobbyPlag concerns the General Data Protection Regulation (GDPR).

Click here to see how LobbyPlag compares the GDPR’s forum shopping provision to what the site claims are lobbyists’ proposals for that provision.

LobbyPlag is an interesting use of legal text comparison tools to promote transparency.

See the original post for more details and links.

Another step in the right direction.

Simulating the European Commission

Saturday, February 2nd, 2013

Did you see Gary Marcus’ “We are not yet ready to simulate the brain,” last Thursday’s Financial Times?

Gary writes:

The 10-year €1.19bn project to simulate the entire human brain, announced on Monday by the European Commission is, at about a sixth of the cost of the Large Hadron Collider, the biggest neuroscience project undertaken. It is an important, but flawed, step to a better understanding of the organ’s workings.

His analysis is telling but he misses the true goal of the project even as he writes:

Even so, it could foster a great deal of useful science. The crucial question is how the money will be spent. Much of the infrastructure developed will serve a vast number of projects, and the funding will support more than 250 scientists from more than 80 institutions, each with his or her own research agenda. A great many, such as Yadin Dudai (who specialises in memory), Seth Grant (who studies the genetics and evolution of neural function) and Stanislas Dehaene (who works on the brain basis of mathematics and consciousness), are stellar.

Supporting researchers, +1! Building the infrastructure of drones, managers, auditors, meeting coordinators and the like for this project, -1!

Every field of research could benefit from the funding that will now be diverted into “infrastructure” that exists only to be “infrastructure” (read employment).

My counter proposal is to simulate the EU commission using Steven Santy’s online “Magic Eight Ball.”

Put the question: Should project [name] be funded? to the Magic Eight Ball as many times as there are EU votes on projects and sum the answers.

Would avoid some of the “infrastructure” expenses and result in equivalent funding decisions.

If that sounds harsh, recall EU provincialism funds only EU-based research. As though scientific research and discovery depends upon nationality or geographic location. In that regard, the EU is like Alabama, only larger.

OpenAIRE Study

Sunday, January 20th, 2013

Implementing Open Access Mandates in Europe: OpenAIRE Study by Thembani Malapela.

From the webpage:

Implementing Open Access Mandates in Europe : OpenAIRE Study on the Development of Open Access Repository Communities in Europe is the title of a recent book authored by Birgit Schmidt and Iryna Kuchma. The book highlights the existing open access policies in Europe and provides an overview of publisher’s self archiving policies and it further gives strategies for policy implementation. Such strategies include both institutional and national – which have been used in implementing open access policy mandates. This work provides a unique overview of national awareness of open access in 32 European countries involving all EU member states and in addition, Norway, Iceland, Croatia, Switzerland and Turkey.

What makes this book an interesting read is that it taps into activities implemented through OpenAIRE project and related repository projects by other stakeholders in Europe. Despite its extensive coverage on the implementation of Open Access Mandates in the region, the authors acknowledge, “the main issues that still need to be resolved in the coming years include the effective promotion of open access among research communities and support in copyright management for researchers and research institutions as well as intermediaries such as libraries and repositories”.

The more data that becomes “open,” the greater the semantic diversity you will find.

Important to follow the discussion as you prepare to map more and more information into your topic map.

EU – Law-Related Authority Files

Friday, January 11th, 2013

The EU Data Portal has a number of law-related authority files:

I first saw these at: New EU Data Portal links to several law-related authority files.

European Commission’s Low Attack on Open Source [TMs and Transparency]

Monday, January 7th, 2013

European Commission’s Low Attack on Open Source by Glyn Moody.

From the post:

If ACTA was the biggest global story of 2012, more locally there’s no doubt that the UK government’s consultation on open standards was the key event. As readers will remember, this was the final stage in a long-running saga with many twists and turns, mostly brought about by some uncricket-like behaviour by proprietary software companies who dread a truly level playing-field for government software procurement.

Justice prevailed in that particular battle, with open standards being defined as those with any claimed patents being made available on a royalty-free basis. But of course these things are never that simple. While the UK has seen the light, the EU has actually gone backwards on open standards in recent times.

Again, as long-suffering readers may recall, the original European Interoperability Framework also required royalty-free licensing, but what was doubtless a pretty intense wave of lobbying in Brussels overturned that, and EIF v2 ended up pushing FRAND, which effectively locks out open source – the whole point of the exercise.

Shamefully, some parts of the European Commission are still attacking open source, as I revealed a couple of months ago when Simon Phipps spotted a strange little conference with the giveaway title of “Implementing FRAND standards in Open Source: Business as usual or mission impossible?”

The plan was pretty transparent: organise something in the shadows, so that the open source world would be caught hopping. The fact that I only heard about it a few weeks beforehand, when I spend most of my waking hours scouting out information on the open source world, open standards and Europe, reading thousands of posts and tweets a week, shows how quiet the Commission kept about this.

This secrecy allowed the organisers to cherry pick participants to tilt the discussion in favour of software patents in Europe (which shouldn’t even exist, of course, according to the European Patent Convention), FRAND supporters and proprietary software companies, even though the latter are overwhelmingly American (so much for loyalty to the European ideal.) The plan was clearly to produce the desired result that open source was perfectly compatible with FRAND, because enough people at this conference said so.

But the “EU” hasn’t “gone backwards” on open standards. Organizations, as juridical entities, can’t go backwards or forwards on any topic. Officers, members, representatives of organizations, that is a different matter.

That is where topic maps could help bring transparency to a process such as the opposition to open source software.

For example, it is not:

  • “some parts of the European Commission” but named individuals with photographs and locations
  • “the organizers” but named individuals with specified relationships to commercial software vendors
  • “enough people at this conference” but paid representatives of software vendors and others financially interested in a no open source outcome

TM’s can help tear aware the governmental and corporate veil over these “consultations.”

What you will find are people who are profiting or intend to do so from their opposition to open source software.

Their choice, but they should be forced to declare their allegiance to seek personal profit over public good.

I first saw this at: EU Experiences Setback in Open Source.

New EU Data Portal [Transparency/Innovation?]

Wednesday, December 26th, 2012

EU Commission unwraps public beta of open data portal with 5800+ datasets, ahead of Jan 2013 launch by Robin Wauters.

The EU Data Portal.

From the post:

Good news for open data lovers in the European Union and beyond: the European Commission on Christmas Eve quietly pushed live the public beta version of its all-new open data portal.

For the record: open data is general information that can be freely used, re-used and redistributed by anyone. In this case, it concerns all the information that public bodies in the European Union produce, collect or pay for (it’s similar to the United States government’s

This could include geographical data, statistics, meteorological data, data from publicly funded research projects, and digitised books from libraries.

The post always quotes the portal website as saying:

This portal is about transparency, open government and innovation. The European Commission Data Portal provides access to open public data from the European Commission. It also provides access to data of other Union institutions, bodies, offices and agencies at their request.

The published data can be downloaded by everyone interested to facilitate reuse, linking and the creation of innovative services. Moreover, this Data Portal promotes and builds literacy around Europe’s data.

Eurostat is the largest data contributor so signs of “transparency” should be there, if anywhere.

The first twenty (20) data sets from Eurostat are:

  • Quarterly cross-trade road freight transport by type of transport (1 000 t, Mio Tkm)
  • Turnover by residence of client and by employment size class for div 72 and 74
  • Generation of waste by sector
  • Standardised incidence rate of accidents at work by economic activity, severity and age
  • At-risk-of-poverty rate of older people, by age and sex (Source: SILC)
  • Telecommunication services: Access to networks (1 000)
  • Production of environmentally harmful chemicals, by environmental impact class
  • Fertility indicators
  • Area under wine-grape vine varieties broken down by vine variety, age of the vines and NUTS 2 regions – Romania
  • Severe material deprivation rate by most frequent activity status (population aged 18 and over)
  • Government bond yields, 10 years’ maturity – monthly data
  • Material deprivation for the ‘Economic strain’ and ‘Durables’ dimensions, by number of item (Source: SILC)
  • Participation in non-formal taught activities within (or not) paid hours by sex and working status
  • Number of persons by working status within households and household composition (1 000)
  • Percentage of all enterprises providing CVT courses, by type of course and size class
  • EU Imports from developing countries by income group
  • Extra-EU imports of feedingstuffs: main EU partners
  • Production and international trade of foodstuffs: Fresh fish and fish products
  • General information about the enterprises
  • Agricultural holders

When I think of government “transparency,” I think of:

  • Who is making the decisions?
  • What are their relationships to the people asking for the decisions? School, party, family, social, etc.
  • What benefits are derived from the decisions?
  • Who benefits from those decisions?
  • What are the relationships between those who benefit and those who decide?
  • Remembering it isn’t the “EU” that makes a decision for good or ill for you.

    Some named individual or group of named individuals, with input from other named individuals, with who they had prior relationships, made those decisions.

    Transparency in government would name the names and relationships of those individuals.

    BTW, I would be very interested to learn what sort of “innovation” you can derive from any of the first twenty (20) data sets listed above.

    The holidays may have exhausted my imagination because I am coming up empty.

    Upcoming release of EuroVoc 4.4, EU’s multilingual thesaurus [December 18, 2012]

    Wednesday, December 12th, 2012

    Upcoming release of EuroVoc 4.4, EU’s multilingual thesaurus

    From the post:

    EuroVoc 4.4 will be released on December 18, 2012. During this day, the website might be temporary unavailable.

    6.883 thesaurus concepts

    This new edition is the result of a thorough revision among other things according to the concepts introduced by the ‘Lisbon Treaty’. It includes 6.883 thesaurus concepts of which 85 concepts are new, 142 have been updated and 28 have been classified as obsolete concepts.

    These new concepts are the results of the proposals sent by the librarians from the libraries of the national parliaments in Europe, the European Institutions namely the European Parliament and the users of EuroVoc. All the terms in Portuguese have been revised according to the Portuguese language spelling reform. The prior lexical value remains available as Non-Preferred Terms.

    EuroVoc, the EU’s multilingual thesaurus

    EuroVoc is a multilingual, multidisciplinary thesaurus covering the activities of the EU, the European Parliament in particular. It contains terms in 22 EU languages. It is managed by the Publications Office, which moved forward to ontology-based thesaurus management and semantic web technologies conformant to W3C recommendations as well as latest trends in thesaurus standards.

    There are documents prior to this version of the thesaurus and even documents prior to there being a EuroVoc thesaurus at all.

    And there will be documents after EuroVoc has been superceded.

    Not to mention in between there will be documents that use other vocabularies.

    Good thing we have topic maps to use this resource to its best advantage.

    A way station in a sea of semantic currents and drifts.

    Parallel Language Corpus Hunting?

    Friday, April 27th, 2012

    Parallel language corpus hunters, particularly in legal informatics can rejoice!

    [A] parallel corpus of all European Union legislation, called the Acquis Communautaire, translated into all 22 languages of the EU nations — has been expanded to include EU legislation from 2004-2010…

    If you think semantic impedance in one language is tough, step up and try that across twenty-two (22) languages.

    Of course, these countries share something of a common historical context. Imagine the gulf when you move up to languages from other historical contexts.

    See: DGT-TM-2011, Parallel Corpus of All EU Legislation in Translation, Expanded to Include Data from 2004-2010 for links and other details.

    European Legislation Identifier: Document and Slides

    Wednesday, March 21st, 2012

    European Legislation Identifier: Document and Slides

    From LegalInformatics:

    John Dann of the Luxembourg Service Central de Législation has kindly given his permission for us to post the following documents related to the proposed European Legislation Identifier (ELI) standard:

    If you are interested in legal identifiers or legislative materials in Europe more generally, this should be of interest.

    European Commission launches consultation into e-interoperability

    Thursday, February 23rd, 2012

    European Commission launches consultation into e-interoperability by Derek du Preez.

    From the post:

    The European Commission (EC) has launched a one month public consultation into the problem of incompatible vocabularies used by developers of public administration IT systems.

    “Core vocabularies” are used to make sharing and reusing data easier, and the EC hopes that if they are defined properly, it will be able to quickly and effectively launch e-Government cross-border services.

    The EC has divided the consultation into three separate core vocabularies; person, business and location.

    Despite the minimal nature of the core vocabularies, I think the expectations for their use is set by the final paragraph of this report:

    Once the public consultation is over, the working groups will seek endorsement from EU Member States. This means that the vocabularies will not become a legal obligation, but will give them further exposure for wider use.

    If you have pointers to current incompatible vocabularies, I would appreciate a ping. Just so we can revisit those vocabularies in say five years to see the result of “exposure for wider use.”

    SEALS – Community Page

    Sunday, January 29th, 2012

    SEALS – Semantic Evaluation At Large Scale – Community Page

    The community page was added after my first post on the SEAL project.

    The next community event:

    SEALS to present evaluation results at ESWC 2012

    SEALS is pleased to announce that the workshop Evaluation of Semantic Technologies (IWEST 2012) has been confirmed to take place at the leading semantic web conference, ESWC (Extended Semantic Web Conference) 2012, scheduled to take place May 27-31, 2012 in beautiful Crete, Greece.

    This workshop will be a venue for researchers and tool developers, firstly, to initiate discussion about the current trends and future challenges of evaluating semantic technologies. Secondly, to support communication and collaboration with the goal of aligning the various evaluation efforts within the community and accelerating innovation in all the associated fields as has been the case with both the TREC benchmarks in information retrieval and the TPC benchmarks in database research.

    A call for papers will be published soon. All SEALS community members and evaluation campaign participants are especially encouraged to submit and participate.

    If you attend, I am particularly interested in the results of the discussion about “aligning the various evaluation efforts within the community….”

    I say that because when the project started, the “about” page reported:

    This is a very active research area, currently supported by more than 3000 individuals integrated in 360 organisations which have produced around 700 tools, but still suffers from a lack of standard benchmarks and infrastructures for assessing research outcomes. Due to its physically boundless nature, it remains relatively disorganized and lacks common grounds for assessing research and technological outcomes.

    Sounds untidy, even diverse doesn’t it? 😉

    To tell the truth, I am not bothered by the repetition of semantic diversity in efforts to reduce semantic diversity. I find it refreshing that our languages burst the bonds that would be imposed upon them on a regular basis. Tyrants of thought, social, political and economic arrangements, the well- and the ill-intended, all fail. (Some last longer than others but on a historical time scale, the governments of the East and West are ephemera. Their peoples, the originators of language and semantics, will persist.)

    We can reduce semantic diversity when it is needful or to account for it, but even those efforts, as SEALS points out, exhibit the same semantic diversity as the area they purport to address.

    EC Tender for Open Data Portal

    Friday, July 22nd, 2011

    Deadline: 19 September 2011

    From the announcement:

    Today, [19 July 2011] the European Commission has taken a new step in realising an European Data Portal. They have published a call for tenders to develop the data portal on it’s electronic Tender Portal All information can be found on this page.

    Luxembourg, 19 July 2011

    (by Tom Kronenburg)

    At the Digital Agenda Assembly workshop on Open Data in June, mr. Khalil Rouhana of the European Commission announced the intention (slide 7) to build a European Open Data portal. Rouhana said that a EC Portal should become operational in 2012, holding a significant amount of EC datasets. It is also planned that by 2013 a pan/european data portal should present datasets published by the Member States.

    Today, the European Commission has taken a new step in realizing the European Data Portal. The EC has published a call for tenders to develop the data portal on it’s electronic Tender Portal The call for tenders is one of the necessary steps for realizing the ambition of creating one pan-european Open Data portal.

    The tender procedure will result in a contract that encompasses four types of services:

    • to develop and administer a web portal to act as a single point of access to data sets produced and held by European Commission services (and by extension to data sets produced and held by other European institutions/bodies and other public bodies),
    • to assist the Commission with the definition and implementation of a data set publication process,
    • to assist the Commission with the preparation of data sets for publication via the portal,
    • to assist the Commission in supporting for engaging the stakeholders’ community interested in re-using the published data sets.

    I checked to be sure and the tender is open to people based in the United States.

    This looks like it could be both interesting and fun.

    Check with your usual major players to see if you can contract out for part of the action in case they are successful.

    T-Rex Information Extraction

    Friday, October 15th, 2010

    T-Rex (Trainable Relation Extraction).

    Tools for document classification, entity and relation (read association) extraction.

    Topic maps of any size are going to be constructed from mining of “data” and in a lot of cases that will mean “documents” (to the extent that is a meaningful distinction).

    Interesting toolkit for that purpose but apparently not being maintained. Parked at Sourceforge after having been funded by the EU.

    Does anyone have a status update on this project?