Archive for the ‘EU’ Category

EU’s Unfunded Hear/See No Evil Policy

Friday, April 28th, 2017

EU lawmakers vote to make YouTube fight online hate speech by Julia Floretti.

From the post:

Video-sharing platforms such as Google’s YouTube and Vimeo will have to take measures to protect citizens from content containing hate speech and incitement to violence under measures voted by EU lawmakers on Tuesday.

The proliferation of hate speech and fake news on social media has led to companies coming under increased pressure to take it down quickly, while internet campaigners have warned an excessive crackdown could endanger freedom of speech.

Members of the culture committee in the European Parliament voted on a legislative proposal that covers everything from 30 percent quotas for European works on video streaming websites such as Netflix to advertising times on TV to combating hate speech.

Ironically, the reported vote was by the “CULT” committee. No, I’m not making that up! I can prove that from the documents page:

From the report,


Amendment 18

(28) Some of the content stored on video-sharing platforms is not under the editorial responsibility of the video-sharing platform provider. However, those providers typically determine the organisation of the content, namely programmes or user-generated videos, including by automatic means or algorithms. Therefore, those providers should be required to take appropriate measures to protect minors from content that may impair their physical, mental or moral development and protect all citizens from incitement to violence or hatred directed against a group of persons or a member of such a group defined by reference to sex, race, colour, religion, descent or national or ethnic origin.
… (emphasis in original)

In addition to being censorship, unfunded censorship at that, the EU report runs afoul of the racist reality of the EU.

If you’re up for some difficult reading, consider Intolerance, Prejudice and Discrimination – A European Report by Forum Berlin, Andreas Zick, Beate Küpper, and Andreas Hövermann.

From page 13 of the report:

  • Group-focused enmity is widespread in Europe. It is weakest in the Netherlands, and strongest in Poland and Hungary. With respect to anti-immigrant attitudes, anti-Muslim attitudes and racism there are only minor differences between the countries, while differences in the extent of anti-Semitism, sexism and homophobia are much more marked.
  • About half of all European respondents believe there are too many immigrants in their country. Between 17 percent in the Netherlands and more than 70 percent in Poland believe that Jews seek to benefit from their forebears’ suffering during the Nazi era. About one third of respondents believe there is a natural hierarchy of ethnicity. Half or more condemn Islam as “a religion of intolerance”. A majority in Europe also subscribe to sexist attitudes rooted in traditional gender roles and demand that: “Women should take their role as wives and mothers more seriously.” With a figure of about one third, Dutch respondents are least likely to affirm sexist attitudes. The proportion opposing equal rights for homosexuals ranges between 17 percent in the Netherlands and 88 percent in Poland; they believe it is not good “to allow marriages between two men or two women”.

At the risk of insulting our simian relatives, this new EU policy can be summarized by:

(source: Three Wise Monkeys)

Suppressing hate speech does not result in less hate, only in less evidence of it.

While this legislation is pending, YouTube and Vimeo should occasionally suspend access of EU viewers for an hour. EU voters may decide they need more responsible leadership.

The Right to be Forgotten in the Media: A Data-Driven Study

Wednesday, July 27th, 2016

The Right to be Forgotten in the Media: A Data-Driven Study by , , , , .

Abstract:

Due to the recent “Right to be Forgotten” (RTBF) ruling, for queries about an individual, Google and other search engines now delist links to web pages that contain “inadequate, irrelevant or no longer relevant, or excessive” information about that individual. In this paper we take a data-driven approach to study the RTBF in the traditional media outlets, its consequences, and its susceptibility to inference attacks. First, we do a content analysis on 283 known delisted UK media pages, using both manual investigation and Latent Dirichlet Allocation (LDA). We find that the strongest topic themes are violent crime, road accidents, drugs, murder, prostitution, financial misconduct, and sexual assault. Informed by this content analysis, we then show how a third party can discover delisted URLs along with the requesters’ names, thereby putting the efficacy of the RTBF for delisted media links in question. As a proof of concept, we perform an experiment that discovers two previously-unknown delisted URLs and their corresponding requesters. We also determine 80 requesters for the 283 known delisted media pages, and examine whether they suffer from the “Streisand effect,” a phenomenon whereby an attempt to hide a piece of information has the unintended consequence of publicizing the information more widely. To measure the presence (or lack of presence) of a Streisand effect, we develop novel metrics and methodology based on Google Trends and Twitter data. Finally, we carry out a demographic analysis of the 80 known requesters. We hope the results and observations in this paper can inform lawmakers as they refine RTBF laws in the future.

Not collecting data prior to laws and policies seems to be a trademark of the legislative process.

Otherwise, the “Right to be Forgotten” (RTBF) nonsense that only impacts searching and then only in particular ways could have been avoided.

The article does helpfully outline how to discover delistings, of which they discovered 283 known delisted links.

Seriously? Considering that Facebook has 1 Billion+ users, much ink and electrons are being spilled over a minimum of 283 delisted links?

It’s time for the EU to stop looking for mites and mole hills to attack.

Especially since they are likely to resort to outright censorship as their next move.

That always ends badly.

EU Plays What-a-Mole with URLs (RTBF)

Sunday, June 5th, 2016

Researchers Uncover a Flaw in Europe’s Tough Privacy Rules by Mark Scott.

From the post:

Europe likes to think it leads the world in protecting people’s privacy, and that is particularly true for the region’s so-called right to be forgotten. That legal right allows people connected to the Continent to ask the likes of Google to remove links about themselves from online search results, under certain conditions.

Yet that right — one of the world’s most widespread efforts to protect people’s privacy online — may not be as effective as many European policy makers think, according to new research by computer scientists based, in part, at New York University.

The academic team, which also included experts from the Federal University of Minas Gerais in Brazil, said that in roughly a third of the cases examined, the researchers were able to discover the names of people who had asked for links to be removed. Those results, based on the researchers’ use of basic coding, came despite the individuals’ expressed efforts to remove their names from online searches.

The findings, which had not previously been made public and will be presented at an academic conference next month, raise questions about how successful Europe’s “right to be forgotten” can be if people’s identities can still be found with just a few clicks of a mouse. The paper says such breaches may undermine “the spirit” of the legal ruling.

From the positive conclusions on the Right to Be Forgotten (RTBF) by the paper authors:


We end this paper with a few opinions and recommendations based on the results and observations of this paper. After having studied RTBF and its consequences from a data perspective, the authors feel that RTBF has been largely working and responding to legitimate privacy concerns of many Europeans. We feel that Google’s process for determining which links should be delisted seems fair and reasonable. We feel that Google is being fairly transparent about how it processes RTBF requests [13]. Other academics have called more transparency [12]. However, by being more specific about how delisting decisions are made, it may become easier for the attacker to rediscover delisted URLs and the corresponding requesters.

I have to conclude they are collectively innocent of reading George Orwell’s 1984.

-if all records told the same tale — then the lie passed into history and became truth. ‘Who controls the past,’ ran the Party slogan, ‘controls the future: who controls the present controls the past.’ (George Orwell, 1984, Part 1, Chapter 2)

The paper does expose the EU efforts to control the past are akin to playing whack-a-mole:

with URLs.

Except that unlike the video, the EU doesn’t play very well.

As the paper outlines in some detail, delisting isn’t the same thing as making all records tell the same tale.

No only can you discover the “delisted,” you can often find evidence of who requested the “delisting.”

If “delisting” at Google becomes commonplace it will create opportunities for new web services. A web service that accepts URLs and passes through the content, annotated with Google Delisted Content – Suspected Delister: (delister’s name and current twitter handle).

1984 did not end well.

For a different (not necessarily better) outcome, resist all attempts to control the past, or to at least make it harder to discover.

EU Too Obvious With Wannabe A Monopoly Antics

Wednesday, April 20th, 2016

If you ever had any doubts (I didn’t) that the EU is as immoral as any other government, recent moves by the EU in the area of software will cure those.

EU hits Google with second antitrust charge by Foo Yun Chee reports:

EU antitrust regulators said that by requiring mobile phone manufacturers to pre-install Google Search and the Google Chrome browser to get access to other Google apps, the U.S. company was harming consumers by stifling competition.

Show of hands. How many of you think the EU gives a sh*t about consumers?

Yeah, that’s what I thought as well.

Or as Chee quotes European Competition Commissioner Margrethe Vestager:

“We believe that Google’s behavior denies consumers a wider choice of mobile apps and services and stands in the way of innovation by other players,” she said.

Hmmm, “other players.” Those don’t sound like consumers, those sound like people who will be charging consumers.

If you need confirmation of that reading, consider Anti-innovation: EU excludes open source from new tech standards by Glyn Moody.

From the post:


“Open” is generally used in the documents to denote “open standards,” as in the quotation above. But the European Commission is surprisingly coy about what exactly that phrase means in this context. It is only on the penultimate page of the ICT Standardisation Priorities document that we finally read the following key piece of information: “ICT standardisation requires a balanced IPR [intellectual property rights] policy, based on FRAND licensing terms.”

It’s no surprise that the Commission was trying to keep that particular detail quiet, because FRAND licensing—the acronym stands for “fair, reasonable, and non-discriminatory”—is incompatible with open source, which will therefore find itself excluded from much of the EU’s grand new Digital Single Market strategy. That’s hardly a “balanced IPR policy.”

Glyn goes on to say that FRAND licensing is the result of lobbying by American technical giants but seems unlikely.

The EU has attempted to favor EU-origin “allegedly” competitive software for years.

I say “allegedly” because the EU never points to competitive software in its antitrust proceedings that was excluded, only to the speculation that but for those evil American monopolists, there would be this garden of commercial and innovative European software. You bet.

There is a lot of innovative European software, but it hasn’t been produced in the same mindset that afflicts officials at the EU. They are fixated on an out-dated software sales/licensing model. Consider the rising number of companies based on nothing but open source if you want a sneak peek at the market of the future.

Being mired in market models from the past, the EU sees only protectionism (the Google complaint) and out-dated notions of software licensing (FRAND) as foundations for promoting a software industry in Europe.

Not to mention the provincialism of the EU makes it the enemy of a growing software industry in Europe. Did you know that EU funded startups are limited to hiring EU residents? (Or so I have been told, by EU startups.) That certainly works that way with EU awards.

There is nothing inconsistent with promoting open source and a vibrant EU software industry, so long as you know something about both. Knowing nothing about either has led the EU astray.

More Bad News For EC Brain Project Wood Pigeons

Sunday, February 14th, 2016

I heard the story of how the magpie tried to instruct other birds, particularly the wood pigeon, on how to build nests in a different form but the lesson was much the same.

The EC Brain project reminds me of the wood pigeon hearing “…take two sticks…” and running off to build its nest.

With no understanding of the human brain, the EC set out to build one, on a ten year deadline.

Byron Spice’s report in: Project Aims to Reverse-engineer Brain Algorithms, Make Computers Learn Like Humans casts further doubt upon that project:

Carnegie Mellon University is embarking on a five-year, $12 million research effort to reverse-engineer the brain, seeking to unlock the secrets of neural circuitry and the brain’s learning methods. Researchers will use these insights to make computers think more like humans.

The research project, led by Tai Sing Lee, professor in the Computer Science Department and the Center for the Neural Basis of Cognition (CNBC), is funded by the Intelligence Advanced Research Projects Activity (IARPA) through its Machine Intelligence from Cortical Networks (MICrONS) research program. MICrONS is advancing President Barack Obama’s BRAIN Initiative to revolutionize the understanding of the human brain.

“MICrONS is similar in design and scope to the Human Genome Project, which first sequenced and mapped all human genes,” Lee said. “Its impact will likely be long-lasting and promises to be a game changer in neuroscience and artificial intelligence.”

Artificial neural nets process information in one direction, from input nodes to output nodes. But the brain likely works in quite a different way. Neurons in the brain are highly interconnected, suggesting possible feedback loops at each processing step. What these connections are doing computationally is a mystery; solving that mystery could enable the design of more capable neural nets.

My goodness! Unknown loops in algorithms?

The Carnegie Mellon project is exploring potential algorithms, not trying to engineer the unknown.

If the EC had titled its project the Graduate Assistant and Hospitality Industry Support Project, one could object to the use of funds for travel junkets but it would otherwise be intellectually honest.

Consciousness May Be the Product of Carefully Balanced Chaos [Show The Red Card]

Thursday, January 28th, 2016

Consciousness May Be the Product of Carefully Balanced Chaos by sciencehabit.

From the posting:

The question of whether the human consciousness is subjective or objective is largely philosophical. But the line between consciousness and unconsciousness is a bit easier to measure. In a new study (abstract) of how anesthetic drugs affect the brain, researchers suggest that our experience of reality is the product of a delicate balance of connectivity between neurons—too much or too little and consciousness slips away. During wakeful consciousness, participants’ brains generated “a flurry of ever-changing activity”, and the fMRI showed a multitude of overlapping networks activating as the brain integrated its surroundings and generated a moment to moment “flow of consciousness.” After the propofol kicked in, brain networks had reduced connectivity and much less variability over time. The brain seemed to be stuck in a rut—using the same pathways over and over again.

These researchers need to be shown the red card as they say in soccer.

I thought it was agreed that during the Human Brain Project, no one would research or publish new information about the human brain, in order to allow the EU project to complete its “working model” of the human brain.

The Human Brain Project is a butts in seats and/or hotels project and a gum ball machine will be able to duplicate its results. But discovering vast amounts of unknown facts demonstrates the lack of an adequate foundation for the project at its inception.

In other words, more facts may decrease public support for ill-considered WPA projects for science.

Calling the “judgement,” favoritism would be a more descriptive term, of award managers into question, surely merits the “red card” in this instance.

(Note to readers: This post is to be read as sarcasm. The excellent research reported Enzo Tagliazucchi, et al. in Large-scale signatures of unconsciousness are consistent with a departure from critical dynamics is an indication of some of the distance between current research and replication of a human brain.)

The full abstract if you are interested:

Loss of cortical integration and changes in the dynamics of electrophysiological brain signals characterize the transition from wakefulness towards unconsciousness. In this study, we arrive at a basic model explaining these observations based on the theory of phase transitions in complex systems. We studied the link between spatial and temporal correlations of large-scale brain activity recorded with functional magnetic resonance imaging during wakefulness, propofol-induced sedation and loss of consciousness and during the subsequent recovery. We observed that during unconsciousness activity in frontothalamic regions exhibited a reduction of long-range temporal correlations and a departure of functional connectivity from anatomical constraints. A model of a system exhibiting a phase transition reproduced our findings, as well as the diminished sensitivity of the cortex to external perturbations during unconsciousness. This framework unifies different observations about brain activity during unconsciousness and predicts that the principles we identified are universal and independent from its causes.

The “official” version of this article lies behind a paywall but you can see it at: http://arxiv.org/pdf/1509.04304.pdf for free.

Kudos to the authors for making their work accessible to everyone!

I first saw this in a Facebook post by Simon St. Laurent.

Maybe Corporations Aren’t Sovereign States After All

Monday, December 21st, 2015

Revealed: how Google enlisted members of US Congress it bankrolled to fight $6bn EU antitrust case by Harry Davies.

From the post:

Google enlisted members of the US congress, whose election campaigns it had funded, to pressure the European Union to drop a €6bn antitrust case which threatens to decimate the US tech firm’s business in Europe.

The coordinated effort by senators and members of the House of Representatives, as well as by a congressional committee, formed part of a sophisticated, multimillion-pound lobbying drive in Brussels, which Google has significantly ramped up as it fends off challenges to its dominance in Europe.

An investigation by the Guardian into Google’s multifaceted lobbying campaign in Europe has uncovered fresh details of its activities and methods. Based on documents obtained under a freedom of information request and a series of interviews with EU officials, MEPs and Brussels lobbyists, the investigation has also found:

If you appreciate a tale of how a major corporation attempts to bully a sovereign government by buying up the support of another sovereign government, then this post by Harry Davies will be a great joy.

For the most part I’m not sympathetic to the EU’s complaints because it is attempting to create safe harbors for EU search companies to replicate what Google already offers. Why would anyone want more page-rank search engines is unknown. Been there, done that.

The EU could fund innovative research into the next-generation search technology and draw customers away from Google with better search results and the ad cash that goes with them.

Instead, the EU wants to hold Google back while inefficient and higher priced competitors bilk EU consumers. That hardly seems like a winning model for technological development.

Seat warmers in the EU will prattle on about privacy and other EU fictions in the actions against Goole.

Anyone who thinks removing search results from Google and only Google increases privacy is on par with Americans who fear terrorism. It’s some, as of yet to be diagnosed, mental disorder.

How people that ignorant reliably travel back and forth to work everyday is a tribute modern transportation systems.

Google should start doing rolling one-week Google blackouts across the EU. Paying penalties under SAAs and/or with lost revenue would be a small price to pay for rationality on the part of the EU.

The best defense against a monopoly is a better product than the monopoly, not the same product at a higher price from smaller EU vendors.

PS: You might want to notice the EU is trying to favor EU search vendors, not EU citizens, whatever they may claim to the contrary. Another commonality between governments.

Introducing OpenAI [Name Surprise: Not SkyNet II or Terminator]

Friday, December 11th, 2015

Introducing OpenAI by Greg Brockman, Ilya Sutskever, and the OpenAI team.

From the webpage:

OpenAI is a non-profit artificial intelligence research company. Our goal is to advance digital intelligence in the way that is most likely to benefit humanity as a whole, unconstrained by a need to generate financial return.

Since our research is free from financial obligations, we can better focus on a positive human impact. We believe AI should be an extension of individual human wills and, in the spirit of liberty, as broadly and evenly distributed as is possible safely.

The outcome of this venture is uncertain and the work is difficult, but we believe the goal and the structure are right. We hope this is what matters most to the best in the field.

Background

Artificial intelligence has always been a surprising field. In the early days, people thought that solving certain tasks (such as chess) would lead us to discover human-level intelligence algorithms. However, the solution to each task turned out to be much less general than people were hoping (such as doing a search over a huge number of moves).

The past few years have held another flavor of surprise. An AI technique explored for decades, deep learning, started achieving state-of-the-art results in a wide variety of problem domains. In deep learning, rather than hand-code a new algorithm for each problem, you design architectures that can twist themselves into a wide range of algorithms based on the data you feed them.

This approach has yielded outstanding results on pattern recognition problems, such as recognizing objects in images, machine translation, and speech recognition. But we’ve also started to see what it might be like for computers to be creative, to dream, and to experience the world.

Looking forward

AI systems today have impressive but narrow capabilities. It seems that we’ll keep whittling away at their constraints, and in the extreme case they will reach human performance on virtually every intellectual task. It’s hard to fathom how much human-level AI could benefit society, and it’s equally hard to imagine how much it could damage society if built or used incorrectly.

OpenAI

Because of AI’s surprising history, it’s hard to predict when human-level AI might come within reach. When it does, it’ll be important to have a leading research institution which can prioritize a good outcome for all over its own self-interest.

We’re hoping to grow OpenAI into such an institution. As a non-profit, our aim is to build value for everyone rather than shareholders. Researchers will be strongly encouraged to publish their work, whether as papers, blog posts, or code, and our patents (if any) will be shared with the world. We’ll freely collaborate with others across many institutions and expect to work with companies to research and deploy new technologies.

OpenAI’s research director is Ilya Sutskever, one of the world experts in machine learning. Our CTO is Greg Brockman, formerly the CTO of Stripe. The group’s other founding members are world-class research engineers and scientists: Trevor Blackwell, Vicki Cheung, Andrej Karpathy, Durk Kingma, John Schulman, Pamela Vagata, and Wojciech Zaremba. Pieter Abbeel, Yoshua Bengio, Alan Kay, Sergey Levine, and Vishal Sikka are advisors to the group. OpenAI’s co-chairs are Sam Altman and Elon Musk.

Sam, Greg, Elon, Reid Hoffman, Jessica Livingston, Peter Thiel, Amazon Web Services (AWS), Infosys, and YC Research are donating to support OpenAI. In total, these funders have committed $1 billion, although we expect to only spend a tiny fraction of this in the next few years.

You can follow us on Twitter at @open_ai or email us at info@openai.com.

Seeing that Elon Musk is the co-chair of this project I was surprised the name wasn’t SkyNet II or Terminator. But OpenAI is a more neutral one and given the planned transparency of the project, a good one.

I also appreciate the project not being engineered for the purpose of spending money over a ten year term. Doing research first and then formulating plans for the next step in research sounds like a more sensible plan.

Whether any project ever achieves “artificial intelligence” equivalent to human intelligence or not, this project may be a template for how to usefully explore complex scientific questions.

Lessons in Truthful Disparagement

Friday, October 30th, 2015

Cathy O’Neil, mathbabe featured a guest post on her blog about the EU Human Brain project.

I am taking notes on truthful disparagement from Dirty Rant About The Human Brain Project.

Just listing the main section headers:

  1. We have no fucking clue how to simulate a brain.
  2. We have no fucking clue how to wire up a brain.
  3. We have no fucking clue what makes human brains work so well.
  4. We have no fucking clue what the parameters are.
  5. We have no fucking clue what the important thing to simulate is.

The guest post was authored by a neuroscientist.

Cathy has just posted her slides for a day long workshop on data science (to be held in Stockholm), if you want something serious to read after you stop laughing about the EU Human Brain Project.

BBC Pages Censored by the EU

Friday, June 26th, 2015

List of BBC web pages which have been removed from Google’s search results by Neil McIntosh.

From the post:

Since a European Court of Justice ruling last year, individuals have the right to request that search engines remove certain web pages from their search results. Those pages usually contain personal information about individuals.

Following the ruling, Google removed a large number of links from its search results, including some to BBC web pages, and continues to delist pages from BBC Online.

The BBC has decided to make clear to licence fee payers which pages have been removed from Google’s search results by publishing this list of links. Each month, we’ll republish this list with new removals added at the top.

We are doing this primarily as a contribution to public policy. We think it is important that those with an interest in the “right to be forgotten” can ascertain which articles have been affected by the ruling. We hope it will contribute to the debate about this issue. We also think the integrity of the BBC’s online archive is important and, although the pages concerned remain published on BBC Online, removal from Google searches makes parts of that archive harder to find.

The pages affected by delinking may disappear from Google searches, but they do still exist on BBC Online. David Jordan, the BBC’s Director of Editorial Policy and Standards, has written a blog post which explains how we view that archive as “a matter of historic public record” and, thus, something we alter only in exceptional circumstances. The BBC’s rules on deleting content from BBC Online are strict; in general, unless content is specifically made available only for a limited time, the assumption is that what we publish on BBC Online will become part of a permanently accessible archive. To do anything else risks reducing transparency and damaging trust.

Kudos for the BBC for demonstrating the extent of censorship implied by the EU’s “right to be forgotten. The “right to be forgotten” combines ignorance of technology with eurocentrism at its very worst. Not to mention being futile when directed at a search engine.

Just to get you started, here are the links from the post:

One caveat: when looking through this list it is worth noting that we are not told who has requested the delisting, and we should not leap to conclusions as to who is responsible. The request may not have come from the obvious subject of a story.

May 2015

http://news.bbc.co.uk/1/hi/england/humber/5070882.stm

http://news.bbc.co.uk/1/hi/england/london/6173888.stm

http://www.bbc.co.uk/news/uk-scotland-edinburgh-east-fife-17449896

http://news.bbc.co.uk/2/hi/uk_news/england/tees/4072892.stm

http://news.bbc.co.uk/1/hi/uk/8229401.stm

http://news.bbc.co.uk/1/hi/northern_ireland/1697871.stm

http://www.bbc.co.uk/news/uk-wales-mid-wales-26820735

http://news.bbc.co.uk/2/hi/business/7968536.stm

http://news.bbc.co.uk/2/hi/business/8607205.stm

http://news.bbc.co.uk/1/hi/england/cornwall/7475762.stm

http://news.bbc.co.uk/1/hi/england/2843343.stm

http://news.bbc.co.uk/2/hi/uk_news/england/3445793.stm

http://news.bbc.co.uk/2/hi/uk_news/england/london/6184091.stm

http://news.bbc.co.uk/1/hi/scotland/8529436.stm

http://news.bbc.co.uk/1/hi/england/surrey/8626921.stm

http://news.bbc.co.uk/1/hi/england/lancashire/7017043.stm

http://www.bbc.co.uk/news/uk-england-lancashire-22570334

http://www.bbc.co.uk/news/uk-scotland-glasgow-west-22633321

http://news.bbc.co.uk/2/hi/uk_news/england/manchester/7031790.stm

http://news.bbc.co.uk/1/hi/england/london/6256193.stm

http://news.bbc.co.uk/1/hi/scotland/7730169.stm

http://news.bbc.co.uk/1/hi/england/london/4102529.stm

http://news.bbc.co.uk/2/hi/uk_news/239774.stm

http://news.bbc.co.uk/1/hi/england/london/3562355.stm

http://news.bbc.co.uk/2/hi/uk_news/england/london/3562355.stm

http://news.bbc.co.uk/2/hi/health/6390421.stm

http://news.bbc.co.uk/1/hi/england/lincolnshire/4465225.stm

April 2015

http://www.bbc.co.uk/news/health-15982608

http://news.bbc.co.uk/2/hi/uk_news/england/cambridgeshire/3837895.stm

http://www.bbc.co.uk/news/uk-england-13524740

http://news.bbc.co.uk/2/hi/uk_news/37979.stm

http://www.bbc.co.uk/news/uk-scotland-edinburgh-east-fife-16986231

http://news.bbc.co.uk/1/hi/england/southern_counties/3124151.stm

http://news.bbc.co.uk/1/hi/northern_ireland/7220428.stm

http://news.bbc.co.uk/1/hi/northern_ireland/7218858.stm

http://news.bbc.co.uk/1/hi/northern_ireland/7229438.stm

http://www.bbc.co.uk/wales/bllcks/me_and_mine/

http://www.bbc.co.uk/wales/bllcks/me_and_mine/iangwynhughes.shtml

http://www.bbc.co.uk/wales/bllcks/me_and_mine/ianwinterton.shtml

http://www.bbc.co.uk/wales/bllcks/me_and_mine/jonfortgang.shtml

http://www.bbc.co.uk/wales/bllcks/me_and_mine/mylesgascoyne.shtml

http://www.bbc.co.uk/wales/bllcks/me_and_mine/sandraosborne.shtml

http://www.bbc.co.uk/news/uk-scotland-glasgow-west-27238412

March 2015

http://www.bbc.co.uk/1/hi/england/west_midlands/3082071.stm

http://news.bbc.co.uk/1/hi/world/europe/863439.stm

http://m.bbc.co.uk/news/uk-scotland-glasgow-west-20998106

http://www.bbc.com/news/uk-12520150

http://news.bbc.co.uk/2/hi/uk_news/education/1471655.stm

http://www.bbc.co.uk/news/uk-scotland-tayside-central-11536013

http://news.bbc.co.uk/1/hi/uk/179398.stm

http://news.bbc.co.uk/1/hi/northern_ireland/7009880.stm

http://news.bbc.co.uk/2/hi/uk_news/england/beds/bucks/herts/3649829.stm

http://news.bbc.co.uk/1/hi/in_pictures/4697892.stm

http://www.bbc.co.uk/newsbeat/20357076

http://news.bbc.co.uk/2/hi/health/6917049.stm

February 2015

http://www.bbc.co.uk/blogs/legacy/theeditors/2007/06/shock_tactics.html

http://news.bbc.co.uk/1/hi/england/london/7506139.stm

http://news.bbc.co.uk/1/hi/england/london/7604051.stm

http://news.bbc.co.uk/1/hi/england/london/4102529.stm

http://news.bbc.co.uk/1/hi/england/london/4093123.stm

http://news.bbc.co.uk/1/hi/health/2068088.stm

http://news.bbc.co.uk/2/hi/europe/126040.stm

http://news.bbc.co.uk/1/hi/uk/146650.stm

http://news.bbc.co.uk/1/hi/uk/3228040.stm

http://news.bbc.co.uk/1/hi/uk/765246.stm

http://news.bbc.co.uk/1/hi/england/southern_counties/4717327.stm

http://news.bbc.co.uk/1/hi/uk/146080.stm

http://news.bbc.co.uk/1/hi/england/2176641.stm

http://www.bbc.co.uk/news/uk-england-gloucestershire-13469941

http://www.bbc.co.uk/news/uk-england-lancashire-16928146

http://www.bbc.co.uk/tyne/content/articles/2006/02/07/shearer_qa_feature.shtml

January 2015

http://www.bbc.co.uk/news/uk-scotland-edinburgh-east-fife-20682672

http://www.bbc.co.uk/news/uk-scotland-edinburgh-east-fife-19559270

http://www.bbc.co.uk/schools/citizenx/being/rights/asylum_p2_big.swf

http://news.bbc.co.uk/1/hi/england/beds/bucks/herts/3663494.stm

December 2014

http://news.bbc.co.uk/1/hi/talking_point/3309723.stm

http://news.bbc.co.uk/1/hi/england/west_midlands/4896906.stm

http://www.bbc.co.uk/leicester/content/articles/2006/01/23/jnrft05_u10s_
league_summary_22012006_feature.shtml

November 2014

http://www.bbc.co.uk/news/uk-scotland-tayside-central-13361261

http://www.bbc.co.uk/news/uk-wales-south-east-wales-24740420

http://www.bbc.co.uk/news/uk-england-13524740

http://news.bbc.co.uk/1/hi/england/3536133.stm

http://news.bbc.co.uk/2/hi/uk_news/215647.stm

http://news.bbc.co.uk/1/hi/scotland/7742450.stm

http://news.bbc.co.uk/2/hi/uk_news/england/3536133.stm

http://news.bbc.co.uk/1/hi/uk/7389677.stm

http://news.bbc.co.uk/1/hi/england/2781665.stm

http://news.bbc.co.uk/2/hi/talking_point/3735199.stm

http://news.bbc.co.uk/1/hi/england/3445763.stm

http://www.bbc.co.uk/blogs/legacy/thereporters/robertpeston/2007/10/
merrills_mess.html

http://news.bbc.co.uk/1/hi/northern_ireland/3874393.stm

http://news.bbc.co.uk/2/hi/uk_news/scotland/north_east/8309109.stm

http://news.bbc.co.uk/2/hi/uk_news/northern_ireland/1630200.stm

http://news.bbc.co.uk/1/hi/education/1793669.stm

http://news.bbc.co.uk/1/hi/wales/1564461.stm

http://news.bbc.co.uk/2/hi/uk_news/scotland/1397426.stm

http://news.bbc.co.uk/2/hi/science/nature/2943946.stm

http://news.bbc.co.uk/2/hi/uk_news/england/oxfordshire/3497532.stm

http://news.bbc.co.uk/2/hi/programmes/correspondent/1888430.stm

http://news.bbc.co.uk/1/hi/talking_point/4232440.stm

http://www.bbc.co.uk/gloucestershire/getfresh/2003/10/wicca_questions.shtml

http://news.bbc.co.uk/1/hi/england/north_yorkshire/7303297.stm

http://news.bbc.co.uk/2/hi/uk_news/wales/920077.stm

http://news.bbc.co.uk/1/hi/england/north_yorkshire/7359543.stm

http://news.bbc.co.uk/1/hi/england/nottinghamshire/4757993.stm

http://news.bbc.co.uk/1/hi/england/nottinghamshire/5237884.stm

http://news.bbc.co.uk/1/hi/england/southern_counties/3777733.stm

http://news.bbc.co.uk/2/hi/uk_news/england/southern_counties/3143478.stm

http://www.bbc.co.uk/news/uk-england-13524740

October 2014

http://news.bbc.co.uk/1/hi/england/2051061.stm

http://news.bbc.co.uk/1/hi/scotland/1887975.stm

http://news.bbc.co.uk/1/hi/scotland/tayside_and_central/7150460.stm

http://www.bbc.co.uk/news/uk-england-lancashire-12045141

http://news.bbc.co.uk/1/hi/england/1766321.stm

http://news.bbc.co.uk/olmedia/1765000/images/_1766321_malcolmbell300.jpg

http://news.bbc.co.uk/2/hi/uk_news/england/2594317.stm

http://www.bbc.co.uk/news/uk-wales-mid-wales-16110563

http://news.bbc.co.uk/1/hi/england/oxfordshire/6361347.stm

http://news.bbc.co.uk/1/hi/programmes/panorama/3710528.stm

http://news.bbc.co.uk/1/hi/programmes/panorama/3008433.stm

http://news.bbc.co.uk/media/images/39191000/jpg/_39191603_vennslim.jpg

http://www.bbc.co.uk/drama/spooks/spooksexpert_questions_1.shtml

http://news.bbc.co.uk/1/hi/scotland/north_east/8309109.stm

http://news.bbc.co.uk/1/hi/scotland/1397426.stm

http://news.bbc.co.uk/2/hi/europe/1105488.stm

http://news.bbc.co.uk/1/hi/uk/818889.stm

http://news.bbc.co.uk/2/hi/uk_news/813596.stm

http://news.bbc.co.uk/2/hi/uk_news/england/bristol/somerset/3721062.stm

http://www.bbc.co.uk/news/mobile/uk-14265891

http://news.bbc.co.uk/1/hi/scotland/2168512.stm

http://news.bbc.co.uk/1/hi/sci/tech/323866.stm

http://news.bbc.co.uk/olmedia/320000/images/_323866_debbiefair.jpg

September 2014

http://news.bbc.co.uk/1/hi/wales/3536991.stm

http://news.bbc.co.uk/1/hi/england/london/4022365.stm

http://news.bbc.co.uk/1/hi/england/london/4025739.stm

http://news.bbc.co.uk/1/hi/england/london/4041953.stm

http://news.bbc.co.uk/1/hi/uk/375816.stm

http://news.bbc.co.uk/1/hi/england/1786346.stm

http://news.bbc.co.uk/1/hi/england/1829377.stm

http://news.bbc.co.uk/1/hi/england/2205961.stm

http://news.bbc.co.uk/1/hi/england/west_midlands/3283037.stm

http://news.bbc.co.uk/1/hi/england/wiltshire/3132175.stm

http://news.bbc.co.uk/1/hi/entertainment/1352097.stm

http://news.bbc.co.uk/1/hi/entertainment/1449259.stm

http://news.bbc.co.uk/1/hi/health/3093087.stm

http://news.bbc.co.uk/1/hi/uk/3115844.stm

http://news.bbc.co.uk/1/hi/england/southern_counties/3143478.stm

http://news.bbc.co.uk/sport1/hi/sports_talk/1521047.stm

http://news.bbc.co.uk/sport1/hi/sports_talk/2254216.stm

http://www.bbc.co.uk/northernireland/yourplaceandmine/topics/
your_questions/A745823.shtml

http://news.bbc.co.uk/1/hi/england/southern_counties/3143478.stm

http://www.bbc.co.uk/news/uk-wales-south-east-wales-24740420

http://www.bbc.co.uk/wiltshire/content/articles/2006/01/16/
mwyml_reports_feature.shtml

http://www.bbc.co.uk/wiltshire/content/articles/2006/01/17/
mwyml_reports_feature.shtml

http://news.bbc.co.uk/1/hi/england/3536133.stm

http://news.bbc.co.uk/1/hi/in_pictures/4005059.stm

http://news.bbc.co.uk/2/hi/europe/2263029.stm

http://news.bbc.co.uk/media/images/38259000/jpg/
_38259272_alexanderbbc150.jpg

http://news.bbc.co.uk/media/images/40517000/jpg/
_40517263_sami300.jpg

http://www.bbc.co.uk/threecounties/teens/2004/07/
james_tapping_work_exp.shtml

http://news.bbc.co.uk/1/hi/england/southern_counties/3156658.stm

http://news.bbc.co.uk/2/hi/uk_news/england/southern_counties/3156658.stm

August 2014

http://news.bbc.co.uk/1/hi/england/2246690.stm

http://news.bbc.co.uk/1/hi/england/cumbria/4493558.stm

http://news.bbc.co.uk/1/hi/uk/146432.stm

http://news.bbc.co.uk/2/hi/programmes/click_online/4316658.stm

http://news.bbc.co.uk/2/hi/programmes/click_online/4386216.stm

http://news.bbc.co.uk/1/hi/uk/469609.stm

http://news.bbc.co.uk/1/hi/wales/920077.stm

http://news.bbc.co.uk/1/hi/england/kent/6161563.stm

http://news.bbc.co.uk/2/hi/europe/3209541.stm

http://news.bbc.co.uk/2/hi/uk_news/3206355.stm

http://news.bbc.co.uk/1/hi/england/bristol/7720506.stm

http://news.bbc.co.uk/2/hi/uk_news/northern_ireland/1382875.stm

July 2014

http://news.bbc.co.uk/1/hi/england/2236046.stm

http://news.bbc.co.uk/1/hi/england/bristol/3721062.stm

http://news.bbc.co.uk/1/hi/programmes/newsnight/4746523.stm

http://news.bbc.co.uk/1/hi/uk/971231.stm

http://news.bbc.co.uk/1/hi/world/middle_east/8375952.stm

http://news.bbc.co.uk/2/hi/talking_point/4137317.stm

http://www.bbc.co.uk/blogs/legacy/thereporters/robertpeston/2007/10/
merrills_mess.html

http://www.bbc.co.uk/news/10603523

One consequence of this listing is that I will have to follow the BBC blog to catch the new list of deletions, month by month. The writing is always enjoyable but it’s one more thing to track.

The thought does occur to me that analysis of the EU censored pages may reveal patterns of what materials are the most likely subjects of censorship.

In addition to the BBC list, one can imagine a search engine that only indexes EU censored pages. Would ad revenue sustain such an index or would it be pay-per-view?

It would be very ironic if EU censorship resulted in more publicity for people exercising their “right to be forgotten.” Not only ironic, but appropriate at well.

PS: You can follow the BBC Internet Blog on Twitter: @bbcinternetblog.

Google Antitrust Charges: Guilty Until Proven Innocent

Wednesday, April 15th, 2015

The EU antitrust charges against Google will be news for some time so start with the the primary sources.

Competition Commissioner Margrethe Vestager

First, the official press release from the European Commission: Antitrust: Commission sends Statement of Objections to Google on comparison shopping service; opens separate formal investigation on Android, which reads in part:

The European Commission has sent a Statement of Objections to Google alleging the company has abused its dominant position in the markets for general internet search services in the European Economic Area (EEA) by systematically favouring its own comparison shopping product in its general search results pages. The Commission’s preliminary view is that such conduct infringes EU antitrust rules because it stifles competition and harms consumers. Sending a Statement of Objections does not prejudge the outcome of the investigation.

EU Commissioner in charge of competition policy Margrethe Vestager said: “The Commission’s objective is to apply EU antitrust rules to ensure that companies operating in Europe, wherever they may be based, do not artificially deny European consumers as wide a choice as possible or stifle innovation”.

“In the case of Google I am concerned that the company has given an unfair advantage to its own comparison shopping service, in breach of EU antitrust rules. Google now has the opportunity to convince the Commission to the contrary.

In the first paragraph, “Sending a Statement of Objections does not prejudge the outcome….” and by the fourth paragraph, “…Google now has the opportunity to convince the Commission to the contrary.”???

That sounds remarkably like “guilty until proven innocent” to me. You?

Can you imagine a judge in a US antitrust trial telling the defendant:

“We are going to have a fair trial and you will have to opportunity to convince me your’re not guilty.”

It’s unfortunate that vendors continue to use the EU as a pawn in efforts to compete other vendors. It just encourages the EU, with its admittedly Euro-centric view of the world, to attempt to manage activities best left un-managed. Yes, Google is the world leader in search, if you think indexing 5% of the web constitutes leadership. A “leader” that is still wedded to its lemming (page-rank) based ranking algorithm.

Apparently the EU hasn’t noticed that raw search data is now easily available for potential competitors to Google. (You know it as Common Crawl Link is to a series of my posts on Common Crawl.) The EU is unaware of the ongoing revolution in deep learning, which will make lemming-based ranking passé. (Yes, Google has contributed heavily to that research but research isn’t criminal, at least not yet.) And the very technology for performing Internet searches may be about to change (Darpa/Memex).

Does Google dominate the ad-supported, users-as-end-product, search market? Sure, if you don’t like that, why not create a search service that returns one (1) result, the one that I am looking for? No ads, no selling my information, just returning one useful result. Given the time wasted in a day scrolling through some search engine results, do you see a market for that among professionals?

If I search for pizza, given my IP address and order history, there is only one result that needs to show up. With the number highlighted for calling. Think about all the one result searches you need in a day, week, month. I suppose that doesn’t work for dating services but no one search solution will fit all use cases. Entirely different market from Google, paid for by vendors.

Source documents for your topic map:

Antitrust: Commission probes allegations of antitrust violations by Google (2010)

Antitrust: Commission sends Statement of Objections to Google on comparison shopping service (April 15, 2015)

Antitrust: Commission opens formal investigation against Google in relation to Android mobile operating system

Council Regulation (EC) No 1/2003 of 16 December 2002 on the implementation of the rules on competition laid down in Articles 81 and 82 of the Treaty (Text with EEA relevance) (in English, as of today) The canonical link: http://eur-lex.europa.eu/legal-content/EN/TXT/?uri=CELEX:32003R0001.

Nonsensical ‘Unbiased Search’ Proposal

Tuesday, December 2nd, 2014

Forget EU’s Toothless Vote To ‘Break Up’ Google; Be Worried About Nonsensical ‘Unbiased Search’ Proposal by Mike Masnick.

Mike uncovers (in plain sight) the real danger of the recent EU proposal to “break up” Google.

Reading the legislation (which I neglected to do), Mike writes:

But within the proposal, a few lines down, there was something that might be even more concerning, and more ridiculous, even if it generated fewer (actually, almost no) headlines. And it’s that, beyond “breaking up” search engines, the resolution also included this bit of nonsense, saying that search engines need to be “unbiased”:

Stresses that, when operating search engines for users, the search process and results should be unbiased in order to keep internet searches non-discriminatory, to ensure more competition and choice for users and consumers and to maintain the diversity of sources of information; notes, therefore, that indexation, evaluation, presentation and ranking by search engines must be unbiased and transparent; calls on the Commission to prevent any abuse in the marketing of interlinked services by search engine operators;

But what does that even mean? Search is inherently biased. That’s the point of search. You want the best results for what you’re searching for, and the job of the search engine is to rank results by what it thinks is the best. An “unbiased” search engine isn’t a search engine at all. It just returns stuff randomly.

See Mike’s post for additional analysis of this particular mummers farce.

Another example why the Internet should be governed by a new structure, staffed by people with the technical knowledge to make sensible decisions. By “new structure” I mean one separate from and not subject to any existing government. Including the United States, where the head of the NSA thinks local water supplies are controlled over the Internet (FALSE).

I first saw this in a tweet by Joseph Esposito.

EU commits €14.4m to support open data across Europe

Thursday, November 6th, 2014

EU commits €14.4m to support open data across Europe by Samuel Gibbs.

From the post:

The European Union has committed €14.4m (£11m) towards open data with projects and institutions lead by the Open Data Institute (ODI), Southampton University, the Open University and Telefonica.

The funding, announced today at the ODI Summit being held in London, is the largest direct investment into open data startups globally and will be used to fund three separate schemes covering startups, open data research and a new training academy for data science.

“This is a decisive investment by the EU to create open data skills, build capabilities, and provide fuel for open data startups across Europe,” said Gavin Starks, chief executive of the ODI a non-for-profit organisation based in London co-founded by inventor of the world wide web Sir Tim Berners-Lee. “It combines three key drivers for open adoption: financing startups, deepening our research and evidence, and training the next generation of data scientists, to exploit emerging open data ecosystems.”

Money from the €14.4m will be divided into three sections. Through the EU’s €80 billion Horizon 2020 research and innovation funding, €7.8m will be used to fund the 30-month Open Data Incubator for Europe (ODInE) for open data startups modelled on the ODI’s UK open data startup incubator that has been running since 2012.

Take a look at Open Data Institute’s Startup page

BTW, on the list of graduates, the text of the links for Provenance and Mastodon C are correct but the underlying hyperlinks,
http://theodi.org/start-ups/www.provenance.it and http://theodi.org/start-ups/www.mastodonc.com, respectively, are incorrect.

With the correct underlying hyperlinks:

Mastodon C

Provenance

I did not check the links for the current startups. I did run the W3C Link Checker on http://theodi.org/start-ups and go some odd results. If you are interested, see what you think.

Sorry, I got diverted by the issues with the Open Data Institute site.

Among other highlights from the article:

A further €3.7m will be used to fund 15 researchers into open data posed with the question “how can we answer complex questions with web data?”.

You can puzzle over that one on your own.

EC [WPA] Brain Project Update

Friday, October 3rd, 2014

Electronic Brain by 2023: E.U.’s Human Brain Project ramps up by R. Colin Johnson.

From the post:

The gist of the first year’s report is that all the pieces are assembled — all personal are hired, laboratories throughout the region engaged, and the information and communications (ICT) is in place to allow the researchers and their more than 100 academic and corporate partners in more than 20 countries to effectively collaborate and share data. Already begun are projects that reconstruct the brain’s functioning at several different biological scales, the analysis of clinical data of diseases of the brain, and the development of computing systems inspired by the brain.

The agenda for the first two and a half years (the ramp-up phase) has also been set whereby the HBP will amass all known strategic data about brain functioning, develop theoretical frameworks that fit that data, and develop the necessary infrastructure for developing six ICT platforms during the following “operational” phase circa 2017.

“Getting ready” is a fair summary of HBP Achievements Year One.

The report fails to mention the concerns of scientists threatening to boycott the project, but given the response of the EC to that letter, which could be summarized as: “…we have decided to spend the money, get in line or get out of the way,” a further response was unlikely.

No, the EC Brain Project is more in line with the WPA projects of depression era in the United States. WPA projects were employment projects first and the results of those projects, strictly a secondary concern.

No doubt some new results will come from the EU Brain Project, simply because it isn’t possible to employ that many researchers and not have some publishable results. Particularly if self-published by the project itself.

One can only hope that the project will publish a bibliography of “all known strategic data about brain functioning” as part of its research results. Just so outsiders can gauge the development of “…theoretical frameworks that fit that data.”

One suspects for less than the conference and travel costs built into this project, the EC could have purchased a site license for the entire EU to most if not all European scientific publishers. That would do more to advance scientific research in the EU than attempting to duplicate the unknown.

InterActive Terminology for Europe (IATE)

Saturday, July 12th, 2014

InterActive Terminology for Europe (IATE)

From the about page:

IATE (= “Inter-Active Terminology for Europe”) is the EU’s inter-institutional terminology database. IATE has been used in the EU institutions and agencies since summer 2004 for the collection, dissemination and shared management of EU-specific terminology. The project partners are:

  • European Commission
  • Parliament
  • Council
  • Court of Justice
  • Court of Auditors
  • Economic & Social Committee
  • Committee of the Regions
  • European Central Bank
  • European Investment Bank
  • Translation Centre for the Bodies of the EU

The project was launched in 1999 with the objective of providing a web-based infrastructure for all EU terminology resources, enhancing the availability and standardisation of the information.

IATE incorporates all of the existing terminology databases of the EU’s translation services into a single new, highly interactive and accessible interinstitutional database. The following legacy databases have been imported into IATE, which now contains approximately 1.4 million multilingual entries:

  • Eurodicautom (Commission),
  • TIS (Council),
  • Euterpe (EP),
  • Euroterms (Translation Centre),
  • CDCTERM (Court of Auditors),

For more information, please download the IATE brochure.

I first saw this at: IATE terminology database available for download; contains very large number of legal terms in multiple languages.

I am sure IATE “…contains very large number of legal terms in multiple languages” but I would not trust the mapping for legal purposes until it has been verified.

Download at: http://iate.europa.eu/tbxPageDownload.do

Covering the European Elections with Linked Data

Wednesday, June 25th, 2014

Covering the European Elections with Linked Data by Basile Simon.

From the post:

What we wanted to do was:

  • to use Linked Data in a news context (something that the Vote2014 team was trying to do with Paul’s new model, article above),
  • to provide some background on this important event for the UK and Europe,
  • and to offer alternative coverage of the election (sort of).

In the end, we built an experimental dashboard for the elections, and eventually discovered some potentially editorially challenging stuff in our data—detailed below—which led us to decide not to release the experiment to the public. Despite being unable to release the project, this one or two weeks rush taught us lots, and we are today coming up with improvements to our data model, following the questions raised by our findings. Before we get to the findings, though, I’ll walk through the process of making the dashboard.

If you are thinking about covering the U.S. mid-term elections this Fall, you need to read Basile’s post.

Not only will you be inspired in many ways but you will gain insight into what it will take to have a quality interface ready by election time. It is a non-trivial task but apparently a very exciting one.

Perhaps you can provide an alternative to the mind numbing stalling until enough results are in for the elections to be called.

Follow the Money (OpenTED)

Tuesday, May 20th, 2014

Opening Up EU Procurement Data by Friedrich Lindenberg.

From the post:

What is the next European dataset that investigative journalists should look at? Back in 2012 at the DataHarvest conference, Brigitte, investigative superstar from FarmSubsidy and co-host of the conference, had a clear answer: let’s open up TED (Tenders Electronic Daily). TED is the EU’s shared procurement mechanism, and is at the heart of the EU contracting process. Opening it up would shine a light on the key questions of who receives public money, and what they receive it for.

Her suggestion triggered a two-year project, OpenTED, which, as of last week, has finally matured into a useful resource for journalists and researchers. While gaps remain, we hope it will now start to be used by journalists, NGOs, analysts and citizens to get information on everything from large scale trends to local municipal developments.

(image omitted)

OpenTED

TED collects tender notices for large public projects so that companies from all EU countries can bid on those contracts. For journalists, there are many exciting questions such a database would be able to answer: What major projects are being announced? Who is winning the contracts for these projects, and is that decision made prudently and impartially? Who are the biggest suppliers in a particular country or industry?

A data dictionary for the project remains unfinished and there are plenty of other opportunities to contribute to this project.

The phrase “large public project” means projects with budgets in excess of €200,000. If experience in the United States holds true for the EU, there can be a lot of FGC (Fraud, Greed, Corruption) in under €200,000 contracts.

If you are looking for volunteer opportunities, the data needs to be used and explored, a data dictionary remains unfinished, current code can be improved and I assume documentation would be appreciated.

Certainly the type of project that merits widespread public support.

I find the project interesting because once you connect the players based on this data set, folding in other sets of connections, such as school, social, club, agency, employer, will improve the value of the original data set. Topic maps of course being my preferred method for the folding.

I first saw this in a tweet by ePSIplatform.

Feathers, Gossip and the European Union Court of Justice (ECJ)

Wednesday, May 14th, 2014

It is a common comment that the United States Supreme Court has difficulty with technology issues. Not terribly surprising since digital technology evolves several orders of magnitude faster than legal codes and customs.

But even if judicial digital illiteracy isn’t surprising, judicial theological illiteracy should be.

I am referring, of course, to the recent opinion by the European Court of Justice that there is a right to be “forgotten” in the records of the search giant Google.

In the informal press release about its decision, the ECJ states:

Finally, in response to the question whether the directive enables the data subject to request that links to web pages be removed from such a list of results on the grounds that he wishes the information appearing on those pages relating to him personally to be ‘forgotten’ after a certain time, the Court holds that, if it is found, following a request by the data subject, that the inclusion of those links in the list is, at this point in time, incompatible with the directive, the links and information in the list of results must be erased. The Court observes in this regard that even initially lawful processing of accurate data may, in the course of time, become incompatible with the directive where, having regard to all the circumstances of the case, the data appear to be inadequate, irrelevant or no longer relevant, or excessive in relation to the purposes for which they were processed and in the light of the time that has elapsed. The Court adds that, when appraising such a request made by the data subject in order to oppose the processing carried out by the operator of a search engine, it should in particular be examined whether the data subject has a right that the information in question relating to him personally should, at this point in time, no longer be linked to his name by a list of results that is displayed following a search made on the basis of his name. If that is the case, the links to web pages containing that information must be removed from that list of results, unless there are particular reasons, such as the role played by the data subject in public life, justifying a preponderant interest of the public in having access to the information when such a search is made. (The press release version, The official judgement).

Which doesn’t sound unreasonable, particularly if you are a theological illiterate.

One contemporary retelling of a story about St. Philip Neri goes as follows:

The story is often told of the most unusual penance St. Philip Neri assigned to a woman for her sin of spreading gossip. The sixteenth-century saint instructed her to take a feather pillow to the top of the church bell tower, rip it open, and let the wind blow all the feathers away. This probably was not the kind of penance this woman, or any of us, would have been used to!

But the penance didn’t end there. Philip Neri gave her a second and more difficult task. He told her to come down from the bell tower and collect all the feathers that had been scattered throughout the town. The poor lady, of course, could not do it-and that was the point Philip Neri was trying to make in order to underscore the destructive nature of gossip. When we detract from others in our speech, our malicious words are scattered abroad and cannot be gathered back. They continue to dishonor and divide many days, months, and years after we speak them as they linger in people’s minds and pass from one tale-bearer to the next. (From The Feathers of Gossip: How our Words can Build Up or Tear Down by Edward P. Sri)*

The problem with “forgetting” is the same one as the gossip penitent. Information is copied and replicated by sites for their own purposes. Nothing Google can do will impact those copies. Even if Google, removes all of its references from a particular source, the information could be re-indexed in the future from new sources.

This decision is a “feel good” one for privacy advocates. But, the ECJ should have recognized the gossip folktale parallel and decided that effective relief is impossible. Ordering an Impossible solution diminishes the stature of the court and the seriousness with which its decisions are regarded.

Not to mention the burden this will place on Google and other search result providers, with no guarantee that the efforts will be successful.

Sometimes the best solution is to simply do nothing at all.

* There isn’t a canonical form for this folktale, which has been told and re-told by many cultures.

Search Gets Smarter with Identifiers

Wednesday, March 19th, 2014

Search Gets Smarter with Identifiers

From the post:

The future of computing is based on Big Data. The vast collections of information available on the web and in the cloud could help prevent the next financial crisis, or even tell you exactly when your bus is due. The key lies in giving everything – whether it’s a person, business or product – a unique identifier.

Imagine if everything you owned or used had a unique code that you could scan, and that would bring you a wealth of information. Creating a database of billions of unique identifiers could revolutionise the way we think about objects. For example, if every product that you buy can be traced through every step in the supply chain you can check whether your food has really come from an organic farm or whether your car is subject to an emergency recall.

….

The difficulty with using big data is that the person or business named in one database might have a completely different name somewhere else. For example, news reports talk about Barack Obama, The US President, and The White House interchangeably. For a human being, it’s easy to know that these names all refer to the same person, but computers don’t know how to make these connections. To address the problem, Okkam has created a Global Open Naming System: essentially an index of unique entities like people, organisations and products, that lets people share data.

“We provide a very fast and effective way of discovering data about the same entities across a variety of sources. We do it very quickly,” says Paolo Bouquet. “And we do it in a way that it is incremental so you never waste the work you’ve done. Okkam’s entity naming system allows you to share the same identifiers across different projects, different companies, different data sets. You can always build on top of what you have done in the past.”

The benefits of a unique name for everything

http://www.okkam.org/

The community website: http://community.okkam.org/ reports 8.5+ million entities.

When the EU/CORDIS show up late for a party, it’s really late.

A multi-lingual organization like the EU, kudos on their efforts in that direction, should know uniformity of language or identifiers is only found in dystopian fiction.

I prefer the language and cultural richness of Europe over the sterile uniformity of American fast food chains. Same issue.

You?

I first saw this in a tweet by Stefano Bertolo.

CORDIS – EU research projects under FP7 (2007-2013)

Monday, March 10th, 2014

CORDIS – EU research projects under FP7 (2007-2013)

Description:

This dataset contains projects funded by the European Union under the seventh framework programme for research and technological development (FP7) from 2007 to 2013. Grant information is provided for each project, including reference, acronym, dates, funding, programmes, participant countries, subjects and objectives. A smaller file is also provided without the texts for objectives.

The column separator is the “;” character.

The “Achievements” column is blank for all 22,653 projects/rows.

Can you suggest other sources will machine readable data on the results from EU research projects under FP7 (2007-2013)?

Thanks!

I first saw this in a tweet by Stefano Bertolo.

Advertising RDF and Linked Data:… [Where’s the beef?]

Saturday, February 1st, 2014

Advertising RDF and Linked Data: SPARQL Queries on EU Data

From the webpage:

This is a collection of SPARQL queries on EU data that shows benefits of converting it to RDF and linking it, i.e. queries that reveal non-trivial information that would have been hard to reconstruct by hunting it down over separate/unlinked data sources.

At first I thought this would be a cool demonstration of the use of SPARQL, with the queries as links and more fully set forth below.

Nada. The non-working hyperlinks in the list of queries I suspect were meant to be internal links to the fuller exposition of the queries.

Then when I get to the queries, the only one that promises:

Link to query result: http://www4.wiwiss.fu-berlin.de/eures/sparql

Returns a 404.

The other links appear to be links to webpages that given a SPARQL, which if I had a SPARQL client, I could paste the SPARQL query in to see the result.

I would mirror the question:

Effort of obtaining those results without RDFizing and linking:

with:

Effort to see “…benefits of convering [EU data] to RDF and linking it” without a SPARQL client, very high/impossible.

That’s not just a criticism of RDF. Topic maps made a different mistake but it had the same impact.

The question for any user is “where’s the beef?” What am I gaining? Now, not some unknown number of tomorrows from now. Today!

PS: The EU data cloud has dropped the “Linked Open Data Around-the-Clock” moniker I reported in September of 2011. Same place, different branding. I suspect that is why governments like the web so much. Implementing newspeak policy is just a save away.

Legivoc – connecting laws in a changing world

Thursday, December 26th, 2013

Legivoc – connecting laws in a changing world by Hughes-Jehan Vibert, Pierre Jouvelot, Benoît Pin.

Abstract:

On the Internet, legal information is a sum of national laws. Even in a changing world, law is culturally specific (nation-specific most of the time) and legal concepts only become meaningful when put in the context of a particular legal system. Legivoc aims to be a semantic interface between the subject of law of a State and the other spaces of legal information that it will be led to use. This project will consist of setting up a server of multilingual legal vocabularies from the European Union Member States legal systems, which will be freely available, for other uses via an application programming interface (API).

And I thought linking all legal data together was ambitious!

So long as the EU was composed of civil law jurisdictions, I would not have taken odds on the success of the project but it could have some useful results.

One you add in common law jurisdictions like the United Kingdom, the project may still have some useful results but there isn’t going to be mapping across all the languages.

Part of the difficulty will be language but part of it will be at the most basic assumptions of both systems.

In civil law, the drafters of legal codes attempt to systematically set out a set of principles that take each other into account and represent a blueprint for an ordered society.

Common law, on the other hand, has at its core court decisions that determine the results between two parties. And those decisions can be relied upon by other parties.

Between civil and common law jurisdictions, some laws/concepts may be more mappable than others. Modern labor law for example, may be new enough for semantic accretions to not prevent a successful mapping.

Older laws, property and inheritance laws, for example, are usually the most unique for any jurisdiction. Those are likely to prove impossible to map or reconcile.

Still, it will be an interesting project, particularly if they disclose the basis for any possible mapping, as opposed to simply declaring a mapping.

Both would be useful, but the former robust in the face of changing law and the latter is brittle.

Think Tank Review

Saturday, December 7th, 2013

Think Tank Review by Central Library of the General Secretariat of the EU Council.

The title could mean a number of things so when I saw it at Full Text Reports, I followed it.

From the first page:

Welcome to issue 8 of the Think Tank Review compiled by the Council Library.* It references papers published in October 2013. As usual, we provide the link to the full text and a short abstract.

The current Review and past issues can be downloaded from the Intranet of the General Secretariat of the Council or requested to the Library.

A couple of technical points: the Think Tank Review will soon be made available – together with other bibliographic and research products from the Library – on our informal blog at http://www.councillibrary.wordpress.com. A Beta version is already online for you to comment.

More broadly, in the next months we will be looking for ways to disseminate the contents of the Review in a more sophisticated way than the current – admittedly spartan – collection of links cast in a pdf format. We will look at issues such as indexing, full text search, long-term digital preservation, ease of retrieval and readability on various devices. Ideas from our small but faithful community of readers are welcome. You can reach us at central.library@consilium.europa.eu.

I’m not a policy wonk so scanning the titles didn’t excite me but it might you or (more importantly) one of your clients.

It seemed like an odd enough resource that you may not encounter it by chance.

BARTOC launched : A register for vocabularies

Friday, November 15th, 2013

BARTOC launched : A register for vocabularies by Sarah Dister

From the post:

Looking for a classification system, controlled vocabulary, ontology, taxonomy, thesaurus that covers the field you are working in? The University Library of Basel in Switzerland recently launched a register containing the metadata of 600 controlled and structured vocabularies in 65 languages. Its official name: the Basel Register of Thesauri, Ontologies and Classifications (BARTOC).

High quality search

All items in BARTOC are indexed with Eurovoc, EU’s multilingual thesaurus, and classified using Dewey Decimal Classification (DDC) numbers down to the third level, allowing a high quality subject search. Other search characteristics are:

  • The search interface is available in 20 languages.
  • A Boolean operators field is integrated into the search box.
  • The advanced search allows you to refine your search by Field type, Language, DDC, Format and Access.
  • In the results page you can refine your search further by using the facets on the right side.

A great step towards bridging vocabularies but at a much higher (more general) level than any enterprise or government department.

Free Access to EU Satellite Data

Thursday, November 14th, 2013

Free Access to EU Satellite Data (Press Release, Brussels, 13 November 2013).

From the release:

The European Commission will provide free, full and open access to a wealth of important environmental data gathered by Copernicus, Europe’s Earth observation system. The new open data dissemination regime, which will come into effect next month, will support the vital task of monitoring the environment and will also help Europe’s enterprises, creating new jobs and business opportunities. Sectors positively stimulated by Copernicus are likely to be services for environmental data production and dissemination, as well as space manufacturing. Indirectly, a variety of other economic segments will see the advantages of accurate earth observation, such as transport, oil and gas, insurance and agriculture. Studies show that Copernicus – which includes six dedicated satellite missions, the so-called Sentinels, to be launched between 2014 and 2021 – could generate a financial benefit of some € 30 billion and create around 50.000 jobs by 2030. Moreover, the new open data dissemination regime will help citizens, businesses, researchers and policy makers to integrate an environmental dimension into all their activities and decision making procedures.

To make maximum use of this wealth of information, researchers, citizens and businesses will be able to access Copernicus data and information through dedicated Internet-based portals. This free access will support the development of useful applications for a number of different industry segments (e.g. agriculture, insurance, transport, and energy). Other examples include precision agriculture or the use of data for risk modelling in the insurance industry. It will fulfil a crucial role, meeting societal, political and economic needs for the sustainable delivery of accurate environmental data.

More information on the Copernicus web site at: http://copernicus.eu

The “€ 30 billion” financial benefit seems a bit soft after looking at the study reports on the economic value of Copernicus.

For example, if Copernicus is used to monitor illegal dumping (D. Drimaco, Waste monitoring service to improve waste management practices and detect illegal landfills), how is a financial benefit calculated for illegal dumping prevented?

If you are the Office of Management and Budget (U.S.), you could simply make up the numbers and report them in near indecipherable documents. (Free Sequester Data Here!)

I don’t doubt there will be economic benefits from Copernicus but questions remain: how much and for who?

I first saw this in a tweet by Stefano Bertolo.

Eurostat regional yearbook 2013 [PDF as Topic Map Interface?]

Sunday, October 13th, 2013

Eurostat regional yearbook 2013

From the webpage:

Statistical information is an important tool for understanding and quantifying the impact of political decisions in a specific territory or region. The Eurostat regional yearbook 2013 gives a detailed picture relating to a broad range of statistical topics across the regions of the Member States of the European Union (EU), as well as the regions of EFTA and candidate countries. Each chapter presents statistical information in maps, figures and tables, accompanied by a description of the main findings, data sources and policy context. These regional indicators are presented for the following 11 subjects: economy, population, health, education, the labour market, structural business statistics, tourism, the information society, agriculture, transport, and science, technology and innovation. In addition, four special focus chapters are included in this edition: these look at European cities, the definition of city and metro regions, income and living conditions according to the degree of urbanisation, and rural development.

The Statistical Atlas is an interactive map viewer, which contains statistical maps from the Eurostat regional yearbook and provides the possibility to download these maps as high-resolution PDFs.

PDF version of the Eurostat regional yearbook 2013

But this isn’t a dead PDF file:

Under each table, figure or map in all Eurostat publications you will find hyperlinks with Eurostat online data codes, allowing easy access to the most recent data in Eurobase, Eurostat’s online database. A data code leads to either a two- or three-dimensional table in the TGM (table, graph, map) interface or to an open dataset which generally contains more dimensions and longer time series using the Data Explorer interface (3). In the Eurostat regional yearbook, these online data codes are given as part of the source below each table, figure and map.

In the PDF version of this publication, the reader is led directly to the freshest data when clicking on the hyperlinks for Eurostat online data codes. Readers of the printed version can access the freshest data by typing a standardised hyperlink into a web browser, for example:

http://ec.europa.eu/eurostat/product?code=&mode=view, where is to be replaced by the online data code in question.

A great data collection for anyone interested in the EU.

Take particular note of how delivery in PDF format does not preclude accessing additional information.

I assume that would extend to topic map-based content as well.

Where there is a tradition of delivery of information in a particular form, why would you want to change it?

Or to put it differently, what evidence is there of a pay-off from another form of delivery?

Noting that I don’t consider hyperlinks to be substantively different from other formal references.

Formal references are a staple of useful writing, albeit hyperlinks (can) take less effort to follow.

AT4AM: The XML Web Editor Used By…

Saturday, August 17th, 2013

AT4AM: The XML Web Editor Used By Members Of European Parliment

From the post:

AT4AM – Authoring Tool for Amendments – is a web editor provided to Members of European Parliament (MEPs) that has greatly improved the drafting of amendments at European Parliament since its introduction in 2010.

The tool, developed by the Directorate for Innovation and Technological Support of European Parliament (DG ITEC) has replaced a system based on a collection of macros developed in MS Word and specific ad hoc templates.

Moving beyond guessing the semantics of an author depends upon those semantics being documented at the point of creation.

Having said that, I think we all acknowledge that for the average user, RDF and its kin, were DOA.

Interfaces such as AT4AM, if they can be extended to capture the semantics of their authors, would be a step in the right direction.

BTW, see the AT4AM homepage, complete with live demo.

DCAT Application Profile for Data Portals in Europe – Final Draft

Wednesday, May 22nd, 2013

DCAT Application Profile for Data Portals in Europe – Final Draft

From the post:

The DCAT Application profile for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe. Its basic use case is to enable a cross-data portal search for data sets and make public sector data better searchable across borders and sectors. This can be achieved by the exchange of descriptions of data sets among data portals.

This final draft is open for public review until 10 June 2013. Members of the public are invited to download the specification and post their comments directly on this page. To be able to do so you need to be registered and logged in.

If you are interested in integration of data from European data portals, it is worth the time to register, etc.

Not all the data you are going to need to integrate a data set but at least a start in the right direction.

Pan-European open data…

Wednesday, March 13th, 2013

Pan-European open data available online from EuroGeographics

From the post:

Data compiled from national mapping supplied by 45 European countries and territories can now be downloaded for free at http://www.eurogeographics.org/form/topographic-data-eurogeographics.

From today (8 March 2013), the 1:1 million scale topographic dataset, EuroGlobalMap will be available free of charge for any use under a new open data licence. It is produced using authoritative geo-information provided by members of EuroGeographics, the Association for European Mapping, Cadastre and Land Registry Authorities.

….

“World leaders acknowledge the need for further mainstream sustainable development at all levels, integrating economic, social and environmental aspects and recognising their inter-linkages,” she said. [EuroGeographics’ President, Ingrid Vanden Berghe]

“Geo-information is key. It provides a vital link among otherwise unconnected information and enables the use of location as the basis for searching, cross-referencing, analysing and understanding Europe-wide data.”

Geographic location is a common binding point for information.

Interesting to think about geographic steganography. Right latitude but wrong longitude, or other variations.

Six Degrees of Francis Bacon…

Friday, March 8th, 2013

Six Degrees of Francis Bacon, a 17th century social network by Nathan Yau.

From the post:

Network of Francis Bacon

Nathan points us to a project to determine the relationships of Francis Bacon:

Six Degrees of Francis Bacon.

Imagine that instead of collecting “door pass” data in the Man Bites Dog story about influence of special interests in the EU Parliment, the study collected financial, social, education, and other relationships with members of the EU Parliament and the favors it bestows.

Same outcome? Or different?