Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 10, 2015

June 9, 2015

An identifier is not a string…

Filed under: Topic Maps — Patrick Durusau @ 6:49 pm

Deborah A. Lapeyre tweets:

An identifier is not a string, it is an association between a string and a thing #dataverse2015

I assume she is here: Dataverse Community Meeting 2015.

I be generous and assume that Deborah was just reporting something said by a speaker. 😉

What else would an identifier (or symbol) be if it wasn’t just a string?

What happens if we use a symbol (read word) in a conversation and the other person says: “I don’t understand.”

Do we:

  1. Skip the misunderstood part of the conversation?
  2. Repeat the misunderstood part of the conversation but louder?
  3. Expand or use different words for the misunderstood part of the conversation?

Are you betting on #3?

If you have every played 20 questions then you know that discovering what a symbol means involves listing other symbols and their values, while you try to puzzle out the original symbol

Think of topic maps as being the reverse of twenty questions. We start with the answer and we want to make sure everyone gets the same answer. So, how do you do that? You list questions and their answers (key/value pairs) for the answer.

It is a way of making the communication more reliable because if you don’t immediately recognize the answer, then you can consult the questions and their answers to make sure you understand.

Additional people can add their questions and answers to the answer so someone working in another language, for instance, can know you are talking about an answer they recognize.

True enough, you can claim an association between a string and “a thing” but then you are into all sorts of dubious and questionable metaphysics and epistemology. You are certainly free to list such an association between a string and a thing but that is only one question/answer among many.

You do realize of course that all the keys and values are answers in their own right and could also be described with a list of key/value pairs.

I think I like the reverse of twenty-questions better than my earlier identifier explanation. You can play as short or as long a game as you choose.

Does that work for you?

Fast Track to the Corporate Wish List [Is There A Hacker In The House?]

Filed under: Government,Government Data,Law,Politics — Patrick Durusau @ 6:19 pm

Fast Track to the Corporate Wish List by David Dayen.

From the post:

Some time in the next several days, the House will likely vote on trade promotion authority, enabling the Obama administration to proceed with its cherished Trans-Pacific Partnership (TPP). Most House Democrats want no part of the deal, which was crafted by and for corporations. And many Tea Party Republicans don’t want to hand the administration any additional powers, even in service of a victory dearly sought by the GOP’s corporate allies. The vote, which has been repeatedly delayed as both the White House and House GOP leaders try to round up support, is expected to be extremely close.

The Obama administration entered office promising to renegotiate unbalanced trade agreements, which critics believe have cost millions of manufacturing jobs in the past 20 years. But they’ve spent more than a year pushing the TPP, a deal with 11 Pacific Rim nations that mostly adheres to the template of corporate favors masquerading as free trade deals. Of the 29 TPP chapters, only five include traditional trade measures like reducing tariffs and opening markets. Based on leaks and media reports—the full text remains a well-guarded secret—the rest appears to be mainly special-interest legislation.

Pharmaceutical companies, software makers, and Hollywood conglomerates get expanded intellectual property enforcement, protecting their patents and their profits. Some of this, such as restrictions on generic drugs, is at the expense of competition and consumers. Firms get improved access to poor countries with nonexistent labor protections, like Vietnam or Brunei, to manufacture their goods. TPP provides assurances that regulations, from food safety to financial services, will be “harmonized” across borders. In practice, that means a regulatory ceiling. In one of the most contested provisions, corporations can use the investor-state dispute settlement (ISDS) process, and appeal to extra-judicial tribunals that bypass courts and usual forms of due process to seek monetary damages equaling “expected future profits.”

How did we reach this point—where “trade deals” are Trojan horses for fulfilling corporate wish lists, and where all presidents, Democrat or Republican, ultimately pay fealty to them? One place to look is in the political transfer of power, away from Congress and into a relatively obscure executive branch office, the Office of the United States Trade Representative (USTR).

USTR has become a way station for hundreds of officials who casually rotate between big business and the government. Currently, Michael Froman, former Citigroup executive and chief of staff to Robert Rubin, runs USTR, and his actions have lived up to the agency’s legacy as the white-shoe law firm for multinational corporations. Under Froman’s leadership, more ex-lobbyists have funneled through USTR, practically no enforcement of prior trade violations has taken place, and new agreements like TPP are dubiously sold as progressive achievements, laced with condescension for anyone who disagrees.

David does a great job of sketching the background both for the Trans-Pacific Partnership but also the U.S. Trade Representative.

Given the hundreds of people, nation states and corporations that have access to the text of the Trans-Pacific Partnership text, don’t you wonder why it remains secret?

I don’t think President Obama and his business cronies realize that secrecy of an agreement that will affect the vast majority of American citizens strikes at the legitimacy of government itself. True enough, corporations that own entire swaths of Congress are going to get more benefits than the average American. Those benefits are out in the open and citizens can press for benefits as well.

The benefits that accrue to corporations under the Trans-Pacific Partnership will be gained in secret, with little or no opportunity for the average citizen to object. There is something fundamentally unfair about the secret securing of benefits for corporations.

I hope that Obama doesn’t complain about “illegal” activity that foils his plan to secretly favor corporations. I won’t be listening. Will you?

Congress.gov Webinar 11 June 2015 2PM-3PM

Filed under: Government,Law,Law - Sources — Patrick Durusau @ 5:51 pm

Congress.gov Webinar

The Law Library of Congress is putting on a webinar about Congress.gov this coming Thursday, 11 June 2015, 2PM-3PM.

Whether you believe laws really matter or you just need to find laws/action for rhetoric, this is likely to be a very good webinar!

See you there!

SciGraph

Filed under: Graphs,Neo4j,Ontology — Patrick Durusau @ 5:41 pm

SciGraph

From the webpage:

SciGraph aims to represent ontologies and data described using ontologies as a Neo4j graph. SciGraph reads ontologies with owlapi and ingests ontology formats available to owlapi (OWL, RDF, OBO, TTL, etc). Have a look at how SciGraph translates some simple ontologies.

Goals:

  • OWL 2 Support
  • Provide a simple, usable, Neo4j representation
  • Efficient, parallel ontology ingestion
  • Provide basic “vocabulary” support
  • Stay domain agnostic

Non-goals:

  • Create ontologies based on the graph
  • Reasoning support

Some applications of SciGraph:

  • the Monarch Initiative uses SciGraph for both ontologies and biological data modeling [repaired link] [Monarch enables navigation across a rich landscape of phenotypes, diseases, models, and genes for translational research.]
  • SciCrunch uses SciGraph for vocabulary and annotation services [biomedical but also has US patents?]
  • CINERGI uses SciGraph for vocabulary and annotation services [Community Inventory of EarthCube Resources for Geosciences Interoperability, looks very ripe for a topic map discussion]

If you are interested in representation, modeling or data integration with ontologies, you definitely need to take a look at SciGraph.

Enjoy!

Titan 0.9.0-M2 Release

Filed under: Graphs,TinkerPop,Titan — Patrick Durusau @ 4:02 pm

Titan 0.9.0-M2 Release.

From Dan LaRocque:

Aurelius is pleased to release Titan 0.9.0-M2. 0.9.0-M2 is an experimental release intended for development use.

This release uses TinkerPop 3.0.0.M9-incubating, compared with 3.0.0.M6 in Titan 0.9.0-M1. Source written against Titan 0.5.x and earlier will generally require modification to compile against Titan 0.9.0-M2. As TinkerPop 3 requires a Java 8 runtime, so too does Titan 0.9.0-M2.

While 0.9.0-M1 came out with a separate console and server zip archive, 0.9.0-M2 is a single zipfile with both components. The zipfile is still only packaged with Hadoop 1 to match TP3’s Hadoop support.

http://s3.thinkaurelius.com/downloads/titan/titan-0.9.0-M2-hadoop1.zip

Documentation:

Manual: http://s3.thinkaurelius.com/docs/titan/0.9.0-M2/
Javadoc: http://titan.thinkaurelius.com/javadoc/0.9.0-M2/

The upgrade instructions and changelog for 0.9.0-M2 are in the usual places.

http://s3.thinkaurelius.com/docs/titan/0.9.0-M2/upgrade.html
http://s3.thinkaurelius.com/docs/titan/0.9.0-M2/changelog.html

I have to limit my reading of people who pretend that C/C+ level hacks (OPM) are “…the work of most sophisticated state-sponsored cyber intrusion entities.”

Enjoy!

Studying Law Studying Surveillance

Filed under: Cybersecurity,Privacy — Patrick Durusau @ 3:19 pm

Studying Law Studying Surveillance by Julie Cohen.

Abstract:

The dialogue between law and Surveillance Studies has been complicated by a mutual misrecognition that is both theoretical and temperamental. Legal scholars are inclined to consider surveillance simply as the (potential) subject of regulation, while scholarship in Surveillance Studies often seems not to grapple with the ways in which legal processes and doctrines are sites of contestation over both the modalities and the limits of surveillance. Put differently, Surveillance Studies takes notice of what law does not—the relationship between surveillance and social shaping—but glosses over what legal scholarship rightly recognizes as essential—the processes of definition and compromise that regulators and other interested parties must navigate, and the ways that legal doctrines and constructs shape those processes. This article explores the fault lines between law and Surveillance Studies and considers the potential for more productive confrontation and dialogue in ways that leverage the strengths of each tradition.

Quite an interesting read but to be honest, I would rather confront surveillance studies on its running failure to produce results than on theory questions.

When I say, “its running failure to produce results,” I have to acknowledge that drone strikes and cruise missiles have been used to settle private scores between citizens in Afghanistan and elsewhere, but that seems like a very poor rate of return. And we shouldn’t forget the mentally disturbed and wannabe terrorists that the FBI assists, one assumes on the basis of surveillance evidence.

What I suspect the surveillance camp has yet to comprehend is that assuming 24 x 7 total surveillance of even a smallish group of people, is going to take the collective bandwidth of at least three to four times the number of people under surveillance, to say nothing of the infrastructures to keep all their watching coordinated.

With the limited surveillance data that is being captured now, the surveillance community has demonstrated its inability to do much worthwhile with the data. The recent story of the TSA being unable to identify 73 TSA employees as having links to terrorism being yet another case in point. The surveillance community is unable to effectively share data with agencies that need to have it.

I would start any dialogue or debate about surveillance by putting the burden of proof squarely on the shoulders of the surveillance community. What evidence do they have that surveillance works at all? Or that particular procedures, such as the bulk collection of phone metadata is effective? The latest review on the phone records program suggests for all of the hand wringing over it, it has yet to be useful.

No doubt there is the potential for it to be useful, but that could be said about almost any human activity. We need some basis beyond paranoia and/or the need to sustain agency budgets to support surveillance programs.

It’s not that liberal theory isn’t important for the law, it is, but if there is no factual basis for even evaluating surveillance, then why trouble ourselves?

I first saw this in a tweet by Bruce Schneier.

Apache Lucene 5.2, Solr 5.2 Available

Filed under: Lucene,Solr — Patrick Durusau @ 1:39 pm

From the news:

Lucene can be downloaded from http://www.apache.org/dyn/closer.cgi/lucene/java/5.2.0 and Solr can be downloaded from http://www.apache.org/dyn/closer.cgi/lucene/solr/5.2.0

See the Lucene CHANGES.txt and Solr CHANGES.txt files included with the release for a full list of details.

Enjoy!

PS: Also see the Reference Guide for Solr 5.2.

Bogus OPM – China Claims Debunked

Filed under: Cybersecurity,Security — Patrick Durusau @ 1:03 pm

Summary:

Claim 1: Stolen, signed security certificates, require a level of sophistication not observed outside nation-state cyber forces. (Bogus)

Claim 2: Mimikatz is a classic [tactics, techniques and procedures] of Deep Panda. (Bogus)

The details:

Start with Hacking as Offensive Counterintelligence by John Schindler, which reads in part:

The IC is pointing the finger at China, tentatively, apparently at hacking entities that have a “close relationship” with Chinese intelligence. The case for official Chinese culpability is growing.

The “is growing” hyperlink takes us to a tweet by Bill Gertz (that got 8 retweets and 2 favorites) that reads:

Week in Cyber Threat Space: New technical details reveal PLA link to OPM hack http://flashcritic.com/technical-forensics-of-opm-hack-reveal-pla-links-to-cyber-attacks-targeting-americans/

Which takes us to: Technical forensics of OPM hack reveal PLA links to cyber attacks targeting Americans by Bill Gertz.

Evidence for Chinese groups being involved?

Sakula is a Remote Access Tool, or RAT, that employs the use of stolen, signed security certificates, a technique requiring a level of sophistication not observed outside nation-state cyber forces.

The domain names used by the hackers in the OPM attack included OPMsecurity.org and opm-learning.org.

Claim 1:

“…stolen, signed security certificates, a technique requiring a level of sophistication not observed outside nation-state cyber forces.”

Really?

A casual web search uncovers: Bogus AV program uses 12 stolen digital certificates to make the malware look legit by Jeremy Kirk.

Which reads in part:

The samples of Antivirus Security Pro collected by Microsoft used stolen certificates issued “by a number of different CAs to software developers in various locations around the world,” the company wrote.

The certificates were issued to developers in the Netherlands, U.S., Russia, Germany, Canada and the U.K. by CAs such as VeriSign, Comodo, Thawte and DigiCert, according to a chart.

Using stolen certificates is not a new tactic, but it is usually considered difficult to accomplish since hackers have to either breach an organization or an entity that issues the certificates.

One of the certificates was issued just three days before Microsoft picked up samples of Antivirus Security Pro using it, indicating “that the malware’s distributors are regularly stealing new certificates, rather than using certificates from an older stockpile.

The claim that:

“…stolen, signed security certificates, a technique requiring a level of sophistication not observed outside nation-state cyber forces.”

is: BOGUS!

Claim 2:

Another method used by the Chinese in the OPM data breach was Mimikatz, software that allows remote users to learn network administrator log-in credentials through a relatively simple process.

“Mimikatz is a classic [tactics, techniques and procedures] of Deep Panda,” said a security analyst familiar with details of the attack. “This allows the actors to dump password hashes, perform pass the hash and ‘golden ticket’ attacks in the victim environment.”

A “classic…of Deep Panda?”

Mimikatz is the latest, and one of the best, tool to gather credential data from Windows systems. In fact I consider Mimikatz to be the “swiss army knife” of Windows credentials – that one tool that can do everything. Since the author of Mimikatz, Benjamin Delpy, is French most of the resources describing Mimikatz usage is in French, at least on his blog. The Mimikatz GitHub repository is in English and includes useful information on command usage.

Mimikatz is a Windows x32/x64 program coded in C by Benjamin Delpy (@gentilkiwi) in 2007 to learn more about Windows credentials (and as a Proof of Concept). There are two optional components that provide additional features, mimidrv (driver to interact with the Windows kernal) and mimilib (AppLocker bypass, Auth package/SSP, password filter, and sekurlsa for WinDBG). Mimikatz requires administrator or SYSTEM and often debug rights in order to perform certain actions and interact with the LSASS process (depending on the action requested).

From Mimikatz and Active Directory Kerberos Attacks by Sean Metcalf.

OK, I guess “classic” is the right term for a program that is (3 months on the Internet = 1 Web year) thirty-two (32) Web years old.

I don’t see any basis for claiming that Mimikatz is unique to “Deep Panda,” assuming “Deep Panda” is something other than marketing hype. Mimikatz is available from Github and is used by thousands if not tens of thousands of users.

The claim that:

“Mimikatz is a classic [tactics, techniques and procedures] of Deep Panda,”

is: BOGUS!

The article does have one useful bit of information related to the OPM hack:

The OPM hack involved the compromise of administrator-level access that allowed the hackers to download information, and potentially to alter or corrupt data within the system.

Do you remember item 4 on page 40 of the United States Office of Personnel Management, Agency Financial Report, 2014:

The password length setting for privileged user accounts did not meet minimum OPM password length requirements.

So the OPM hack did not occur because:

the group [was] among the most sophisticated state-sponsored cyber intrusion entities.

but rather because “privileged users” failed to “meet minimum OPM password length requirements.”

No nation states required for the OPM hack, just poorly skilled network administrators.

June 8, 2015

Reporters Need To Learn Basic Data Skills [and skepticism]

Filed under: Journalism,News,Reporting — Patrick Durusau @ 10:42 am

It’s Time For Every Journalist To Learn Basic Data Skills by Marta Kang.

From the post:

First came the web. Then came social media. Now journalists face a new challenge on the horizon: big data.

It used to be that data journalism lived in a corner of the newsroom, in the care of investigative or business reporters. But in recent years, big data has amassed at such a rate that it can no longer be the responsibility of a few.

Numbers, Numbers On Every Beat

In 2013, IBM researchers found that 90 percent of the world’s data had been created in the previous two years. We’d suddenly gained a quantitative understanding of our world! This new knowledge base empowers us to predict the spread of disease, analyze years of government spending, and even understand how an extra cup of coffee might affect one’s sleep quality. We’ve essentially gained countless new perspectives — bird’s-eye views, granular views, inward views of ourselves — as long as we know how to make sense of the numbers.

Many news outlets have already taken to using data to drive a range of stories, from the profound to the surprising. ProPublica and NPR calculated how much limbs are worth in each state to highlight the dramatic disparity in workers’ comp benefits across the U.S. The Washington Post analyzed 30 years of groundhog forecasts and found that “a groundhog is just a groundhog,” and not a weatherman, alas.

Still, conversations about data-driven journalism have mostly focused on large-scale projects by industry powerhouses and new outlets like FiveThirtyEight and Vox.

But the job shouldn’t be left to big newsrooms with dedicated teams. In this era of big data, every journalist must master basic data skills to make use of all sources available to them.

“There’s so much data available now, and there’s basically data on every single beat, and you have reporters getting spreadsheets all the time,” Chad Skelton, a data journalist with The Vancouver Sun, told me.

I am very sympathetic to Martha’s agenda of motivating journalists to learn basic data skills. Reporters can hardly work in the public interest if they have to accept untested claims about data from one source or another. They need to skills to not only use data but to question data when a story sounds too good to be true.

For example, when the news broke about the data breach at the U.S. Office of Personnel Management (OPM), how many reports headlined that the Inspector General for the OPM had requested at least two of their computer systems be shut down because they “could potentially have national security implications?” U.S. Was Warned of System Open to Cyberattacks

Oh, yes, that was the zero (0) number, the empty set, nil.

Several days after the drum beat that China was responsible for the breach, with no evidence proffered to support that accusation, the full brokenness of the OPM systems is coming to light.

Not that reporters have to have every detail on the first report, but finding an Inspector General’s report on security (obviously relevant to a data breach story) isn’t an exercise in data sleuthing.

Knowing how broken the systems were reported to be by the OPM’s own Inspector General, increases the range of suspects to high school hackers, college CS students, professional hackers, other nation states, and everyone in between.

The nearest physical analogy I can think of would be to have a pallet of $100 bills in the middle of Times Square and when a majority of those go missing, inventing a security fantasy that only a bank could steal money in those quantities.

Anybody with a computer could have broken into OPM systems and it is disingenuous to posture and pretend otherwise.

Let’s teach journalist basic data skills but let’s also teach them to not repeat gibberish from government spokespersons. In many cases it isn’t news even if they say it. In the case of data breaches, it is more often than not noise.

Five ways to spot a phishing email [There’s a sucker born every 0.108 seconds]

Filed under: Cybersecurity,Security — Patrick Durusau @ 8:23 am

Five ways to spot a phishing email by Kevin Wright.

I know none of my readers really need this sort of reminder/advice but its a good summary to pass onto others.

Kevin’s list reads (see his post for details):

  1. An odd from name/domain
  2. Spelling and grammar
  3. Over-the-top calls to action
  4. Mismatched and masked links
  5. A request for personal information

I am guessing from #2, Spelling and grammar, that Kevin hasn’t read many government reports or regulations. Few enough spelling errors but the grammar is atrocious. 😉

BTW, Kevin reports that the stats on phishing emails are grim:

Phishing emails are inexpensive to deploy, and a staggering 156 million are sent every day. Of these, 8 million are opened and 800,000 individuals click on the phishing links inside.

Or to update the alleged P.T. Barnum phrase, “There’s a sucker born every minute,” for the Internet Age:

There’s a sucker born every 0.108 seconds.

June 7, 2015

Debug like a doctor [Not Like the FBI]

Filed under: Cybersecurity,Security — Patrick Durusau @ 9:15 pm

Debug like a doctor by Connor Mendenhall.

The crux of Connor’s post, which he then explains very well is:

Differential diagnosis is a systematic method used by doctors to match sets of symptoms with their likely causes. A good differential diagnosis consists of four distinct steps:

  1. List all the observed symptoms.
  2. List possible causes for the observed symptoms.
  3. Rank the list of causes in order of urgency.
  4. Conduct test to rule out causes in priority order.

You can contrast that with the FBI method of investigating data breaches:

  1. Get incomplete/incoherent account of data loss, requiring data loss updates after months of investigation.
  2. Leak to news media anonymous accusations that China is responsible for the data breach.

A lack of cybersecurity talent requires a coarsening of some steps of investigation but I think it has been taken too far. Take the Office of Personnel Management breach, where the estimate of data lose worsens day by day.

Take a tip from the big data people, start with the data and not with the result you want.

Signatures, patterns and trends: Timeseries data mining at Etsy

Filed under: Cybersecurity,Pattern Recognition,Streams,Time Series — Patrick Durusau @ 8:56 pm

From the description:

Etsy loves metrics. Everything that happens in our data centres gets recorded, graphed and stored. But with over a million metrics flowing in constantly, it’s hard for any team to keep on top of all that information. Graphing everything doesn’t scale, and traditional alerting methods based on thresholds become very prone to false positives.

That’s why we started Kale, an open-source software suite for pattern mining and anomaly detection in operational data streams. These are big topics with decades of research, but many of the methods in the literature are ineffective on terabytes of noisy data with unusual statistical characteristics, and techniques that require extensive manual analysis are unsuitable when your ops teams have service levels to maintain.

In this talk I’ll briefly cover the main challenges that traditional statistical methods face in this environment, and introduce some pragmatic alternatives that scale well and are easy to implement (and automate) on Elasticsearch and similar platforms. I’ll talk about the stumbling blocks we encountered with the first release of Kale, and the resulting architectural changes coming in version 2.0. And I’ll go into a little technical detail on the algorithms we use for fingerprinting and searching metrics, and detecting different kinds of unusual activity. These techniques have potential applications in clustering, outlier detection, similarity search and supervised learning, and they are not limited to the data centre but can be applied to any high-volume timeseries data.

Blog post: https://codeascraft.com/2013/06/11/introducing-kale/

Signature, patterns and trends? Sounds relevant to monitoring network patterns. Yes?

Good focus on anomaly detection, pointing out that many explanations are overly simplistic.

Use case is one (1) million incoming metrics.

Looking forward to seeing this released as open source!

June 6, 2015

MorganaXProc

Filed under: XML,XProc — Patrick Durusau @ 7:35 pm

MorganaXProc

From the webpage:

MorganaXProc is an implementation of W3C’s XProc: An XML Pipeline Language written in Java™.

I first saw this in a tweet by Norm Walsh (think XML Calabash, also an implementation of XProc). We could use more people like Norm.

The New ‘China Syndrone’ – Saving Face By Blaming China

Filed under: Cybersecurity,Security — Patrick Durusau @ 1:23 pm

The original China Syndrone was a movie about a cover-up of safety hazards at a nuclear power plant, starring Jane Fonda, Michael Douglas, and Jack Lemmon. The idea was that if a nuclear reactor were to melt down, the molten core would be on its way to China, hence “China syndrone.”

There is a new “China Syndrone” that is the current darling of the U.S. government and its toady press following. The new “China Syndrone” blames China for every breach in cybersecurity in the United States, particular of U.S. government sites. The latest round of these specious accusations surround the 2015 data breach at the U.S. Office of Personnel Management.

The Wall Street Journal, was the first to repeat unsubstantiated claims by U.S. governments sources pointing at hackers in China as the source of the attack. From there it has flared into a general parroting contest in the media to see who can repeat the claim the most often. No one from the press it appears, has obtained any evidence to substantiate such a claim. Nor are they likely to since the hack was either months or a year ago (accounts differ).

Even more disturbing, CNN has reported as “news” the following eleven steps (the story headline says ten, but we already knew CNN has trouble with numbers beyond two) to hack the U.S. government:

  1. Find Agency X
  2. Spam
  3. Get a federal worker to reply
  4. Focus on Agency X
  5. Find more points of entry
  6. Spread
  7. Discover vulnerabilities
  8. Become an admin
  9. Create new users
  10. Exploit fake users
  11. Avoid detection

And…

In April, the U.S. government learned of the ten-step plan to hack it. For two months, the federal government didn’t reveal the information publicly because they had not yet cleaned up the entire system. Nor did federal officials want the Chinese to know they were onto them.

Really? And this “ten step” plan differs from hacking anyone else how? I suspect this is representative of the level of government understanding of cybersecurity. Now you know why the U.S. government is cyberinsecure. Yes?

Did you know that for at least the past two years that privileged users at OPM have not followed rules on password length? Or that staff who no longer work at the agency may still have valid access to data? Or that users may have greater access than necessary for their positions? See U.S. Office of Personnel Management Data Breach for details and sources.

Here’s a five step plan to hack the OPM:

  1. Locate likely privileged user on LinkedIn or other social network site. (count on LinkedIn today is over 3,000)
  2. Locate network address for OPM login
  3. Brute force short password
  4. Exploit user’s access
  5. Avoid detection

Or if that seems like too much work:

  1. Locate likely privileged user on LinkedIn or other social network site. (count on LinkedIn today is over 3,000)
  2. Steal user’s laptop
  3. Use password save in browser to login the OPM network
  4. Exploit user’s access
  5. Avoid detection

Either way works.

You don’t have to be a nation state to breach U.S. government security and pretending otherwise annoys potential friends (China) and prevents us from addressing known security issues. Like management incompetence at OPM. How difficult is it to enforce password restrictions. If a privileged user can’t logon because they can use a secure password, then you know they have outlived their usefulness at the agency.

PS: China or at least hackers in China could have been responsible for the OPM hack but then anything is possible. Saving face instead of addressing management and security issues is a very poor cybersecurity strategy.

Cheap Bastards

Filed under: Cybersecurity,Security — Patrick Durusau @ 12:36 pm

Have you heard the story about the couple who celebrated their 50th wedding anniversary at an elegant restaurant with their three sons? The father was chagrined to find that none of the sons had given them presents for the occasion. After the meal was over, the father said: “There’s something your mother and I have been meaning to tell you for years. We were never married.” The sons gasped and the youngest son blurted: “You mean we are all bastards?” “Yes,” said the father, “and cheap ones too!”

That story came to mind when I read the Telsa bug bounty program award list:

• XSS: $200–$500

• CSRF: $100–$500

• SQL: $500–$1,000

• Command injection: $1,000

• Business logic issues: $100–$300

• Horizontal privilege escalation: $500

• Vertical privilege escalation: $500–$1,000

• Forceful browsing/Insecure direct object references: $100–$500

• Security misconfiguration: Up to $200

• Sensitive data exposure: Up to $300

Given the education, experience, training, expertise, equipment, resources needed to be a first class hacker, how are you going to make a living at $300 for “sensitive data exposure?”

A hack may only take you a few seconds but it isn’t like you are doing piece work in a garment factory. The time it takes to perform a hack shouldn’t be the measure for payment.

If you are seriously interested in improving cybersecurity, unlike the leadership at the U.S. Office of Personnel Management, then don’t be a “cheap bastard,” when it comes to cybersecurity. If that seems unclear, you know where to find me for further details.

June 5, 2015

‘prevent encryption above all else’

Filed under: Cybersecurity,Security — Patrick Durusau @ 6:39 pm

‘prevent encryption above all else’ by Andrea Peterson.

From the post:

The debate over encryption erupted on Capitol Hill again Wednesday, with an FBI official testifying that law enforcement’s challenge is working with tech companies “to build technological solutions to prevent encryption above all else.”

At first glance the comment from Michael B. Steinbach, assistant director in the FBI’s Counterterrorism Division, might appear to go further than FBI Director James B. Comey. Encryption, a technology widely used to secure digital information by scrambling data so only authorized users can decode it, is “a good thing,” Comey has said, even if he wants the government to have the ability get around it.

But Steinbach’s testimony also suggests he meant that companies shouldn’t put their customers’ access to encryption ahead of national security concerns — rather than saying the government’s top priority should be preventing the use of the technology that secures basically everything people do online.

That seems plain enough doesn’t it?

BTW, Michael Steinbach isn’t just a name in a report. He can be identified with this photo:

MichaelSteinbach

The picture in Andrea’s post is much more recent but probably also subject to copyright restrictions. I got mine off the FBI website at: http://www.fbi.gov/about-us/executives/steinbach.

Whatever the government says, the goal is to leave you ripe for the rape of identification measures. And that’s only the beginning.

Have You Ever Pwned an F-35?

Filed under: Cybersecurity,Security — Patrick Durusau @ 4:26 pm

800px-f-35

Wikipedia reports the weapon systems of the Lockheed Martin F-35 Lightning II as follows:

Not all options are available on any one flight.

Sean Lyngaas reports in Untold lines of code make Pentagon weapons vulnerable:

Weapons systems remain vulnerable to hacking despite the billions of dollars the Defense Department spends annually on cybersecurity, Pentagon officials have acknowledged. Frank Kendall, the department’s top acquisition official, is taking a stab at the problem through his latest round of guidance, but he appears to be up against formidable foes in the scope of the threat and the cost of addressing it.

There are nine million lines of code in the F-35 joint strike fighter jet, plus 15 million lines in support systems, according to Richard Stiennon, chief research analyst at IT-Harvest. Cleaning up all the code in the weapons systems being produced for DOD would cost hundreds of billions of dollars alone, reckoned Stiennon, who is writing a book on cyber warfare. “In other words, if we ever go to war with a sophisticated adversary, or have a battle, they could pull out their cyber weapons and make us look pretty foolish,” he said.

Stiennon of IT-Harvest said cyber vulnerabilities have been baked into the defense acquisition system. “The Pentagon made a mistake common to many manufacturers,” he wrote in an op-ed in November 2014. “They assumed that because their systems were proprietary and distribution was controlled there would be no hacking, no vulnerabilities discovered, and no patch-management cycles to fix them. This is security by obscurity, an approach that always fails over time.”

Let’s see, 9 million lines of code in the F-35 plus 15 million in support systems, what, 24 million lines of code?

Is anyone giving odds on the first zero-day bug being a buffer overflow condition?

Welcome to the Internet of Things! Where potentially hackable things include the F-35 with the weapons systems listed above.

The data scientists keep wailing about a shortage of data scientists. Much more likely to have a shortage of cracker talent.

Better to break an F-35 yourself while it is sitting on the ground, unarmed, that for that to happen in the air while carrying a nuke.

Top cracker talent is going to start attracting baseball like salaries. What did they used to say: “The future is so bright I have to wear shades?”

PS: You do realize that cracking an F-35 without permission of its owner and/or as part of a country’s military is likely a crime in most jurisdictions? Just checking.


Editorial correction:

The original lead sentence read:

The Wikipedia article Lockheed Martin F-35 Lightning II reports that you could gain control over:

which to one close reader implied that Wikipedia stated that hacks of an F-35 would give control over the weapons systems listed.

To clarify that the only reliance on Wikipedia was for the list of weapons systems, the lead sentence now reads:

Wikipedia reports the weapon systems of the Lockheed Martin F-35 Lightning II as follows:

I amended the paragraph that starts: “Welcome to the Internet of Things!” to read:

Welcome to the Internet of Things! Where potentially hackable things include the F-35 with the weapons systems listed above.

U.S. Office of Personnel Management Data Breach

Filed under: Cybersecurity,Government,Security — Patrick Durusau @ 3:13 pm

APNewsBreak: Massive breach of federal personnel data by Ken Dilanian and Ricardo Alonso-Zaldivar.

From the post:

Hackers broke into the U.S. government personnel office and stole identifying information of at least 4 million federal workers.

The Department of Homeland Security said in a statement Thursday that at the beginning of May, data from the Office of Personnel Management and the Interior Department was compromised.

One usual cyber suspects, China, has been accused of being responsible for the breach. Which China denies.

China and North Korea are accused of cybercrimes on a regular basis, in part due to the inability of most Americans to find either one on a map.

I don’t doubt that governments around the world engage in a variety of cyber activities, some offensive and some defensive. Including China and North Korea. But given the revelations of Edward Snowden and the crimes committed against allies, non-allies and its own people by the United States government, that same government has no high moral ground for accusing others without public proof.

No doubt the accused in many cases could return the favor with evidence of further indiscretions by the United States. Fewer tantrums and more funding of computer security research would be a step in a better direction.

In case you are interested, the Congressional Budget Justification Performance Budget Fiscal Year 2015, reads in part:

During FY 2013, OPM made significant strides in addressing the management challenges identified by the OIG. A detail accounting of OPM’s FY 2013 actions to address the management challenges can be found in OPM’s FY 2013 Agency Financial Report at https://www.opm.gov/about-us/budget-performance/performance/2013-agency-financial-report.pdf. Below is a table briefly describing the top management challenges and how the fiscal year 2015 budget request addresses each management challenge.

Isn’t that odd? That the 2015 Justification skips over 2014 to say it is improving from the 2013 financial report?

From the United States Office of Personnel Management, Agency Financial Report, 2014:

During FY 2014, the Office of Chief Information Officer (OCIO) continued to make progress in centralizing security program functions in an effort to address deficiencies noted in its security program. However, we continue to observe control weaknesses as follows:

1. The current authentication guidance regarding two-factor authentication has not been fully applied.

2. Access rights in OPM systems are not documented and mapped to personnel roles and functions to ensure that personnel access is limited only to the functions needed to perform their job responsibilities.

3. The information security control monitoring program was not fully effective in detecting information security control weaknesses. We noted access rights in OPM systems were:

  • Granted to new users without following the OPM access approval process and quarterly reviews to confirm access approval were not consistently performed.
  • Not revoked immediately upon user separation and quarterly reviews to confirm access removal were not consistently performed.

4. The password length setting for privileged user accounts did not meet minimum OPM password length requirements.

(at page 40)

For more than two (2) years the Office of Chief Information Officer (OCIO) has been sitting on implementing two-factor authentication and privileged user passwords did not meet “minimum OPM password length requirements.”

Wow! I think we now know who was responsible for this data breach, even if we don’t who carried out the data breach. Yes?

This isn’t some highly sophisticated hack. Some former employee or current one with a weak password could be responsible for this data breach. A data breach that exposed all present and past federal employees. That’s a high impact breach, don’t you think?

Could be some junior high or high school hacking club. Looking for low lying fruit. They found it at the U.S. Office of Personnel Management. Maybe OPM will tighten their security up, in another year or two.

BTW, it isn’t like the government lacks for good advice on being secure:

FIPS PUB 200 Minimum Security Requirements for Federal Information and Information Systems

NIST Special Publication 800-53 Revision 4 Security and Privacy Controls for Federal Information Systems and Organizations

Also be aware of the NIST Cybersecurity Framework page.

There are other relevant NIST publications and agency specific ones but those should give you an idea of what is already known to government security experts.

“At least they don’t seal the fire exits”…

Filed under: Diversity,Politics,Researchers — Patrick Durusau @ 10:36 am

“At least they don’t seal the fire exits” Or why unpaid internships are BS by Auriel M. V. Fournier.

From the post:

I’m flipping through a job board, scanning for post docs, dreamily reading field technician posts and there they are

Unpaid internship in Amazing Place A

Unpaid technician working with Cool Species B

Some are obvious, and put their unpaid status it in the title, others you have to dig through the fine print, before you are hit you over the head with what a ‘unique oppurtunity this internship is’ how rare the animal or system, and how you should smile and love that you are not going to get paid, and might even have to pay them for the pleasure of working for them.

Every time I see one of these posts my skin crawls, my heart races, my eyes narrow. These jobs anger me, at my core, and I think we as a scientific community need to stop doing this to ourselves and our young scientists.

We get up and talk about how we need diversity in our field (whatever field it is, for me its wildlife ecology) how we need people from all backgrounds, cultures, creeds and races. Then we create positions that only those who come from means, and continue to have them can take. We are shooting ourselves in the foot by excluding people from getting into science. How is someone who has student loans (most students do), someone who has no financial support, someone with a child, or a sick parent, no family to buy a plane ticket for them, or any other kind of life situation supposed to take these positions? How?
….

Take the time to read Auriel’s post, whether you use unpaid internships or not. It’s not long and worth the read. I will wait for you to come back before continuing….back so soon?

Abstract just a little bit from Auriel’s post and think about her main point separate and apart from the specifics of unpaid internships. Is it that unpaid work can be undertaken only by those who can survive without getting paid for that work? Yes?

If you agree with that, how many unpaid editors, unpaid journal board members, unpaid peer reviewers, unpaid copy editors, unpaid program unit chairs, unpaid presenters, unpaid organizational officers, etc., do you think exist in academic circles?

Hmmm, do you think the people in all those unpaid positions still have to make ends meet at the end of the month? Take care of expenses out of their own pockets for travel and other expenses? Do you think the utility company cares whether you have done a good job as a volunteer peer reviewer this past month?

The same logic that Auriel uses in her post applies to all those unpaid positions as well. Not that academic groups can make all unpaid volunteer positions paid but any unpaid or underpaid position means you have made choices about who can hold those positions.

Global marine data to become unified, accessible

Filed under: Data Integration,Oceanography — Patrick Durusau @ 10:09 am

Global marine data to become unified, accessible

From the post:

An international project aims to enable the next great scientific advances in global marine research by making marine data sets more easily accessible to researchers worldwide.

Currently different data formats between research centres pose a challenge to oceanographic researchers, who need unified data sets to get the most complete picture possible of the ocean. This project, called ODIP II, aims to solve this problem using NERC’s world-class vocabulary server to ‘translate’ between these different data semantics. The vocabulary server, which is effectively now an international standard for a service of this kind, was developed by the British Oceanographic Data Centre (BODC); a national facility operated as part of the National Oceanography Centre (NOC).

That sounds promising, at least until you read:

By the time ODIP II is complete, in May 2018, it aims to have developed a means of seamlessly sharing and managing marine data between the EU, the USA and Australia, by co-ordinating the existing regional marine e-infrastructures.

I’ve never been really strong on geography but the last time I looked, “global” included more than the EU, USA and Australia.

Let’s be very generous and round the EU, USA and Australia population total up to 1 billion.

That leaves 6 billion people and hundreds of countries unaccounted for. Don’t know but some of those countries might have marine data. Won’t know if we don’t ask.

Still a great first step, but let’s not confuse the world with ourselves and what we know.

Twitter As Censor

Filed under: Government,Politics,Twitter — Patrick Durusau @ 9:30 am

Twitter shut down a site that saved politicians’ deleted tweets by Colin Lecher

Colin reports that Politwoops (Sunlight Foundation) was shut down by Twitter. It’s crime? It saved tweets that politicians deleted. Horrors. A public statements that remain public statements. Can’t imagine why anyone would think that was reasonable.

No appeal, no coherent explanation, no review of the history of a discussion that has been going on since 2012. See Colin’s post for more details.

The Sunlight Foundation has its own reasons “honoring” the Twitter decision. However, I think the Twitter decision merits a more pointed response.

Script to detect deleted tweets? Anyone have a script they can post to search for deleted tweets? Assuming the starting point is an archive of tweets and the script checks to see if any have been deleted.

Polypoops or some similar title: Reddit? For user who detect deleted tweets to post them. Assuming that any site hosting edgy porn won’t be overly troubled by embarrassing politicians.

You may protest that such activities may be seen by Twitter as violating its “terms of service.” To be honest, I am not overly concerned with Twitter’s “dog in the manger” strategies when it comes to Twitter content.

The_Dog_in_the_Manger

Bounty for Internal Twitter Decision Making on Politwoops: If you are good with writing/managing Kickstarter campaigns, what do you think about a bounty for internal Twitter decision making documentation on the Politwoops issue? What do you think it would take? How would you authenticate a response?

Twitter management is within its legal rights to make arbitrary and capricious decisions about their terms of service.

The community is within its rights to make decisions as well.

The question is whether Twitter management wants to pull back its corporate hand or a nub.

June 4, 2015

Apache Spark on HDP: Learn, Try and Do

Filed under: Hortonworks,Spark — Patrick Durusau @ 2:45 pm

Apache Spark on HDP: Learn, Try and Do by Jules S. Damji.

I wanted to leave you with something fun to enjoy this evening. I am off to read a forty-eight (48) page bill that would make your ninth (9th) grade English teacher hurl. It’s really grim stuff that boils down to a lot of nothing but you have to parse through it to make that evident. More on that tomorrow.

From the post:

Not a day passes without someone tweeting or re-tweeting a blog on the virtues of Apache Spark.

At a Memorial Day BBQ, an old friend proclaimed: “Spark is the new rub, just as Java was two decades ago. It’s a developers’ delight.”

Spark as a distributed data processing and computing platform offers much of what developers’ desire and delight—and much more. To the ETL application developer Spark offers expressive APIs for transforming data; to the data scientists it offers machine libraries, MLlib component; and to data analysts it offers SQL capabilities for inquiry.

In this blog, I summarize how you can get started, enjoy Spark’s delight, and commence on a quick journey to Learn, Try, and Do Spark on HDP, with a set of tutorials.

I don’t know which is more disturbing. That Spark was being discussed at a Memorial Day BBQ or that anyone was sober enough to remember it. Life seems to change when you are older than the average cardiologist.

Sorry! Where were we, oh, yes, Saptak Sen has collected a set of tutorials to introduce you to Spark on the HDP Sandbox.

Near the bottom of the page, Apache Zeppelin (incubating) is mentioned along with Spark. Could use it to enable exploration of a data set. Could also use it so that users “discover” on their own that your analysis of the data is indeed correct. 😉

Due diligence means not only seeing the data as processed but the data from where that data was drawn, what pre-processing was done on that data, the circumstances under which the “original” data came into being, the algorithms applied at all stages, to name only a few considerations.

The demonstration of a result merits, “that’s interesting” until you have had time to verify it. “Trust” comes after verification.

Open Review: Grammatical theory:…

Filed under: Grammar,Linguistics,Open Access,Peer Review — Patrick Durusau @ 2:22 pm

Open Review: Grammatical theory: From transformational grammar to constraint-based approaches by Stefan MĂźller (Author).

From the webpage:

This book is currently at the Open Review stage. You can help the author by making comments on the preliminary version: Part 1, Part 2. Read our user guide to get acquainted with the software.

This book introduces formal grammar theories that play a role in current linguistics or contributed tools that are relevant for current linguistic theorizing (Phrase Structure Grammar, Transformational Grammar/Government & Binding, Mimimalism, Generalized Phrase Structure Grammar, Lexical Functional Grammar, Categorial Grammar, Head-Driven Phrase Structure Grammar, Construction Grammar, Tree Adjoining Grammar, Dependency Grammar). The key assumptions are explained and it is shown how each theory treats arguments and adjuncts, the active/passive alternation, local reorderings, verb placement, and fronting of constituents over long distances. The analyses are explained with German as the object language.

In a final part of the book the approaches are compared with respect to their predictions regarding language acquisition and psycholinguistic plausibility. The nativism hypothesis that claims that humans posses genetically determined innate language-specific knowledge is examined critically and alternative models of language acquisition are discussed. In addition this more general part addresses issues that are discussed controversially in current theory building such as the question whether flat or binary branching structures are more appropriate, the question whether constructions should be treated on the phrasal or the lexical level, and the question whether abstract, non-visible entities should play a role in syntactic analyses. It is shown that the analyses that are suggested in the various frameworks are often translatable into each other. The book closes with a section that shows how properties that are common to all languages or to certain language classes can be captured.

(emphasis in the original)

Part of walking the walk of open access means participating in open reviews as your time and expertise permits.

Even if grammar theory isn’t your field, professionally speaking, it will be good mental exercise to see another view of the world of language.

I am intrigued by the suggestion “It shows that the analyses that are suggested in the various frameworks are often translatable into each other.” Shades of the application of category theory to linguistics? Mappings of identifications?

The Archive Is Closed [Library of Congress Twitter Archive]

Filed under: Library,Tweets,Twitter — Patrick Durusau @ 1:57 pm

The Archive Is Closed by Scott McLemee.

From the post:

Five years ago, this column looked into scholarly potential of the Twitter archive the Library of Congress had recently acquired. That potential was by no means self-evident. The incensed “my tax dollars are being used for this?” comments practically wrote themselves, even without the help of Twitter bots.

For what — after all — is the value of a dead tweet? Why would anyone study 140-character messages, for the most part concerning mundane and hyperephemeral topics, with many of them written as if to document the lowest possible levels of functional literacy?
As I wrote at the time, papers by those actually doing the research treated Twitter as one more form of human communication and interaction. The focus was not on the content of any specific message, but on the patterns that emerged when they were analyzed in the aggregate. Gather enough raw data, apply suitable methods, and the results could be interesting. (For more detail, see the original discussion.)

The key thing was to have enough tweets on hand to grind up and analyze. So, yes, an archive. In the meantime, the case for tweet preservation seems easier to make now that elected officials, religious leaders and major media outlets use Twitter. A recent volume called Twitter and Society (Peter Lang, 2014) collects papers on how politics, journalism, the marketplace and (of course) academe itself have absorbed the impact of this high-volume, low-word-count medium.

As far as the Library of Congress archive, Scott reports:


The Library of Congress finds itself in the position of someone who has agreed to store the Atlantic Ocean in his basement. The embarrassment is palpable. No report on the status of the archive has been issued in more than two years, and my effort to extract one elicited nothing but a statement of facts that were never in doubt.

“The library continues to collect and preserve tweets,” said Gayle Osterberg, the library’s director of communications, in reply to my inquiry. “It was very important for the library to focus initially on those first two aspects — collection and preservation. If you don’t get those two right, the question of access is a moot point. So that’s where our efforts were initially focused and we are pleased with where we are in that regard.”

That’s as helpful as the responses I get about the secret ACM committee that determines the fate of feature requests for the ACM digital library. You can’t contact them directly nor can you find any record of their discussions/decisions.

Let’s hope greater attention and funding can move the Library of Congress Twitter Archive towards public access, for all the reasons enumerated by Scott.

One does have to wonder, given the role of the U.S. government in pushing for censorship of Twitter accounts, will the Library of Congress archive be complete and free from censorship? Or will it have dark spots depending upon the whims and caprices of the current regime?

Search by Number (Citation) [US Congress, One Pager on Congressional Citations]

Filed under: Government,Law — Patrick Durusau @ 1:19 pm

Search by Number (Citation)

From the webpage:

Retrieve legislation (amendments, bills, laws, and resolutions) or committee reports by specifying the congress number, the document type abbreviation, and the document number (e.g. 114hr1, 113s.rpt.25, 104PL104). Citations work with or without spaces and periods, and in upper or lowercase. All supported citation formats are listed at Search by Number.

Alternatively, specify a congress using the checkboxes below and search legislation within that congress (e.g. H.R. 202, s744, ha70, S.Amdt.250, pl113-2).

A new way to search for legislation, reports, etc. from the U.S. Congress.

Especially helpful when reading articles whose authors haven’t mastered the art of hyperlinking to congressional materials.

BTW, from Congress.gov Advanced Search, I created a one pager on congressional citations. I had to use 10pt type to get it all on one page but can post the source if you want to produce it in a different size.

Don’t Give Out Cellphone Advice!

Filed under: Government,Law — Patrick Durusau @ 10:43 am

It’s dangerous to give out advice on cellphones these days.

David Wright has been charged with “Conspiring to Obstruct National Security Investigation.

David and Ussamah Abdullah Rahim discussed beheading of a particular individual. The conversations took place over a period of time long enough for Ussamah Abdullah Rahim to become impatient. Ussamah Abdullah Rahim decided instead of the beheading, he would shoot some police officers. In a conversation about the new plan, Wright advised Ussamah Abdullah Rahim to destroy his cellphone so evidence on it would be lost. (The conversation was being monitored by the FBI. Ussamah Abdullah Rahim was subsequently killed in an encounter with police officers.)

As I recall, technically speaking, beheading in most jurisdictions is considered to be murder. So Wright and company were engaged in a conspiracy to commit murder long before any discussion of the cellphone.

Law enforcement agencies should not pad their statistics with marginal national security arrests. It devalues any legitimate warning or advice they may have in the future.

Reputation instead of obligation:…

Filed under: Open Access,Open Data,Transparency — Patrick Durusau @ 10:16 am

Reputation instead of obligation: forging new policies to motivate academic data sharing by Sascha Friesike, Benedikt Fecher, Marcel Hebing, and Stephanie Linek.

From the post:

Despite strong support from funding agencies and policy makers academic data sharing sees hardly any adoption among researchers. Current policies that try to foster academic data sharing fail, as they try to either motivate researchers to share for the common good or force researchers to publish their data. Instead, Sascha Friesike, Benedikt Fecher, Marcel Hebing, and Stephanie Linek argue that in order to tap into the vast potential that is attributed to academic data sharing we need to forge new policies that follow the guiding principle reputation instead of obligation.

In 1996, leaders of the scientific community met in Bermuda and agreed on a set of rules and standards for the publication of human genome data. What became known as the Bermuda Principles can be considered a milestone for the decoding of our DNA. These principles have been widely acknowledged for their contribution towards an understanding of disease causation and the interplay between the sequence of the human genome. The principles shaped the practice of an entire research field as it established a culture of data sharing. Ever since, the Bermuda Principles are used to showcase how the publication of data can enable scientific progress.

Considering this vast potential, it comes as no surprise that open research data finds prominent support from policy makers, funding agencies, and researchers themselves. However, recent studies show that it is hardly ever practised. We argue that the academic system is a reputation economy in which researchers are best motivated to perform activities if those pay in the form of reputation. Therefore, the hesitant adoption of data sharing practices can mainly be explained by the absence of formal recognition. And we should change this.

(emphasis in the original)

Understanding what motivates researchers to share data is an important step towards encouraging data sharing.

But at the same time, would we say that every researcher is as good as every other researcher at preparing data for sharing? At documenting data for sharing? At doing any number of tasks that aren’t really research, but just as important in order to share data?

Rather than focusing exclusively on researchers, funders should fund projects to include data sharing specialists who have the skills and interests necessary to effectively share data as part of a project’s output. Their reputations will be more closely tied to the successful sharing of data and researchers would gain in reputation for the high quality data that is shared. A much better fit for the recommendation of the authors.

Or to put it differently, lecturing researchers on how they should spend their limited time and resources to satisfy your goals, isn’t going to motivate anyone. “Pay the man!” (Richard Prior from Silver Streak)

How I tracked FBI aerial surveillance

Filed under: Clojure,Government,Privacy — Patrick Durusau @ 9:42 am

How I tracked FBI aerial surveillance by John Wiseman.

John give full details of how he scooped AP by 25 days on FBI aerial surveillance. Not to mention that he links to how you can build a similar setup. A setup that uses Clojure! (Plus a software defined radio for you hobbyists out there.)

Assembling a cast of watchers/employees at airports who can photograph people exiting specific planes would be a big step towards matching people up to surveillance flights. Not to mention running photo searches to identify the people themselves.

A gold fish bowl world isn’t the best choice but the government has made that choice. It is up to the rest of us to see that they enjoy the full benefit of that choice. Perhaps they will choose differently at some point in the future.

Resources for Journalists (GIJN)

Filed under: Journalism,News,Reporting — Patrick Durusau @ 9:22 am

Resources for Journalists – Global Investigative Journalism Network

I have referenced specific posts and resources at the Global Investigative Journalism Network (GIJN) before but their resources page merits separate mention.

I am more familiar with their Data Journalism Resources and Data Journalism Toolkit but they have advice on everything from Archiving Your Work to Covering Street Protests.

I’m not sure if I can paste the Paypal link so see their Sponsors and Supporters page to donate to support their work.

« Newer PostsOlder Posts »

Powered by WordPress