Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 4, 2016

Paul Klee’s Personal Notebooks Online (Art, Design)

Filed under: Art,Design — Patrick Durusau @ 5:33 pm

3,900 Pages of Paul Klee’s Personal Notebooks Are Now Online, Presenting His Bauhaus Teachings (1921-1931)

Two snippets and an image to get you interested:

Paul Klee led an artistic life that spanned the 19th and 20th centuries, but he kept his aesthetic sensibility tuned to the future. Because of that, much of the Swiss-German Bauhaus-associated painter’s work, which at its most distinctive defines its own category of abstraction, still exudes a vitality today.

Klee-Notebooks-3

More recently, the Zentrum Paul Klee made available online almost all 3,900 pages of Klee’s personal notebooks, which he used as the source for his Bauhaus teaching between 1921 and 1931. If you can’t read German, his extensively detailed textual theorizing on the mechanics of art (especially the use of color, with which he struggled before returning from a 1914 trip to Tunisia declaring, “Color and I are one. I am a painter”) may not immediately resonate with you. But his copious illustrations of all these observations and principles, in their vividness, clarity, and reflection of a truly active mind, can still captivate anybody — just as his paintings do.

A reminder that design is never a “solved” problem but one that changes as culture does.

I first saw this in a tweet by Alexis Lloyd.

You Can Master the Z Shell (Pointer to How-To)

Filed under: Awk,Linux OS,Shell Scripting — Patrick Durusau @ 4:31 pm

Cutting through the toxic atmosphere created by governments around the world requires the sharpest tools and develop of skills at using them.

Unix shells are like a switchblade knife. Not for every job but if you need immediate results, its hard to beat. While you are opening an application, loading files, finding appropriate settings, etc., a quick shell command can have you on your way.

Nacho Caballero writes in Master Your Z Shell with These Outrageously Useful Tips:

If you had previously installed Zsh but never got around to exploring all of its magic features, this post is for you.

If you never thought of using a different shell than the one that came by default when you got your computer, I recommend you go out and check the Z shell. Here are some Linux guides that explain how to install it and set it as your default shell. You probably have Zsh installed you are on a Mac, but there’s nothing like the warm fuzzy feeling of running the latest version (here’s a way to upgrade using Homebrew).

The Zsh manual is a daunting beast. Just the chapter on expansions has 32 subsections. Forget about memorizing this madness in one sitting. Instead, we’ll focus on understanding a few useful concepts, and referencing the manual for additional help.

The three main sections of this post are file picking, variable transformations, and magic tabbing. If you’re pressed for time, read the beginning of each one, and come back later to soak up the details (make sure you stick around for the bonus tips at the end). (emphasis in original)

Would be authors/editors, want to try your hand at the chapter on expansions? Looking at the documentation for Zsh version 5.2, released December 2, 2015, there are 25 numbered subsections for 14 Expansion.

You will be impressed by the number of modifiers/operators available. If you do write a manual for expansions in Zsh, do distribute it widely.

I hope it doesn’t get overlooked by including it here but Nacho also wrote: AWK GTF! How to Analyze a Transcriptome Like a Pro – Part 1 (2 and 3). Awk is another switchblade like tool for your toolkit.

I first saw this in a tweet by Christophe Lalanne.

Requirements – Programming Exercise – @jessitron

Filed under: Programming,Requirements — Patrick Durusau @ 1:36 pm

Jessica Kerr @jessitron posted to Twitter:

Programming exercise:
I give you some requirements
You write the code
A third person tries to guess the requirements based on the code.

Care to try the same exercise on existing business/government processes?

Or return to code that you wrote a year or more ago?

If you aren’t following @jessitron you should be.

Facts Before Policy? – Digital Security – Contacts – Volunteer Opportunity

Filed under: Cybersecurity,Government,Topic Maps — Patrick Durusau @ 12:53 pm

Rep. McCaul, Michael T. [R-TX-10] has introduced H.R.4651 – Digital Security Commission Act of 2016, full text here, a proposal to form the National Commission on Security and Technology Challenges.

From the proposal:

(2) To submit to Congress a report, which shall include, at a minimum, each of the following:

(A) An assessment of the issue of multiple security interests in the digital world, including public safety, privacy, national security, and communications and data protection, both now and throughout the next 10 years.

(B) A qualitative and quantitative assessment of—

(i) the economic and commercial value of cryptography and digital security and communications technology to the economy of the United States;

(ii) the benefits of cryptography and digital security and communications technology to national security and crime prevention;

(iii) the role of cryptography and digital security and communications technology in protecting the privacy and civil liberties of the people of the United States;

(iv) the effects of the use of cryptography and other digital security and communications technology on Federal, State, and local criminal investigations and counterterrorism enterprises;

(v) the costs of weakening cryptography and digital security and communications technology standards; and

(vi) international laws, standards, and practices regarding legal access to communications and data protected by cryptography and digital security and communications technology, and the potential effect the development of disparate, and potentially conflicting, laws, standards, and practices might have.

(C) Recommendations for policy and practice, including, if the Commission determines appropriate, recommendations for legislative changes, regarding—

(i) methods to be used to allow the United States Government and civil society to take advantage of the benefits of digital security and communications technology while at the same time ensuring that the danger posed by the abuse of digital security and communications technology by terrorists and criminals is sufficiently mitigated;

(ii) the tools, training, and resources that could be used by law enforcement and national security agencies to adapt to the new realities of the digital landscape;

(iii) approaches to cooperation between the Government and the private sector to make it difficult for terrorists to use digital security and communications technology to mobilize, facilitate, and operationalize attacks;

(iv) any revisions to the law applicable to wiretaps and warrants for digital data content necessary to better correspond with present and future innovations in communications and data security, while preserving privacy and market competitiveness;

(v) proposed changes to the procedures for obtaining and executing warrants to make such procedures more efficient and cost-effective for the Government, technology companies, and telecommunications and broadband service providers; and

(vi) any steps the United States could take to lead the development of international standards for requesting and obtaining digital evidence for criminal investigations and prosecutions from a foreign, sovereign State, including reforming the mutual legal assistance treaty process, while protecting civil liberties and due process.

Excuse the legalese but clearly an effort that could provide a factual as opposed to fantasy basis for further on digital security. No one can guarantee a sensible result but without a factual basis, any legislation is certainly going to be wrong.

For your convenience and possible employment/volunteering, here are the co-sponsors of this bill, with hyperlinks to their congressional homepages:

Now would be a good time to pitch yourself for involvement in this possible commission.

Pay attention to Section 8 of the bill:

SEC. 8. Staff.

(a)Appointment.—The chairman and vice chairman shall jointly appoint and fix the compensation of an executive director and of and such other personnel as may be necessary to enable the Commission to carry out its functions under this Act.

(b)Security clearances.—The appropriate Federal agencies or departments shall cooperate with the Commission in expeditiously providing appropriate security clearances to Commission staff, as may be requested, to the extent possible pursuant to existing procedures and requirements, except that no person shall be provided with access to classified information without the appropriate security clearances.

(c)Detailees.—Any Federal Government employee may be detailed to the Commission on a reimbursable basis, and such detailee shall retain without interruption the rights, status, and privileges of his or her regular employment.

(d)Expert and consultant services.—The Commission is authorized to procure the services of experts and consultants in accordance with section 3109 of title 5, United States Code, but at rates not to exceed the daily rate paid a person occupying a position level IV of the Executive Schedule under section 5315 of title 5, United States Code.

(e)Volunteer services.—Notwithstanding section 1342 of title 31, United States Code, the Commission may accept and use voluntary and uncompensated services as the Commission determines necessary.

I can sense you wondering what:

the daily rate paid a person occupying a position level IV of the Executive Schedule under section 5315 of title 5, United States Code

means, in practical terms.

I was tempted to point you to: 5 U.S. Code § 5315 – Positions at level IV, but that would be cruel and uninformative. 😉

I did track down the Executive Schedule, which lists position level IV:

Annual: $160,300 or about $80.15/hr. and for an 8-hour day, $641.20.

If you do volunteer or get a paying gig, please remember that omission and/or manipulation of subject identity properties can render otherwise open data opaque.

I don’t know of any primes are making money off of topic maps currently so it isn’t likely to gain traction with current primes. On the other hand, new primes can and do occur. Not often but it happens.

PS: If the extra links to contacts, content, etc. are helpful, please let me know. I started off reading the link poor RSA 2016: McCaul calls backdoors ineffective, pushes for tech panel to solve security issues. Little more than debris on your information horizon.

March 3, 2016

EFF On First Amendment, Apple, All Writs Act

Filed under: Cybersecurity,Government,Law,Security — Patrick Durusau @ 8:30 pm

Deep Dive: Why Forcing Apple to Write and Sign Code Violates the First Amendment by Andrew Crocker and Jamie Williams.

From the post:

EFF filed an amicus brief today in support of Apple’s fight against a court order compelling the company to create specific software to enable the government to break into an iPhone. The brief is written on behalf of 46 prominent technologists, security researchers, and cryptographers who develop and rely on secure technologies and services that are central to modern life. It explains that the court’s unprecedented order would violate Apple’s First Amendment rights. That’s because the right to free speech prohibits the government from compelling unwilling speakers to speak, and the act of writing and, importantly, signing computer code is a form of protected speech. So by forcing Apple to write and sign an update to undermine the security of its iOS software, the court is also compelling Apple to speak—in violation of the First Amendment. (emphasis in original)

Despite my mentioning A Readers’ Guide to the Apple All Writs Act Cases earlier today, I wanted to call the EFF amicus brief out separately.

Its strong defense of Apple solely on First Amendment grounds merits special mention.

Enabling the government to compel speech, for any reason, should be resisted in courts, in the streets and in refusing to speak.

Or as one of my least favorite people in history once put it:

220px-Voennaia_marka_Ni_shagu_nazad!

(Not one step back)

Yes, it is really that important.

11 Million Pages of CIA Files [+ Allen Dulles, war criminal]

Filed under: Government,Government Data,Topic Maps — Patrick Durusau @ 6:54 pm

11 Million Pages of CIA Files May Soon Be Shared By This Kickstarter by Joseph Cox.

From the post:

Millions of pages of CIA documents are stored in Room 3000. The CIA Records Search Tool (CREST), the agency’s database of declassified intelligence files, is only accessible via four computers in the National Archives Building in College Park, MD, and contains everything from Cold War intelligence, research and development files, to images.

Now one activist is aiming to get those documents more readily available to anyone who is interested in them, by methodically printing, scanning, and then archiving them on the internet.

“It boils down to freeing information and getting as much of it as possible into the hands of the public, not to mention journalists, researchers and historians,” Michael Best, analyst and freedom of information activist told Motherboard in an online chat.

Best is trying to raise $10,000 on Kickstarter in order to purchase the high speed scanner necessary for such a project, a laptop, office supplies, and to cover some other costs. If he raises more than the main goal, he might be able to take on the archiving task full-time, as well as pay for FOIAs to remove redactions from some of the files in the database. As a reward, backers will help to choose what gets archived first, according to the Kickstarter page.

“Once those “priority” documents are done, I’ll start going through the digital folders more linearly and upload files by section,” Best said. The files will be hosted on the Internet Archive, which converts documents into other formats too, such as for Kindle devices, and sometimes text-to-speech for e-books. The whole thing has echoes of Cryptome—the freedom of information duo John Young and Deborah Natsios, who started off scanning documents for the infamous cypherpunk mailing list in the 1990s.

Good news! Kickstarter has announced this project funded!

Additional funding will help make this archive of documents available sooner rather than later.

As opposed to an attempt to boil the ocean of 11 million pages of CIA files, what about smaller topic mapping/indexing projects that focus on bounded sub-sets of documents of interest to particular communities?

I don’t have any interest in the STAR GATE project (clairvoyance, precognition, or telepathy, continued now by the DHS at airport screening facilities) but would be very interested in the records of Allen Dulles, a war criminal of some renown.

Just so you know, Michael has already uploaded documents on Allen Dulles from the CIA Records Search Tool (CREST) tool:

History of Allen Welsh Dulles as CIA Director – Volume I: The Man

History of Allen Welsh Dulles as CIA Director – Volume II: Coordination of Intelligence

History of Allen Welsh Dulles as CIA Director – Volume III: Covert Activities

History of Allen Welsh Dulles as CIA Director – Volume IV: Congressional Oversight and Internal Administration

History of Allen Welsh Dulles as CIA Director – Volume V: Intelligence Support of Policy

To describe Allen Dulles as a war criminal is no hyperbole. Among his other crimes, overthrow of President Jacobo Arbenz Guzman of Guatemala (think United Fruit Company), removal of Mohammad Mossadeq, prime minister of Iran (think Shah of Iran), are only two of his crimes, the full extent of which will probably never be known.

Files are being uploaded to That 1 Archive.

Scripting FOIA Requests

Filed under: Government,Government Data — Patrick Durusau @ 4:17 pm

An Activist Wrote a Script to FOIA the Files of 7,000 Dead FBI Officials by Joseph Cox.

From the post:

One of the best times to file a Freedom of Information request with the FBI is when someone dies; after that, any files that the agency holds on them can be requested. Asking for FBI files on the deceased is therefore pretty popular, with documents released on Steve Jobs, Malcolm X and even the Insane Clown Posse.

One activist is turning this back onto the FBI itself, by requesting files on nearly 7,000 dead FBI employees en masse, and releasing a script that allows anyone else to do the same.

“At the very least, it’ll be like having an extensive ‘Who’s Who in the FBI’ to consult, without worrying that anyone in there is still alive and might face retaliation for being in law enforcement,” Michael Best told Motherboard in an online chat. “For some folks, they’ll probably show allegations of wrongdoing while others probably highlight some of the FBI’s best and brightest.”

On Monday, Best will file FOIAs for FBI records and files relating to 6,912 employees named in the FBI’s own “Dead List,” a list of people that the FBI understands to be deceased. A recent copy of the list, which includes special agents and section chiefs, was FOIA’d by MuckRock editor JPat Brown in January.

Points to remember:

  • Best’s script works for any FOIA office that accepts email requests (not just the FBI, be creative)
  • Get 3 or more people to file the same FOIA requests
  • Publicize your spreadsheet of FOIA targets

Don’t forget the need to scan, OCR and index (topic map) the results of your FOIA requests.

Information that cannot be found may as well still be concealed by the FBI (and others).

A Readers’ Guide to the Apple All Writs Act Cases

Filed under: Cybersecurity,Government,Law,Security — Patrick Durusau @ 3:54 pm

A Readers’ Guide to the Apple All Writs Act Cases

From the post:

The last few weeks and months have been awash in media coverage of two cases before magistrate judges involving the federal government seeking to use the All Writs Act to compel Apple’s cooperation with ongoing criminal investigations. The older case, in the Eastern District of New York, involves a drug case where the phone’s owner has pleaded guilty to the charges against him. The more recent case, in the Central District of California, involves an iPhone used by Syed Farook, one of the alleged San Bernardino shooters. While the two cases involve different different phone models, operating systems, alleged crimes, and legal postures, they touch on similar questions related to the scope of the All Writs Act.

In an attempt to create a one-stop shop for our coverage and the related documents and some useful sources, we’ve compiled this readers’ guide. We will update it as the cases progress to include the latest filings and posts, so check back for more as things unfold.

Just Security has started a “one-stop shop” for its coverage and official documents in the Apple All Writs Act cases.

Considering how seldom news sources point to rulings, briefs, etc., this will get a lot of hits in the coming months.

It does not include coverage from other professional sources, such as LawFare – Hard National Security Choices. Two items by Robert Chesney from LawFare that you may find of interest:

A Primer on Apple’s Brief in the San Bernadino iPhone Fight

Apple v. FBI Primer #2: On Judge Orenstein’s Ruling in the Queens Meth Case

If anyone has collected the professional legal commentary sites posting on the Apple All Writs Act cases, I would appreciate a pointer.

March 2, 2016

Justifying the Investigatory Powers Bill – Despite a Lack of Evidence

Filed under: Government,Security — Patrick Durusau @ 8:45 pm

In UK Parliment Reports on the Draft Investigatory Powers Bill, I pointed to a number of UK Parliament reports that leave little doubt about the excesses of the proposed Investigatory Powers Bill.

Undeterred by those objections, the UK government pressed ahead with reams of poor writing to distract citizens from the lack of justification for any of the proposed Investigatory Powers Bill.

Here are links to the latest effort at obfuscation:

Investigatory Powers Bill 2015-16 – The cancer on the body politic in question. (258 pages)

Overarching Documents:

Investigatory Powers Bill: government response to pre-legislative scrutiny (web) Ref: ISBN 9781474129541, Cm 9219, same document but for printing: Investigatory Powers Bill: government response to pre-legislative scrutiny (print) Ref: ISBN 9781474129534, Cm 9219 (102 pages)

Operational case for bulk powers (47 pages)

Operational case for the retention of internet connection records (31 pages)

Comparison of internet connection records in the Investigatory Powers Bill with Danish internet session logging legislation (8 pages)

Delegated powers and regulatory reform committee: memorandum by the Home Office (31 pages)

Investigatory Powers Bill: codes of practice

National security notices: draft code of practice (19 pages)

Interception of communications: draft code of practice (101 pages)

Security and intelligence agencies’ retention and use of bulk personal datasets: draft code of practice (38 pages)

Equipment interference: draft code of practice (83 pages)

Communications data: draft code of practice (118 pages)

Bulk acquisition: draft code of practice (50 pages)

A grand total of 886 pages, none of are relevant without a justification for the powers sought.

I used to think the British educational system was the best in the world, bar none, but this batch of documents may force me to rethink that assessment.

For example:

The Operational case for bulk powers reports on the need for cyber security (page 16):


4.14. The cyber security of the UK is of growing importance to our national security, economy and society. The levels of cyber-attacks by criminals and hostile states have grown considerably; the number of nationally-significant cyber incidents dealt with by the security and intelligence agencies, for example, doubled between 2014 and 2015. Terrorists are increasingly seeking cyber capabilities in order to threaten the critical national infrastructure of the UK. The scale of the challenge is daunting: one recent cybercrime attack alone infected around 150,000 users in the UK.

4.15. The scale of the internet limits the utility of targeted powers and make bulk capabilities critical to the UK’s efforts to detect and defend against such attacks. 95% of the cyber-attacks on the UK detected by the security and intelligence agencies over the last six months were only discovered through the collection and analysis of bulk data. These have included numerous attacks against government networks and every major UK commercial sector. The security and intelligence agencies routinely share this unique intelligence with their partners in UK industry, enabling them to protect their businesses and customers from cyber-attacks.

I was quite amazed to learn users can be infected by cyber attacks:

The scale of the challenge is daunting: one recent cybercrime attack alone infected around 150,000 users in the UK.

It’s a good thing the UK still has the National Health Service. 😉

I could have sworn that computer systems and not people were infected by cybercrime. But that’s unlikely to be what the authors meant. Making it sound like people were being injured creates a sense of urgency.

Along the same lines, consider that 95% of cyber-attacks go unnoticed, save for bulk data collection:

95% of the cyber-attacks on the UK detected by the security and intelligence agencies over the last six months were only discovered through the collection and analysis of bulk data.

If 95% of cyber-attacks are so trivial and non-threatening that victims are unaware of the attacks, where is the sense of urgency?

I concede that charged with making a case out of non-existing evidence is a challenge to any writer. I offer this collection of documents as proof for that proposition.

Best wishes to everyone in the UK who is trying to stop this slide into madness.

Muting users on Twitter – Achtung! State, DoD, Other US Censors

Filed under: Censorship,Government,Twitter — Patrick Durusau @ 5:12 pm

The Twitter Help Center has a great webpage titled: Muting users on Twitter.

From that page:

Mute is a feature that allows you to remove an account’s Tweets from your timeline without unfollowing or blocking that account. Muted accounts will not know that you’ve muted them and you can unmute them at any time. To access a list of accounts you have muted, visit your muted accounts settings on twitter.com or your app settings on Twitter for iOS or Android.

Instead of leaning on Twitter to close accounts, the State Department, Department of Defense and others can compile Twitter Mute Lists that have the Twitter accounts that any reasonable person should mute.

The Catholic News Service used to publish movie ratings in Our Sunday Visitor and while the rating system has changed since I last saw it (think 1960’s), it was a great way to pick out movies.

I think most ones I saw were either condemned or some similar category. 😉

A twitter mute list from State, DoD and others would save me time of searching for offensive content to view. I am sure that is true for others as well.

Oh, not to mention that people who are offended can choose to not view such content. Sorry, almost go carried away there.

How’s that for a solution to “propaganda” on Twitter? If it offends you, don’t look. Leave the rest of us the hell alone.

Graph Encryption: Going Beyond Encrypted Keyword Search [Subject Identity Based Encryption]

Filed under: Cryptography,Cybersecurity,Graphs,Subject Identity,Topic Maps — Patrick Durusau @ 4:49 pm

Graph Encryption: Going Beyond Encrypted Keyword Search by Xiarui Meng.

From the post:

Encrypted search has attracted a lot of attention from practitioners and researchers in academia and industry. In previous posts, Seny already described different ways one can search on encrypted data. Here, I would like to discuss search on encrypted graph databases which are gaining a lot of popularity.

1. Graph Databases and Graph Privacy

As today’s data is getting bigger and bigger, traditional relational database management systems (RDBMS) cannot scale to the massive amounts of data generated by end users and organizations. In addition, RDBMSs cannot effectively capture certain data relationships; for example in object-oriented data structures which are used in many applications. Today, NoSQL (Not Only SQL) has emerged as a good alternative to RDBMSs. One of the many advantages of NoSQL systems is that they are capable of storing, processing, and managing large volumes of structured, semi-structured, and even unstructured data. NoSQL databases (e.g., document stores, wide-column stores, key-value (tuple) store, object databases, and graph databases) can provide the scale and availability needed in cloud environments.

In an Internet-connected world, graph database have become an increasingly significant data model among NoSQL technologies. Social networks (e.g., Facebook, Twitter, Snapchat), protein networks, electrical grid, Web, XML documents, networked systems can all be modeled as graphs. One nice thing about graph databases is that they store the relations between entities (objects) in addition to the entities themselves and their properties. This allows the search engine to navigate both the data and their relationships extremely efficiently. Graph databases rely on the node-link-node relationship, where a node can be a profile or an object and the edge can be any relation defined by the application. Usually, we are interested in the structural characteristics of such a graph databases.

What do we mean by the confidentiality of a graph? And how to do we protect it? The problem has been studied by both the security and database communities. For example, in the database and data mining community, many solutions have been proposed based on graph anonymization. The core idea here is to anonymize the nodes and edges in the graph so that re-identification is hard. Although this approach may be efficient, from a security point view it is hard to tell what is achieved. Also, by leveraging auxiliary information, researchers have studied how to attack this kind of approach. On the other hand, cryptographers have some really compelling and provably-secure tools such as ORAM and FHE (mentioned in Seny’s previous posts) that can protect all the information in a graph database. The problem, however, is their performance, which is crucial for databases. In today’s world, efficiency is more than running in polynomial time; we need solutions that run and scale to massive volumes of data. Many real world graph datasets, such as biological networks and social networks, have millions of nodes, some even have billions of nodes and edges. Therefore, besides security, scalability is one of main aspects we have to consider.

2. Graph Encryption

Previous work in encrypted search has focused on how to search encrypted documents, e.g., doing keyword search, conjunctive queries, etc. Graph encryption, on the other hand, focuses on performing graph queries on encrypted graphs rather than keyword search on encrypted documents. In some cases, this makes the problem harder since some graph queries can be extremely complex. Another technical challenge is that the privacy of nodes and edges needs to be protected but also the structure of the graph, which can lead to many interesting research directions.

Graph encryption was introduced by Melissa Chase and Seny in [CK10]. That paper shows how to encrypt graphs so that certain graph queries (e.g., neighborhood, adjacency and focused subgraphs) can be performed (though the paper is more general as it describes structured encryption). Seny and I, together with Kobbi Nissim and George Kollios, followed this up with a paper last year [MKNK15] that showed how to handle more complex graphs queries.

Apologies for the long quote but I thought this topic might be new to some readers. Xianrui goes on to describe a solution for efficient queries over encrypted graphs.

Chase and Kamara remark in Structured Encryption and Controlled Disclosure, CK10:


To address this problem we introduce the notion of structured encryption. A structured encryption scheme encrypts structured data in such a way that it can be queried through the use of a query-specific token that can only be generated with knowledge of the secret key. In addition, the query process reveals no useful information about either the query or the data. An important consideration in this context is the efficiency of the query operation on the server side. In fact, in the context of cloud storage, where one often works with massive datasets, even linear time operations can be infeasible. (emphasis in original)

With just a little nudging, their:

A structured encryption scheme encrypts structured data in such a way that it can be queried through the use of a query-specific token that can only be generated with knowledge of the secret key.

could be re-stated as:

A subject identity encryption scheme leaves out merging data in such a way that the resulting topic map can only be queried with knowledge of the subject identity merging key.

You may have topics that represent diagnoses such as cancer, AIDS, sexual contacts, but if none of those can be associated with individuals who are also topics in the map, there is no more disclosure than census results for a metropolitan area and a list of the citizens therein.

That is you are missing the critical merging data that would link up (associate) any diagnosis with a given individual.

Multi-property subject identities would make the problem even harder, so say nothing of conferring properties on the basis of supplied properties as part of the merging process.

One major benefit of a subject identity based approach is that without the merging key, any data set, however sensitive the information, is just a data set, until you have the basis for solving its subject identity riddle.

PS: With the usual caveats of not using social security numbers, birth dates and the like as your subject identity properties. At least not in the map proper. I can think of several ways to generate keys for merging that would be resistant to even brute force attacks.

Ping me if you are interested in pursuing that on a data set.

‘Hack The Pentagon’ Bug Bounty Program (Have a good idea, then f*ck it up)

Filed under: Cybersecurity,Government,Security — Patrick Durusau @ 3:53 pm

U.S. Announces ‘Hack The Pentagon’ Bug Bounty Program by Bill Chappel.

From the post:

Announcing what it calls “the first cyber bug bounty program in the history of the federal government,” the Department of Defense says it’s inviting hackers to test the security of its Web pages and networks.

The contest is only for “vetted hackers,” the DoD says, which means that anyone hoping to find vulnerabilities in its systems will first need to pass a background check. Participants could win money and recognition for their work, the agency says.

The pilot program is slated to begin in April. And if you’re wondering whether the hackers might disrupt a critical piece of the Department of Defense’s infrastructure, the agency says that hackers will target a predetermined system that’s not part of its critical operations.

According to a list published by the Defense Department, it currently manages 488 websites, which are devoted to everything from the 111th Attack Wing and other military units to the Yellow Ribbon Reintegration Program.

The “Hack the Pentagon” initiative is the work of the Defense Digital Service, a DoD unit that was launched last fall as part of the White House’s U.S. Digital Service.

A sad story. A Pentagon bug bounty program, even if limited to only parts of the DoD’s infrastructure, could pull cyber talent from around the world.

End result: Better security for the Pentagon and bug reports on commonly used elements of web infrastructure.

However, the Pentagon wants only “vetted hackers.”

A pool of non-threatening or at least docile talent that is willing to find but also conceal vulnerabilities.

The bug bounty program is a great idea, “vetted hackers” is the perfect way to diminish its value. To the Pentagon and the general public.

What this program needs is an anonymous rewards program like Crime Stoppers.

That would attract the best talent which in turn increases the security of Pentagon systems.

Or, is that the point of this program?

Won’t know that until the list of “vetted hackers” is published. Anyone at Lloyd’s giving odds on the same names appearing on current DoD contracts?

Fearing Cyber-Terrorism (Ethical Data Science Anyone?)

Filed under: Ethics,Government — Patrick Durusau @ 10:22 am

Discussions of the ethics of data science are replete with examples of not discriminating against individuals based on race (a crime in some contexts), violation of privacy expectations, etc.

What I have not seen, perhaps poor searching on my part, are discussions of the ethical obligation of data scientists to persuade would be clients that their fears are baseless and/or refusing to participate in projects based on fear mongering.

Here’s a recent example of the type of fear mongering I have in mind:

Cyberterrorism Is the Next ‘Big Threat,’ Says Former CIA Chief


The cyberwar could get much hotter soon, in the estimation of former CIA counter-intelligence director Barry Royden, a 40-year intel veteran, who told Business Insider the threat of cyberterrorism is pervasive, evasive, and so damned invasive that, sooner or later, someone will give into temptation, pull the trigger, and unleash chaos.

Ooooh, chaos. That sounds serious, except that it is the product of paranoid fantasy and and a desire to game the appropriations process.

Consider that in 2004, Gabriel Weimann, United States Institute of Peace, debunks cyberterrorism in Cyberterrorism How Real Is the Threat?.

Fast forward eight years and you find Peter W. Singer (Brookings) writing in The Cyber Terror Bogeyman says:

We have let our fears obscure how terrorists really use the Internet.

About 31,300. That is roughly the number of magazine and journal articles written so far that discuss the phenomenon of cyber terrorism.

Zero. That is the number of people that who been hurt or killed by cyber terrorism at the time this went to press.

In many ways, cyber terrorism is like the Discovery Channel’s “Shark Week,” when we obsess about shark attacks despite the fact that you are roughly 15,000 times more likely to be hurt or killed in an accident involving a toilet. But by looking at how terror groups actually use the Internet, rather than fixating on nightmare scenarios, we can properly prioritize and focus our efforts. (emphasis in original)

That’s a data point isn’t it?

The quantity of zero. Yes?

In terms of data science narrative, the:

…we obsess about shark attacks despite the fact that you are roughly 15,000 times more likely to be hurt or killed in an accident involving a toilet.

is particularly impressive. Anyone with a data set of the ways people have been injured or killed in cases involving a toilet?

The ethical obligation of data scientists comes into focus when:

The Military Cyber Spending reserved by the Pentagon for cyber operations next year is $5 Billion, part of the comprehensive $496 billion fiscal 2015 budget

What are the ethics of taking $millions for work that you know is unnecessary and perhaps even useless?

Do you humor the client, and in the case of government, loot the public till?

Does it make a difference (ethically speaking) that someone else will take the money if you don’t?

Any examples of data scientists not taking on work based on the false threat of cyber-terrorism?

PS: Just in case anyone brings up the Islamic State, the bogeyman of the month of late, point them to: ISIS’s Cyber Caliphate hacks the wrong Google. The current cyber abilities of the Islamic State make them more of a danger to themselves than anyone else. (That’s a factual observation and not an attempt to provide “material support or resources” to the Islamic State.)

March 1, 2016

Avoid “Complete,” “Data Science,” in Titles

Filed under: Data Science — Patrick Durusau @ 10:06 pm

A Complete Tutorial to learn Data Science in R from Scratch by Manish Saraswat.

This is a useful tutorial but it isn’t:

  1. Complete
  2. Does NOT cover all of Data Science

But, this tutorial was tweeted and has been retweeted at least seven times that I know of, possibly more.

Using vague and/or inaccurate terms in titles makes tutorials more difficult to find.

That alone should be reason enough to use better titles.

A more accurate title would be:

R for Predictive Modeling, From Installation to Modeling

That captures the use of R, that the main focus is on predictive modeling and that it will start with the installation of R and proceed to modeling.

Not a word said about all of “data science,” or being “complete,” whatever that means in a discipline with daily advances on multiple fronts.

Just a little effort on the part of authors could improve the lives of all of us desperately searching to find their work.

Yes?

Advice on Reading Academic Papers [Comments on Reading Case Law/Statutes]

Filed under: Government,Law,Law - Sources,Literature,Reading — Patrick Durusau @ 6:55 pm

Advice on Reading Academic Papers by Aaron Massey.

From the post:

Graduate students must learn to read academic papers, but in virtually all cases, these same students are not formally taught how to best read academic papers. It is not the same process used to read a newspaper, magazine, or novel. The process of learning how to read academic papers properly can not only be painful, but also waste quite a bit of time. Here are my quick tips on reading papers of all stripes:

Less detailed than How to read and understand a scientific paper…., which includes a worked example, and not as oriented to CS as Now to Read a Paper.

In addition to four other guides, Aaron includes this link which returns (as of today), some 384,000,000 “hits” on the search string: “how to read a scientific paper.”

There appears to be no shortage of advice on “how to read a scientific paper.” 😉

Just for grins, a popular search engine returns these results:

“how to read case law” returns 2,070 “hits,” which dwindles down to 80 when similar materials are removed.

Isn’t that interesting? Case law, which in many cases determines who pays, who goes to jail, who wins, has such poor coverage in reading helps?

“how to read statutes” returns 2,500 “hits,” which dwindles down to 97 when similar materials are omitted.

Beyond the barriers of legal “jargon,” be aware that even ordinary words may not have expected meanings in both case law and statutes.

For best and safest results, always consult licensed legal counsel.

That perpetuates the legal guild but its protective mechanisms are harsh and pitiless. Consider yourself forewarned.

Failure Is Not An Option [Really?]

Filed under: Cybersecurity,Design,Government,Politics,Security — Patrick Durusau @ 3:22 pm

Slogans such as this one distort policy discussions, planning and implementation on a variety of issues.

failure-option-02

The issue here is cybersecurity but it could be sexual harassment, rape, terrorist acts (other than the first two), fraud, hunger, suicide, etc.

Take it as a given there are no, repeat no sparrow shall fall systems.

Sorry to disappoint you but even with unlimited resources, which no project has, that’s not possible.

Every discussion of cybersecurity or other policy issue MUST include the issue of how much security (risk if you prefer) can be obtained for N resources?

More likely than not you are always going to want more security that you have resources to obtain but acknowledging that up front, enables you to prepare for what happens when security fails.

Which it is going to do. No ifs, ands or buts, all security systems fail. Some more often than others but they all fail.

I don’t consider Roswell to be a counter-example. The information, such as does exist, isn’t important enough for the effort required to obtain it. Some secrets remain secrets out of disinterest.

Realizing failure is not only an option but a certainty, designers don’t have to waste time on plausible deniability and/or responsibility for all breaches. Congress allocated $N resources and for $N resources, you get the rot-13 cipher level of security.

As opposed to the VA routine where Congress allocates $N resources to the VA but expects $N3 care for veterans. Why is anyone surprised the VA provided $N level of care and created mechanisms to deny $N3 care?

Of course, cheating and lying aren’t the best options for dealing with a shortfall in funding but that mirrors the VA funders so that isn’t surprising either.

Be up front with clients and say:

  • Yes, failure is not only an option, it’s going to happen.
  • Anyone who says differently hopes you manage by bumper stickers.
  • Evaluate what $N resources can buy you against risk R.
  • Plan your response to failure (as opposed to the post-failure blame game)

Such an approach will make you a novelty among consultants/contractors.

« Newer Posts

Powered by WordPress