Leak Publication: Sharing, Crediting, and Re-Using Leaks

March 22nd, 2017

If you substitute “leak” for “data” in this essay by Daniella Lowenberg, does it work for leaks as well?

Data Publication: Sharing, Crediting, and Re-Using Research Data by Daniella Lowenberg.

From the post:

In the most basic terms- Data Publishing is the process of making research data publicly available for re-use. But even in this simple statement there are many misconceptions about what Data Publications are and why they are necessary for the future of scholarly communications.

Let’s break down a commonly accepted definition of “research data publishing”. A Data Publication has three core features: 1 – data that are publicly accessible and are preserved for an indefinite amount of time, 2 – descriptive information about the data (metadata), and 3 – a citation for the data (giving credit to the data). Why are these elements essential? These three features make research data reusable and reproducible- the goal of a Data Publication.

As much as I admire the work of the International Consortium of Investigative Journalists (ICIJ, especially its Panama Papers project, sharing data beyond the confines of their community isn’t a value, much less a goal.

As all secret keepers, government, industry, organizations, ICIJ has “reasons” for its secrecy, but none that I find any more or less convincing than those offered by other secret keepers.

Every secret keeper has an agenda their secrecy serves. Agendas that which don’t include a public empowered to make judgments about their secret keeping.

The ICIJ proclaims Leak to Us.

A good place to leak but include with your leak a demand, an unconditional demand, that your leak be released in its entirely within a year or two of its first publication.

Help enable the public to watch all secrets and secret keepers, not just those some secret keepers choose to expose.

When To Worry About CIA’s Zero-Day Exploits

March 22nd, 2017

Chris McNab’s Alexsey’s TTPs (.. Tactics, Techniques, and Procedures) post on Alexsey Belan provides a measure for when to worry about Zero-Day exploits held by the CIA.

McNab lists:

  • Belan’s 9 offensive characteristics
  • 5 defensive controls
  • WordPress hack – 12 steps
  • LinkedIn targeting – 11 steps
  • Third victim – 11 steps

McNab observes:


Consider the number of organizations that provide services to their users and employees over the public Internet, including:

  • Web portals for sales and marketing purposes
  • Mail access via Microsoft Outlook on the Web and Google Mail
  • Collaboration via Slack, HipChat, SharePoint, and Confluence
  • DevOps and support via GitHub, JIRA, and CI/CD utilities

Next, consider how many enforce 2FA across their entire attack surface. Large enterprises often expose domain-joined systems to the Internet that can be leveraged to provide privileged network access (via Microsoft IIS, SharePoint, and other services supporting NTLM authentication).

Are you confident safe 2FA is being enforced over your entire attack surface?

If not, don’t worry about potential CIA held Zero-Day exploits.

You’re in danger from script kiddies, not the CIA (necessarily).

Alexsey Belan made the Most Wanted list at the FBI.

Crimes listed:

Conspiring to Commit Computer Fraud and Abuse; Accessing a Computer Without Authorization for the Purpose of Commercial Advantage and Private Financial Gain; Damaging a Computer Through the Transmission of Code and Commands; Economic Espionage; Theft of Trade Secrets; Access Device Fraud; Aggravated Identity Theft; Wire Fraud

His FBI poster runs two pages but you could edit off the bottom of the first page to make it suitable for framing.

😉

Try hanging that up in your local university computer lab to test their support for free speech.

CIA Documents or Reports of CIA Documents? Vault7

March 22nd, 2017

As I tool up to analyze the 1134 non-duplicate/artifact HTML files in Vault 7: CIA Hacking Tools Revealed, it occurred to me those aren’t “CIA documents.”

Take Weeping Angel (Extending) Engineering Notes as an example.

Caveat: My range of experience with “CIA documents” is limited to those obtained by Michael Best and others using Freedom of Information Act requests. But that should be sufficient to identify “CIA documents.”

Some things I notice about Weeping Angel (Extending) Engineering Notes:

  1. A Wikileaks header with donation button.
  2. “Vault 7: CIA Hacking Tools Revealed”
  3. Wikileaks navigation
  4. reported text
  5. More Wikileaks navigation
  6. Ads for Wikileaks, Tor, Tails, Courage, bitcoin

I’m going to say that the 1134 non-duplicate/artifact HTML files in Vault7, Part1, are reports of portions (which portions is unknown) of some unknown number of CIA documents.

A distinction that influences searching, indexing, concordances, word frequency, just to name a few.

What I need is the reported text, minus:

  1. A Wikileaks header with donation button.
  2. “Vault 7: CIA Hacking Tools Revealed”
  3. Wikileaks navigation
  4. More Wikileaks navigation
  5. Ads for Wikileaks, Tor, Tails, Courage, bitcoin

Check in tomorrow when I boil 1134 reports of CIA documents to get something better suited for text analysis.

XQuery 3.1 and Company! (Deriving New Versions?)

March 22nd, 2017

XQuery 3.1: An XML Query Language W3C Recommendation 21 March 2017

Hurray!

Related reading of interest:

XML Path Language (XPath) 3.1

XPath and XQuery Functions and Operators 3.1

XQuery and XPath Data Model 3.1

These recommendations are subject to licenses that read in part:

No right to create modifications or derivatives of W3C documents is granted pursuant to this license, except as follows: To facilitate implementation of the technical specifications set forth in this document, anyone may prepare and distribute derivative works and portions of this document in software, in supporting materials accompanying software, and in documentation of software, PROVIDED that all such works include the notice below. HOWEVER, the publication of derivative works of this document for use as a technical specification is expressly prohibited.

You know I think the organization of XQuery 3.1 and friends could be improved but deriving and distributing “improved” versions is expressly prohibited.

Hmmm, but we are talking about XML and languages to query and transform XML.

Consider the potential of an query that calls XQuery 3.1: An XML Query Language and materials cited in it, then returns a version of XQuery 3.1 that has definitions from other standards off-set in the XQuery 3.1 text.

Or than inserts into the text examples or other materials.

For decades XML enthusiasts have bruited about dynamic texts but have produced damned few of them (as in zero) as their standards.

Let’s use the “no derivatives” language of the W3C as an incentive to not create another static document but a dynamic one that can grow or contract according to the wishes of its reader.

Suggestions for first round features?

Fact Checking Wikileaks’ Vault 7: CIA Hacking Tools Revealed (Part 2 – The PDF Files)

March 21st, 2017

You may want to read Fact Checking Wikileaks’ Vault 7: CIA Hacking Tools Revealed (Part 1) before reading this post. In Part 1, I walk you through obtaining a copy of Wikileaks’ Vault 7: CIA Hacking Tools Revealed so you can follow and check my analysis and conclusions.

Fact checking applies to every source, including this blog.

I proofed my listing of the 357 PDF files in the first Vault 7 release and report an increase in arguably CIA files and a slight decline in public documents. An increase from 114 to 125 for the CIA and a decrease from 109 to 98 for public documents.

  1. Arguably CIA – 125
  2. Public – 98
  3. Wikileaks placeholders – 134

The listings to date:

  1. CIA (maybe)
  2. Public documents
  3. Wikileaks placeholders

For public documents, I created hyperlinks whenever possible. (Saying a fact and having evidence are two different things.) Vendor documentation that was not marked with a security classification I counted as public.

All I can say for the Wikileaks placeholders, some 134 of them, is to ignore them unless you like mining low grade ore.

I created notes in the CIA listing to help narrow your focus down to highly relevant materials.

I have more commentary in the works but I wanted to release these listings in case they help others make efficient use of their time.

Enjoy!

PS: A question I want to start addressing this week is how the dilution of a leak impacts the use of same?

Fact Checking Wikileaks’ Vault 7: CIA Hacking Tools Revealed (Part 1)

March 20th, 2017

Executive Summary:

If you reported Vault 7: CIA Hacking Tools Revealed as containing:

8,761 documents and files from an isolated, high-security network situated inside the CIA’s Center for Cyber Intelligence in Langley, Virgina…. (Vault 7: CIA Hacking Tools Revealed)

you failed to check your facts.

I detail my process below but in terms of numbers:

  1. Of 7809 HTML files, 6675 are duplicates or Wikileaks artifacts
  2. Of 357 PDF files, 134 are Wikileaks artifacts (for materials not released). Of the remaining 223 PDF files, 109 of them are public information, the GNU Make Manual for instance. Out of the 357 pdf files, Wikileaks has delivered 114 arguably from the CIA and some of those are doubtful. (Part 2, forthcoming)

Wikileaks haters will find little solace here. My criticisms of Wikileaks are for padding the leak and not enabling effective use of the leak. Padding the leak is obvious from the inclusion of numerous duplicate and irrelevant documents. Effective use of the leak is impaired by the padding but also by teases of what could have been released but wasn’t.

Getting Started

To start on common ground, fire up a torrent client, obtain and decompress: Wikileaks-Year-Zero-2017-v1.7z.torrent.

Decompression requires this password: SplinterItIntoAThousandPiecesAndScatterItIntoTheWinds

The root directory is year0.

When I run a recursive ls from above that directory:

ls -R year0 | wc -l

My system reports: 8820

Change to the year0 directory and ls reveals:

bootstrap/ css/ highlighter/ IMG/ localhost:6081@ static/ vault7/

Checking the files in vault7:

ls -R vault7/ | wc -l

returns: 8755

Change to the vault7 directory and ls shows:

cms/ files/ index.html logo.png

The directory files/ has only one file, org-chart.png. An organization chart of the CIA but with sub-departments are listed with acronyms and “???.” Did the author of the chart not know the names of those departments? I point that out as the first of many file anomalies.

Some 7809 HTML files are found under cms/.

The cms/ directory has a sub-directory files, plus main.css and 7809 HTML files (including the index.html file).

Duplicated HTML Files

I discovered duplication of the HTML files quite by accident. I had prepared the files with Tidy for parsing with Saxon and compressed a set of those files for uploading.

The 7808 files I compressed started at 296.7 MB.

The compressed size, using 7z, was approximately 3.5 MB.

That’s almost 2 order of magnitude of compression. 7z is good, but it’s not quite that good. 😉

Checking my file compression numbers

You don’t have to take my word for the file compression experience. If you select all the page_*, space_* and user_* HTML files in a file browser, it should report a total size of 296.7 MB.

Create a sub-directory to year0/vault7/cms/, say mkdir htmlfiles and then:

cp *.html htmlfiles

Then: cd htmlfiles

and,

7z a compressedhtml.7z *.html

Run: ls -l compressedhtml.7z

Result: 3488727 Mar 16 16:31 compressedhtml.7z

Tom Harris, in How File Compression Works, explains that:


Most types of computer files are fairly redundant — they have the same information listed over and over again. File-compression programs simply get rid of the redundancy. Instead of listing a piece of information over and over again, a file-compression program lists that information once and then refers back to it whenever it appears in the original program.

If you don’t agree the HTML file are highly repetitive, check the next section where one source of duplication is demonstrated.

Demonstrating Duplication of HTML files

Let’s start with the same file as we look for a source of duplication. Load Cinnamon Cisco881 Testing at Wikileaks into your browser.

Scroll to near the bottom of the file where you see:

Yes! There are 136 prior versions of this alleged CIA file in the directory.

Cinnamon Cisco881 Testinghas the most prior versions but all of them have prior versions.

Are we now in agreement that duplicated versions of the HTML pages exist in the year0/vault7/cms/ directory?

Good!

Now we need to count how many duplicated files there are in year0/vault7/cms/.

Counting Prior Versions of the HTML Files

You may or may not have noticed but every reference to a prior version takes the form:

<a href=”filename.html”>integer</a*gt;

That going to be an important fact but let’s clean up the HTML so we can process it with XQuery/Saxon.

Preparing for XQuery

Before we start crunching the HTML files, let’s clean them up with Tidy.

Here’s my Tidy config file:

output-xml: yes
quote-nbsp: no
show-warnings: no
show-info: no
quiet: yes
write-back: yes

In htmlfiles I run:

tidy -config tidy.config *.html

Tidy reports two errors:


line 887 column 1 - Error: is not recognized!
line 887 column 15 - Error: is not recognized!

Grepping for “declarations>”:

grep "declarations" *.html

Returns:

page_26345506.html:<declarations><string name="½ö"></string></declarations><p>›<br>

The string element is present as well so we open up the file and repair it with XML comments:

<!-- <declarations><string name="½ö"></string></declarations><p>›<br> -->
<!-- prior line commented out to avoid Tidy error, pld 14 March 2017-->

Rerun Tidy:

tidy -config tidy.config *.html

Now Tidy returns no errors.

XQuery Finds Prior Versions

Our files are ready to be queried but 7809 is a lot of files.

There are a number of solutions but a simple one is to create an XML collection of the documents and run our XQuery statements across the files as a set.

Here’s how I created a collection file for these files:

I did an ls in the directory and piped that to collection.xml. Opening the file I deleted index.html, started each entry with <doc href=" and ended each one with "/>, inserted <collection> before the first entry and </collection> after the last entry and then saved the file.

Your version should look something like:

<collection>
  <doc href="page_10158081.html"/>
  <doc href="page_10158088.html"/>
  <doc href="page_10452995.html"/>
...
  <doc href="user_7995631.html"/>
  <doc href="user_8650754.html"/>
  <doc href="user_9535837.html"/>
</collection>

The prior versions in Cinnamon Cisco881 Testing from Wikileaks, have this appearance in HTML source:

<h3>Previous versions:</h3>
<p>| <a href=”page_17760540.html”>1</a> <span class=”pg-tag”><i>empty</i></span>
| <a href=”page_17760578.html”>2</a> <span class=”pg-tag”></span>

…..

| <a href=”page_23134323.html”>135</a> <span class=”pg-tag”>[Xetron]</span>
| <a href=”page_23134377.html”>136</a> <span class=”pg-tag”>[Xetron]</span>
|</p>
</div>

You will need to spend some time with the files (I have obviously) to satisfy yourself that <a> elements that contain only numbers are exclusively used for prior references. If you come across any counter-examples, I would be very interested to hear about them.

To get a file count on all the prior references, I used:

let $count := count(collection('collection.xml')//a[matches(.,'^\d+$')])
return $count

Run that script to find: 6514 previous editions of the base files

Unpacking the XQuery

Rest assured that’s not how I wrote the first XQuery on this data set! 😉

Without exploring all the by-ways and alleys I traversed, I will unpack that query.

First, the goal of the query is to identify every <a> element that only contains digits. Recalling that previous versions link have digits only in their <a> elements.

A shout out to Jonathan Robie, Editor of XQuery, for reminding me that string expressions match substrings unless they are have beginning and ending line anchors. Here:

'^\d+$'

The \d matches only digits, the + enables matching 1 or more digits, and the beginning ^ and ending $ eliminate any <a> elements that might start with one or more digits, but also contains text. Like links to files, etc.

Expanding out a bit more, [matches(.,'^\d+$')], the [ ] enclose a predicate that consist of the matches function, which takes two arguments. The . here represents the content of an <a> element, followed by a comma as a separator and then the regex that provides the pattern to match against.

Although talked about as a “code smell,” the //a in //a[matches(.,'^\d+$')] enables us to pick up the <a> elements wherever they are located. We did have to repair these HTML files and I don’t want to spend time debugging ephemeral HTML.

Almost there! The collection file, along with the collection function, collection('collection.xml') enables us to apply the XQuery to all the files listed in the collection file.

Finally, we surround all of the foregoing with the count function: count(collection('collection.xml')//a[matches(.,'^\d+$')]) and declare a variable to capture the result of the count function: let $count :=

So far so good? I know, tedious for XQuery jocks but not all news reporters are XQuery jocks, at least not yet!

Then we produce the results: return $count.

But 6514 files aren’t 6675 files, you said 6675 files

Yes, your right! Thanks for paying attention!

I said at the top, 6675 are duplicates or Wikileaks artifacts.

Where are the others?

If you look at User #71477, which has the file name, user_40828931.html, you will find it’s not a CIA document but part of Wikileaks administration for these documents. There are 90 such pages.

If you look at Marble Framework, which has the file name, space_15204359.html, you find it’s a CIA document but a form of indexing created by Wikileaks. There are 70 such pages.

Don’t forget the index.html page.

When added together, 6514 (duplicates), 90 (user pages), 70 (space pages), index.html, I get 6675 duplicates or Wikileaks artifacts.

What’s your total?


Tomorrow:

In Fact Checking Wikileaks’ Vault 7: CIA Hacking Tools Revealed (Part 2), I look under year0/vault7/cms/files to discover:

  1. Arguably CIA files (maybe) – 114
  2. Public documents – 109
  3. Wikileaks artifacts – 134

I say “Arguably CIA” because there are file artifacts and anomalies that warrant your attention in evaluating those files.

UK Proposes to Treat Journalists As Spies (Your Response Here)

March 19th, 2017

UK’s proposed Espionage Act will treat journalists like spies by Roy Greenslade.

From the post:

Journalists in Britain are becoming increasingly alarmed by the government’s apparent determination to prevent them from fulfilling their mission to hold power to account. The latest manifestation of this assault on civil liberties is the so-called Espionage Act. If passed by parliament, it could lead to journalists who obtain leaked information, along with the whistle blowers who provide it to them, serving lengthy prison sentences.

In effect, it would equate journalists with spies, and its threat to press freedom could not be more stark. It would not so much chill investigative journalism as freeze it altogether.

The proposal is contained in a consultation paper, “Protection of Official Data,” which was drawn up by the Law Commission. Headed by a senior judge, the commission is ostensibly independent of government. Its function is to review laws and recommend reforms to ensure they are fairer and more modern.

But fairness is hardly evident in the proposed law. Its implications for the press were first highlighted in independent news website The Register by veteran journalist Duncan Campbell, who specializes in investigating the U.K. security services.

Comments on the public consultation document can be registered here.

Greenslade reports criticism of the proposal earned this response from the government:


In response, both Theresa’s May’s government and the Law Commission stressed that it was an early draft of the proposed law change. Then the commission followed up by extending the public consultation period by a further month, setting a deadline of May 3.

Early draft, last draft or the final form from parliament, journalists should treat the proposed Espionage Act as a declaration of war on the press.

Being classified as spies, journalists should start acting as spies. Spies that offer no quarter and who take no prisoners.

Develop allies in other countries who are willing to publish information detrimental to your government.

The government has chosen a side and it’s not yours. What more need be said?

Congress API Update

March 18th, 2017

Congress API Update by Derek Willis.

From the post:

When we took over projects from the Sunlight Foundation last year, we inherited an Application Programming Interface, or API, that overlapped with one of our own.

Sunlight’s Congress API and ProPublica’s Congress API are similar enough that we decided to try to merge them together rather than run them separately, and to do so in a way that makes as few users change their code as possible.

Today we’ve got an update on our progress.

Users of the ProPublica Congress API can now access additional fields in responses for Members, Bills, Votes and Nominations. We’ve updated our documentation to provide examples of those responses. These aren’t new responses but existing ones that now include some new attributes brought over from the Sunlight API. Details on those fields are here.

We plan to fold in Sunlight fields and responses for Committees, Hearings, Floor Updates and Amendments, though that work isn’t finished yet.

The daily waves of bad information on congressional legislation will not be stopped by good information.

However, good information can be used to pick meaningful fights, rather than debating 140 character or less brain farts.

Your choice.

RegexBuddy (Think Occur Mode for Emacs)

March 18th, 2017

RegexBuddy

From the webpage:

RegexBuddy is your perfect companion for working with regular expressions. Easily create regular expressions that match exactly what you want. Clearly understand complex regexes written by others. Quickly test any regex on sample strings and files, preventing mistakes on actual data. Debug without guesswork by stepping through the actual matching process. Use the regex with source code snippets automatically adjusted to the particulars of your programming language. Collect and document libraries of regular expressions for future reuse. GREP (search-and-replace) through files and folders. Integrate RegexBuddy with your favorite searching and editing tools for instant access.

Learn all there is to know about regular expressions from RegexBuddy’s comprehensive documentation and regular expression tutorial.

I was reminded of RegexBuddy when I stumbled on the RegexBuddy Manual in a search result.

The XQuery/XPath regex treatment is far briefer than I would like but at 500+ pages, it’s an impressive bit of work. Even without a copy of RegexBuddy, working through the examples will make you a regex terrorist.

The only unfortunate aspect, for *nix users, is that you need to run RegexBuddy in a Windows VM. 🙁

If you are comfortable with Emacs, Windows or otherwise, then the Occur mode comes to mind. It doesn’t have the visuals of RegexBuddy but then you are accustomed to a power-user environment.

In terms of productivity, it’s hard to beat regexes. I passed along a one liner awk regex tip today to extract content from a “…pile of nonstandard multiply redundant JavaScript infested pseudo html.”

I’ve seen the HTML in question. The description seems a bit generous to me. 😉

Try your hand at regexes and see if your productivity increases!

Balisage Papers Due in 3 Weeks!

March 16th, 2017

Apologies for the sudden lack of posting but I have been working on a rather large data set with XQuery and checking forwards and backwards to make sure it can be replicated. (I hate “it works on my computer.”)

Anyway, Tommie Usdin dropped an email bomb today with a reminder that Balisage papers are due on April 7, 2017.

From her email:

Submissions to “Balisage: The Markup Conference” and pre-conference symposium:
“Up-Translation and Up-Transformation: Tasks, Challenges, and Solutions”
are on April 7.

It is time to start writing!

Balisage: The Markup Conference 2017
August 1 — 4, 2017, Rockville, MD (a suburb of Washington, DC)
July 31, 2017 — Symposium Up-Translation and Up-Transformation
https://www.balisage.net/

Balisage: where serious markup practitioners and theoreticians meet every August. We solicit papers on any aspect of markup and its uses; topics include but are not limited to:

• Web application development with XML
• Informal data models and consensus-based vocabularies
• Integration of XML with other technologies (e.g., content management, XSLT, XQuery)
• Performance issues in parsing, XML database retrieval, or XSLT processing
• Development of angle-bracket-free user interfaces for non-technical users
• Semistructured data and full text search
• Deployment of XML systems for enterprise data
• Web application development with XML
• Design and implementation of XML vocabularies
• Case studies of the use of XML for publishing, interchange, or archiving
• Alternatives to XML
• the role(s) of XML in the application lifecycle
• the role(s) of vocabularies in XML environments

Detailed Call for Participation: http://balisage.net/Call4Participation.html
About Balisage: http://balisage.net/Call4Participation.html

pre-conference symposium:
Up-Translation and Up-Transformation: Tasks, Challenges, and Solutions
Chair: Evan Owens, Cenveo
https://www.balisage.net/UpTransform/index.html

Increasing the granularity and/or specificity of markup is an important task in many content and information workflows. Markup transformations might involve tasks such as high-level structuring, detailed component structuring, or enhancing information by matching or linking to external vocabularies or data. Enhancing markup presents secondary challenges including lack of structure of the inputs or inconsistency of input data down to the level of spelling, punctuation, and vocabulary. Source data for up-translation may be XML, word processing documents, plain text, scanned & OCRed text, or databases; transformation goals may be content suitable for page makeup, search, or repurposing, in XML, JSON, or any other markup language.

The range of approaches to up-transformation is as varied as the variety of specifics of the input and required outputs. Solutions may combine automated processing with human review or could be 100% software implementations. With the potential for requirements to evolve over time, tools may have to be actively maintained and enhanced. This is the place to discuss goals, challenges, solutions, and workflows for significant XML enhancements, including approaches, tools, and techniques that may potentially be used for a variety of other tasks.

For more information: info@balisage.net or +1 301 315 9631

I’m planning on posting tomorrow one way or the other!

While you wait for that, get to work on your Balisage paper!

Pre-Installed Malware – Espionage Potential

March 14th, 2017

Malware found pre-installed on dozens of different Android devices by David Bisson.

From the post:

Malware in the form of info-stealers, rough ad networks, and even ransomware came pre-installed on more than three dozen different models of Android devices.

Researchers with Check Point spotted the malware on 38 Android devices owned by a telecommunications company and a multinational technology company.

See David’s post for the details but it raises the intriguing opportunity to supply government and corporate offices with equipment with malware pre-installed.

No more general or targeted phishing schemes, difficult attempts to breach physical security and/or to avoid anti-virus or security programs.

The you leak – we print model of the news media makes it unlikely news organizations will want to get their skirts dirty pre-installing malware on hardware.

News organizations consider themselves “ethical” in publishing stolen information but are unwilling to steal it themselves, because stealing is “unethical.”

There’s some nuance in there I am missing, perhaps that being proven to have stolen carries a prison sentence in most places. Odd how ethics correspond to self-interest isn’t it?

If you are interested in the number of opportunities for malware on computers in 2017, check out Computers Sold This Year. It reports as of today over 41 million computers sold this year alone.

News organizations don’t have the skills to create a malware network but if information were treated as having value, separate from the means of its acquisition, a viable market would not be far behind.

New Wiper Malware – A Path To Involuntary Transparency

March 14th, 2017

From Shamoon to StoneDrill – Advanced New Destructive Malware Discovered in the Wild by Kaspersky Lab

From the press release:

The Kaspersky Lab Global Research and Analysis Team has discovered a new sophisticated wiper malware, called StoneDrill. Just like another infamous wiper, Shamoon, it destroys everything on the infected computer. StoneDrill also features advanced anti-detection techniques and espionage tools in its arsenal. In addition to targets in the Middle East, one StoneDrill target has also been discovered in Europe, where wipers used in the Middle East have not previously been spotted in the wild.

Besides the wiping module, Kaspersky Lab researchers have also found a StoneDrill backdoor, which has apparently been developed by the same code writers and used for espionage purposes. Experts discovered four command and control panels which were used by attackers to run espionage operations with help of the StoneDrill backdoor against an unknown number of targets.

Perhaps the most interesting thing about StoneDrill is that it appears to have connections to several other wipers and espionage operations observed previously. When Kaspersky Lab researchers discovered StoneDrill with the help of Yara-rules created to identify unknown samples of Shamoon, they realised they were looking at a unique piece of malicious code that seems to have been created separately from Shamoon. Even though the two families – Shamoon and StoneDrill – don’t share the exact same code base, the mind-set of the authors and their programming “style” appear to be similar. That’s why it was possible to identify StoneDrill with the Shamoon-developed Yara-rules.

Code similarities with older known malware were also observed, but this time not between Shamoon and StoneDrill. In fact StoneDrill uses some parts of the code previously spotted in the NewsBeef APT, also known as Charming Kitten – another malicious campaign which has been active in the last few years.

For details beyond the press release, see: From Shamoon to StoneDrill: Wipers attacking Saudi organizations and beyond by Costin Raiu, Mohamad Amin Hasbini, Sergey Belov, Sergey Mineev or the full report, same title, version 1.05.

Wipers can impact corporate and governmental operations but they may be hiding crimes and misdeeds at the same time.

Of greater interest are the espionage operations enabled by StoneDrill.

If you are interested in planting false flags, pay particular attention to the use of language analysis in the full report.

Taking a clue from Lakoff on framing, would you opinion of StoneDrill change if instead of “espionage” it was described as a “corporate/government transparency” tool?

I don’t recall anyone saying that transparency is by definition voluntary.

Perhaps that’s the ticket. Malware can bring about involuntary transparency.

Yes?

Less Than Accurate Cybersecurity News Headline – From Phys.org No Less

March 13th, 2017

Skimming through my Twitter stream I encountered:

That sounds important and it’s from Phys.org.

Who describe themselves in 100 words:

Phys.org™ (formerly Physorg.com) is a leading web-based science, research and technology news service which covers a full range of topics. These include physics, earth science, medicine, nanotechnology, electronics, space, biology, chemistry, computer sciences, engineering, mathematics and other sciences and technologies. Launched in 2004, Phys.org’s readership has grown steadily to include 1.75 million scientists, researchers, and engineers every month. Phys.org publishes approximately 100 quality articles every day, offering some of the most comprehensive coverage of sci-tech developments world-wide. Quancast 2009 includes Phys.org in its list of the Global Top 2,000 Websites. Phys.org community members enjoy access to many personalized features such as social networking, a personal home page set-up, RSS/XML feeds, article comments and ranking, the ability to save favorite articles, a daily newsletter, and other options.

So I bit and visited New technique completely protects internet pictures and videos from cyberattacks, which reads in part:

A Ben-Gurion University of the Negev (BGU) researcher has developed a new technique that could provide virtually 100 percent protection against cyberattacks launched through internet videos or images, which are a growing threat.

“Any downloaded or streamed video or picture is a potential vehicle for a cyberattack,” says Professor Ofer Hadar, chair of BGU’s Department of Communication Systems Engineering. “Hackers like videos and pictures because they bypass the regular data transfer systems of highly secure systems, and there is significant space in which to implant malicious code.”

“Preliminary experimental results show that a method based on a combination of Coucou Project techniques results in virtually 100 percent protection against cyberattacks,” says Prof. Hadar. “We envision that firewall and antivirus companies will be able to utilize Coucou protection applications and techniques in their products.”

The Coucou Project receives funding from the BGU Cyber Security Research Center and the BaseCamp Innovation Center at the Advanced Technologies Park adjacent to BGU, which is interested in developing the protective platform into a commercial enterprise.

Summary: Cyberattackers using internet videos or images are in little danger of being thwarted any time soon.

First, Professor Hadar’s technique would need to be verified by other researchers. (Possibly has been but no publications are cited.)

Second, the technique must not introduce additional cybersecurity weaknesses.

Third, vendors have to adopt and implement the techniques.

Fourth, users must upgrade to new software that incorporates the new techniques.

A more accurate headline reads:

New Technique In Theory Protects Pictures and Videos From Cyberattacks

Yes?

Notes to (NUS) Computer Science Freshmen…

March 13th, 2017

Notes to (NUS) Computer Science Freshmen, From The Future

From the intro:

Early into the AY12/13 academic year, Prof Tay Yong Chiang organized a supper for Computer Science freshmen at Tembusu College. The bunch of seniors who were gathered there put together a document for NUS computing freshmen. This is that document.

Feel free to create a pull request to edit or add to it, and share it with other freshmen you know.

There is one sad note:


The Art of Computer Programming (a review of everything in Computer Science; pretty much nobody, save Knuth, has finished reading this)

When you think about the amount of time Knuth has spent researching, writing and editing The Art of Computer Programming (TAOCP), it doesn’t sound unreasonable to expect others, a significant number of others, to have read it.

Any online reading groups focused on TAOCP?

AI Brain Scans

March 13th, 2017

‘AI brain scans’ reveal what happens inside machine learning


The ResNet architecture is used for building deep neural networks for computer vision and image recognition. The image shown here is the forward (inference) pass of the ResNet 50 layer network used to classify images after being trained using the Graphcore neural network graph library

Credit Graphcore / Matt Fyles

The image is great eye candy, but if you want to see images annotated with information, check out: Inside an AI ‘brain’ – What does machine learning look like? (Graphcore)

From the product overview:

Poplar™ is a scalable graph programming framework targeting Intelligent Processing Unit (IPU) accelerated servers and IPU accelerated server clusters, designed to meet the growing needs of both advanced research teams and commercial deployment in the enterprise. It’s not a new language, it’s a C++ framework which abstracts the graph-based machine learning development process from the underlying graph processing IPU hardware.

Poplar includes a comprehensive, open source set of Poplar graph libraries for machine learning. In essence, this means existing user applications written in standard machine learning frameworks, like Tensorflow and MXNet, will work out of the box on an IPU. It will also be a natural basis for future machine intelligence programming paradigms which extend beyond tensor-centric deep learning. Poplar has a full set of debugging and analysis tools to help tune performance and a C++ and Python interface for application development if required.

The IPU-Appliance for the Cloud is due out in 2017. I have looked at Graphcore but came up dry on the Poplar graph libraries and/or an emulator for the IPU.

Perhaps those will both appear later in 2017.

Optimized hardware for graph calculations sounds promising but rapidly processing nodes that may or may not represent the same subject seems like a defect waiting to make itself known.

Many approaches rapidly process uncertain big data but being no more ignorant than your competition is hardly a selling point.

Creating A Social Media ‘Botnet’ To Skew A Debate

March 10th, 2017

New Research Shows How Common Core Critics Built Social Media ‘Botnets’ to Skew the Education Debate by Kevin Mahnken.

From the post:

Anyone following education news on Twitter between 2013 and 2016 would have been hard-pressed to ignore the gradual curdling of Americans’ attitudes toward the Common Core State Standards. Once seen as an innocuous effort to lift performance in classrooms, they slowly came to be denounced as “Dirty Commie agenda trash” and a “Liberal/Islam indoctrination curriculum.”

After years of social media attacks, the damage is impressive to behold: In 2013, 83 percent of respondents in Education Next’s annual poll of Americans’ education attitudes felt favorably about the Common Core, including 82 percent of Republicans. But by the summer of 2016, support had eroded, with those numbers measuring only 50 percent and 39 percent, respectively. The uproar reached such heights, and so quickly, that it seemed to reflect a spontaneous populist rebellion against the most visible education reform in a decade.

Not so, say researchers with the University of Pennsylvania’s Consortium for Policy Research in Education. Last week, they released the #commoncore project, a study that suggests that public animosity toward Common Core was manipulated — and exaggerated — by organized online communities using cutting-edge social media strategies.

As the project’s authors write, the effect of these strategies was “the illusion of a vociferous Twitter conversation waged by a spontaneous mass of disconnected peers, whereas in actuality the peers are the unified proxy voice of a single viewpoint.”

Translation: A small circle of Common Core critics were able to create and then conduct their own echo chambers, skewing the Twitter debate in the process.

The most successful of these coordinated campaigns originated with the Patriot Journalist Network, a for-profit group that can be tied to almost one-quarter of all Twitter activity around the issue; on certain days, its PJNET hashtag has appeared in 69 percent of Common Core–related tweets.

The team of authors tracked nearly a million tweets sent during four half-year spans between September 2013 and April 2016, studying both how the online conversation about the standards grew (more than 50 percent between the first phase, September 2013 through February 2014, and the third, May 2015 through October 2015) and how its interlocutors changed over time.

Mahnken talks as though creating a ‘botnet’ to defeat adoption of the Common Core State Standards is a bad thing.

I never cared for #commoncore because testing makes money for large and small testing vendors. It has no other demonstrated impact on the educational process.

Let’s assume you want to build a championship high school baseball team. To do that, various officious intermeddlers, who have no experience with baseball, fund creation of the Common Core Baseball Standards.

Every three years, every child is tested against the Common Core Baseball Standards and their performance recorded. No funds are allocated for additional training for gifted performers, equipment, baseball fields, etc.

By the time these students reach high school, will you have the basis for a championship team? Perhaps, but if you do, it due to random chance and not the Common Core Baseball Standards.

If you want a championship high school baseball team, you fund training, equipment, baseball fields and equipment, in addition to spending money on the best facilities for your hoped for championship high school team. Consistently and over time you spend money.

The key to better education results isn’t testing, but funding based on the education results you hope to achieve.

I do commend the #commoncore project website for being an impressive presentation of Twitter data, even though it is clearly a propaganda machine for pro Common Core advocates.

The challenge here is to work backwards from what was observed by the project to both principles and tactics that made #stopcommoncore so successful. That is we know it has succeeded, at least to some degree, but how do we replicate that success on other issues?

Replication is how science demonstrates the reliability of a technique.

Looking forward to hearing your thoughts, suggestions, etc.

Enjoy!

Eight Simple Rules for Doing Accurate Journalism [+ One]

March 10th, 2017

Eight Simple Rules for Doing Accurate Journalism by Craig Silverman.

From the post:

It’s a cliché to say clichés exist for a reason. As journalists, we’re supposed to avoid them like the, um, plague. But it’s useful to have a catchy phrase that can stick in someone’s mind, particularly if you’re trying to spread knowledge or change behaviour.

This week I began cataloguing some of my own sayings about accuracy — you can consider them aspiring clichés — and other phrases I find helpful or instructive in preparation for a workshop I’m giving with The Huffington Post’s Mandy Jenkins at next week’s Online News Association conference. Our session is called B.S. Detection for Online Journalists. The goal is to equip participants with tools, tips, and knowledge to get things right, and weed out misinformation and hoaxes before they spread them.

So, with apologies to Bill Maher, I offer some new, some old, and some wonderfully clichéd rules for doing accurate journalism. Keep these in your head and they’ll help you do good work.

The problem of verification, if journal retractions are credited, isn’t limited to those writing under deadline pressure. Verification is neglected by those who spend months word-smithing texts.

I like Silverman’s post but I would ask:

Why do you say that?

However commonplace or bizarre a statement maybe, always challenge the speaker for their basis for a statement.

Take former CIA Director Michael Hayden‘s baseless notion that:

“…but this group of millennials and related groups simply have different understandings of the words loyalty, secrecy, and transparency than certainly my generation did.”

As Zaid Jilani goes on to demonstrate, Hayden’s opinion isn’t rooted in fact but prejudice.

The question at that point is whether Hayden’s prejudice is newsworthy enough to be reported. Having ascertain that Hayden is just grousing, why not leave the interview on the cutting room floor?

Journalists have no obligation to repeat the prejudices of current or former government officials as being worthy of notice.

XQuery Ready CIA Vault7 Files

March 10th, 2017

I have extracted the HTML files from WikiLeaks Vault7 Year Zero 2017 V 1.7z, processed them with Tidy (see note on correction below), and uploaded the “tidied” HTML files to: Vault7-CIA-Clean-HTML-Only.

Beyond the usual activities of Tidy, I did have to correct the file page_26345506.html: by creating a comment around one line of content:

<!– <declarations><string name=”½ö”></string></declarations&>lt;p>›<br> –>

Otherwise, the files are only corrected HTML markup with no other changes.

The HTML compresses well, 7811 files coming in at 3.4 MB.

Demonstrate the power of basic XQuery skills!

Enjoy!

Unicode 10.0 Beta Review

March 9th, 2017

Unicode 10.0 Beta Review

In today’s mail:

The Unicode Standard is the foundation for all modern software and communications around the world, including all modern operating systems, browsers, laptops, and smart phones—plus the Internet and Web (URLs, HTML, XML, CSS, JSON, etc.). The Unicode Standard, its associated standards, and data form the foundation for CLDR and ICU releases. Thus it is important to ensure a smooth transition to each new version of the standard.

Unicode 10.0 includes a number of changes. Some of the Unicode Standard Annexes have modifications for Unicode 10.0, often in coordination with changes to character properties. In particular, there are changes to UAX #14, Unicode Line Breaking Algorithm, UAX #29, Unicode Text Segmentation, and UAX #31, Unicode Identifier and Pattern Syntax. In addition, UAX #50, Unicode Vertical Text Layout, has been newly incorporated as a part of the standard. Four new scripts have been added in Unicode 10.0, including Nüshu. There are also 56 additional emoji characters, a major new extension of CJK ideographs, and 285 hentaigana, important historic variants for Hiragana syllables.

Please review the documentation, adjust your code, test the data files, and report errors and other issues to the Unicode Consortium by May 1, 2017. Feedback instructions are on the beta page.

See http://unicode.org/versions/beta-10.0.0.html for more information about testing the 10.0.0 beta.

See http://unicode.org/versions/Unicode10.0.0/ for the current draft summary of Unicode 10.0.0.

It’s not too late for you to contribute to the Unicode party! There plenty of reviewing and by no means has all the work been done!

For this particular version, comments are due by May 1, 2017.

Enjoy!

Smile! You May Be On A Candid Camera!

March 9th, 2017

Hundreds of Thousands of Vulnerable IP Cameras Easy Target for Botnet, Researcher Says by Chris Brook.

From the post:

A researcher claims that hundreds of thousands of shoddily made IP cameras suffer from vulnerabilities that could make them an easy target for attackers looking to spy, brute force them, or steal their credentials.

Researcher Pierre Kim disclosed the vulnerabilities Wednesday and gave a comprehensive breakdown of the affected models in an advisory on his GitHub page.

A gifted security researcher who has discovered a number of backdoors in routers, estimates there are at least 18,000 vulnerable cameras in the United States alone. That figure may be as high as 200,000 worldwide.

For all of the pissing and moaning in Chris’ post, I don’t see the problem.

Governments, corporations, web hosts either have us under surveillance or their equipment is down for repairs.

Equipment that isn’t under their direct control, such as “shoddily made IP cameras,” provide an opportunity for citizens to return the surveillance favor.

To perform surveillance those who accept surveillance of the “masses” but find surveillance of their activities oddly objectionable.

Think of it this way:

The US government has to keep track of approximately 324 million people, give or take. With all the sources of information on every person, that’s truly a big data problem.

Turn that problem around and consider that Congress has only 535 members.

That’s more of a laptop sized data problem, albeit that they are clever about covering their tracks. Or think they are at any rate.

No, the less security that exists in general the more danger there is for highly visible individuals.

Think about who is more vulnerable before you complain about a lack of security.

The security the government is trying to protect isn’t for you. I promise. (The hoarding of cyber exploits by the CIA is only one such example.)

How Bad Is Wikileaks Vault7 (CIA) HTML?

March 9th, 2017

How bad?

Unless you want to hand correct 7809 html files to use with XQuery, grab the latest copy of Tidy

It’s not the worst HTML I have ever seen, but put that in the context of having seen a lot of really poor HTML.

I’ve “tidied” up a test collection and will grab a fresh copy of the files before producing and releasing a clean set of the HTML files.

Producing a document collection for XQuery processing. Working towards something suitable for application of NLP and other tools.

That CIA exploit list in full: … [highlights]

March 8th, 2017

That CIA exploit list in full: The good, the bad, and the very ugly by Iain Thomson.

From the post:

We’re still going through the 8,761 CIA documents published on Tuesday by WikiLeaks for political mischief, although here are some of the highlights.

First, though, a few general points: one, there’s very little here that should shock you. The CIA is a spying organization, after all, and, yes, it spies on people.

Two, unlike the NSA, the CIA isn’t mad keen on blanket surveillance: it targets particular people, and the hacking tools revealed by WikiLeaks are designed to monitor specific persons of interest. For example, you may have seen headlines about the CIA hacking Samsung TVs. As we previously mentioned, that involves breaking into someone’s house and physically reprogramming the telly with a USB stick. If the CIA wants to bug you, it will bug you one way or another, smart telly or no smart telly. You’ll probably be tricked into opening a dodgy attachment or download.

That’s actually a silver lining to all this: end-to-end encrypted apps, such as Signal and WhatsApp, are so strong, the CIA has to compromise your handset, TV or computer to read your messages and snoop on your webcam and microphones, if you’re unlucky enough to be a target. Hacking devices this way is fraught with risk and cost, so only highly valuable targets will be attacked. The vast, vast majority of us are not walking around with CIA malware lurking in our pockets, laptop bags, and living rooms.

Thirdly, if you’ve been following US politics and WikiLeaks’ mischievous role in the rise of Donald Trump, you may have clocked that Tuesday’s dump was engineered to help the President pin the hacking of his political opponents’ email server on the CIA. The leaked documents suggest the agency can disguise its operations as the work of a foreign government. Thus, it wasn’t the Russians who broke into the Democrats’ computers and, by leaking the emails, helped swing Donald the election – it was the CIA all along, Trump can now claim. That’ll shut the intelligence community up. The President’s pet news outlet Breitbart is already running that line.

Iain does a good job of picking out some of the more interesting bits from the CIA (alleged) file dump. No, you will have to read Iain’s post for those.

I mention Iain’s post primarily as a way to entice you into reading the all the files in hopes of discovering more juicy tidbits.

Read the files. Your security depends on the indifference of the CIA and similar agencies. Is that your model for privacy?

Gap Analysis Resource – Electrical Grid

March 8th, 2017

Electricity – Federal Efforts to Enhance Grid Resilience Government Accounting Office (GAO) (January 2017)

What GAO Found

The Department of Energy (DOE), the Department of Homeland Security (DHS), and the Federal Energy Regulatory Commission (FERC) reported implementing 27 grid resiliency efforts since 2013 and identified a variety of results from these efforts. The efforts addressed a range of threats and hazards—including cyberattacks, physical attacks, and natural disasters—and supported different types of activities (see table). These efforts also addressed each of the three federal priorities for enhancing the security and resilience of the electricity grid: (1) developing and deploying tools and technologies to enhance awareness of potential disruptions, (2) planning and exercising coordinated responses to disruptive events, and (3) ensuring actionable intelligence on threats is communicated between government and industry in a time-sensitive manner. Agency officials reported a variety of results from these efforts, including the development of new technologies—such as a rapidly-deployable large, highpower transformer—and improved coordination and information sharing between the federal government and industry related to potential cyberattacks.

(table omitted)

Federal grid resiliency efforts were fragmented across DOE, DHS, and FERC and overlapped to some degree but were not duplicative. GAO found that the 27 efforts were fragmented in that they were implemented by three agencies and addressed the same broad area of national need: enhancing the resilience of the electricity grid. However, DOE, DHS, and FERC generally tailored their efforts to contribute to their specific missions. For example, DOE’s 11 efforts related to its strategic goal to support a more secure and resilient U.S. energy infrastructure. GAO also found that the federal efforts overlapped to some degree but were not duplicative because none had the same goals or engaged in the same activities. For example, three DOE and DHS efforts addressed resiliency issues related to large, high-power transformers, but the goals were distinct—one effort focused on developing a rapidly deployable transformer to use in the event of multiple large, high-power transformer failures; another focused on developing next-generation transformer components with more resilient features; and a third focused on developing a plan for a national transformer reserve. Moreover, officials from all three agencies reported taking actions to coordinate federal grid resiliency efforts, such as serving on formal coordinating bodies that bring together federal, state, and industry stakeholders to discuss resiliency issues on a regular basis, and contributing to the development of federal plans that address grid resiliency gaps and priorities. GAO found that these actions were consistent with key practices for enhancing and sustaining federal agency coordination.
…(emphasis in original)

A high level view of efforts to “protect” the electrical grid (grid) in the United States.

Most of the hazards, massive solar flares, the 1859 Carrington Event, or a nuclear EMP, would easily overwhelm many if not all current measures to harden the grid.

Still, participants get funded to talk about hazards and dangers they can’t prevent nor easily remedy.

What dangers do you want to protect the grid against?

Headless Raspberry Pi Hacking Platform Running Kali Linux

March 8th, 2017

Set Up a Headless Raspberry Pi Hacking Platform Running Kali Linux by Sadmin.

From the post:

The Raspberry Pi is a credit card-sized computer that can crack Wi-Fi, clone key cards, break into laptops, and even clone an existing Wi-Fi network to trick users into connecting to the Pi instead. It can jam Wi-Fi for blocks, track cell phones, listen in on police scanners, broadcast an FM radio signal, and apparently even fly a goddamn missile into a helicopter.

The key to this power is a massive community of developers and builders who contribute thousands of builds for the Kali Linux and Raspberry Pi platforms. For less than a tank of gas, a Raspberry Pi 3 buys you a low-cost, flexible cyberweapon.

Of course, it’s important to compartmentalize your hacking and avoid using systems that uniquely identify you, like customized hardware. Not everyone has access to a supercomputer or gaming tower, but fortunately one is not needed to have a solid Kali Linux platform.

With over 10 million units sold, the Raspberry Pi can be purchased in cash by anyone with $35 to spare. This makes it more difficult to determine who is behind an attack launched from a Raspberry Pi, as it could just as likely be a state-sponsored attack flying under the radar or a hyperactive teenager in high school coding class.

Blogging while I wait for the Wikileaks Vault7 Part 1 files to load into an XML database. The rhyme or reason (or the lack thereof) behind Wikileaks releases continues to escape me.

Within a day or so I will drop what I think is a more useful organization of that information.

While you wait, this is a particularly good post on using a Raspberry Pi “for reconnaissance and attacking Wi-Fi networks” in the author’s words.

Although a Raspberry Pi is easy to conceal, both on your person and on location, the purpose of such a device isn’t hard to discern.

If you are carrying a Raspberry Pi, avoid being searched until after you can dispose of it. Make sure that your fingerprints or biological trace evidence is not on it.

I say “your fingerprints or biological trace evidence” because it would be amusing if fingerprints or biological trace evidence implicated some resident of the facility where it is found.

The results of being suspected of possessing a Kali Linux equipped Raspberry Pi versus being proven to have possessed such a device, may differ by years.

Go carefully.

Confirmation: Internet of Things As Hacking Avenue

March 7th, 2017

I mentioned in the Internet of Things (IoT) in Reading the Unreadable SROM: Inside the PSOC4 [Hacking Leader In Internet of Things Suppliers] as a growing, “Compound Annual Growth Rate (CAGR) of 33.3%,” source of cyber insecurity.

Today, Bill Brenner writes:

WikiLeaks’ release of 8,761 pages of internal CIA documents makes this much abundantly clear: the agency has built a monster hacking operation – possibly the biggest in the world – on the backs of the many internet-connected household gadgets we take for granted.

That’s the main takeaway among security experts Naked Security reached out to after the leak went public earlier Tuesday.

I appreciate the confirmation!

Yes, the IoT can and is being used for government surveillance.

At the same time, the IoT is a tremendous opportunity to level the playing field against corporations and governments alike.

If the IoT isn’t being used against corporations and governments, whose fault is that?

That’s my guess too.

You can bulk download the first drop from: https://archive.org/details/wikileaks.vault7part1.tar.

Vault 7: CIA Hacking Tools In Bulk Download

March 7th, 2017

If you want to avoid mirroring Vault 7: CIA Hacking Tools Revealed for yourself, check out: https://archive.org/details/wikileaks.vault7part1.tar.

Why Wikileaks doesn’t offer bulk access to its data sets, you would have to ask Wikileaks.

Enjoy!

Wikileaks Armed – You’re Not

March 7th, 2017

Vault 7: CIA Hacking Tools Revealed (Wikileaks).

Very excited to read:

Today, Tuesday 7 March 2017, WikiLeaks begins its new series of leaks on the U.S. Central Intelligence Agency. Code-named “Vault 7” by WikiLeaks, it is the largest ever publication of confidential documents on the agency.

The first full part of the series, “Year Zero”, comprises 8,761 documents and files from an isolated, high-security network situated inside the CIA’s Center for Cyber Intelligence in Langley, Virgina. It follows an introductory disclosure last month of CIA targeting French political parties and candidates in the lead up to the 2012 presidential election.

Recently, the CIA lost control of the majority of its hacking arsenal including malware, viruses, trojans, weaponized “zero day” exploits, malware remote control systems and associated documentation. This extraordinary collection, which amounts to more than several hundred million lines of code, gives its possessor the entire hacking capacity of the CIA. The archive appears to have been circulated among former U.S. government hackers and contractors in an unauthorized manner, one of whom has provided WikiLeaks with portions of the archive.

Very disappointed to read:


Wikileaks has carefully reviewed the “Year Zero” disclosure and published substantive CIA documentation while avoiding the distribution of ‘armed’ cyberweapons until a consensus emerges on the technical and political nature of the CIA’s program and how such ‘weapons’ should analyzed, disarmed and published.

Wikileaks has also decided to redact and anonymise some identifying information in “Year Zero” for in depth analysis. These redactions include ten of thousands of CIA targets and attack machines throughout Latin America, Europe and the United States. While we are aware of the imperfect results of any approach chosen, we remain committed to our publishing model and note that the quantity of published pages in “Vault 7” part one (“Year Zero”) already eclipses the total number of pages published over the first three years of the Edward Snowden NSA leaks.

For all of the fretting over the “…extreme proliferation risk in the development of cyber ‘weapons’…”, bottom line is Wikileaks and its agents are armed with CIA cyber weapons and you are not.

Assange/Wikileaks have cast their vote in favor of arming themselves and protecting the CIA and others.

Responsible leaking of cyber weapons means arming everyone equally.

Continuing Management Fail At Twitter

March 6th, 2017

Twitter management continues to fail.

Consider censoring the account of Lauri Love. (a rumored hacker)

Competent management at Twitter would be licensing the rights to create shareable mutes/filters for all posts from Lauri Love.

The FBI, Breitbart, US State Department, and others would vie for users of their filters, which block “dangerous and/or seditious content.”

Filters licensed in increments, depending on how many shares you want to enable.

Twitter with no censorship at all would drive the market for such filters.

Licensing filters by number of shares provides a steady revenue stream and Twitter could its censorship prone barnacles. More profit, reduced costs, what’s not to like?

PS: I ask nothing for this suggestion. Getting Twitter out of the censorship game on behalf of governments is benefit enough for me.

Reading the Unreadable SROM: Inside the PSOC4 [Hacking Leader In Internet of Things Suppliers]

March 6th, 2017

Reading the Unreadable SROM: Inside the PSOC4 by Elliot Williams.

From the post:

Wow. [Dmitry Grinberg] just broke into the SROM on Cypress’ PSoC 4 chips. The supervisory read-only memory (SROM) in question is a region of proprietary code that runs when the chip starts up, and in privileged mode. It’s exactly the kind of black box that’s a little bit creepy and a horribly useful target for hackers if the black box can be broken open. What’s inside? In the manual it says “The user has no access to read or modify the SROM code.” Nobody outside of Cypress knows. Until now.

This matters because the PSoC 4000 chips are among the cheapest ARM Cortex-M0 parts out there. Consequently they’re inside countless consumer devices. Among [Dmitry]’s other tricks, he’s figured out how to write into the SROM, which opens the door for creating an undetectable rootkit on the chip that runs out of each reset. That’s the scary part.

The cool parts are scattered throughout [Dmitry]’s long and detailed writeup. He also found that the chips that have 8 K of flash actually have 16 K, and access to the rest of the memory is enabled by setting a single bit. This works because flash is written using routines that live in SROM, rather than the usual hardware-level write-to-register-and-wait procedure that we’re accustomed to with other micros. Of course, because it’s all done in software, you can brick the flash too by writing the wrong checksums. [Dmitry] did that twice. Good thing the chips are inexpensive.

We should all commend Dmitry Grinberg on his choice of the leading Internet of Things (IoT) supplier as his target.

Cyber-insecurity grows with every software security solution but

The Internet of Things market size is estimated to grow from USD 157.05 Billion in 2016 to USD 661.74 Billion by 2021, at a Compound Annual Growth Rate (CAGR) of 33.3% from 2016 to 2021. (Internet of Things (IoT) Market)

Insecurity growing at a “Compound Annual Growth Rate (CAGR) of 33.3%” is impressive to say the least. Not to mention all the legacy insecurities that have never been patched or where patches have not been installed.

Few will duplicate Dmitry’s investigation but no doubt tools will soon bring the fruits of his labor to a broader market.

Responsible Disclosure

The comments on Dmitry’s work have the obligatory complaints about public disclosure of these flaws.

Every public disclosure is a step towards transparency of both corporations and governments.

I see not cause for complaint.

You?

Enjoy the Projects gallery as well.

Why I Love XML (and Good Thing, It’s Everywhere) [Needs Subject Identity Too]

March 5th, 2017

Why I Love XML (and Good Thing, It’s Everywhere) by Lee Pollington.

Lee makes a compelling argument for XML as the underlying mechanism for data integration when saying:

…Perhaps the data in your relational databases is structured. What about your knowledge management systems, customer information systems, document systems, CMS, mail, etc.? How do you integrate that data with structured data to get a holistic view of all your data? What do you do when you want to bring a group of relational schemas from different systems together to get that elusive 360 view – which is being demanded by the world’s regulators banks? Mergers and acquisitions drive this requirement too. How do you search across that data?

Sure there are solution stack answers. We’ve all seen whiteboards with ever growing number of boxes and those innocuous puny arrows between them that translate to teams of people, buckets of code, test and operations teams. They all add up to ever-increasing costs, complexity, missed deadlines & market share loss. Sound overly dramatic? Gartner calculated a worldwide spend of $5 Billion on data integration software in 2015. How much did you spend … would you know where to start calculating that cost?

While pondering what you spend on a yearly basis for data integration, contemplate two more questions from Lee:

…So take a moment to think about how you treat the data format that underpins your intellectual property? First-class citizen or after-thought?…

If you are treating your XML elements as first class citizens, do tell me that you created subject identity tests for those subjects?

So that a programmer new to your years of legacy XML will understand that <MFBM>, <MBFT> and <MBF> elements are all expressed in units of 1,000 board feet.

Yes?

Reducing the cost of data integration tomorrow, next year and five years after that, requires investment in the here and now.

Perhaps that is why data integration costs continue to climb.

Why pay for today what can be put off until tomorrow? (Future conversion costs are a line item in some future office holder’s budget.)