MS Streamlines Malware Delivery

June 27th, 2017

Malware delivery takes a giant leap forward with the MS Fall Creators Update:

If new malware is detected on any computer running Windows 10 in the world, Microsoft said it will be able to develop a signature for it and protect all the other users worldwide. The first victim will be safe as well because the virus will be set off in a virtual sandbox on the cloud, not on the person’s device.

Microsoft sees artificial intelligence as the next solution for security as attacks get more sophisticated.

“If we’re going to stay on top of anything that is changing that fast, you have to automate,” Lefferts said.

About 96 percent of detected cyberattacks are brand new, he noted.

With Microsoft’s current researchers working at their fastest pace, it can take a few hours to develop protections from the first moment they detect malware.

It’s during those few hours when people are really hit by malware. Using cloud data from Microsoft Office to develop malware signatures is crucial, for example, because recent attacks relied on Word vulnerabilities.

Two scenarios immediately come to mind:

1. The “malware” detection is “false,” the file/operation/URL is benign but now 400 million computers see it as “malware,” or,
2. Due to MTM attacks, false reports are sent to Windows computers on a particular sub-net.

Global security decision making is a great leap, but the question is in what direction?

PS: Did you notice the claim “96 percent of detected cyberattacks are brand news…?” I ask because that’s inconsistent with the documented long lives of cyber exploits, Website Security Statistics Report 2015 (WhiteHat Security).

Re-imagining Legislative Data – Semantic Integration Alert

June 27th, 2017

From the post:

This morning, on Tuesday, June 27, 2017, Library of Congress Chief Information Officer Bernard A. Barton, Jr., is scheduled to make an in-person announcement to the attendees of the 2017 Legislative Data & Transparency Conference in the CVC. Mr. Barton will deliver a short announcement about the Library’s intention to launch a legislative data App Challenge later this year. This pre-launch announcement will encourage enthusiasts and professionals to bring their app-building skills to an endeavor that seeks to create enhanced access and interpretation of legislative data.

The themes of the challenge are INNOVATE, INTEGRATE, and LEGISLATE. Mr. Barton’s remarks are below:

Here in America, innovation is woven into our DNA. A week from today our nation celebrates its 241st birthday, and those years have been filled with great minds who surveyed the current state of affairs, analyzed the resources available to them, and created devices, systems, and ways of thinking that created a better future worldwide.

The pantheon includes Benjamin Franklin, George Washington Carver, Alexander Graham Bell, Bill Gates, and Steve Jobs. It includes first-generation Americans like Nikolai Tesla and Albert Einstein, for whom the nation was an incubator of innovation. And it includes brilliant women such as Grace Hopper, who led the team that invented the first computer language compiler, and Shirley Jackson, whose groundbreaking research with subatomic particles enabled the inventions of solar cells, fiber-optics, and the technology the brings us something we use every day: call waiting and caller ID.

For individuals such as these, the drive to innovate takes shape through understanding the available resources, surveying the landscape for what’s currently possible, and taking it to the next level. It’s the 21st Century, and society benefits every day from new technology, new generations of users, and new interpretations of the data surrounding us. Social media and mobile technology have rewired the flow of information, and some may say it has even rewired the way our minds work. So then, what might it look like to rewire the way we interpret legislative data?

It can be said that the legislative process – at a high level – is linear. What would it look like if these sets of legislative data were pushed beyond a linear model and into dimensions that are as-yet-unexplored? What new understandings wait to be uncovered by such interpretations? These understandings could have the power to evolve our democracy.

That’s a pretty grand statement, but it’s not without basis. The sets of data involved in this challenge are core to a legislative process that is centuries old. It’s the source code of America government. An informed citizenry is better able to participate in our democracy, and this is a very real opportunity to contribute to a better understanding of the work being done in Washington. It may even provide insights for the people doing the work around the clock, both on the Hill, and in state and district offices. Your innovation and integration may ultimately benefit the way our elected officials legislate for our future.

Improve the future, and be a part of history.

The 2017 Legislative Data App Challenge will launch later this summer. Over the next several weeks Information will be made available at loc.gov/appchallenge, and individuals are invited to connect via appchallenge@loc.gov.

I mention this as a placeholder only because Pull’s post is general enough to mean several things, their opposites or something entirely different.

The gist of the post is that later this summer (2017), a challenge involving an “app” will be announced. The “app” will access/deliver/integrate legislative data. Beyond that, no further information is available at this time.

Impact of Microsoft Leaks On Programming Practice

June 27th, 2017

Mohit Kumar’s great graphic:

leads for his story: Microsoft’s Private Windows 10 Internal Builds and Partial Source Code Leaked Online.

The use of MS source code for discovery of vulnerabilities is obvious.

Less obvious questions:

• Do programmers follow leaked MS source code?
• Do programmers following leaked MS source code commit similar vulnerability errors?

Evidence for a public good argument for not spreading leaked MS source code anyone?

Alert! IAB workshop on Explicit Internet Naming Systems

June 26th, 2017

From the post:

Internet namespaces rely on Internet connected systems sharing a common set of assumptions on the scope, method of resolution, and uniqueness of the names. That set of assumption allowed the creation of URIs and other systems which presumed that you could authoritatively identify a service using an Internet name, a service port, and a set of locally-significant path elements.

There are now multiple challenges to maintaining that commonality of understanding.

• Some naming systems wish to use URIs to identify both a service and the method of resolution used to map the name to a serving node. Because there is no common facility for varying the resolution method in the URI structure, those naming systems must either mint new URI schemes for each resolution service or infer the resolution method from a reserved name or pattern. Both methods are currently difficult and costly, and the effort thus scales poorly.
• Users’ intentions to refer to specific names are now often expressed in voice input, gestures, and other methods which must be interpreted before being put into practice. The systems which carry on that interpretation often infer which intent a user is expressing, and thus what name is meant, by contextual elements. Those systems are linked to existing systems who have no access to that context and which may thus return results or create security expectations for an unintended name.
• Unicode allows for both combining characters and composed characters when local language communities have different practices. When these do not have a single normalization, context is required to determine which to produce or assume in resolution. How can this context be maintained in Internet systems?

While any of these challenges could easily be the topic of a stand-alone effort, this workshop seeks to explore whether there is a common set of root problems in the explicitness of the resolution context, heuristic derivation of intent, or language matching. If so, it seeks to identify promising areas for the development of new, more explicit naming systems for the Internet.

We invite position papers on this topic to be submitted by July 28, 2017 to ename@iab.org. Decisions on accepted submissions will be made by August 11, 2017.

Proposed dates for the workshop are September 28th and 29th, 2017 and the proposed location is in the Pacific North West of North America. Finalized logistics will be announced prior to the deadline for submissions.

When I hear “naming” and “Internet” in the same sentence, the line, “Oh no, no, please God help me!,” from Black Sabbath‘s Black Sabbath:

Well, except that the line needs to read:

Oh no, no, please God help us!

since any proposal is likely to impact users across the Internet.

The most frightening part of the call for proposals reads:

While any of these challenges could easily be the topic of a stand-alone effort, this workshop seeks to explore whether there is a common set of root problems in the explicitness of the resolution context, heuristic derivation of intent, or language matching. If so, it seeks to identify promising areas for the development of new, more explicit naming systems for the Internet.

Are we doing a clean reboot on the problem of naming? “…[A] common set of root problems….[?]”

Research on and design of “more” explicit naming systems for the Internet could result in proposals subject to metric evaluations. Looking for common “root problems” in naming systems, is a recipe for navel gazing.

June 25th, 2017

From the post:

Today we are rolling out two new features on Facebook to improve the experience of sharing, discovering and clicking .onion links to Tor hidden services especially for people who are not on Tor.

First, Facebook can now show previews for .onion links. Hidden service owners can use Open Graph tags to customise these previews, much like regular websites do.

Second, people who are not using Tor and click on .onion links will now see a message informing them that the link they clicked may not work. The message enables people to find out more about Tor and – for hidden services which have opted in – helps visit the service’s equivalent regular website. For people who are already using Tor, we send them straight through to the hidden service without showing any message.

This is a very bad plan!

If you are:

not using Tor and click on .onion links will now see a message informing them that the link they clicked may not work.

Accessing .onion links on Facebook, without using Tor, in the words of Admiral Ackbar, “It’s a trap!”:

Consumer Warning: Stale Passwords For Sale

June 25th, 2017

The important take away: the passwords are from a 2012 LinkedIn breach.

Unless you like paying for and mining low grade ore, considering passing on this offer.

Claims of stolen government passwords don’t make someone trustworthy. 😉

Statistical Functions for XQuery 3.1 (see OpenFormula)

June 24th, 2017

From the webpage:

Loosely inspired by the JavaScript simple-statistics project. The goal of this module is to provide a basic set of statistical functions for doing data analysis in XQuery 3.1.

Functions are intended to be implementation-agnostic.

Unit tests were written using the unit testing module of BaseX.

OpenFormula (Open Document Format for Office Applications (OpenDocument)) defines eighty-seven (87) statistical functions.

There are fifty-five (55) financial functions defined by OpenFormula, just in case you are interested.

See Through Walls With WiFi!

June 22nd, 2017

Drones that can see through walls using only Wi-Fi

From the post:

A Wi-Fi transmitter and two drones. That’s all scientists need to create a 3D map of the interior of your house. Researchers at the University of California, Santa Barbara have successfully demonstrated how two drones working in tandem can ‘see through’ solid walls to create 3D model of the interiors of a building using only, and we kid you not, only Wi-Fi signals.

As astounding as it sounds, researchers Yasamin Mostofi and Chitra R. Karanam have devised this almost superhero-level X-ray vision technology. “This approach utilizes only Wi-Fi RSSI measurements, does not require any prior measurements in the area of interest and does not need objects to move to be imaged,” explains Mostofi, who teaches electrical and computer engineering at the University.

For the paper and other details, see: 3D Through-Wall Imaging With Unmanned Aerial Vehicles and WiFi.

Before some contractor creates the Stingray equivalent for law enforcement, researchers and electronics buffs need to create new and improved versions for the public.

Government and industry offices are more complex than this demo but the technology will continue to improve.

I don’t have the technical ability to carry out the experiment but wondering if measurement of a strong signal from any source as it approaches a building and then its exit on the far side would serve the same purpose?

Reasoning that government/industry buildings may become shielded to some signals but in an age of smart phones, not all.

Enjoy!

.Rddj (data journalism with R)

June 21st, 2017

From the webpage:

The R Project is a great software environment for doing all sorts of data-driven journalism. It can be used for any of the stages of a typical data project: data collection, cleaning, analysis and even (interactive) visualization. And it’s all reproducible and transparent! Sure, it requires a fair amount of scripting, yet…

Do not fear! With this hand-curated (and opinionated) list of resources, you will be guided through the thick jungle of countless R packages, from learning the basics of R’s syntax, to scraping HTML tables, to a guide on how to make your work comprehensible and reproducible.

Some more efforts at persuasion: As I work in the media, I know how a lot of journalists are turned off by everything that doesn’t have a graphical interface with buttons to click on. However, you don’t need to spend days studying programming concepts in order to get started with R, as most operations can be carried out without applying scary things such as loops or conditionals – and, nowadays, high-level abstrations like dplyr make working with data a breeze. My advice if you’re new to data journalism or data processing in general: Better learn R than Excel, ’cause getting to know Excel (and the countless other tools that each do a single thing) doesn’t come for free, either.

This list is (partially) inspired by R for Journalists by Ed Borasky, which is another great resource for getting to know R.

… (emphasis in original)

The topics are familiar:

• RStudio
• Syntax and basic R programming
• Collecting Data (from the Web)
• Data cleaning and manipulation
• Text mining / natural language processing
• Exploratory data analysis and plotting
• Interactive data visualization
• Publication-quality graphics
• Reproducibility
• Examples of using R in (data) journalism
• What makes this list of resources different from search results?

Hand curation.

How much of a difference?

Compare the search results of “R” + any of these categories to the resources here.

Bookmark .Rddj for data journalism and R, then ping me with the hand curated list of resources you are creating.

Save yourself and the rest of us from search. Thanks!

Storyzy A.I. Fights Fake Quotes (Ineffective Against Trump White House)

June 21st, 2017

In the battle against fake news, Storyzy A.I. fights fake quotes

From the post:

The Quote Verifier launched today by Storyzy takes the battle against fake news to a whole new automated level by conveniently flagging fake quotes on social networks and search engines with +50,000 new authentic quotes added daily.

Storyzy aims to help social networks and search engines by spotting fake quotes. To fulfill this ambition Storyzy developed a tool (currently available in Beta version) that verifies whether a quote is authentic or not by checking if a person truly said that or not.
… (emphasis in original)

A tool for your short-list of verification tools to use on a daily basis.

It’s ineffective against the Trump White House because accurate quotes can still be “false.”

“Truthful quotes,” as per Trump White House policy, issue only from the President and must reflect what he meant to say. Subject to correction by the President.

A “truthful quote,” consists of three parts:

1. Said by the President
2. Reflects what he meant to say
3. Includes any subsequent correction by the President (one or more)

There is a simply solution to avoiding “false” quotes from President Trump:

Never quote him or his tweets at all.

Quote his lackeys, familiars and sycophants, but not him.

A Dictionary of Victorian Slang (1909)

June 20th, 2017

Passing English of the Victorian era, a dictionary of heterodox English, slang and phrase (1909) by J. Reeding Ware.

Quoted from the Preface:

HERE is a numerically weak collection of instances of ‘Passing English’. It may be hoped that there are errors on every page, and also that no entry is ‘quite too dull’. Thousands of words and phrases in existence in 1870 have drifted away, or changed their forms, or been absorbed, while as many have been added or are being added. ‘Passing English’ ripples from countless sources, forming a river of new language which has its tide and its ebb, while its current brings down new ideas and carries away those that have dribbled out of fashion. Not only is ‘Passing English’ general ; it is local ; often very seasonably local. Careless etymologists might hold that there are only four divisions of fugitive language in London west, east, north and south. But the variations are countless. Holborn knows little of Petty Italia behind Hatton Garden, and both these ignore Clerkenwell, which is equally foreign to Islington proper; in the South, Lambeth generally ignores the New Cut, and both look upon Southwark as linguistically out of bounds; while in Central London, Clare Market (disappearing with the nineteenth century) had, if it no longer has, a distinct fashion in words from its great and partially surviving rival through the centuries the world of Seven Dials, which is in St Giles’s St James’s being ractically in the next parish. In the East the confusion of languages is a world of ‘ variants ‘ there must be half-a-dozen of Anglo-Yiddish alone all, however, outgrown from the Hebrew stem. ‘Passing English’ belongs to all the classes, from the peerage class who have always adopted an imperfection in speech or frequency of phrase associated with the court, to the court of the lowest costermonger, who gives the fashion to his immediate entourage.

A healthy reminder that language is no more fixed and unchanging than the people who use it.

Enjoy!

My Last Index (Is Search A Form of Discrimination?)

June 20th, 2017

From the post:

A casual reader of authors’ acknowledgment pages will encounter expressions of familial gratitude that paper over years of spousal neglect and missed cello recitals. A keen reader of those pages may happen upon animals that were essential to an author’s well-being—supportive dogs, diverting cats, or, in one instance, “four very special squirrels.” But even an assiduous reader of acknowledgments could go a lifetime without coming across a single shout-out to a competent indexer.

That is mostly because the index gets constructed late in the book-making process. But it’s also because most readers pay no mind to indexes, especially at this moment in time when they are being supplanted by Amazon and Google. More and more, when I want to track down an errant tidbit of information about a book, I use Amazon’s “Search inside this book” function, which allows interested parties to access a book’s front cover, copyright, table of contents, first pages (and sometimes more), and index. But there’s no reason to even use the index when you can “Look Inside!” to find anything you need.

I had plenty of time to ponder the unsung heroism of indexers when I was finishing my latest book. Twice before, I had assembled an indexer’s tools of trade: walking down the stationery aisles of a college book store, pausing to consider the nib and color of my Flair pens, halting before the index cards. But when I began work on this index, I was overcome with thoughts of doom that Nancy Mulvany, author of Indexing Books, attributes to two factors that plague self-indexing authors: general fatigue and too much self-involvement. “Intense involvement with one’s book,” Mulvany writes, “can make it very difficult to anticipate the index user’s needs accurately.”

Perhaps my mood was dire because I’d lost the services of my favorite proofreader, a woman who knew a blackberry from a BlackBerry, and who could be counted on to fix my flawed French. Perhaps it was because I was forced to notice how often I’d failed to include page citations in my bibliography entries, and how inconsistently I’d applied the protocol for citing Web sites—a result of my failure to imagine a future index user so needy as to require the exact date of my visit to theirvingsociety.org.uk. Or perhaps it was because my daughter was six months away from leaving home for college and I was missing her in advance.

Perhaps for all of those reasons, I could only see my latest index as a running commentary on the fragility of all human endeavor. And so I started reading indexes while reluctantly compiling my own.

A highly instructive tale on the importance of indexing (and hiring a professional indexer) that includes this reference to Jonathan Swift:

Jonathan Swift, in his 1704 A Tale of a Tub, describes two means of using books: “to serve them as men do lords—learn their titles exactly and then brag of their acquaintance,” or “the choicer, the profounder and politer method, to get a thorough insight into the index, by which the whole book is governed and turned, like fishes by the tail.”

In full context, the Swift passage is even more amusing:

The whole course of things being thus entirely changed between us and the ancients, and the moderns wisely sensible of it, we of this age have discovered a shorter and more prudent method to become scholars and wits, without the fatigue of reading or of thinking. The most accomplished way of using books at present is twofold: either first to serve them as some men do lords, learn their titles exactly, and then brag of their acquaintance; or, secondly, which is indeed the choicer, the profounder, and politer method, to get a thorough insight into the index by which the whole book is governed and turned, like fishes by the tail. For to enter the palace of learning at the great gate requires an expense of time and forms, therefore men of much haste and little ceremony are content to get in by the back-door. For the arts are all in a flying march, and therefore more easily subdued by attacking them in the rear. Thus physicians discover the state of the whole body by consulting only what comes from behind. Thus men catch knowledge by throwing their wit on the posteriors of a book, as boys do sparrows with flinging salt upon their tails. Thus human life is best understood by the wise man’s rule of regarding the end. Thus are the sciences found, like Hercules’ oxen, by tracing them backwards. Thus are old sciences unravelled like old stockings, by beginning at the foot. (The Tale of a Tub by Jonathan Swift)

Searching, as opposed to indexing (good indexing at any rate), is the equivalent of bragging of the acquaintance of a lord. Yes, you did find term A or term B in the text, but you don’t know what other terms appear in the text, nor do you know what other statements were made about term A or term B.

Search is at best a partial solution and one that varies based on the skill of the searcher.

Indexing, on the other hand, can reflect an accumulation of insights, made equally available to all readers.

Is search a form of discrimination?

Is search a type of access with disproportionate (read disadvantageous) impact on some audiences and not others?

Any research on the social class, racial, ethnic impact of search you would suggest?

Manning Leaks — No Real Harm (Database of Government Liars Anyone?)

June 20th, 2017

From the post:

In the seven years since WikiLeaks published the largest leak of classified documents in history, the federal government has said they caused enormous damage to national security.

But a secret, 107-page report, prepared by a Department of Defense task force and newly obtained by BuzzFeed News, tells a starkly different story: It says the disclosures were largely insignificant and did not cause any real harm to US interests.

Regarding the hundreds of thousands of Iraq-related military documents and State Department cables provided by the Army private Chelsea Manning, the report assessed “with high confidence that disclosure of the Iraq data set will have no direct personal impact on current and former U.S. leadership in Iraq.”

The 107 page report, redacted, runs 35 pages. Thanks to BuzzFeed News for prying that much of a semblance of the truth out of the government.

It is further proof that US prosecutors and other federal government representatives lie to the courts, the press and the public, whenever its suits their purposes.

Anyone with transcripts from the original Manning hearings, should identify statements by prosecutors at variance with this report, noting the prosecutor’s name, rank and recording the page/line reference in the transcript.

That individual prosecutors and federal law enforcement witnesses lie is a commonly known fact. What I haven’t seen, is a central repository of all such liars and the lies they have told.

I mention a central repository because to say one or two prosecutors have lied or been called down by a judge grabs a headline, but showing a pattern over decades of lying by the state, that could move to an entirely different level.

Judges, even conservative ones (especially conservative ones?), don’t appreciate being lied to by anyone, including the state.

The state has chosen lying as its default mode of operation.

Let’s help them wear that banner.

Interested?

Concealed Vulnerability Survives Reboots – Consumers Left in Dark

June 19th, 2017

From the post:

Until now, all malware targeting IoT devices survived only until the user rebooted his equipment, which cleared the device’s memory and erased the malware from the user’s equipment.

Intense Internet scans for vulnerable targets meant that devices survived only minutes until they were reinfected again, which meant that users needed to secure devices with unique passwords or place behind firewalls to prevent exploitation.

New vulnerability allows for permanent Mirai infections

While researching the security of over 30 DVR brands, researchers from Pen Test Partners have discovered a new vulnerability that could allow the Mirai IoT worm and other IoT malware to survive between device reboots, permitting for the creation of a permanent IoT botnet.

“We’ve […] found a route to remotely fix Mirai vulnerable devices,” said Pen Test Partners researcher Ken Munro. “Problem is that this method can also be used to make Mirai persistent beyond a power off reboot.”

Understandably, Munro and his colleagues decided to refrain from publishing any details about this flaw, fearing that miscreants might weaponize it and create non-removable versions of Mirai, a malware known for launching some of the biggest DDoS attacks known today.

Do security researchers realize concealing vulnerabilities prevents market forces from deciding the fate of insecure systems?

Should security researchers marketing vulnerabilities to manufacturers be more important than the operation market forces on their products?

More important than your right to choose products based on the best and latest information?

Market forces are at work here, but they aren’t ones that will benefit consumers.

E-Cigarette Can Hack Your Computer (Is Nothing Sacred?)

June 19th, 2017

Kavita Iyer has the details on how an e-cigarette can be used to hack your computer at: Know How E-Cigarette Can Be Used By Hackers To Target Your Computer.

I’m guessing you aren’t so certain that expensive e-cigarette you “found” is harmless after all?

Malware in e-cigarettes seems like a stretch given the number of successful phishing emails every year.

But, a recent non-smoker maybe the security lapse you need.

Key DoD Officials – September 1947 to June 2017

June 19th, 2017

While looking for a particular Department of Defense official, I stumbled on: Department of Defense Key Officials September 1947–June 2017.

Yes, almost seventy (70) years worth of key office holders at the DoD. It’s eighty (80) pages long, produced by the Historical Office of the Secretary of Defense.

One potential use, aside from giving historical military fiction a ring of authenticity, would be to use this as a starting set of entities to trace through the development of the military/industrial complex.

Everyone, including me, refers to the military/industrial complex as though it is a separate entity, over there somewhere.

But as everyone discovered with the Panama Papers, however tangled and corrupt even world-wide organizations can be, we have the technology to untangle those knots and to shine bright lights into obscure corners.

Interested?

June 19th, 2017

For your Monday amusement: Pentagon Official: DoD will be audit ready by end of September by Eric White.

From the post:

In today’s Federal Newscast, the Defense Department’s Comptroller David Norquist said the department has been properly preparing for its deadline for audit readiness.

The Pentagon’s top financial official said DoD will meet its deadline to be “audit ready” by the end of September. DoD has been working toward the deadline for the better part of seven years, and as the department pointed out in its most recent audit readiness update, most federal agencies haven’t earned clean opinions until they’ve been under full-scale audits for several years. But newly-confirmed comptroller David Norquist said now’s the time to start. He said the department has already contracted with several outside accounting firms to perform the audits, both for the Defense Department’s various components and an overarching audit of the entire department.

I’m reminded of the alleged letter by the Duke of Wellington to Whitehall:

Gentlemen,

Whilst marching from Portugal to a position which commands the approach to Madrid and the French forces, my officers have been diligently complying with your requests which have been sent by H.M. ship from London to Lisbon and thence by dispatch to our headquarters.

We have enumerated our saddles, bridles, tents and tent poles, and all manner of sundry items for which His Majesty’s Government holds me accountable. I have dispatched reports on the character, wit, and spleen of every officer. Each item and every farthing has been accounted for, with two regrettable exceptions for which I beg your indulgence.

Unfortunately the sum of one shilling and ninepence remains unaccounted for in one infantry battalion’s petty cash and there has been a hideous confusion as the the number of jars of raspberry jam issued to one cavalry regiment during a sandstorm in western Spain. This reprehensible carelessness may be related to the pressure of circumstance, since we are war with France, a fact which may come as a bit of a surprise to you gentlemen in Whitehall.

This brings me to my present purpose, which is to request elucidation of my instructions from His Majesty’s Government so that I may better understand why I am dragging an army over these barren plains. I construe that perforce it must be one of two alternative duties, as given below. I shall pursue either one with the best of my ability, but I cannot do both:

1. To train an army of uniformed British clerks in Spain for the benefit of the accountants and copy-boys in London or perchance.

2. To see to it that the forces of Napoleon are driven out of Spain.

Wellington

The primary function of any military organization is suppression of the currently designated “enemy.”

Congress should direct the Department of Homeland Security (DHS) to auditing the DoD.

Instead of chasing fictional terrorists, DHS staff would be chasing known to exist dollars and alleged expenses.

TensorFlow 1.2 Hits The Streets!

June 17th, 2017

TensorFlow 1.2

I’m not copying the features and improvement here, better that you download TensorFlow 1.2 and experience them for yourself!

The incomplete list of models at TensorFlow Models:

• attention_ocr: a model for real-world image text extraction.
• autoencoder: various autoencoders.
• cognitive_mapping_and_planning: implementation of a spatial memory based mapping and planning architecture for visual navigation.
• compression: compressing and decompressing images using a pre-trained Residual GRU network.
• differential_privacy: privacy-preserving student models from multiple teachers.
• im2txt: image-to-text neural network for image captioning.
• inception: deep convolutional networks for computer vision.
• learning_to_remember_rare_events: a large-scale life-long memory module for use in deep learning.
• lm_1b: language modeling on the one billion word benchmark.
• namignizer: recognize and generate names.
• neural_gpu: highly parallel neural computer.
• neural_programmer: neural network augmented with logic and mathematic operations.
• next_frame_prediction: probabilistic future frame synthesis via cross convolutional networks.
• object_detection: localizing and identifying multiple objects in a single image.
• real_nvp: density estimation using real-valued non-volume preserving (real NVP) transformations.
• resnet: deep and wide residual networks.
• skip_thoughts: recurrent neural network sentence-to-vector encoder.
• slim: image classification models in TF-Slim.
• street: identify the name of a street (in France) from an image using a Deep RNN.
• swivel: the Swivel algorithm for generating word embeddings.
• syntaxnet: neural models of natural language syntax.
• textsum: sequence-to-sequence with attention model for text summarization.
• transformer: spatial transformer network, which allows the spatial manipulation of data within the network.
• tutorials: models described in the TensorFlow tutorials.
• video_prediction: predicting future video frames with neural advection.

And your TensorFlow model is ….?

Enjoy!

If You Don’t Think “Working For The Man” Is All That Weird

June 17th, 2017

From the post:

Financial services jobs go in and out of fashion. In 2001 equity research for internet companies was all the rage. In 2006, structuring collateralised debt obligations (CDOs) was the thing. In 2010, credit traders were popular. In 2014, compliance professionals were it. In 2017, it’s all about machine learning and big data. If you can get in here, your future in finance will be assured.

J.P. Morgan’s quantitative investing and derivatives strategy team, led Marko Kolanovic and Rajesh T. Krishnamachari, has just issued the most comprehensive report ever on big data and machine learning in financial services.

Titled, ‘Big Data and AI Strategies’ and subheaded, ‘Machine Learning and Alternative Data Approach to Investing’, the report says that machine learning will become crucial to the future functioning of markets. Analysts, portfolio managers, traders and chief investment officers all need to become familiar with machine learning techniques. If they don’t they’ll be left behind: traditional data sources like quarterly earnings and GDP figures will become increasingly irrelevant as managers using newer datasets and methods will be able to predict them in advance and to trade ahead of their release.

At 280 pages, the report is too long to cover in detail, but we’ve pulled out the most salient points for you below.

How important is Sarah’s post and the report by J.P. Morgan?

Let put it this way: Sarah’s post is the first business type post I have saved as a complete webpage so I can clean it up and print without all the clutter. This year. Perhaps last year as well. It’s that important.

Sarah’s post is a quick guide to the languages, talents and tools you will need to start “working for the man.”

It that catches your interest, then Sarah’s post is pure gold.

Enjoy!

PS: I’m still working on a link for the full 280 page report. The switchboard is down for the weekend so I will be following up with J.P. Morgan on Monday next.

The Quartz Directory of Essential Data (Directory of Directories Is More Accurate)

June 17th, 2017

The Quartz Directory of Essential Data

From the webpage:

A curated list of useful datasets published by important sources. Please remember that “important” does not mean “correct.” You should vet these data as you would with any human source.

Switch to the “Data” tab at the bottom of this spreadsheet and use Find (⌘ + F) to search for datasets on a particular topic.

Note: Just because data is useful, doesn’t mean it’s easy to use. The point of this directory is to help you find data. If you need help accessing or interpreting one of these datasets, please reach out to your friendly Quartz data editor, Chris.

Slack: @chris
Email: c@qz.com

A directory of 77 data directories. The breath of organizing topics, health, trade, government, for example, creates a need for repeated data mining by every new user.

A low/no-friction method for creating more specific and re-usable directories has remained elusive.

June 17th, 2017

From the post:

An archive worth knowing about: The Library of Congress and Boston’s WGBH have joined forces to create The American Archive of Public Broadcasting and “preserve for posterity the most significant public television and radio programs of the past 60 years.” Right now, they’re overseeing the digitization of approximately 40,000 hours of programs. And already you can start streaming “more than 7,000 historic public radio and television programs.”

The collection includes local news and public affairs programs, and “programs dealing with education, environmental issues, music, art, literature, dance, poetry, religion, and even filmmaking.” You can browse the complete collection here. Or search the archive here. For more on the archive, read this About page.

If you’d like to support Open Culture and our mission, please consider making a donation to our site. It’s hard to rely 100% on ads, and your contributions will help us provide the best free cultural and educational materials.

Hopeful someone is spinning cable/television content 24 x 7 to archival storage. The ability to research and document, reliably, patterns in shows, advertisements, news reporting, etc., is more important than any speculative copyright interest.

June 17th, 2017

From the post:

BEFORE THE BOOKS ARRIVED, Adam Gopnik, in an effort to be polite, almost contradicted the essential insight of his life. An essayist, critic, and reporter at The New Yorker for the last 31 years, he was asked whether there is an imperative for busy, ambitious journalists to read books seriously—especially with journalism, and not just White House reporting, feeling unusually high-stakes these days—when the doorbell rang in his apartment, a block east of Central Park. He came back with a shipment and said, “It would be,” pausing to think of and lean into the proper word, “brutally unkind and unrealistic to say, Oh, all of you should be reading Stendhal. You’ll be better BuzzFeeders for it.” For the part about the 19th-century French novelist, he switched from his naturally delicate voice to a buffoonish, apparently bookish, baritone.

Then, as he tore open the packaging of two nonfiction paperbacks (one, obscure research for an assignment on Ernest Hemingway; the other, a new book on Adam Smith, a past essay subject) and sat facing a wall-length bookcase and sliding ladder in his heavenly, all-white living room, Gopnik took that back. His instinct was to avoid sermonizing about books, particularly to colleagues with grueling workloads, because time for books is a privilege of his job. And yet, to achieve such an amazingly prolific life, the truth is he simply read his way here.

I spoke with a dozen accomplished journalists of various specialties who manage to do their work while reading a phenomenal number of books, about and beyond their latest project. With journalists so fiercely resented after last year’s election for their perceived elitist detachment, it might seem like a bizarre response to double down on something as hermetic as reading—unless you see books as the only way to fully see the world.

Being well-read is a transcendent achievement similar to training to run 26.2 miles, then showing up for a marathon in New York City and finding 50,000 people there. It is at once superhuman and pedestrian.

… (emphasis in original)

A deeply inspirational and instructive essay on serious readers and the benefits that accrue to them. Very much worth two or more slow reads, plus looking up the authors, writers and reporters who are mentioned.

Earlier this year I began the 2017 Women of Color Reading Challenge. I have not discovered any technical insights into data science or topic maps, but I am gaining, incrementally for sure, a deeper appreciation for how race and gender shapes a point of view.

Or perhaps more accurately, I am encountering points of view different enough from my own that I recognize them as being different. That in and of itself, the encountering of different views, is one reason I aspire to become a “serious reader.”

You?

OpSec Reminder

June 17th, 2017

Catalin Cimpanu covers a hack of the DoD’s Enhanced Mobile Satellite Services (EMSS) satellite phone network in 2014 in British Hacker Used Home Internet Connection to Hack the DoD in 2014.

The details are amusing but the most important part of Cimpanu’s post is a reminder about OpSec:

In a statement released yesterday, the NCA said it had a solid case against Caffrey because they traced back the attack to his house, and found the stolen data on his computer. Furthermore, officers found an online messaging account linked to the hack on Caffrey’s computer.

Caffrey’s OpSec stumbles:

1. Connection traced to his computer (No use of Tor or VPN)
2. Data found on his hard drive (No use of encryption and/or storage elsewhere)
3. Online account used in hack operated from his computer (Again, no use of Tor or VPN)

I’m sure the hack was a clever one but Caffrey’s OpSec was less so. Decidedly less so.

FOIA Success Prediction

June 16th, 2017

From the post:

Many journalists know the feeling: There could be a cache of documents that might confirm an important story. Your big scoop hinges on one question: Will the government official responsible for the records respond to your FOIA request?

Now, thanks to a new project from a data storage and analysis company, some of the guesswork has been taken out of that question.

Want to know the chances your public records request will get rejected? Plug it into FOIA Predictor, a probability analysis web application from Data.World, and it will provide an estimation of your success based on factors including word count, average sentence length and specificity.

Accuracy?

Best way to gauge that is experience with your FOIA requests.

Try starting at MuckRock.com.

Enjoy!

Man Bites Dog Or Shoots Member of Congress – Novelty Rules the News

June 15th, 2017

The need for “novelty” in a 24 x 7 news cycle, identified by Lewis and Marwick in Megyn Kelly fiasco is one more instance of far right outmaneuvering media comes to the fore in coverage of the recent shooting reported in Capitol Hill shaken by baseball shooting.

Boiled down to the essentials, James Hodgkinson, 66, of Illinois, who is now dead, wounded “House Majority Whip Steve Scalise (R-La.) and four others in the Washington suburb of Alexandria, Va.,” on June 14, 2017. The medical status of the wounded vary from critical to released.

That’s all the useful information, aside from identification of the victims, that can be wrung from that story.

Not terribly useful information, considering Hodgkinson is dead and so not a candidate for a no-fly/sell list.

But you will read column inch after column inch of non-informative comments by and between special interest groups, “experts,” and even experienced political reporters, on a one-off event.

A per capita murder rate of 5 per 100,000, works out to 50 murderers per million people. Approximately 136 million people voted in the 2016 election so 50 x 136 means 6800 people who will commit murder this year voted in the 2016 election. (I’m assuming 1 murderer per murder, which isn’t true but it does simplify the calculation.)

One of those 6800 people (I could have used shootings per capita for an even larger number) shot a member of Congress.

Will this story, plus or minus hand wringing, accusations, counter-accusations, etc., change your routine tomorrow? Next week? Your plans for this year?

All I see is novelty and no news.

You?

PS: Identifying the “novelty” of this story did not require a large research/fact-checking budget. What it did require is a realization that everyone is talking about the shooting of a member of congress means only “everyone is talking about….” Whether that is just a freakish event or genuine news, requires deeper inquiry.

One nutter shoots a member of Congress, man bites dog, novelty, not news. Organization succeeds in killing 3rd member of Congress, that looks like news. Pattern, behavior, facts, goals, etc.

The Media and Far Right Trolls – Mutual Reinforcing Exploitation (MRE)

June 15th, 2017

The Columbia Journalism Review (CJR) normally has great headlines but the editors missed a serious opportunity with: Megyn Kelly fiasco is one more instance of far right outmaneuvering media by Becca Lewis and Alice Marwick.

Lewis and Marwick capture the essential facts and then lose their key insights in order to portray “the media” (whoever that is) as a victim of far right trolls.

Indeed, research suggests that even debunking falsehoods can reinforce and amplify them. In addition, if a media outlet declines to cover a story that has widely circulated in the far-right and mainstream conservative press, it is accused of lying and promoting a liberal agenda. Far-right subcultures are able to exploit this, using the media to spread ideas and target potential new recruits.

A number of factors make the mainstream media susceptible to manipulation from the far-right. The cost-cutting measures instituted by traditional newspapers since the 1990s have resulted in less fact-checking and investigative reporting. At the same time, there is a constant need for novelty to fill a 24/7 news cycle driven by cable networks and social media. Many of those outlets have benefited from the new and increased partisanship in the country, meaning there is now more incentive to address memes and half-truths, even if it’s only to shoot them down.

Did you catch them? The key insights/phrases?

1. “…declines to cover a story that has widely circulated…it is accused of lying and promoting a liberal agenda…”
2. “…less fact-checking and investigative reporting…”
3. “…constant need for novelty to fill a 24/7 news cycle driven by cable networks and social media…”

Declining to Cover a Story

“Far-right subcultures” don’t exploit “the media” with just any stories, they are “…widely circulated…” stories. That is “the media” is being exploited over stories it carries out of fear of losing click-through advertising revenue. If a story is “widely circulated,” it attracts reader interest, page-views, click-throughs and hence, is news.

Less Fact-Checking and Investigative Reporting

Lewis and Marwick report the decline in fact-checking and investigative reporting as fact but don’t connect it to “the media” carrying stories promoted by “far-right subcultures.” Even if fact-checking and investigative reporting were available in abundance, for every story, given enough public interest (read “…widely circulated…”), is any editor going to decline a story of wide spread interest? (BTW, who chose to reduce fact-checking and investigative reporting? It wasn’t “far-right subcultures” choosing for “the media.”

Constant Need for Novelty

The “…constant need for novelty…” and its relationship to producing income for “the media” is best captured by the following dialogue from Santa Claus (1985)

How can I tell all the people
How do I do that?
In my line,
television works best.
Oh, I know! Those little picture
box thingies? Can we get on those?
With enough money, a horse in a
hoop skirt can get on one of those.

In the context of Lewis and Marwick, far-right subculture news is the “horse in a hoop skirt” of the dialogue. It’s a “horse in a hoop skirt” that is generating page-views and click-through rates.

I’m partial to my headline but the CJR aims at a more literary audience, I would suggest:

The Media and Far Right Trolls – Imitating Alessandro and Napoleone

Alessandro and Napoleone, currently residents of Hell, are described in Canto 32 of the Inferno (Dante, Ciardi translation) as follows:

and at my feet I saw two clamped together

together, “Who are you,” I said, “who lie
so tightly breast to breast?” They strained their necks

the tears their eyes had managed to contain
up to that time gushed out, and the cold froze them
between the lids, sealing them shut again

tighter than any clamp grips wood to wood,
like billy-goats in a sudden savage mood.

“The media” now reports its “butting heads” with “far-right subcultures,” generating more noise, in addition to reports of non-fact-checked but click-stream revenue producing right-wing fantasies.

Tails 3.0 is out (Don’t be a Bank or the NHS, Upgrade Today)

June 13th, 2017

Tails 3.0 is out

From the webpage:

We are especially proud to present you Tails 3.0, the first version of Tails based on Debian 9 (Stretch). It brings a completely new startup and shutdown experience, a lot of polishing to the desktop, security improvements in depth, and major upgrades to a lot of the included software.

Debian 9 (Stretch) will be released on June 17. It is the first time that we are releasing a new version of Tails almost at the same time as the version of Debian it is based upon. This was an important objective for us as it is beneficial to both our users and users of Debian in general and strengthens our relationship with upstream:

• Our users can benefit from the cool changes in Debian earlier.
• We can detect and fix issues in the new version of Debian while it is still in development so that our work also benefits Debian earlier.

This release also fixes many security issues and users should upgrade as soon as possible.

Upgrade today, not tomorrow, not next week. Today!

Don’t be like banks and NHS and run out-dated software.

• barring civil liability for
• decriminalizing
• prohibiting insurance coverage for damages due to

hacking of out-dated software.

Management will develop an interest in software upgrade policies.

Power Outage Data – 15 Years Worth

June 13th, 2017

From the post:

This database details 15 years of power outages across the United States, compiled and standardized from annual data available at from the Department of Energy.

For an explanation of what it means, how it came about, and how we got here, listen to this conversation between Inside Energy Reporter Dan Boyce and Data Journalist Jordan Wirfs-Brock:

You can also view the data as a Google Spreadsheet (where you can download it as a CSV). This version of the database also includes information about the amount of time it took power to be restored, the demand loss in megawatts, the NERC region, (NERC refers to the North American Electricity Reliability Corporation, formed to ensure the reliability of the grid) and a list of standardized tags.

The data set isn’t useful for tactical information, the submissions are too general to replicate the events leading up to an outage.

On the other hand, identifiable outage events, dates, locations, etc., do make recovery of tactical data from grid literature a manageable search problem.

Enjoy!

Electric Grid Threats – Squirrels 952 : CrashOverride 1 (maybe)

June 13th, 2017

If you are monitoring cyberthreats to the electric grid, compare the teaser document, Crash Override: Analysis of the Treat to Electric Grid Operators from Dragos, Inc. to the stats at CyberSquirrel1.com:

I say a “teaser” documents because the modules of greatest interest include: “This module was unavailable to Dragos at the time of publication” statements (4 out of 7) and:

If you are a Dragos, Inc. customer, you will have already received the more concise and technically in-depth intelligence report. It will be accompanied by follow-on reports, and the Dragos team will keep you up-to-date as things evolve.

If you have a copy of Dragos customer data on CrashOverride, be a dear and publish a diff against this public document.

Inquiring minds want to know. 😉

If you are planning to mount/defeat operations against an electric grid, a close study CyberSquirrel1.com cases will be instructive.

Creating and deploying grid damaging malware remains a challenging task.

Training an operative to mimic a squirrel, not so much.

FreeDiscovery

June 12th, 2017

FreeDiscovery: Open Source e-Discovery and Information Retrieval Engine

From the webpage:

FreeDiscovery is built on top of existing machine learning libraries (scikit-learn) and provides a REST API for information retrieval applications. It aims to benefit existing e-Discovery and information retrieval platforms with a focus on text categorization, semantic search, document clustering, duplicates detection and e-mail threading.

In addition, FreeDiscovery can be used as Python package and exposes several estimators with a scikit-learn compatible API.

Python 3.5+ required.

Homepage has command line examples, with a pointer to: http://freediscovery.io/doc/stable/examples/ for more examples.

The additional examples use a subset of the TREC 2009 legal collection. Cool!

I saw this in a tweet by Lynn Cherny today.

Enjoy!