June « 2015 « Another Word For It

June 4, 2015

TPP – Just One of Many Government Secrets

Filed under: Business Intelligence,Government,Transparency — Patrick Durusau @ 8:22 am

The Trans-Pacific Partnership is just one of many government secrets.

Reading Army ELA: Weapon Of Mass Confusion? by Kevin McLaughlin, I discovered yet another.

From the post:

…
As DISA and VMware work on a new JELA proposal, sources familiar with the matter said the relationship between the two is coming under scrutiny from other enterprise vendors. What’s more, certain details of the JELA remain shrouded in secrecy.

DISA’s JELA document contains several large chunks of redacted text, including one entire section titled “Determination Of Fair And Reasonable Cost.”

In other parts, DISA has redacted specific figures, such as VMware’s percentage of the DOD’s virtualized environments and the total amount the DOD has invested in VMware software licenses. The redacted portions have fueled industry speculation about why these and other aspects of the contract were deemed unfit for the eyes of the public.

DISA’s rationale for awarding the Army ELA and DOD JELA to VMware without opening it up to competition is also suspect, one industry executive who’s been tracking both deals told CRN. “Typically, the justification for sole-sourcing contracts to a vendor is that they only cover maintenance, yet these contracts obviously weren’t maintenance-only,” said the source.
…

The situation is complex but essentially the Army signed a contract with VMware that resulted in the Army downloading suites of software when it only wanted one particular part of the suite, but the Army was billed for maintenance cost on the entire suite.

That appears to be what was specified in the VMware ELA, which should be a motivation to using topic maps in connection with government contracts.

Did that go by a little fast? The jump from the VMware ELA to topic maps?

Think about it. The “Army” didn’t really sign a contract with “VMware” anymore than “VMware” signed a contract with the “Army.”

No, someone in particular, a nameable individual or group of nameable individuals, had meetings, reviews, and ultimately decided to agree to the contract between the “Army” and “VMware.” All of those individuals has roles in the proceedings that resulted in the ELA in question.

Yet, when it comes time to discuss the VMware ELA, the best we can do is talk about it as though these large organizations acted on their own. The only named individual who might be in some way responsible for the mess is the Army’s current CIO, Lt. Gen. Robert S. Ferrell, and he got there after the original agreement but before its later extension.

Topic maps, since we don’t have to plot the domain before we start uncovering relationships and roles, could easily construct a history of contacts (email, phone, physical meetings), aligned with documents (drafts, amendments), of all the individuals on all sides of this sorry episode.

Topic maps can’t guarantee that the government, the DOD in this case, won’t make ill-advised contracts in the future. No software can do that. What topic maps can do is trace responsibility for such contracts to named individuals. Having an accountable government means having accountable government employees.

PS: How does your government, agency, enterprise track responsibility?

PPS: Topic maps could also trace, given the appropriate information, who authorized the redactions to the DISA JELA. The first person who should be on a transport as a permanent advisor in Syria. Seriously. Remember what I said about accountable government requiring accountable employees.

Comments Off

Cultural Heritage Markup (Pre-Balisage)

Filed under: Archives,Cultural Anthropology,History — Patrick Durusau @ 7:39 am

Cultural Heritage Markup Balisage, Monday, August 10, 2015.

Do you remember visiting your great-aunt’s house? Where everything looked like museum pieces and the smell was worse than your room every got? And all the adults has strained smiles and said how happy they were to be there?

Well, cultural heritage markup isn’t like that. All the real cultural heritage stuff we have maiden aunts and Norwegian bachelor uncles to take care of that stuff. This pre-Balisage workshop is working with markup and is a lot more fun!

Hugh Cayless, Duke University introduces the workshop:

Cultural heritage materials are remarkable for their complexity and heterogenity. This often means that when you’ve solved one problem, you’ve solved one problem. Arrayed against this difficulty, we have a nice big pile of tools and technologies with an alphabet soup of names like XML, TEI, RDF, OAIS, SIP, DIP, XIP, AIP, and BIBFRAME, coupled with a variety of programming languages or storage and publishing systems. All of our papers today address in some way the question of how you deal with messy, complex, human data using the available toolsets and how those toolsets have to be adapted to cope with our data. How do you avoid having your solution dictated by the tools available? How do you know when you’re doing it right? Our speakers are all trying, in various ways, to reconfigure their tools or push past those tools’ limitations, and they are going to tell us how they’re doing it.

A large number of your emails, tweets, webpages, etc. are destined to be “cultural heritage” (phone calls too if the NSA has anything to say about it) so you better get on the cultural heritage markup train today!

Comments Off

June 3, 2015

Experiment proves Reality does not exist until it is Measured [Nor Do Topics]

Filed under: Physics,Topic Maps — Patrick Durusau @ 3:35 pm

Experiment proves Reality does not exist until it is Measured

From the post:

The bizarre nature of reality as laid out by quantum theory has survived another test, with scientists performing a famous experiment and proving that reality does not exist until it is measured.

Physicists at The Australian National University (ANU) have conducted John Wheeler’s delayed-choice thought experiment, which involves a moving object that is given the choice to act like a particle or a wave. Wheeler’s experiment then asks — at which point does the object decide?

Common sense says the object is either wave-like or particle-like, independent of how we measure it. But quantum physics predicts that whether you observe wave like behavior (interference) or particle behavior (no interference) depends only on how it is actually measured at the end of its journey. This is exactly what the research team found.

“It proves that measurement is everything. At the quantum level, reality does not exist if you are not looking at it,” said Associate Professor Andrew Truscott from the ANU Research School of Physics and Engineering.
…

The results are more of an indictment of “common sense” than startling proof that “reality does not exist if you are not looking at it.”

In what sense would “reality” exist if you weren’t looking at it?

It is well known that what we perceive as motion, distance, sensation are all constructs that are being assembled by our brains based upon input from our senses. Change those senses or fool them and the “displayed” results are quite different.

If you doubt either of those statements, try your hand at the National Geographic BrainGames site.

Topics as you recall, represent all the information we know about a particular subject.

So, in what sense does a topic not exist until we look at it?

Assuming that you have created your topic map in a digital computer, where would you point to show me your topic map? The whole collection of topics. Or a single topic for that matter?

In order to point to a topic, you have to query the topic map. That is you have to ask to “see” the topic in question.

When displayed, that topic may have information that you don’t remember entering. In fact, you may be able to prove you never entered some of the information displayed. Yet, the information is now being displayed before you.

Part of the problem arises because for convenience sake, we often think of computers as storing information as we would write it down on a piece of paper. But the act of displaying information by a computer is a transformation of its storage of information into a format that is easier for us to perceive.

A transformation process underlies the display of a topic, well, depending upon the merging rules for your topic map. It is always possible to ask a topic map to return a set of topics that match a merging criteria but that again is your “looking at” a requested set of topics and not in any way “the way the topics are in reality.”

One of the long standing problems in semantic interoperability is the insistence of every solution that it has the answer if everyone else would just listen and abandon their own solutions.

Yes, yes that would work but thus far, after over 6,000 years of recorded, different systems for semantics (languages, both natural and artificial) that has never happened. I take that as some evidence that a universal solution isn’t going to happen.

What I am proposing is that topics, in a topic map, have the shape and content designed by an author and/or as requested by a user. That is the result of a topic map is always a question of “what did you ask” and not some preordained answer.

As I said, that isn’t likely to come up early in your use of topic maps but it could be invaluable for maintenance and processing of a topic map.

I am working on several examples to illustrate this idea and hope to have one or more of them posted tomorrow.

Comments Off

Clear Practice Lock

Filed under: Humor,Security — Patrick Durusau @ 2:48 pm

Clear Practice Lock

Violet Blue mentioned wanting one of these on Twitter the other day.

From the description:

Aspiring lockpickers often find themselves in a frustrating situation: Armed with all the practical knowledge that the internet and locksport community can provide, they still have to overcome the frustrations of picking an actual lock. In those cases, the Clear Practice Lock is an invaluable tool for novices, allowing them to see their developing skills in action.

The bible of the lock is made of clear plastic, allowing the lockpicker to observe the manipulations of the pins while practicing his/her lockpicking techniques.

Uses standard pins and an SC1 keyway

Clear plastic bible

Includes matching key

Great for teaching & learning locksmithing and lockpicking

This would be a real conversation starter at security conferences.

Not to mention a distraction while you pwn someone’s laptop. 😉

Comments Off

Mapping the History of L.A.’s Notorious Sprawl

Filed under: Mapping,Maps — Patrick Durusau @ 2:27 pm

Mapping the History of L.A.’s Notorious Sprawl by Betsy Mason.

From the post:

(Apologies for the distortion, the map really needs a full page and to be seen when interactive.)

From the post:

THE SPRAWLING BUILTSCAPE of Los Angeles always seems to have people there riled up in one way or another. Lately there are rumblings about “classic” L.A. homes being displaced by bigger, more modern houses, changing the face of established neighborhoods. Even people with enormous mansions are complaining about the enormouser mansions people are building next door. And this is just one of the ongoing storylines in an ever-morphing city.

Now, urban designer Omar Ureta has created an interactive map to help tell some of these stories. His Built:LA project shows the ages of almost every existing building in the city, and can break them down by decade to reveal how the city has grown over time (works best in Chrome or Firefox).

“There’s so much discussion going on right now in how L.A. is urbanizing, I wanted to create a tool that could contribute to the dialogue,” Ureta, who moved to L.A. nine years ago from the Inland Empire, told me in an email. “I’m excited that the map is actually making people ask more questions about their neighborhood, their city and the whole region.”
…

Ureta’s combining of data from a variety of sources enables users to peel back layers of construction in L.A. Makes me curious about forward looking “what-if” maps based on local history of development?

This project should be an inspiration for either historical or future projecting maps of urban construction.

Comments Off

Neural Networks and Deep Learning

Filed under: Deep Learning,Machine Learning — Patrick Durusau @ 2:02 pm

Neural Networks and Deep Learning by Michael Nielsen.

From the webpage:

Neural Networks and Deep Learning is a free online book. The book will teach you about:

Neural networks, a beautiful biologically-inspired programming paradigm which enables a computer to learn from observational data
Deep learning, a powerful set of techniques for learning in neural networks

Neural networks and deep learning currently provide the best solutions to many problems in image recognition, speech recognition, and natural language processing. This book will teach you the core concepts behind neural networks and deep learning.

The book is currently an incomplete beta draft. More chapters will be added over the coming months. For now, you can:

Read Chapter 1, which explains how neural networks can learn to recognize handwriting
Read Chapter 2, which explains backpropagation, the most important algorithm used to learn in neural networks.
Read Chapter 3, which explains many techniques which can be used to improve the performance of backpropagation.
Read Chapter 4, which explains why neural networks can compute any function.
Learn more about the approach taken in this book

Michael starts off with a task that we all mastered as small children, recognizing hand written digits. Along the way, you will learn not just the mechanics of how the characters are recognized but why neural networks work the way they do.

Great introductory material to pass along to a friend.

Comments (1)

Financial sector takes up to 176 days to patch security flaws

Filed under: Cybersecurity,Security — Patrick Durusau @ 1:42 pm

Financial sector takes up to 176 days to patch security flaws by Charles Oborne.

From the post:

The financial industry takes an average of 176 days to patch security problems, a new analysis of reported vulnerabilities reveals.

Cybersecurity threat prediction and remediation firm NopSec has released an analysis of over 65,000 security vulnerabilities recorded across two decades. The report, titled “2015 State of Vulnerability Risk Management,” reveals that key security issues and known vulnerabilities are being overlooked by the enterprise — and it takes far too long to patch problems as they surface.

The report analyzed over 65,000 vulnerabilities logged within the National Vulnerability Database, a US government repository of standards-based vulnerability management data which includes security related software flaws, misconfigurations, product names, and impact metrics.
…

Direct link to: 2015 State of Vulnerability Risk Management.

Digging into the report you will find this jewel:

Most alarming was the financial industry with over 30% of vulnerabilities taking more than a year to fix from the time they were detected.

This is another example of a report that lacks “business intelligence.” Let’s assume that we want to fix all vulnerabilities within 30 days of discovery. We could just adopt that as corporate policy but what would that take to become a corporate reality?

Among other things (this isn’t exhaustive), we need to discover if we need additional personnel for remediation (plus the payroll to pay them), the probable costs of software/training, etc. for remediation, the current costs that we incur without the additional personnel and expenses, etc.

It could take two years or more to fix some vulnerabilities but if the cost of fixing them exceeds the cost of having the vulnerability, can you guess which strategy most business leaders will choose?

Vulnerabilities don’t exist in a vacuum, separate and apart from all other enterprise activities and goals. Like you I would have them outrank other priorities but someone has to make sure the business turns a profit. Not simply is secure. Unprofitable and secure is a bad state.

Comments Off

The Empire Strikes Back Apple…

Filed under: Cybersecurity,Mac OS X — Patrick Durusau @ 1:14 pm

The Empire Strikes Back Apple – how your Mac firmware security is completely broken

From the post:

If you are a rootkits fan the latest Chaos Communication Congress (CCC) in 2014 brought us two excellent presentations, Thunderstrike by Trammell Hudson and Attacks on UEFI security, inspired by Darth Venami’s misery and Speed Racer by Rafal Wojtczuk and Corey Kallenberg.

The first one was related to the possibility to attack EFI from a Thunderbolt device, and the second had a very interesting vulnerability regarding the UEFI boot script table. The greatest thing about the second vulnerability is that it allows to unlock flash protections by modifying the boot script executed after a S3 suspend-resume cycle.

Dmytro Oleksiuk aka Cr4sh released proof of concept code regarding this attack against an Intel DQ77KB motherboard. His very interesting blog post is “Exploiting UEFI boot script table vulnerability”. You should definitely read it.

And a bit further on:

…
What is that hole after all? Is Dark Jedi hard to achieve on Macs?
No, it’s extremely easy because Apple does all the dirty work for you. What the hell am I talking about?
Well, Apple’s S3 suspend-resume implementation is so f*cked up that they will leave the flash protections unlocked after a suspend-resume cycle. !?#$&#%&!#%&!#

And you ask, what the hell does this mean? It means that you can overwrite the contents of your BIOS from userland and rootkit EFI without any other trick other than a suspend-resume cycle, a kernel extension, flashrom, and root access.

Wait, am I saying Macs EFI can be rootkitted from userland without all the tricks from Thunderbolt that Trammell presented? Yes I am! And that is one hell of a hole :-).
…
(emphasis in the original)

The post continues with a detailed explanation of the security hole. Which may not exist on “mid/late 2014 machines and newer” according to the post.

A very impressive amount of work and knowledge of Mac OS X went into this report.

I have never seen a Mac in a bank, which as Willie Sutton observed: ” that’s where the money is.”

Perhaps times are changing. It has been reported that despite a declining PC market, Apple sales continue to rise. (1st quarter of 2015)

Perhaps Macs are evolving into a worth while target. Unlike turning off someone’s iPhone. Annoying at worse. A song that when played encrypts local storage would be a far more serious problem.

Comments Off

W3C Validation Tools – New Location

Filed under: CSS3,HTML,HTML5,W3C,WWW — Patrick Durusau @ 10:52 am

W3C Validation Tools

The W3C graciously hosts the following free validation tools:

CSS Validator – Checks your Cascading Style Sheets (CSS)

Internationalization Checker – Checks level of internationalization-friendliness.

Link Checker – Checks your web pages for broken links.

Markup Validator – Checks the markup of your Web documents (HTML or XHTML).

RSS Feed Validator – Validator for syndicated feeds. (RSS and Atom feeds)

RDF Validator – Checks and visualizes RDF documents.

Unicorn – Unified validator. HTML, CSS, Links & Mobile.

Validator.nu Checks HTML5.

I mention that these tools are free to emphasize there is no barrier to their use.

Just as you wouldn’t submit a research paper with pizza grease stains on it, use these tools to proof draft standards before you submit them for review.

Comments Off

NoSQL Now! 2015

Filed under: Conferences,NoSQL — Patrick Durusau @ 10:30 am

NoSQL Now! 2015

There is a strong graph track but if your interests lie elsewhere, you won’t be disappointed!

BTW, register by July 17, 2015 for a 20% discount off the standard price. (That gets the full event below $500. For three days in San Jose? That’s a real bargain.)

Comments Off

Foreign Intelligence Gathering Laws

Filed under: Government,Intelligence,Law,Law - Sources,Privacy — Patrick Durusau @ 10:14 am

Foreign Intelligence Gathering Laws by Peter Roudik, Director of Legal Research, Law Library of Congress.

From the description:

This report contains information on laws regulating the collection of intelligence in the European Union, United Kingdom, France, Netherlands, Portugal, Romania, and Sweden. The report details how EU Members States control activities of their intelligence agencies and what restrictions are imposed on information collection. All EU Member States follow EU legislation on personal data protection, which is a part of the common European Union responsibility.

To the extent that you think intelligence services obey laws or if you need statute and case citations for rhetorical purposes, for the countries covered this report will be quite handy.

Whether you are in the United States or one of the countries listed in this report or elsewhere, your default assumption should be that you are under surveillance and the record light is on.

Comments Off

Journal of Cybersecurity

Filed under: Cybersecurity — Patrick Durusau @ 9:58 am

Journal of Cybersecurity

From the description:

The newly launched fully open access Journal of Cybersecurity publishes accessible articles describing original research in the inherently interdisciplinary world of computer, systems, and information security. The journal is premised on the belief that computer science-based approaches, while necessary, are not sufficient to tackle cybersecurity challenges. Instead, scholarly contributions from a range of disciplines are needed to understand the human aspects of cybersecurity. Journal of Cybersecurity provides a hub around which the interdisciplinary cybersecurity community can form, and is committed to providing quality conceptual and empirical research, as well as scholarship, that is grounded in real-world implications and solutions.

From the first editorial:

The Journal of Cybersecurity (JCS) is a new, open-access publication from Oxford University Press, developed specifically to deliver a venue that can bridge the many different disciplines and specialisms in information security. Future successes in cybersecurity policy and practice will depend on dialogue, knowledge transfer, and collaboration.

JCS publishes accessible articles describing original research in the inherently interdisciplinary world of computer, systems, and information security. JCS is premised on the belief that computer science-based approaches, while necessary, are not sufficient to tackle cybersecurity challenges. Instead, scholarly contributions from a range of disciplines are needed to understand the human aspects of cybersecurity. JCS provides a hub around which the interdisciplinary cybersecurity community can form. JCS is committed to providing quality empirical research, as well as scholarship, that is grounded in real-world implications and solutions. It will appeal to academics and researchers in security and related fields, senior security managers in industry, and policy-makers in government.

JCS will initially publish research on the following aspects of cybersecurity: anthropological and cultural studies; computer science and security, including mathematical and systems perspectives; security and crime science; cryptography and associated topics; security economics; human factors and psychology; law and regulation; political and policy perspectives; strategy and international relations; and privacy.

I first saw this in A New Journal — Dedicated to Cybersecurity by Susan Landau. Editors in chief are David Pym and Tyler Moore.

Take note that the Journal of Cybersecurity is an open-access journal. Rather unlike the FBI cybersecurity database where you can obtain access only if the FBI decides you are trustworthy. Given their history, you have to wonder what standard they are using for “trustworthy.” We used to call that “security by obscurity.” Never worked, at least not well.

Glad to see JCS is taking an open approach to cybersecurity. The more we all know, the more we can protect ourselves and others.

PS: Yes, I am aware of the identifier conflict of JCS as in Journal of Cybersecurity and JCS as in the Journal of Cuneiform Studies. I know that was the first thing that came to mind when you saw JCS.

Personally I would use a scoping property set to cybersecurity for one and ancient near eastern studies for the other, just as a precaution against confused search results.

Another difference is that the Journal of Cybersecurity is open access and the Journal of Cuneiform Studies is not. You wouldn’t want just anyone reading: Back to the Cedar Forest: The Beginning and End of Tablet V of the Standard Babylonian Epic of Gilgameš (pp. 69-90) F. N. H. Al-Rawi and A. R. George.

I suppose some organizations learn more quickly than others.

Comments Off

June 2, 2015

New Testament Virtual Manuscript Room

Filed under: Bible,Humanities,Manuscripts — Patrick Durusau @ 6:39 pm

New Testament Virtual Manuscript Room

From the webpage:

This site is devoted to the study of Greek New Testament manuscripts. The New Testament Virtual Manuscript Room is a place where scholars can come to find the most exhaustive list of New Testament manuscript resources, can contribute to marking attributes about these manuscripts, and can find state of the art tools for researching this rich dataset.

While our tools are reasonably functional for anonymous users, they provide additional features and save options once a user has created an account and is logged in on the site. For example, registered users can save transcribed pages to their personal account and create personalized annotations to images.

A close friend has been working on this project for several years. Quite remarkable although I would prefer it to feature Hebrew (and older) texts. 😉

Comments Off

USA Freedom Act Passes:… [Questions for the EFF]

Filed under: Government,Privacy — Patrick Durusau @ 6:07 pm

USA Freedom Act Passes: What We Celebrate, What We Mourn, and Where We Go From Here

From the post:

The Senate passed the USA Freedom Act today by 67-32, marking the first time in over thirty years that both houses of Congress have approved a bill placing real restrictions and oversight on the National Security Agency’s surveillance powers. The weakening amendments to the legislation proposed by NSA defender Senate Majority Mitch McConnell were defeated, and we have every reason to believe that President Obama will sign USA Freedom into law. Technology users everywhere should celebrate, knowing that the NSA will be a little more hampered in its surveillance overreach, and both the NSA and the FISA court will be more transparent and accountable than it was before the USA Freedom Act.

It’s no secret that we wanted more. In the wake of the damning evidence of surveillance abuses disclosed by Edward Snowden, Congress had an opportunity to champion comprehensive surveillance reform and undertake a thorough investigation, like it did with the Church Committee. Congress could have tried to completely end mass surveillance and taken numerous other steps to rein in the NSA and FBI. This bill was the result of compromise and strong leadership by Sens. Patrick Leahy and Mike Lee and Reps. Robert Goodlatte, Jim Sensenbrenner, and John Conyers. It’s not the bill EFF would have written, and in light of the Second Circuit’s thoughtful opinion, we withdrew our support from the bill in an effort to spur Congress to strengthen some of its privacy protections and out of concern about language added to the bill at the behest of the intelligence community.

Even so, we’re celebrating. We’re celebrating because, however small, this bill marks a day that some said could never happen—a day when the NSA saw its surveillance power reduced by Congress. And we’re hoping that this could be a turning point in the fight to rein in the NSA.

For years, the larger EFF community has proven itself capable of fighting bad legislation that would hamper rights and freedoms online, with the clearest example being the 2012 annihilation of the Internet blacklist legislation SOPA. Lawmakers have feared that technology users—organized, politically-savvy, articulate, and educated about the law and its effects on tech—would strike out to stop their misguided legislative efforts. But for all our many victories in stopping bad legislation, we have struggled to pass bills that would better protect our freedoms. Passing a bill is far more difficult than simply killing a bad bill, and takes more sustained pressure from the public, a massive publicity campaign around a central issue, deep connections to lawmakers, and the coordination of diverse groups from across the political spectrum.

The USA Freedom Act shows that the digital rights community has leveled up. We’ve gone from just killing bad bills to passing bills that protect people’s rights.

…

Surprising that Congress went as far as it did, but I have some questions for the EFF.

What makes you think the NSA or any other part of the intelligence apparatus will follow the law passed by Congress?

We already know that the director of national intelligence was willing to lie to Congress and that other illegal activity has been concealed from Congress (and the public) for years.

Why does the EFF suddenly have confidence that known lawbreakers, who proclaim they aren’t lying, this time, should be taken at face value?

It is altogether possible that telcos will indeed store data, the NSA and others will their parts in requesting data, all while still collecting all data, legal and illegal.

I’m not a conspiracy theorist but we do have documented cases of the intelligence community doing that very thing. “It’s for the good of the country” and all that delusional crap.

I see nothing to celebrate unless and until Congress defunds and dismantles under supervision of random members of the public and press all of the current intelligence apparatus.

Did the US intelligence apparatus foresee the overthrow of the Shah of Iraq? The assassination of Sadat? Or Rabin? Or the fall of the Berlin Wall? 9/11? The American public should be asking what we are getting for all the effort of the US intelligence services? So far, loss of privacy, a lot of insecurity and paper waste.

(No, I don’t credit stories whispered in secret by Saigon warriors, not at all.)

Update: There are a variety of others who have reached the same conclusion:

Why It’s Quite Premature to Celebrate the Death of the Surveillance State by Norman Solomon.

USA Freedom Act gives NSA everything it wants — and less by Joshua Kopstein. (a “nothingburger for the privacy community”)

How The USA Freedom Act is Actually Reducing Freedom in America by Virgil Vaduva.

Comments Off

Side by Side with Elasticsearch and Solr: Performance and Scalability

Filed under: ElasticSearch,Solr — Patrick Durusau @ 3:43 pm

Side by Side with Elasticsearch and Solr: Performance and Scalability by Mick Emmett.

From the post:

Back by popular demand! Sematext engineers Radu Gheorghe and Rafal Kuc returned to Berlin Buzzwords on Tuesday, June 2, with the second installment of their “Side by Side with Elasticsearch and Solr” talk. (You can check out Part 1 here.)

Elasticsearch and Solr Performance and Scalability

This brand new talk — which included a live demo, a video demo and slides — dove deeper into into how Elasticsearch and Solr scale and perform. And, of course, they took into account all the goodies that came with these search platforms since last year. Radu and Rafal showed attendees how to tune Elasticsearch and Solr for two common use-cases: logging and product search. Then they showed what numbers they got after tuning. There was also some sharing of best practices for scaling out massive Elasticsearch and Solr clusters; for example, how to divide data into shards and indices/collections that account for growth, when to use routing, and how to make sure that coordinated nodes don’t become unresponsive.

Video is coming soon, and in the meantime please enjoy the slides:

After you see the presentation and slides (parts 1 and 2), you will understand the “popular demand” for these authors.

The best comparison of Elasticsearch and Solr that you will see this year. (Unless the presenters update their presentation before the end of the year.)

Comments Off

Relevant Search

Filed under: ElasticSearch,Relevance,Search Engines,Solr — Patrick Durusau @ 3:21 pm

Relevant Search – With examples using Elasticsearch and Solr by Doug Turnbull and John Berryman.

From the webpage:

Users expect search to be simple: They enter a few terms and expect perfectly-organized, relevant results instantly. But behind this simple user experience, complex machinery is at work. Whether using Solr, Elasticsearch, or another search technology, the solution is never one size fits all. Returning the right search results requires conveying domain knowledge and business rules in the search engine’s data structures, text analytics, and results ranking capabilities.

Relevant Search demystifies relevance work. Using Elasticsearch, it teaches you how to return engaging search results to your users, helping you understand and leverage the internals of Lucene-based search engines. Relevant Search walks through several real-world problems using a cohesive philosophy that combines text analysis, query building, and score shaping to express business ranking rules to the search engine. It outlines how to guide the engineering process by monitoring search user behavior and shifting the enterprise to a search-first culture focused on humans, not computers. You’ll see how the search engine provides a deeply pluggable platform for integrating search ranking with machine learning, ontologies, personalization, domain-specific expertise, and other enriching sources.

…

Creating a foundation for Lucene-based search (Solr, Elasticsearch) relevance internals

Bridging the field of Information Retrieval and real-world search problems

Building your toolbelt for relevance work

Solving search ranking problems by combining text analysis, query building, and score shaping

Providing users relevance feedback so that they can better interact with search

Integrating test-driven relevance techniques based on A/B testing and content expertise

Exploring advanced relevance solutions through custom plug-ins and machine learning

…

Now imagine relevancy searching where a topic map contains multiple subject identifications for a single subject, from different perspectives.

Relevant Search is in early release but the sooner you participate, the fewer errata there will be in the final version.

Comments Off

Statistical and Mathematical Functions with DataFrames in Spark

Filed under: Data Frames,Python,Spark — Patrick Durusau @ 2:59 pm

Statistical and Mathematical Functions with DataFrames in Spark by Burak Yavuz and Reynold Xin.

From the post:

We introduced DataFrames in Spark 1.3 to make Apache Spark much easier to use. Inspired by data frames in R and Python, DataFrames in Spark expose an API that’s similar to the single-node data tools that data scientists are already familiar with. Statistics is an important part of everyday data science. We are happy to announce improved support for statistical and mathematical functions in the upcoming 1.4 release.

In this blog post, we walk through some of the important functions, including:

Random data generation

Summary and descriptive statistics

Sample covariance and correlation

Cross tabulation (a.k.a. contingency table)

Frequent items

Mathematical functions

We use Python in our examples. However, similar APIs exist for Scala and Java users as well.

You do know you have to build Spark yourself to find these features before the release of 1.4. Yes? For that: https://github.com/apache/spark/tree/branch-1.4.

Have you ever heard the expression “used in anger?”

That’s what Spark and its components deserve, to be “used in anger.”

Enjoy!

Comments Off

Data Science on Spark

Filed under: BigData,Machine Learning,Spark — Patrick Durusau @ 2:43 pm

Databricks Launches MOOC: Data Science on Spark by Ameet Talwalkar and Anthony Joseph.

From the post:

For the past several months, we have been working in collaboration with professors from the University of California Berkeley and University of California Los Angeles to produce two freely available Massive Open Online Courses (MOOCs). We are proud to announce that both MOOCs will launch in June on the edX platform!

The first course, called Introduction to Big Data with Apache Spark, begins today [June 1, 2015] and teaches students about Apache Spark and performing data analysis. The second course, called Scalable Machine Learning, will begin on June 29th and will introduce the underlying statistical and algorithmic principles required to develop scalable machine learning pipelines, and provides hands-on experience using Spark. Both courses will be freely available on the edX MOOC platform, and edX Verified Certificates are also available for a fee.

…

Both courses are available for free on the edX website, and you can sign up for them today:

Introduction to Big Data with Apache Spark

Scalable Machine Learning

It is our mission to enable data scientists and engineers around the world to leverage the power of Big Data, and an important part of this mission is to educate the next generation.

If you believe in the wisdom of crowds, some 80K enrolled students as of yesterday.

So, what are you waiting for?

😉

Comments Off

$100,000 reward for leaking the Trans-Pacific Partnership ‘TPP’

Filed under: Government,Politics,Transparency — Patrick Durusau @ 12:26 pm

Wikileaks is raising a $100,000 bounty for the missing twenty-six (26) chapters of the TPP. (Hope they get the revised versions after the meeting in Lima this summer.)

Donate Today!

As of 13:23 on 2 June 2015, $23927.17 from 68 people or about 24% of the goal. No guarantees but sounds like a good plan.

Comments Off

Spatial Humanities Workshop

Filed under: Humanities,Mapping,Maps,Spatial Data — Patrick Durusau @ 9:51 am

Spatial Humanities Workshop by Lincoln Mullen.

From the webpage:

Scholars in the humanities have long paid attention to maps and space, but in recent years new technologies have created a resurgence of interest in the spatial humanities. This workshop will introduce participants to the following subjects:

how mapping and spatial analysis are being used in humanities disciplines

how to find, create, and manipulate spatial data

how to create historical layers on interactive maps

how to create data-driven maps

how to tell stories and craft arguments with maps

how to create deep maps of places

how to create web maps in a programming language

how to use a variety of mapping tools

how to create lightweight and semester-long mapping assignments

The seminar will emphasize the hands-on learning of these skills. Each day we will pay special attention to preparing lesson plans for teaching the spatial humanities to students. The aim is to prepare scholars to be able to teach the spatial humanities in their courses and to be able to use maps and spatial analysis in their own research.

Ahem, the one thing Larry forgets to mention is that he is a major player in spatial humanities. His homepage is an amazing place.

The seminar materials don’t disappoint. It would be better to be at the workshop but in lieu of attending, working through these materials will leave you well grounded in spatial humanities.

Comments Off

Identifiers as Shorthand for Identifications

Filed under: Identification,Identifiers,Topic Maps — Patrick Durusau @ 9:33 am

I closed Identifiers vs. Identifications? saying:

Many questions remain, such as how to provide for collections of sets “of properties which provide clues for establishing identity?,” how to make those collections extensible?, how to provide for constraints on such sets?, where to record “matching” (read “merging”) rules?, what other advantages can be offered?

In answering those questions, I think we need to keep in mind that identifiers and identifications lie along a continuum that runs from where we “know” what is meant by an identifier to where we ourselves need a full identification to know what is being discussed. A useful answer won’t be one or the other, but a pairing that suits a particular circumstance and use case.

You can also think of identifiers as a form of shorthand for an identification. If we were working together in a fairly small office, you would probably ask, “Is Patrick in?” rather than listing all the properties that would serve as an identification for me. So all the properties that make up an identification are unspoken but invoked by the use of the identifier.

Works quite well in a small office because to some varying degree, we would all share the identifications that are represented by the identifiers we use in everyday conversation.

That sharing of identifications behind identifiers doesn’t happen in information systems, unless we have explicitly added identifications behind those identifiers.

One problem we need to solve is how to associate an identification with an identifier or identifiers. Looking only slightly ahead, we could use an explicit mechanism like a TMDM association, if we wanted to be able to talk about the subject of the relationship between an identifier and the identification that lies behind it.

But we are not compelled to talk about such a subject and could declare by rule that within a container, an identifier is a shorthand for properties of an identification in the same container. That assumes the identifier is distinguished from the properties that make up the identification. I don’t think we need to reinvent the notions of essential vs. accidental properties but merging rules should call out what properties are required for merging.

The wary reader will have suspected before now that many (if not all) of the terms in such a container could be considered as identifiers in and of themselves. Suddenly they are trying to struggle uphill from a swamp of subject recursion. It is “elephants all the way down.”

Have no fear! Just as we can avoid using TMDM associations to mark the relationship between an identifier and the properties making up an identification, we need use containers for identifiers and identifications only when and where we choose.

In some circumstances we may use bare identifiers, sans any identifications and yet add identifications when circumstances warrant it.

No level, identifiers, an identification, an identification that explodes other identifiers, etc., is right for every purpose. Each may be appropriate for some particular purpose.

We need to allow for downward expansion in the form of additional containers along side the containers we author, as well as extension of containers to add sub-containers for identifiers and identifications we did not or chose not to author.

I do have an underlying assumption that may reassure you about the notion of downward expansion of identifier/identification containers:

Processing of one or more containers of identifiers and identifications can choose the level of identifiers + identifications to be processed.

For some purposes I may only want to choose “top level” identifiers and identifications or even just parts of identifications. For example, think of the simple mapping of identifiers that happens in some search systems. You may examine the identifications for identifiers and then produce a bare mapping of identifiers for processing purposes. Or you may have rules for identifications that produce a mapping of identifiers.

Let’s assume that I want to create a set of the identifiers for Pentane and so I query for the identifiers that have the molecular property C₅H₁₂. Some of the identifiers (with their scopes) returned will be: Beilstein Reference 969132, CAS Registry Number 109-66-0, ChEBI CHEBI:37830, ChEMBL ChEMBL16102, ChemSpider 7712, DrugBank DB03119.

Each one of those identifiers may have other properties in their associated identifications, but there is no requirement that I produce them.

I mentioned that identifiers have scope. If you perform a search on “109-66-0” (CAS Registry Number) or 7712 (ChemSpider) you will quickly find garbage. Some identifiers are useful only with particular data sources or in circumstances where the data source is identified. (The idea of “universal” identifiers is a recurrent human fiction. See The Search for the Perfect Language, Eco.)

Which means, of course, we will need to capture the scope of identifiers.

Comments Off

50 Lies Programmers Believe

Filed under: Humor — Patrick Durusau @ 6:26 am

50 Lies Programmers Believe by Tom Morris.

From the post:

1. The naming convention for the majority of the people in my country is the paradigm case and nobody really does anything differently.
… (49 others follow)

I think this is the short list. 😉

I first saw this in a tweet by Andrea Mostosi.

Comments Off

June 1, 2015

Which Malware Lures Work Best?

Filed under: Cybersecurity,Malware,Security — Patrick Durusau @ 6:04 pm

Which Malware Lures Work Best? Measurements from a Large Instant Messaging Worm by Tyler Moore and Richard Clayton.

Abstract:

Users are inveigled into visiting a malicious website in a phishing or malware-distribution scam through the use of a ‘lure’ – a superficially valid reason for their interest. We examine real world data from some ‘worms’ that spread over the social graph of Instant Messenger users. We find that over 14 million distinct users clicked on these lures over a two year period from Spring 2010. Furthermore, we present evidence that 95% of users who clicked on the lures became infected with malware. In one four week period spanning May–June 2010, near the worm’s peak, we estimate that at least 1.67 million users were infected. We measure the extent to which small variations in lure URLs and the short pieces of text that accompany these URLs affects the likelihood of users clicking on the malicious URL. We show that the hostnames containing recognizable brand names were more effective than the terse random strings employed by URL shortening systems; and that brief Portuguese phrases were more effective in luring in Brazilians than more generic ‘language independent’ text.

Slides

How better to learn what to teach users to avoid than by watching users choose malware?

Although since the highly trained professionals at the TSA miss 95% of test explosives and guns, I’m not sure that user training is the answer to malware URLs.

Perhaps detection and automated following of all links in messages/emails, from a computer setup to detect malware? Not sure how you would get a warning to users.

Still, I like the idea of seeing what users do rather than speculating about what they might do. The latter technique being a favorite of our national security apparatus. Mostly because it drives budgets.

Comments Off

Identifiers vs. Identifications?

Filed under: Duke,Topic Maps,XML — Patrick Durusau @ 3:50 pm

One problem with topic map rhetoric has been its focus on identifiers (the flat ones):

rather than saying topic are managing subject identifications, that is, making explicit what is represented by an expectant identifier:

For processing purposes it is handy to map between identifiers, to query identifiers, access by identifiers, to mention only a few tasks, and all of them are machine facing.

However efficient it may be to use flat identifiers (even by humans), having access to bundle of properties thought to identify a subject is useful as well.

Topic maps already capture identifiers but their syntaxes need to be extended to support the capturing of subject identifications along with identifiers.

Years of reading has gone into the realization about identifiers and their relationship to identifications, but I would be remiss if I didn’t call out the work of Lars Marius Garshol on Duke.

From the GitHub page:

Duke is a fast and flexible deduplication (or entity resolution, or record linkage) engine written in Java on top of Lucene. The latest version is 1.2 (see ReleaseNotes).

Duke can find duplicate customer records, or other kinds of records in your database. Or you can use it to connect records in one data set with other records representing the same thing in another data set. Duke has sophisticated comparators that can handle spelling differences, numbers, geopositions, and more. Using a probabilistic model Duke can handle noisy data with good accuracy.

In an early post on Duke Lars observes:

…
The basic idea is almost ridiculously simple: you pick a set of properties which provide clues for establishing identity. To compare two records you compare the properties pairwise and for each you conclude from the evidence in that property alone the probability that the two records represent the same real-world thing. Bayesian inference is then used to turn the set of probabilities from all the properties into a single probability for that pair of records. If the result is above a threshold you define, then you consider them duplicates.
…
Bayesian identity resolution

Only two quibbles with Lars on that passage:

I would delete “same real-world thing” and substitute, “any subject you want to talk about.”

I would point out that Bayesian inference is only one means of determining if two or more sets of properties represent the same subject. Defining sets of matching properties comes to mind. Inferencing based on relationships (associations). “Ask Steve,” is another.

But, I have never heard a truer statement from Lars than:

The basic idea is almost ridiculously simple: you pick a set of properties which provide clues for establishing identity.

Many questions remain, such as how to provide for collections of sets “of properties which provide clues for establishing identity?,” how to make those collections extensible?, how to provide for constraints on such sets?, where to record “matching” (read “merging”) rules?, what other advantages can be offered?

In answering those questions, I think we need to keep in mind that identifiers and identifications lie along a continuum that runs from where we “know” what is meant by an identifier to where we ourselves need a full identification to know what is being discussed. A useful answer won’t be one or the other, but a pairing that suits a particular circumstance and use case.

Comments Off

The Silence of Attributes

Filed under: Topic Maps,XML — Patrick Durusau @ 3:22 pm

I was reading Relax NG by Eric van der Vlist, when I ran into this wonderful explanation of the silence of attributes:

Attributes are generally difficult to extend. When choosing from among elements and attributes, people often base their choice on the relative ease of processing, styling, or transforming. Instead, you should probably focus on their extensibility.

Independent of any XML schema language, when you have an attribute in an instance document, you are pretty much stuck with it. Unless you replace it with an element, there is no way to extend it. You can’t add any child elements or attributes to it because it is designed to be a leaf node and to remain a leaf node. Furthermore, you can’t extend the parent element to include a second instance of an attribute with the same name. (Attrbutes with duplicate names are forbidden by XML 1.0.) You are thus making an impact not only on the extensibility of the attribute but also on the extensibility of the parent element.

Because attributes can’t be annotated with new attributes and because they can’t be duplicated, they can’t be localized like elements through duplication with different values of xml:lang attributes. Because attributes are more difficult to localize, you should avoid storing any text targeted at human consumers within attributes. You never know whether your application will become international. These attributes would make it more difficult to localize. (At page 200)

Let’s think of “localization” as “use a local identifier” and re-read that last paragraph again (with apologies to Eric):

Because attributes can’t be annotated with new attributes and because they can’t be duplicated, they can’t use local identifiers like elements through duplication with different values of xml:lang attributes. Because attributes are more difficult to localize, you should avoid storing any identifiers targeted at human consumers within attributes. You never know whether your application will become international. These attributes would make it more difficult to use local identifiers.

As a design principle, the use of attributes prevents us from “localizing” to an identifier that a user might recognize.

What is more, identifiers stand in the place of or evoke, the properties that we would list as being “how” we identified a subject, even though we happily use an identifier as a shorthand for that set of properties.

While we should be able to use identifiers for subjects, we should also be able to provide the properties we see those identifiers as representing.

Comments Off

20+ Free Graph Databases

Filed under: Graph Databases — Patrick Durusau @ 3:05 pm

20+ Free Graph Databases

The original post has prose descriptions of each database but for my purposes a list of links is more than sufficient:

Like any other software, you should evaluate these graphs databases against your requirements and data. Do not assume that the terminology used in graph database documentation (such as it is) matches your understanding of graph terminology or indeed that used anywhere in published graph literature.

Having said that, some of the graph software listed above is truly extraordinary.

Comments Off

Portfolio of the Week – Josemi Benítez

Filed under: Graphics,Journalism,Visualization — Patrick Durusau @ 2:50 pm

Portfolio of the Week – Josemi Benítez by Tiago Veloso.

From the post:

It’s an indisputable fact that Spain has produced some of the most inspiring visual journalists of the last two decades, and we are quite happy to present you today the work of another one of those talented designers: Josemi Benítez, who has been responsible for the graphics section of the newspaper El Correo (Bilbao) for seven years.

Josemi began in the world of storytelling through images and texts working as a freelance artist for several advertising agencies, while getting degrees in Journalism and Advertising at the Universidad del País Vasco. In 1999 he began working as a Web designer in the Bilbao newspaper, helping to create elcorreo.com.

In 2002, Josemi returned to the paper version of the newspaper, coinciding with a key moment of the graphic evolution of the El Correo. Infographics gained space and a new style of graphics was buzzing. Since then, his work has been awarded a number of times by the Society for News Design and Malofiej News Design awards. In addition to his work at El Correo, he also taught infographic design at the University of Navarra and the Master in Multimedia El Correo-UPV / EHU.

Here are the works Josemi sent us:
…

I hesitate to call the examples infographics because they are more nearly works of communicative art. Select several for full size viewing and see if you agree.

Time spent with these images to incorporate these techniques in your own work would be time well spent.

Comments Off

Memantic Is Online!

Filed under: Bioinformatics,Biomedical,Medical Informatics — Patrick Durusau @ 2:34 pm

Memantic

I first blogged about the Memantic paper: Memantic: A Medical Knowledge Discovery Engine in March of this year and am very happy to now find it online!

From the about page:

Memantic captures relationships between medical concepts by mining biomedical literature and organises these relationships visually according to a well-known medical ontology. For example, a search for “Vitamin B12 deficiency” will yield a visual representation of all related diseases, symptoms and other medical entities that Memantic has discovered from the 25 million medical publications and abstracts mentioned above, as well as a number of medical encyclopaedias.

The user can explore a relationship of interest (such as the one between “Vitamin B12 deficiency” and “optic neuropathy”, for instance) by clicking on it, which will bring up links to all the scientific texts that have been discovered to support that relationship. Furthermore, the user can select the desired type of related concepts — such as “diseases”, “symptoms”, “pharmacological agents”, “physiological functions”, and so on — and use it as a filter to make the visualisation even more concise. Finally, the related concepts can be semantically grouped into an expandable tree hierarchy to further reduce screen clutter and to let the user quickly navigate to the relevant area of interest.

…

Concisely organising related medical entities without duplication

Memantic first presents all medical terms related to the query concept and then groups publications by the presence of each such term in addition to the query itself. The hierarchical nature of this grouping allows the user to quickly establish previously unencountered relationships and to drill down into the hierarchy to only look at the papers concerning such relationships. Contrast this with the same search performed on Google, where the user normally gets a number of links, many of which have the same title; the user has to go through each link to see if it contains any novel information that is relevant to their query.

Keeping the index of relationships up-to-date

Memantic perpetually renews its index by continuously mining the biomedical literature, extracting new relationships and adding supporting publications to the ones already discovered. The key advantage of Memantic’s user interface is that novel relationships become apparent to the user much quicker than on standard search engines. For example, Google may index a new research paper that exposes a previously unexplored connection between a particular drug and the disease that is being searched for by the user. However, Google may not assign that paper the sufficient weight for it to appear in the first few pages of the search results, thus making it invisible to the people searching for the disease who do not persevere in clicking past those initial pages.
…

To get a real feel for what the site is capable of, you need to create an account (free) and try it for yourself.

I am not a professional medical researchers but was able to duplicate some prior research I have done on edge case conditions fairly quickly. Whether that was due to the interface and its techniques or because of my knowledge of the subject area is hard to answer.

The interface alone is worth the visit.

Do give Memantic a spin! I think you will like what you find.

Comments Off

Architectural Patterns for Near Real-Time Data Processing with Apache Hadoop

Filed under: Architecture,BigData,Data Streams,Hadoop — Patrick Durusau @ 1:35 pm

Architectural Patterns for Near Real-Time Data Processing with Apache Hadoop by Ted Malaska.

From the post:

Evaluating which streaming architectural pattern is the best match to your use case is a precondition for a successful production deployment.

The Apache Hadoop ecosystem has become a preferred platform for enterprises seeking to process and understand large-scale data in real time. Technologies like Apache Kafka, Apache Flume, Apache Spark, Apache Storm, and Apache Samza are increasingly pushing the envelope on what is possible. It is often tempting to bucket large-scale streaming use cases together but in reality they tend to break down into a few different architectural patterns, with different components of the ecosystem better suited for different problems.

In this post, I will outline the four major streaming patterns that we have encountered with customers running enterprise data hubs in production, and explain how to implement those patterns architecturally on Hadoop.

Streaming Patterns

The four basic streaming patterns (often used in tandem) are:

Stream ingestion: Involves low-latency persisting of events to HDFS, Apache HBase, and Apache Solr.

Near Real-Time (NRT) Event Processing with External Context: Takes actions like alerting, flagging, transforming, and filtering of events as they arrive. Actions might be taken based on sophisticated criteria, such as anomaly detection models. Common use cases, such as NRT fraud detection and recommendation, often demand low latencies under 100 milliseconds.

NRT Event Partitioned Processing: Similar to NRT event processing, but deriving benefits from partitioning the data—like storing more relevant external information in memory. This pattern also requires processing latencies under 100 milliseconds.

Complex Topology for Aggregations or ML: The holy grail of stream processing: gets real-time answers from data with a complex and flexible set of operations. Here, because results often depend on windowed computations and require more active data, the focus shifts from ultra-low latency to functionality and accuracy.

In the following sections, we’ll get into recommended ways for implementing such patterns in a tested, proven, and maintainable way.

…

Great post on patterns for near real-time data processing.

What I have always wondered about is how much of a use case is there for “near real-time processing” of data? If human decision makers are in the loop, that is outside of ecommerce and algorithmic trading, what is the value-add of “near real-time processing” of data?

For example, Kai Wähner in Real-Time Stream Processing as Game Changer in a Big Data World with Hadoop and Data Warehouse gives the following as common use cases for “near real-time processing” of data”

Network monitoring

Intelligence and surveillance

Risk management

E-commerce

Fraud detection

Smart order routing

Transaction cost analysis

Pricing and analytics

Market data management

Algorithmic trading

Data warehouse augmentation

Ecommerce, smart order routing, algorithmic trading all fall into the category of no human involved so those may need real-time processing.

But take network monitoring for example. From the news reports I understand that hackers had free run of the Sony network for months. I suppose you must have network monitoring at all before real-time network monitoring would be useful at all.

I would probe to make sure that “real-time” was necessary for the use cases at hand before simply assuming it. In smaller organizations, access to data and “real-time” results are more often a symptom of control issues as opposed to any actual use case for the data.

Comments Off

TSA Failure Rate: 95% on Guns and Explosives

Filed under: Government,Politics,Security — Patrick Durusau @ 1:09 pm

Airport Security Fails to Detect 95% of Fake Explosives, Weapons by Lauren Walker.

From the post:

As air travelers, we take off our shoes, remove our belts, have our bodies scanned and condense our liquids into mini bottles—and it all may be for naught.

An internal investigation of the Transportation Security Administration (TSA) found that undercover investigators were able to smuggle fake explosives and weapons through checkpoints in 95 percent of trials, which they conducted at dozens of America’s busiest airports. Officials did not disclose when the testing took place, other than to say it ended recently.

In the trials, Department of Homeland Security (DHS) investigators posed as ordinary passengers. They carried out 70 tests, all of which included trying to sneak a banned item through security. Officials briefed on the results told ABC News that TSA agents failed 67 of the 70 tests.
…

When I wrote You Are Safer Than You Think, I didn’t know the TSA was still a leaky sieve for guns and explosives.

So 95% of the over 10.3 billion passengers since 9/11 could have smuggled explosives onto airplanes but chose not to do so.

After over 10.3 billion passengers and not one terrorist, with a history of a 95% failure rate (same results in 2009 and no improvement since then), and knowing that drug and arms smuggling continues, despite the TSA, isn’t it time to save everyone the time and expense of the TSA?

Fiscal Year 2015 Budget for the TSA -> $7.3 billion

Comments Off

« Newer Posts — Older Posts »

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

June 4, 2015

June 3, 2015

June 2, 2015

June 1, 2015