Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

July 11, 2015

Legislating Cyber-Security

Filed under: Cybersecurity,Security — Patrick Durusau @ 6:57 pm

Germany passes strict cyber-security law to protect ‘critical infrastructure’

From the post:

In the wake of ever-increasing cyber-security threats, Germany has passed legislation ordering that over 2,000 essential service providers implement new minimum information security standards or face penalties if they fail to do so within two years.

The law passed its final hurdle in the upper house of the German parliament, the Bundesrat, on Friday after having passed the lower house in June.

The law will affect institutions listed as “critical infrastructure,” such as transportation, health, water utilities, telecommunications providers, as well as finance and insurance firms. It gives companies two years to introduce cyber security measures or face fines of up to €100,000 ($111,000).

The Bundesrat-approved IT security law obliges firms and federal agencies to certify for minimum cyber-security standards and obtain Federal Office of Information Security (BSI) clearance. The companies must also notify the Office of suspected cyber-attacks on their systems.

This is where Dogbert would say: “There are times when no snide comment seems adequate.”

July 10, 2015

SparkHub

Filed under: Spark — Patrick Durusau @ 4:16 pm

SparkHub: A Community Site for Apache Spark

The start of a community site for Apache Spark.

I say the “start of” because under videos you will find only Spark Summit 2015 and Spark Summit East 2015.

No promises on accuracy but searching for “Apache Spark” at YouTube results in 4,900 “hits” as of today (10 July 2015).

Make no mistake, Databricks (the force behind SparkHub) has a well deserved and formidable reputation when it comes to Spark.

And the site was just released today, but still, one would expect more diverse content than is found on the site today.

Given the widespread interest in Spark, widespread if mailing list traffic is any indication of interest, a curated site that does more than list videos, articles and post by title would be a real community asset.

Think of a Spark site that identifies issues by time marks in videos and links those snippets to bug/improvement issues and any discussion.

Make Spark discussions a seamless web as opposed to a landscape of dead-ends, infinitely deep potholes and diamonds, diamonds that once discovered, are covered back up again.

It doesn’t have to be that way.

American Right To Be A Google Censor?

Filed under: Censorship,Search Engines — Patrick Durusau @ 9:12 am

The “right to be forgotten” intellectual confusion has spread from the EU to the United States. John Zorabedian reports in: Do Americans have the same right as Europeans to be “forgotten” by Google? that Consumer Watchdog has filed a complaint with the FTC, seeking the mis-named “right to be forgotten” for Americans.

The “right to be forgotten” is deeply problematic for many reasons, among which are:

  1. If enforced, the offending link is removed from Google’s search results. The original and presumably offending source material persists. At best, the right is: “a right to not be found in Google search results.”
  2. As “a right to not be found in Google search results,” it is a remarkably limited right, since it works only in jurisdictions that establish that right.
  3. As “a right to not be found in Google search results,” it could lead to varying results as rules to be “forgotten” vary from one jurisdiction to another.
  4. As “a right to not be found in Google search results,” if it is given extra-territorial reach, would lead to world-wide censorship of Google search results. (The EU may be concerned with the sensitivities of Balkan war criminals but many outside the EU are not.)
  5. As “a right to not be found in Google search results,” is on its face limited to Google, opening up the marketplace for sites that remember forgotten results and plugins that supplement Google search results with forgotten results.
  6. As “a right to not be found in Google search results,” imposes an administrative overhead on Google that is not imposed on other search engines. Not to mention additional judicial proceedings if denial of a request by Google leads to litigation to force removal of materials from a Google search result.

At this point, the “complaint” of Consumer Watchdog isn’t anything more than a letter to the FTC. It appears no where in official FTC listings. Hopefully it will stay that way.

The European Bioinformatics Institute

Filed under: Bioinformatics — Patrick Durusau @ 8:19 am

The European Bioinformatics Institute

From the webpage:

EMBL-EBI provides freely available data from life science experiments, performs basic research in computational biology and offers an extensive user training programme, supporting researchers in academia and industry.

I have mentioned the European Bioinformatics Institute before in connection with specific projects but never as a stand alone entry. The Institute is part of the European Molecular Biology Laboratory.

More from the webpage:

Our mission

  • To provide freely available data and bioinformatics services to all facets of the scientific community
  • To contribute to the advancement of biology through basic investigator-driven research
  • To provide advanced bioinformatics training to scientists at all levels
  • To help disseminate cutting-edge technologies to industry
  • To coordinate biological data provision throughout Europe

This should be high on your list of bioinformatics bookmarks!

July 9, 2015

Biomedical citation maker

Filed under: Bioinformatics — Patrick Durusau @ 3:29 pm

Biomedical citation maker

I encountered this yesterday while writing about PMID, PMCID and DOI identifiers.

From the webpage:

This page creates biomedical citations to use when for writing on websites. In addition, the citation maker can optionally quote the conclusion of an article and help present numeric results from clinical studies that report diagnosis and treatment.

There is also a bookmarklet if you are interested.

Automatic formatting of citations is the only way to insure quality citation practices. Variations in citations can cause others to miss the cited article or related materials.

I don’t regard correct citations as an excessive burden. Some authors do.

Royal Albert Hall – Performance History & Archive

Filed under: History,Music — Patrick Durusau @ 3:19 pm

Royal Albert Hall – Performance History & Archive

From the webpage:

THE ROYAL ALBERT HALL’S HISTORY IS NOW AT YOUR FINGERTIPS!

Search our Performance Database to find out about your favourite artist or explore 30,000+ events from 1871 to last night.

Search the Archive to discover items in the Hall’s unique archive collections which chart the history of the building, organisation and events.

Another extraordinary resource from the UK. It is almost enough to make you forget that David Cameron is also a product of the UK.

Digital Bodleian

Filed under: History,Library — Patrick Durusau @ 2:20 pm

I know very little of what there is to be known about the Bodleian Library but as soon as I saw Digital Bodleian, I had to follow the link.

As of today, there are 115,179 images and more are on their way. Check the collections frequently and for new collections as well.

One example that is near and dear to me:

Exploring Egypt in the 19th Century

The popup reads:

A complete facsimile of publications from the early-nineteeth-century expeditions to Egypt by Champollion and Rosellini.

The growth of “big data” isn’t just from the production of new data but from the digitization of existing collections as well.

Now the issue is how to collate copies of inscriptions by Champollion in these works with much later materials. So that a scholar finding one such resource will be automatically made aware of the others.

That may not sound like a difficult task but given the amount of material published every year, it remains a daunting one.

Have You Really Tried? (FBI to Encryption Experts Opposing Back Doors)

Filed under: Cybersecurity — Patrick Durusau @ 1:56 pm

The FBI director thinks tech experts who can’t comply with his impossible demands just aren’t trying hard enough by Rob Price.

In discussing the FBI’s request for back doors, FBI Director James Comey says:

… “a whole lot of good people have said it’s too hard … maybe that’s so … But my reaction to that is: I’m not sure they’ve really tried.”

If you are interested in knowing which experts and how hard they have tried, see Keys Under Doormats: Mandating insecurity by requiring government access to all data and communications by Harold Abelson, Ross Anderson, Steven M. Bellovin, Josh Benaloh, Matt Blaze, Whitfield Diffie, John Gilmore, Matthew Green, Susan Landau, Peter G. Neumann, Ronald L. Rivest, Jeffrey I. Schiller, Bruce Schneier, Michael Specter, and Daniel J. Weitzner.

I don’t normally list more than three (3) co-authors but this is an exceptional case. You will recognize a number of the names listed as co-authors.

The most troubling aspect of this story is Director Comey considering his dislike for the uniform answer from experts as sufficient grounds to assume it isn’t true.

That is true scientific illiteracy.

OpenSSL – Alternative chains certificate forgery

Filed under: Cybersecurity — Patrick Durusau @ 1:23 pm

Alternative chains certificate forgery (CVE-2015-1793)

From the post:

Severity: High

During certificate verification, OpenSSL (starting from version 1.0.1n and 1.0.2b) will attempt to find an alternative certificate chain if the first attempt to build such a chain fails. An error in the implementation of this logic can mean that an attacker could cause certain checks on untrusted certificates to be bypassed, such as the CA flag, enabling them to use a valid leaf certificate to act as a CA and “issue” an invalid certificate.

This issue will impact any application that verifies certificates including SSL/TLS/DTLS clients and SSL/TLS/DTLS servers using client authentication.

This issue affects OpenSSL versions 1.0.2c, 1.0.2b, 1.0.1n and 1.0.1o.

OpenSSL 1.0.2b/1.0.2c users should upgrade to 1.0.2d
OpenSSL 1.0.1n/1.0.1o users should upgrade to 1.0.1p

This issue was reported to OpenSSL on 24th June 2015 by Adam Langley/David Benjamin (Google/BoringSSL). The fix was developed by the BoringSSL project.

This is to close the loop on: Mystery Patch for OpenSSL.

How Do You Define Cyber Attack?

Filed under: Cybersecurity,Security — Patrick Durusau @ 12:40 pm

I ask because Teri Robinson in Cyber attack on U.S. power grid could rack up $1 trillion in losses, study says, cites:

“Business Blackout: the insurance implications of a cyber attack on the U.S. power grid,” a study (PDF) from the Centre for Risk Studies at Cambridge University and insurer Lloyd’s of London, found that such an attack would have an impact on multiple types of insurance.

A very stylish report, complete with black pages trimmed in red rather than blank pages. It evaluates a fanciful scenario against the US powergrid, with no basis for evaluating the feasibility of the attack as described.

I asked about how you define “cyber attack” because Appendix A lists the obligatory “Cyber attacks against Industrial Control Systems since 1999.” (starts on page 45)

If you review that list carefully, out of an alleged fourteen (14) “cyber attacks,” four (4) of them were physical attacks on infrastructure, eight (8) of them, including insiders, were true cyber attacks and the other two (2) were misc.

Eight (8) cyber attacks against control systems shouldn’t be peg the threat meters for anyone.

But to return to Teri’s article for a moment, she closes with these observations:

And a risk that has American voters worried. A Morning Consult poll found that 32 percent of voters consider cyber attacks a major threat, putting them just behind terrorism, which, at 36 percent, was the top threat.

Among GOP voters, terrorism garnered 45 percent of the vote with cyber attacks getting just 25 percent. Democrats believe cyber attacks (38 percent) to be the bigger threat, while terrorism claimed 31 percent of the vote.

The FBI, DHS, along with the national news media have certainly done well at selling cyber and terrorist attacks as clear and present dangers.

All the more amazing due to the lack of any terrorist attacks worthy of mention since 9/11 and the complete lack of cyber attacks against control systems in public utilities.

Remember, no electricity means no Bevis and Butthead. Be careful what you wish for.

PS: Don’t waste time and money on cyber nightmare scenarios against the U.S. power grid. Yes, they need good cyber security like everyone else, but they have far worse issues of physical security. Congress has published maps of the critical infrastructures and details on non-cybersecurity issues. Check with your local government documents librarian.

July 8, 2015

Here’s Why Elon Musk Is Wrong About AI

Filed under: Artificial Intelligence,Machine Learning — Patrick Durusau @ 4:46 pm

Here’s Why Elon Musk Is Wrong About AI by Chris V. Nicholson.

From the post:

Nothing against Elon Musk, but the campaign he’s leading against AI is an unfortunate distraction from the true existential threats to humanity: global warming and nuclear proliferation.

Last year was the hottest year on record. We humans as a whole are just a bunch of frogs in an planet-sized pot of boiling water. We’re cooking ourselves with coal and petroleum, pumping carbon dioxide into the air. Smart robots should be the least of our worries.

Pouring money into AI ethics research is the wrong battle to pick because a) it can’t be won, b) it shouldn’t be fought, and c) to survive, humans must focus on other, much more urgent, issues. In the race to destroy humanity, other threats are much better bets than AI.

Not that I disagree with Nicholson, there are much more important issues to worry about than rogue AI, but that overlooks one critical aspect of the argument by Musk.

Musk has said to the world that he’s worried about AI and, more importantly, he has $7 Million+ for anyone who worries about it with him.

Your choices are:

  1. Ignore Musk because building an artificial intelligence when we don’t understand human intelligence seems too remote to be plausible, or
  2. Agree with Musk and if you are in a research group, take a chance on a part of $7 Million in grants.

I am firmly in the #1 camp because I have better things to do with my time attending UFO type meetings. Unfortunately, there are a lot of people in the #2 camp. Just depends on how much money is being offered.

There are any number of research projects that legitimately push the boundaries of knowledge. Unfortunately the government and others also fund projects that are wealth re-distribution programs for universities, hotels, transportation, meeting facilities and the like.

PS: There is a lot of value in the programs being explored under the misnomer of “artificial intelligence.” I don’t have an alternative moniker to suggest but it needs one.

Pentagon Contractors Rank Below Retailers and Banks…

Filed under: Government,Politics — Patrick Durusau @ 4:18 pm

Pentagon Contractors Rank Below Retailers and Banks When It Comes To Cybersecurity by Aliya Sternstein.

My attention was drawn to this story because of a tweet by Eric Clay that reads:

As hard as it is to believe, #Pentagon contractors rank below retailers and banks when it comes to #cybersecurity.

I don’t find that hard to believe at all. Do you?

Just last week we were reading about VMware/Carasoft ponying up $75.5 Million for fraudulent billing of the government and they remain a government contractor.

Oh to be a government contractor! You can deliver planes that burst into flames at random (F-35) and the US government will help you foist them off on unsuspecting “allies,” you can fraudulently bill the government and remain a government contractor, and you can even fail catastrophically, think Virtual Case Management, but still remain a federal contractor.

There are thousands of stories just like the ones I pointed out, some larger, some smaller, but it is a pattern of non-accountability that has been in place for decades.

How about liability of the contractors, their shareholders and principal officers for failure to perform, a liability that is non-dischargeable in bankruptcy as a start towards accountability for government contractors? And that can be satisfied out of retirement accounts and investments, save for SS.

Unless and until the government stops being a large cookie jar, “no one will notice if I take just one,” for its contractors, don’t expect the quality of work in cybersecurity or elsewhere to improve. Quality of work is not a value in the present contracting system.

PS: I am sure there are contractors and individual people who work for contractors who do very high quality work. The problem is they are the anomalies and not the rule.

TinkerPop3

Filed under: Graphs,TinkerPop,Titan — Patrick Durusau @ 3:35 pm

TinkerPop3: Taking graph databases and graph analytics to the next level by Matthias Broecheler.

Abstract:

Apache TinkerPop is an open source graph computing framework which includes the graph traversal language Gremlin and a number of graph utilities that speed up the development of graph based applications. Apache TinkerPop provides an abstraction layer on top of popular graph databases like Titan, OrientDB, and Neo4j as well as scalable computation frameworks like Hadoop and Spark allowing developers to build graph applications that run on multiple platforms avoiding vendor lock-in.

This talk gives an overview of the new features introduced in TinkerPop3 with a deep-dive into query language design, query optimization, and the convergence of OLTP and OLAP in graph processing. A demonstration of TinkerPop3 with the scalable Titan graph database illustrates how these concepts work in practice.

It’s not all the information you will need about TinkerPop3 but should be enough to get you interested in learning more, a lot more.

I had a conversation recently on how to process topic maps with graphs, at least if you were willing to abandon the side-effects detailed in the Topic Maps Data Model (TMDM). More on that to follow.

The Nation has a new publishing model

Filed under: Publishing — Patrick Durusau @ 2:56 pm

Introducing the New TheNation.com by Richard Kim.

From the post:

…on July 6, 2015—exactly 150 years after the publication of our first issue—we’re relaunching TheNation.com. The new site, created in partnership with our friends at Blue State Digital and Diaspark, represents our commitment to being at the forefront of independent journalism for the next generation. The article page is designed with the Nation ambassador in mind: Beautiful, clear fonts (Mercury and Knockout) and a variety of image fields make the articles a joy to read—on desktop, tablet, and mobile. Prominent share tools, Twitter quotes, and a “highlight to e-mail/tweet” function make it easy to share them with others. A robust new taxonomy and a continuous scroll seamlessly connect readers to related content. You’ll also see color-coded touts that let readers take action on a particular issue, or donate and subscribe to The Nation.

I’m not overly fond of paywalls as you know but one part of the relaunch merits closer study. Comments on articles are going to be open to subscribers only.

It will be interesting to learn what the experience of The Nation is with its comments only by subscribers. Hopefully their tracking will be granular enough to determine what portion of subscribers subscribed, simply so they could make comments.

There are any number of fields where opinions run hot enough that even open content but paying for comments to be displayed could be a viable model for publication.

Imagine a publicly accessible topic map on the candidates for the US presidential election next year. If it had sufficient visibility, the publication of any report would spawn automatic responses from others. Responses that would not appear without paying for access to publish the comment.

Viable economic model?

Suggestions?

The Mote in Your Neighbor’s Eye

Filed under: Journalism,News,Reporting — Patrick Durusau @ 2:42 pm

The Washington Post featured this headline recently: Russia is seeing conspiracies in Armenia where none exist.

So, given the past decade of false terror warnings from the FBI and the Department of Homeland Security, as reported by Adam Johnson in FBI and Media Still Addicted to Ginning Up Terrorist Hysteria – But They Have Never Been Right, why hasn’t the Washington Post ran the headline:

USA is seeing terrorists where none exist

How does that go? Maybe the Washington Post should take the plank out of its eye so it can see more clearly the world around it.

PS: Will there be a terrorist attack in the United States someday? Sure, but warning about them for hundreds if not thousands of days in between demonstrates a lack of good judgement.

PMID-PMCID-DOI Mappings (monthly update)

Filed under: Bioinformatics,Medical Informatics,PubMed — Patrick Durusau @ 2:22 pm

PMID-PMCID-DOI Mappings (monthly update)

Dario Taraborelli tweets:

All PMID-PMCID-DOI mappings known by @EuropePMC_news, refreshed monthly ftp://ftp.ebi.ac.uk/pub/databases/pmc/DOI/

The file lists at 150MB but be aware that it decompresses to 909MB+. Approximately 25.6 million lines.

In case you are unfamiliar with PMID/PMCID:

PMID and PMCID are not the same thing.

PMID is the unique identifier number used in PubMed. They are assigned to each article record when it enters the PubMed system, so an in press publication will not have one unless it is issued as an electronic pre-pub. The PMID# is always found at the end of a PubMed citation.

Example of PMID#: Diehl SJ. Incorporating health literacy into adult basic education: from life skills to life saving. N C Med J. 2007 Sep-Oct;68(5):336-9. Review. PubMed PMID: 18183754.

PMCID is the unique identifier number used in PubMed Central. People are usually looking for this number in order to comply with the NIH Public Access Regulations. We have a webpage that gathers information to guide compliance. You can find it here: http://guides.hsl.unc.edu/NIHPublicAccess (broken link) [updated link: https://publicaccess.nih.gov/policy.htm]

A PMCID# is assigned after an author manuscript is deposited into PubMed Central. Some journals will deposit for you. Is this your publication? What is the journal?

PMCID#s can be found at the bottom of an article citation in PubMed, but only for articles that have been deposited in PubMed Central.

Example of a PMCID#: Ishikawa H, Kiuchi T. Health literacy and health communication. Biopsychosoc Med. 2010 Nov 5;4:18. PubMed PMID: 21054840; PubMed Central PMCID: PMC2990724.

From: how do I find the PMID (is that the same as the PMCID?) for in press publications?

If I were converting this into a topic map, I would use the PMID, PMCID, and DOI entries as subject identifiers. (PMIDs and PMCIDs can be expressed as hrefs.)

#HackingTeam – Data Dump Mirrors

Filed under: Cybersecurity,Security — Patrick Durusau @ 10:44 am

Although most of my readers probably have the original torrent, I thought a list of mirrors for the #HackingTeam data dump could be useful:

I have personally verified all the mirrors to exist and that they list what appears to be the hacked content. The accuracy of these mirrors and your safety in examining the contents remains solely your affair.

Full mirror of torrent

https://aedv23ynnl27e7vt.onion.cab/ || http://aedv23ynnl27e7vt.onion/

One site, http://ht.musalbas.com, has already received a DMCA complaint (reported on Twitter by Zach Whittaker.

As a public service to the broader community, please mirror these archives in whole or in part. If you can’t host a mirror, reach out to the mirroring sites to express your interest and support for their efforts.

Do you think this hack may be an effort to push the OPM hack to the back pages of the news? 😉 No, I’m not that paranoid nor do I think the US government is that well organized.

Enjoy!

PS: If you do index/analyze part of the data dump, be sure to post the results and munged data to public repositories.

Avoid Password Embarrassment

Filed under: Cybersecurity,Security — Patrick Durusau @ 10:18 am

Silkie Carlo posted this image on Twitter as useful for a “how to make a password” discussion:

password

You only have two (2) options to avoid password embarrassment:

  1. Never get hacked. (the worst strategy)
  2. Use strong passwords along with a routine of changing them.

If you need advice on what strong passwords, see the FAQ for cryptsetup.

If your own cybersecurity isn’t enough of a motivation for using strong passwords, do you want your name, along with a weak password to come up for years in discussions of weak passwords?

It is a form of fame but I would prefer to avoid the honor.

You?

PS: Embarrassment is perhaps the only known downside to having a weak password, for a user. “Privileged users” had weak passwords at OPM. Ditto for Sony. Now at Hacking Team. Have I missed reports of punitive dismissals?

The theory seems to be that everyone is stupid and therefore individuals should not be penalized for being stupid in particular instances. It may be true that everyone is stupid about somethings but the parameters for strong passwords are known. Stupidity should not be tolerated for problems with known solutions.

Mystery Patch for OpenSSL

Filed under: Cybersecurity,Security — Patrick Durusau @ 9:50 am

Get ready. Mystery high severity bug in OpenSSL to be patched on Thursday by Graham Cluley.

Graham passes on the news that a security fix for a “high” severity bug will be patched in OpenSSL this coming Thursday (9 July 2015).

He also supports the lack of information about the nature of the bug. To avoid giving hackers a chance to exploit it.

When the patch is released, assuming the reaction is similar to Heartbleed, there will still be sites three weeks later that are not patched.

Speculations on the latest “high” severity bug in OpenSSL?

July 7, 2015

Google Study: Most Security Questions Easy To Hack [+ security insight about Google]

Filed under: Cybersecurity,Security — Patrick Durusau @ 3:58 pm

Google Study: Most Security Questions Easy To Hack by Shirley Siluk.

From the post:

There’s a big problem with the security questions often used to help people log into Web sites, or remember or access lost passwords — questions with answers that are easy to remember are also easy for hackers to guess. That’s the key finding of a study that Google recently presented at the International World Wide Web Conference in Florence, Italy.

Google said it analyzed hundreds of millions of secret questions and answers that users had employed to recover access to their accounts. It then calculated how easily hackers could guess the answers to those questions.

In many cases, the answers were relatively easy to hit upon because of unique cultural factors, according to the study. For English speakers, for example, hackers had a 19.7 percent chance of guessing — in just one guess — the right answer to the question, “What is your favorite food?” (Answer: pizza.)

‘Neither Secure nor Reliable’

Google undertook the study because, “despite the prevalence of security questions, their safety and effectiveness have rarely been studied in depth,” noted Anti-Abuse Research Lead Elie Bursztein and Software Engineer Ilan Caron. The conclusion reached after looking at all those millions of questions and answers? “(S)ecret questions are neither secure nor reliable enough to be used as a standalone account recovery mechanism,” Bursztein and Caron said Thursday in a post on Google’s Online Security Blog.

Shirley goes on to give examples of how the answers to some security questions are culturally determined but also quotes suggestions for making your answers to secret questions more secure.

What is the one insight into Google security can you draw from this article?

Google stored the answers to secret questions as clear text.

Yes?

Otherwise, how did they develop the statistics about secret answer usage?

Another answer isn’t clear from: Secrets, Lies, and Account Recovery: Lessons from the Use of Personal Knowledge Questions at Google by Joseph Bonneau, Elie Bursztein, Ilan Caron, Rob Jackson, and Mike Williamson.

Abstract:

We examine the first large real-world data set on personal knowledge question’s security and memorability from their deployment at Google. Our analysis confirms that secret questions generally offer a security level that is far lower than user-chosen passwords. It turns out to be even lower than proxies such as the real distribution of surnames in the population would indicate. Surprisingly, we found that a significant cause of this insecurity is that users often don’t answer truthfully. A user survey we conducted revealed that a significant fraction of users (37%) who admitted to providing fake answers did so in an attempt to make them “harder to guess” although on aggregate this behavior had the opposite effect as people “harden” their answers in a predictable way.

On the usability side, we show that secret answers have surprisingly poor memorability despite the assumption that reliability motivates their continued deployment. From millions of account recovery attempts we observed a significant fraction of users (e.g 40% of our English-speaking US users) were unable to recall their answers when needed. This is lower than the success rate of alternative recovery mechanisms such as SMS reset codes (over 80%).

Comparing question strength and memorability reveals that the questions that are potentially the most secure (e.g what is your first phone number) are also the ones with the worst memorability. We conclude that it appears next to impossible to find secret questions that are both secure and memorable. Secret questions continue have some use when combined with other signals, but they should not be used alone and best practice should favor more reliable alternatives.

Google has moved on to more secure methods for account recovery but the existence of the secret answer data, even from 2013, remains a danger for some users on the Internet.

Ancient [?] Craft of Information Visualization

Filed under: History,Mapping,Maps — Patrick Durusau @ 2:35 pm

Vintage Infodesign [125]: More examples of the ancient craft of information visualization by Tiago Veloso.

From the post:

To open this week’s edition of Vintage InfoDesign, we picked some of the maps published in the 1800s/early 1900’s about the Battle of Waterloo . As we showed you before, on June 18th several newspapers marked with stunning pieces of infographic design the 200th anniversary of Napoleon’s final attempt to rule Europe, and since we haven’t feature any “oldies” related to this topic, we thought it would be interesting to do some Internet “digging”.

Hope you enjoy our findings, and feel free to leave the links to other charts and maps about Waterloo, in the comments section.

I’m not entirely comfortable with using the term “ancient” to describe maps depicting the Battle of Waterloo. I think of the fall of the New Kingdom of Egypt, in about 343 BCE as the beginning of “ancient” history.

What Lies Beneath: A Deep Dive into Clojure’s data structures

Filed under: Clojure,Functional Programming,Programming — Patrick Durusau @ 2:03 pm

What Lies Beneath: A Deep Dive into Clojure’s data structures by Mohit Thatte. (slides)

From the description:

Immutable, persistent data structures are at the heart of Clojure’s philosophy. It is instructive to see how these are implemented, to appreciate the trade-offs between persistence and performance. Lets explore the key ideas that led to effective, practical implementations of these data structures. There will be animations that should help clarify key concepts!

Video: Running time a little over thirty-five minutes (don’t leave for coffee).

Closes with a great reading list.

You may also want to review: Purely functional data structures demystified (slides).

Description:

I spoke about ‘Purely Functional Data structures’ at Functional Conference 2014 in Bangalore. These are my slides with an extra section on further study.

This talk is based on Chris Okasaki’s book title Purely Functional Data Structures. The gist is that immutable and persistent data structures can be designed without sacrificing performance.

Computer trivia question from the first video: Why are the colors red and black used for red and black trees?

Today’s Special on Universal Languages

Filed under: Language,Logic — Patrick Durusau @ 1:16 pm

I have often wondered about the fate of the Loglan project, but never seriously enough to track down any potential successor.

Today I encountered a link to Lojban, which is described by Wikipedia as follows:

Lojban (pronounced [ˈloʒban] is a constructed, syntactically unambiguous human language based on predicate logic, succeeding the Loglan project. The name “Lojban” is a compound formed from loj and ban, which are short forms of logji (logic) and bangu (language).

The Logical Language Group (LLG) began developing Lojban in 1987. The LLG sought to realize Loglan’s purposes, and further improve the language by making it more usable and freely available (as indicated by its official full English title, “Lojban: A Realization of Loglan”). After a long initial period of debating and testing, the baseline was completed in 1997, and published as The Complete Lojban Language. In an interview in 2010 with the New York Times, Arika Okrent, the author of In the Land of Invented Languages, stated: “The constructed language with the most complete grammar is probably Lojban—a language created to reflect the principles of logic.”

Lojban was developed to be a worldlang; to ensure that the gismu (root words) of the language sound familiar to people from diverse linguistic backgrounds, they were based on the six most widely spoken languages as of 1987—Mandarin, English, Hindi, Spanish, Russian, and Arabic. Lojban has also taken components from other constructed languages, notably the set of evidential indicators from Láadan.

I mention this just in case someone proposes to you than a universal language would increase communication and decrease ambiguity, resulting in better, more accurate communication in all fields.

Yes, yes it would. And several already exist. Including Lojban. Their language can take its place along side other universal languages, i.e., it can increase the number of languages that make up the present matrix of semantic confusion.

In case you know, what part of: New languages increase the potential for semantic confusion, seems unclear?

Google search poisoning – old dogs learn new tricks

Filed under: Search Analytics,Search Engines,Searching — Patrick Durusau @ 12:24 pm

Google search poisoning – old dogs learn new tricks by Dmitry Samosseiko.

From the post:

These days, every company knows that having its website appear at the top of Google’s results for relevant keyword searches makes a big difference in traffic and helps the business. Numerous search engine optimization (SEO) techniques have existed for years and provided marketers with ways to climb up the PageRank ladder.

In a nutshell, to be popular with Google, your website has to provide content relevant to specific search keywords and also to be linked to by a high number of reputable and relevant sites. (These act as recommendations, and are rather confusingly known as “back links,” even though it’s not your site that is doing the linking.)

Google’s algorithms are much more complex than this simple description, but most of the optimization techniques still revolve around those two goals. Many of the optimization techniques that are being used are legitimate, ethical and approved by Google and other search providers. But there are also other, and at times more effective, tricks that rely on various forms of internet abuse, with attempts to fool Google’s algorithms through forgery, spam and even hacking.

One of the techniques used to mislead Google’s page indexer is known as cloaking. A few days ago, we identified what we believe is a new type of cloaking that appears to work very well in bypassing Google’s defense algorithms.

Dmitry reports that Google was notified of this new form of cloaking so it may be work for much longer.

I first read about this in Notes from SophosLabs: Poisoning Google search results and getting away with it by Paul Ducklin.

I’m not sure I would characterize this as “poisoning Google search.” Altering a Google search result to be sure but poisoning implies that standard Google search results represent some “standard” of search results. Google search results are the outcome of undisclosed algorithms run on undisclosed content, subject to undisclosed processing of the scores from processing content with algorithms, and output with more undisclosed processing of the results.

Just putting it into large containers, I see four large boxes of undisclosed algorithms and content, all of which impact the results presented as Google Search results. Are Google Search results the standard output from four or more undisclosed processing steps of unknown complexity?

That doesn’t sound like much of a standard to me.

You?

July 6, 2015

Which Functor Do You Mean?

Filed under: Homonymous,Names,Subject Identity — Patrick Durusau @ 8:34 pm

Peteris Krumins calls attention to the classic confusion of names that topic maps address in On Functors.

From the post:

It’s interesting how the term “functor” means completely different things in various programming languages. Take C++ for example. Everyone who has mastered C++ knows that you call a class that implements operator() a functor. Now take Standard ML. In ML functors are mappings from structures to structures. Now Haskell. In Haskell functors are just homomorphisms over containers. And in Prolog functor means the atom at the start of a structure. They all are different. Let’s take a closer look at each one.

Peter has said twice in the first paragraph that each of these “functors” is different. Don’t rush to his 2010 post to point out they are different. That was the point of the post. Yes?

Exercise: All of these uses of functor could be scoped by language. What properties of each “functor” would you use to distinguish them beside their language of origin?

Twelve Tips for Getting Started With Data Journalism

Filed under: Journalism,News,Reporting — Patrick Durusau @ 8:24 pm

Twelve Tips for Getting Started With Data Journalism by Nils Mulvad and Helena Bengtsson.

No mention of Python or R, no instructions for No-SQL or SQL databases, no data cleaning exercises, and yet probably the best advice you will find for data journalism (or data science for that matter).

The essential insight of these twelve tips are that the meaning of the data, which implies answering “why does this matter?,” is the task of data journalism/science.

Anyone with sufficient help can generate graphs, produce charts, apply statistical techniques to data sets, but if it is all just technique, no one is going to care.

The twelve tips offered here are good for a daily read with your morning coffee!

Highly recommended!

Hacking Team Customers

Filed under: Cybersecurity,Security — Patrick Durusau @ 3:49 pm

The recent Hacking Team hack generated a rash of self-righteous tweets about the company’s sales to “repressive” governments.

Before you get overly excited about the sins of the Hacking Team, consider this graphic of arms sales by the United States and Russia:

bi_graphics_usrussiaarmsrace-3

From US/Russia Arms Sales Race by Allan Smith and Skye Gould.

Buying arms is a good indication of the intent to repress someone so I don’t see many places that don’t have repressive governments.

Speaking of repression, this is the best visualization I have seen to date of the Greek debt crisis:

http://demonocracy.info/infographics/eu/debt_greek/debt_greek.html

Its a very large visualization so I won’t attempt to replicate it here.

The German government is trying to repress the Greek people for the foreseeable future in order to collect on its debt. Think of it as international loan sharking. If you don’t pay, we will break your legs. Or in this particular case, austerity measures that will blight the lives of millions. Sounds repressive to me.

We can debate repression one way or the other but the important resource for US citizens is: Office of Foreign Assets Control – Sanctions Programs and Information. Sanctions programs, well, carry sanctions for violating their terms. On you.

I have serious questions about the sanctions list both in terms of who is included and who is not. However, unless you have a large appetite for risk, you had best follow its guidance (or your government’s similar list).

Our Uncritical National Media

Filed under: Journalism,News,Reporting — Patrick Durusau @ 2:38 pm

FBI and Media Still Addicted to Ginning Up Terrorist Hysteria – But They Have Never Been Right by Adam Johnson is a stunning indictment of our national media as “uncritical” of goverment terrorist warnings.

I say “uncritical” because despite forty (40) false terrorist warning in a row, there has been no, repeat no terrorist attack in the United States related to those warnings. Not one.

The national media, say the New York Times of my youth, would have “broke” the news of a terrorist warning, but then it would have sought information to verify that warning. That is why is the government issuing a warning today and not yesterday, or next week?

Failing to find such evidence, which it would have in the past forty (40) cases, it would have pressed, investigated and mocked the government until its thin tissue of lies were plain for all to see.

How many times does a government source have to misrepresent facts before your report starts with:

Just in from the habitual liars at the Department of Homeland Security…

and includes a back story on how the Department of Homeland Security has never been right on one of its warnings, nor has its Transportation Safety Administration (TSA) ever caught a terrorist.

Instead, as Adam reports, this is what we get:

On Monday, several mainstream media outlets repeated the latest press release by the FBI that country was under a new “heightened terror alert” from “ISIL-inspired attacks” “leading up to the July 4th weekend.” One of the more sensational outlets, CNN, led with the breathless warning on several of its cable programs, complete with a special report by The Lead’s Jim Sciutto in primetime:

The threat was given extra credence when former CIA director—and consultant at DC PR firm Beacon Global Strategies—Michael Morell went on CBS This Morning (6/29/15) and scared the ever-living bejesus out of everyone by saying he “wouldn’t be surprised if we were sitting [in the studio] next week discussing an attack on the US.” The first piece of evidence Morell used to justify his apocalyptic posture, the “50 ISIS arrests,” was accompanied by a scary map on the CBS jumbotron showing “ISIS arrests” all throughout the US:

But one key detail is missing from this graphic: None of these “ISIS arrests” involved any actual members of ISIS, only members of the FBI—and their network of informants—posing as such. (The one exception being the man arrested in Arizona, who, while having no contact with ISIS, was also not prompted by the FBI.) So even if one thinks the threat of “lone wolf” attacks is a serious one, it cannot be said these are really “ISIS arrests.” Perhaps on some meta-level, it shows an increase of “radicalization,” but it’s impossible to distinguish between this and simply more aggressive sting operations by the FBI.

I would think that competent, enterprising reporters could have ferreted out all the material that Adam mentions in his post. They could have make the case for the groundless nature of the 4th of July security warning.

But no member of the national media did.

In the aftermath of yet another bogus terror warning, the national media should say why it dons pom-poms to promote every terror alert from the FBI or DHS, instead of serving the public’s interest with critical investigation of alleged terror threats.

July 5, 2015

New Android Malware Sample Found Every 18 Seconds

Filed under: Cybersecurity,Security — Patrick Durusau @ 10:42 am

More than 440K new Android malware strains found in Q1, study finds by Terri Robinson.

From the post:

More than 440,000 new strains of Android malware were discovered by security experts at G DATA analyzing data for the first quarter of 2015.

That the company’s Q1 2015 Mobile Malware Report found so many strains of malware, representing a 6.4 percent jump from the quarter before, is not surprising, considering half of U.S. consumers use a smartphone or tablet to do their banking and 78 percent of those on the Internet make purchases online, giving cybercriminals a large pool of potential victims as well as the opportunity for significant financial gain.

“Mobile banking has become a very profitable target of opportunity,” Andy Hayter, security evangelist at G DATA, told SCMagazine.com in an email correspondence. “With mobile banking applications being new, bad guys are taking advantage, and targeting these apps since the majority of those using them are unaware that you should protect your mobile device from malware.”

The uptick represents 4,900 new Android malware files each day of the quarter, up 400 files daily from those recorded in the second half of 2014. About 200 new malware samples were identified daily, meaning that a new malware sample was discovered every 18 seconds.

You know the problem. Apps want to work across multiple versions of Android and with third-party sites (like banks), which multiplies the number of security steps that have to be done right with each targeted interaction.

What are the odds of all those security steps being done right? With a new malware sample every 18 seconds, I would say the odds are heavily stacked against encountering a secure app. Could happen but on the order of the moon becoming a black hole, spontaneously.

Take the same security precautions with your smart phone as you would with any other network connected device. Keep your OS/apps updated on a regular basis. Off-load data not needed for immediate access. Data not on your phone can’t be stolen if your phone is compromised.

July 4, 2015

Our World in Data

Filed under: History,Visualization — Patrick Durusau @ 4:03 pm

Our World in Data by Mike Roser.

Visualizations of War & Violence, Global Health, Africa, World Poverty and World Hunger & Food Provision.

An author chooses their time period but I find limiting the discussion of world poverty to the last 2,000 years problematic. Obtaining even projected data would be problematic but we know there were civilizations, particularly in the Ancient Near East and in Pre-Columbian America that had rather high standards of living. For that matter, for the time period given, the poverty map skips over the Roman Empire at its height, saying “we know that every country was extremely poor compared to modern living standards.”

The Romans had public bath houses, running water, roads that we still use today, public entertainment, libraries, etc. I am not sure how they were “extremely poor compared to modern living conditions.”

It is also problematic (slide 12) when Max says that:

Before modern economic growth the huge majority lived in extreme poverty and only a tiny elite enjoyed a better standard of living.

There are elites in every society that live better than most but that doesn’t automatically imply that over 84% to 94% of the world population was living in poverty. You don’t sustain a society such as the Aztecs or the Incas with only 6 to 16% of the population living outside poverty.

I am deeply doubtful of Max’s conclusion that in terms of poverty the world is becoming more “equal.”

Part of that skepticism is from being aware of statistics like:

“With less than 5 percent of world population, the U.S. uses one-third of the world’s paper, a quarter of the world’s oil, 23 percent of the coal, 27 percent of the aluminum, and 19 percent of the copper,” he reports. “Our per capita use of energy, metals, minerals, forest products, fish, grains, meat, and even fresh water dwarfs that of people living in the developing world.”
Use It and Lose It: The Outsize Effect of U.S. Consumption on the Environment

Considering that many of those resources are not renewable, there is a natural limit to how much improvement can or will take place outside of the United States. When renewable resources become more practical than they are today, they will only supplement the growing consumption of energy in the United States, not replace it.

Max provides access to his data sets if you are interested in exploring the data further. I would be extremely careful with his World Bank data because the World Bank does have an agenda to show the benefits of development across the world.

Considering the impact of consumption on the environment, the World Bank’s pursuit of a global consumption economy may be one of the more ill-fated schemes of all time.

If you are interested in this type of issue, the National Geographic’s Greendex may be of interest.

« Newer PostsOlder Posts »

Powered by WordPress