Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

February 4, 2016

Beating Body Scanners

Filed under: Government,Security — Patrick Durusau @ 9:10 pm

Just on the off chance that some government mandates wholly ineffectual full body scanners for security purposes, Jonathan Corbett has two videos that demonstrate the ease with which such scanner can be defeated, completely.

Oh, I forgot, the US government has mandated such scanners!

Jonathan maintains a great site at: http://professional-troublemaker.com/. You can folow him @_JonCorbett.

Jon is right about the scanners being ineffectual but being effective wasn’t part of the criteria for purchasing the systems. Scanners were purchased to give the impression of frenzied activity, even if it was totally ineffectual.

What would happen if a terrorist did attack an airport, through one of the hundreds of daily lapses in security? What would the government say if it weren’t engaged in non-stop but meaningless activity?

Someone would say, falsely, that it was inactive on the part of government that enabled the attack.

Stuff and nonsense.

“Terrorist” attacks, actually violence committed by criminals by another name, can and will happen no matter what measures are taken by the government. Short of having an all-nude policy beginning at the perimeter of the airport and prohibiting anything larger than a clear quart zip lock bag being shipped. With passengers or as cargo.

Even then it isn’t hard to imagine several dozen ways to carry out “terrorist” attacks at any airport.

The sooner government leaders begin to educate their citizens that some risks are simply unavoidable, the sooner money can stop being wasted on visible but ineffectual efforts like easily defeated body scanners.

Comodo Chromodo browser – Danger! Danger! – Discontinue Use

Filed under: Cybersecurity,Security — Patrick Durusau @ 8:19 pm

Comodo Chromodo browser does not enforce same origin policy and is based on an outdated version of Chromium

From the overview:

Comodo Chromodo browser, version 45.8.12.392, 45.8.12.391, and possibly earlier, does not enforce same origin policy, which allows for the possibility of cross-domain attacks by malicious or compromised web hosts. Chromodo is based on an outdated release of Chromium with known vulnerabilities.

Solution

The CERT/CC is currently unaware of a practical solution to this problem and recommends the following workarounds.

Disable JavaScript

Disabling JavaScript may mitigate cross-domain scripting attacks. For instructions, refer to Comodo’s help page.

Note that disabling JavaScript may not protect against known vulnerabilities in the version of Chromium on which Chromodo is based. For this reason, users should prioritize implementing the following workaround.

Discontinue use

Until these issues are addressed, consider discontinuing use of Chromodo.

Discontinue use is about as extreme a workaround as I can imagine.

Too bad the Comodo site doesn’t say anything about refunds and/or compensation for damaged customers.

Would you say that without any penalty, there is no incentive for Comodo to produce better software?

Or to put it differently, where is the downside to Comodo producing buggy software?

Where does that impact their bottom line?

I first saw this in a tweet by SecuriTay.

Toneapi helps your writing pack an emotional punch [Not For The Ethically Sensitive]

Filed under: Natural Language Processing,Sentiment Analysis — Patrick Durusau @ 8:04 pm

Toneapi helps your writing pack an emotional punch by Martin Bryant.

From the post:

Language analysis is a rapidly developing field and there are some interesting startups working on products that help you write better.

Take Toneapi, for example. This product from Northern Irish firm Adoreboard is a Web-based app that analyzes (and potentially improves) the emotional impact of your writing.

Paste in some text, and it will offer a detailed visualization of your writing.

If you aren’t overly concerned about manipulating, sorry, persuading your readers to your point of view, you might want to give Toneapi a spin. Martin reports that IBM’s Watson has Tone Analyzer and you should also consider Textio and Relative Insight.

Before this casts an Orwellian pale over your evening/day, remember that focus groups and testing messages have been the staple of advertising for decades.

What these software services do is make a crude form of that capability available to the average citizen.

Some people have a knack for emotional language, like Donald Trump, but I can’t force myself to write in incomplete sentences or with one syllable words. Maybe there’s an app for that? Suggestions?

The Ethical Data Scientist

Filed under: Data Science,Ethics — Patrick Durusau @ 7:42 pm

The Ethical Data Scientist by Cathy O’Neil.

From the post:

….
After the financial crisis, there was a short-lived moment of opportunity to accept responsibility for mistakes with the financial community. One of the more promising pushes in this direction was when quant and writer Emanuel Derman and his colleague Paul Wilmott wrote the Modeler’s Hippocratic Oath, which nicely sums up the list of responsibilities any modeler should be aware of upon taking on the job title.

The ethical data scientist would strive to improve the world, not repeat it. That would mean deploying tools to explicitly construct fair processes. As long as our world is not perfect, and as long as data is being collected on that world, we will not be building models that are improvements on our past unless we specifically set out to do so.

At the very least it would require us to build an auditing system for algorithms. This would be not unlike the modern sociological experiment in which job applications sent to various workplaces differ only by the race of the applicant—are black job seekers unfairly turned away? That same kind of experiment can be done directly to algorithms; see the work of Latanya Sweeney, who ran experiments to look into possible racist Google ad results. It can even be done transparently and repeatedly, and in this way the algorithm itself can be tested.

The ethics around algorithms is a topic that lives only partly in a technical realm, of course. A data scientist doesn’t have to be an expert on the social impact of algorithms; instead, she should see herself as a facilitator of ethical conversations and a translator of the resulting ethical decisions into formal code. In other words, she wouldn’t make all the ethical choices herself, but rather raise the questions with a larger and hopefully receptive group.

First, the link for the Modeler’s Hippocratic Oath takes you to a splash page at Wiley for Derman’s book: My Life as a Quant: Reflections on Physics and Finance.

The Financial Modelers’ Manifesto (PDF) and The Financial Modelers’ Manifesto (HTML), are valid links as of today.

I commend the entire text of The Financial Modelers’ Manifesto to you for repeated reading but for present purposes, let’s look at the Modelers’ Hippocratic Oath:

~ I will remember that I didn’t make the world, and it doesn’t satisfy my equations.

~ Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.

~ I will never sacrifice reality for elegance without explaining why I have done so.

~ Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.

~ I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension

It may just be me but I don’t see a charge being laid on data scientists to be the ethical voices in organizations using data science.

Do you see that charge?

To to put it more positively, aren’t other members of the organization, accountants, engineers, lawyers, managers, etc., all equally responsible for spurring “ethical conversations?” Why is this a peculiar responsibility for data scientists?

I take a legal ethics view of the employer – employee/consultant relationship. The client is the ultimate arbiter of the goal and means of a project, once advised of their options.

Their choice may or may not be mine but I haven’t ever been hired to play the role of Jiminy Cricket.

Jiminy_Cricket

It’s heady stuff to be responsible for bringing ethical insights to the clueless but sometimes the clueless have ethical insights on their on, or not.

Data scientists can and should raise ethical concerns but no more or less than any other member of a project.

As you can tell from reading this blog, I have very strong opinions on a wide variety of subjects. That said, unless a client hires me to promote those opinions, the goals of the client, by any legal means, are my only concern.

PS: Before you ask, no, I would not work for Donald Trump. But that’s not an ethical decision. That’s simply being a good citizen of the world.

Spontaneous Preference for their Own Theories (SPOT effect) [SPOC?]

Filed under: Ontology,Programming,Semantics — Patrick Durusau @ 5:04 pm

The SPOT Effect: People Spontaneously Prefer their Own Theories by Aiden P. Gregga, Nikhila Mahadevana, and Constantine Sedikidesa.

Abstract:

People often exhibit confirmation bias: they process information bearing on the truth of their theories in a way that facilitates their continuing to regard those theories as true. Here, we tested whether confirmation bias would emerge even under the most minimal of conditions. Specifically, we tested whether drawing a nominal link between the self and a theory would suffice to bias people towards regarding that theory as true. If, all else equal, people regard the self as good (i.e., engage in self-enhancement), and good theories are true (in accord with their intended function), then people should regard their own theories as true; otherwise put, they should manifest a Spontaneous Preference for their Own Theories (i.e., a SPOT effect). In three experiments, participants were introduced to a theory about which of two imaginary alien species preyed upon the other. Participants then considered in turn several items of evidence bearing on the theory, and each time evaluated the likelihood that the theory was true versus false. As hypothesized, participants regarded the theory as more likely to be true when it was arbitrarily ascribed to them as opposed to an “Alex” (Experiment 1) or to no one (Experiment 2). We also found that the SPOT effect failed to converge with four different indices of self-enhancement (Experiment 3), suggesting it may be distinctive in character.

I can’t give you the details on this article because it is fire-walled.

But the catch phrase, “Spontaneous Preference for their Own Theories (i.e., a SPOT effect)” certainly fits every discussion of semantics I have ever read or heard.

With a little funding you could prove the corollary, Spontaneous Preference for their Own Code (the SPOC effect) among programmers. 😉

There are any number of formulations for how to fight confirmation bias but Jeremy Dean puts it this way:


The way to fight the confirmation bias is simple to state but hard to put into practice.

You have to try and think up and test out alternative hypothesis. Sounds easy, but it’s not in our nature. It’s no fun thinking about why we might be misguided or have been misinformed. It takes a bit of effort.

It’s distasteful reading a book which challenges our political beliefs, or considering criticisms of our favourite film or, even, accepting how different people choose to live their lives.

Trying to be just a little bit more open is part of the challenge that the confirmation bias sets us. Can we entertain those doubts for just a little longer? Can we even let the facts sway us and perform that most fantastical of feats: changing our minds?

I wonder if that includes imagining using JSON? (shudder) 😉

Hard to do, particularly when we are talking about semantics and what we “know” to be the best practices.

Examples of trying to escape the confirmation bias trap and the results?

Perhaps we can encourage each other.

SQL Injection Hall-Of-Shame / Internet-of-Things Hall-Of-Shame

Filed under: Cybersecurity,Security — Patrick Durusau @ 4:13 pm

SQL Injection Hall-Of-Shame by Arthur Hicken.

From the webpage:

In this day and age it’s ridiculous how frequently large organizations are falling prey to SQL Injection which is almost totally preventable as I’ve written previously.

Note that this is a work in progress. If I’ve missed something you’re aware of please let me know in the comments at the bottom of the page.

Don’t let this happen to you! For some simple tips see the OWASP SQL Injection Prevention Cheat Sheet. For more security info check out the security resources page and the book SQL Injection Attacks and Defense or Basics of SQL injection Analysis, Detection and Prevention: Web Security for more info.

IOT HALL-OF-SHAME

With the rise of internet enabled devices in the Internet of Things or IoT the need for software security is becoming even more important. Unfortunately many device makers seem to put security on the back burner or not even understand the basics of cybersecurity.

I am maintaining here a list of known hacks for “things”. The list is short at the moment but will grow, and is often more generic than it could be. It’s kind of in reverse-chronological order, based on the date that the hack was published. Please assist – if you’re aware of additional thing-hacks please let me know in the comments at the bottom of the page.

I assume you find “wall-of-shame” efforts as entertaining as I do.

I am aware of honor-shame debates from a biblical studies perspective, on which see: Complete Bibliography of Honor-Shame Resources

“Complete” is a relative term when used regarding any bibliography in biblical studies and this appears to have at least one resource from 2011, but none later. You can run the references forward to collect more recent literature.

But the question with shaming techniques is are they effective?

As a case in point, consider Researchers find it’s terrifyingly easy to hack traffic lights where the post points out:

In fact, the most upsetting passage in the entire paper is the dismissive response issued by the traffic controller vendor when the research team presented its findings. According to the paper, the vendor responsible stated that it “has followed the accepted industry standard and it is that standard which does not include security.”

We can entertain ourselves by shaming vendors all day but only the “P” word will drive greater security.

“P” as in penalty.

Vormetric found that to be the case in What Drives Compliance? Hint: The P Word Missing From Cybersecurity Discussions.

Be entertained by wall-of-shame efforts but lobby for compliance enforced by penalties. (Know to anthropologists as a fear culture.)

Truthful Paedophiles On The Darknet?

Filed under: Government,Privacy,Tor — Patrick Durusau @ 3:13 pm

There is credibility flaw in Cryptopolitik and the Darknet by Daniel Moore & Thomas Rid that I overlooked yesterday (The Dark Web, “Kissing Cousins,” and Pornography) Perhaps it was just too obvious to attract attention.

Moore and Rid write:

The pornographic content was perhaps the most distressing. Websites dedicated to providing links to videos purporting to depict rape, bestiality and paedophilia were abundant. One such post at a supposedly nonaffiliated content-sharing website offered a link to a video of ‘a 12 year old girl … getting raped at school by 4 boys’.52 Other examples include a service that sold online video access to the vendor’s own family members:

My two stepsisters … will be pleased to show you their little secrets. Well, they are rather forced to show them, but at least that’s what they are used to.53

Several communities geared towards discussing and sharing illegitimate fetishes were readily available, and appeared to be active. Under the shroud of anonymity, various users appeared to seek vindication of their desires, providing words of support and comfort for one another in solidarity against what was seen as society’s unjust discrimination against non-mainstream sexual practices. Users exchanged experiences and preferences, and even traded content. One notable example from a website called Pedo List included a commenter freely stating that he would ‘Trade child porn. Have pics of my daughter.’54 There appears to be no fear of retribution or prosecution in these illicit communities, and as such users apparently feel comfortable enough to share personal stories about their otherwise stifled tendencies. (page 23)

Despite their description of hidden services as dens of iniquity and crime, those who use them are suddenly paragons of truthfulness, at least when it suits the authors purpose?

Doesn’t crediting the content of the Darknet as truthful, as opposed to being wishful, fantasy, or even police officers posing to investigate (some would say entrap) others, strain the imagination?

Some of the content is no doubt truthful but policy arguments need to be based on facts, not a collection of self-justifying opinions from like minded individuals.

A quick search on the string (without quotes):

police officers posing as children sex rings

Returns 9.7 million “hits.

How many of those police officers appeared in the postings collected by Moore & Rid it isn’t possible to say.

But in science, there is this thing called the burden of proof. That is simply asserting a conclusion, even citing equally non-evidence based conclusions, isn’t sufficient to prove a claim.

Moore & Rid had the burden to prove that the Darknet is a wicked place that poses all sorts of dangers and hazards.

As I pointed out yesterday, The Dark Web, “Kissing Cousins,” and Pornography, their “proof” is non-replicable conclusions about a small part of the Darkweb.

Earlier today I realized their conclusions depend upon a truthful criminal element using the Darkweb.

What do you think about the presumption that criminals are truthful?

Sounds doubtful to me!

February 3, 2016

The Dark Web, “Kissing Cousins,” and Pornography

Filed under: Cybersecurity,Government,Privacy,Security,Tor — Patrick Durusau @ 7:57 pm

Dark web is mostly illegal, say researchers by Lisa Vaas.

You can tell where Lisa comes out on the privacy versus law enforcement issue by the slant of her conclusion:

Users, what’s your take: are hidden services worth the political firestorm they generate? Are they worth criminals escaping justice?

Illegal is a slippery concept.

Marriage of first “kissing” cousins is “illegal” in:

Arkansas, Delaware, Idaho, Iowa, Kansas, Kentucky, Louisiana, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, South Dakota, Texas, Washington, West Virginia, and Wyoming.

Marriage of first “kissing” cousins is legal in:

Alabama, Alaska, California, Colorado, Connecticut, District of Columbia, Florida, Georgia, Hawaii, Maryland, Massachusetts, New Jersey, New Mexico, New York, North Carolina (first cousins but not double first), Rhode Island, South Carolina, Tennessee, Vermont, and Virginia.

There are some other nuances I didn’t capture and for those see: State Laws Regarding Marriages Between First Cousins.

If you read Cryptopolitik and the Darknet by Daniel Moore & Thomas Rid carefully, you will spot a number of problems with their methodology and reasoning.

First and foremost, no definitions were offered for their taxonomy (at page 20):

  • Arms
  • Drugs
  • Extremism
  • Finance
  • Hacking
  • Illegitimate pornography
  • Nexus
  • Other illicit
  • Social
  • Violence
  • Other
  • None

Readers and other researchers are left to wonder what was included or excluded from each of those categories.

In science, that would be called an inability to replicate the results. As if this were science.

Moore & Rid recite anecdotal accounts of particular pornography sites, calculated to shock the average reader, but that’s not the same thing as enabling replication of their research. Or a fair characterization of all the pornography encountered.

They presumed that text was equivalent to image content, so they discarded all images (pages 19-20). Which left them unable to test that presumption. Hmmm, untested assumptions in science?

The results of the unknown basis for classification identied 122 sites (page 21) as pornographic out of the 5,205 initial set of sites.

If you accept Tor’s estimate of 30,000 hidden services that announce themselves every day, Moore & Rid have found that illegal pornography (whatever that means) is:

122 / 30000 = 0.004066667

Moore & Rid have established that “illegal” porn is .004066667% of the Dark Net.

I should be grateful Moore & Rid have so carefully documented the tiny part of the Dark Web concerned with their notion of “illegal” pornography.

But, when you encounter “reasoning” such as:


The other quandary is how to deal with darknets. Hidden services have already damaged Tor, and trust in the internet as a whole. To save Tor – and certainly to save Tor’s reputation – it may be necessary to kill hidden services, at least in their present form. Were the Tor Project to discontinue hidden services voluntarily, perhaps to improve the reputation of Tor browsing, other darknets would become more popular. But these Tor alternatives would lack something precious: a large user base. In today’s anonymisation networks, the security of a single user is a direct function of the number of overall users. Small darknets are easier to attack, and easier to de-anonymise. The Tor founders, though exceedingly idealistic in other ways, clearly appreciate this reality: a better reputation leads to better security.85 They therefore understand that the popularity of Tor browsing is making the bundled-in, and predominantly illicit, hidden services more secure than they could be on their own. Darknets are not illegal in free countries and they probably should not be. Yet these widely abused platforms – in sharp contrast to the wider public-key infrastructure – are and should be fair game for the most aggressive intelligence and law-enforcement techniques, as well as for invasive academic research. Indeed, having such clearly cordoned-off, free-fire zones is perhaps even useful for the state, because, conversely, a bad reputation leads to bad security. Either way, Tor’s ugly example should loom large in technology debates. Refusing to confront tough, inevitable political choices is simply irresponsible. The line between utopia and dystopia can be disturbingly thin. (pages 32-33)

it’s hard to say nothing and see public discourse soiled with this sort of publication.

First, there is no evidence presented that hidden services have damaged Tor and/or trust in the Internet as a whole. Even the authors concede that Tor is the most popular option anonymous browsing and hidden services. That doesn’t sound like damage to me. You?

Second, the authors dump all hidden services in the “bad, very bad” basket, despite their own research classifying only .004066667% of the Dark Net as illicit pornography. They use stock “go to” examples to shock readers in place of evidence and reasoning.

Third, the charge that Tor has “[r]efused to confront tough, inevitable political choices is simply irresponsible” is false. Demonstrably false because the authors point out that Tor developers made a conscious choice to not take political considerations into account (page 25).

Since Moore & Rid disagree with that choice, they resort to name calling, terming the decision “simply irresponsible.” Moore & Rid are entitled to their opinions but they aren’t going to persuade even a semi-literate audience with name calling.

Take Cryptopolitik and the Darknet as an example of how to not write a well researched and reasoned paper. Although, that isn’t a bar to publication as you can see.

Cheating Cheaters [Honeypots for Government Agencies?]

Filed under: Cybersecurity,Games — Patrick Durusau @ 4:36 pm

Video Game Cheaters Outed By Logic Bombs by timothy.

From the post:

A Reddit user decided to tackle the issue of cheaters within Valve’s multiplayer shooter Counter Strike: Global Offensive in their own unique way: by luring them towards fake “multihacks” that promised a motherlode of cheating tools, but in reality, were actually traps designed to cause the users who installed them to eventually receive bans. The first two were designed as time bombs, which activated functions designed to trigger bans after a specific time of day. The third, which was downloaded over 3,500 times, caused instantaneous bans.

I wonder if anyone is running honeypots for intelligence agencies?

Or fake jihad sites for our friends in law enforcement?

Sort of a Spy vs. Spy situation, yes?

spion-mot-spion

Cyber-dueling with government before you aren’t wearing protective gear and the tips aren’t blunted.

Unpublished Black History Photos (NYT)

Filed under: History,Journalism,News,Reporting — Patrick Durusau @ 4:09 pm

The New York Times is unearthing unpublished photos from its archives for Black History Month by Shan Wang.

From the post:

In this black and white photo taken by a New York Times staff photographer, two unidentified second graders at Princeton’s Nassau Street Elementary School stand in front of a classroom blackboard. Some background text accompanies the image, pointing to a 1964 Times article about school integration and adding that the story “offered a caveat that still resonates, noting that in the search for a thriving and equal community, ‘good schooling is not enough.’”

Times readers wrote in to ask specifically about the second graders in the photo, so the Times updated the post with a comment form asking readers to share anything they might know about the girl and boy depicted.

Great background on the Unpublished Black History project at the Times.

Public interfaces enable contribution of information on selected images along with comments.

Unlike the US Intelligence community, the Times is willing to admit that its prior conduct may not reflect (then) or current values.

If a private, for-profit organization can be that honest, what’s the deal with government agencies?

Must be that accountability thing that Republicans are always trying to foist off onto public school teachers and public school teachers alone.

No accountability for elected officials and/or their appointees and cronies.

They are deadly serious about crypto backdoors [And of the CIA and Chinese Underwear]

Filed under: Cryptography,Cybersecurity,Government,Security — Patrick Durusau @ 3:25 pm

They are deadly serious about crypto backdoors by Robert Graham.

From the post:

Julian Sanchez (@normative) has an article questioning whether the FBI is serious about pushing crypto backdoors, or whether this is all a ploy pressuring companies like Apple to give them access. I think they are serious — deadly serious.

The reason they are only half-heartedly pushing backdoors at the moment is that they believe we, the opposition, aren’t serious about the issue. After all, the 4rth Amendment says that a “warrant of probable cause” gives law enforcement unlimited power to invade our privacy. Since the constitution is on their side, only irrelevant hippies could ever disagree. There is no serious opposition to the proposition. It’ll all work itself out in the FBI’s favor eventually. Among the fascist class of politicians, like the Dianne Feinsteins and Lindsay Grahams of the world, belief in this principle is rock solid. They have absolutely no doubt.

But the opposition is deadly serious. By “deadly” I mean this is an issue we are willing to take up arms over. If congress were to pass a law outlawing strong crypto, I’d move to a non-extradition country, declare the revolution, and start working to bring down the government. You think the “Anonymous” hackers were bad, but you’ve seen nothing compared to what the tech community would do if encryption were outlawed.

On most policy questions, there are two sides to the debate, where reasonable people disagree. Crypto backdoors isn’t that type of policy question. It’s equivalent to techies what trying to ban guns would be to the NRA.

What he says.

Crypto backdoors are a choice between a policy that benefits government at the expense of everyone (crypto backdoors) versus a policy that benefits everyone at the expense of the government (no crypto backdoors). It’s really that simple.

When I say crypto backdoors benefit the government, I mean that quite literally. Collecting data via crypto backdoors and otherwise, enables government functionaries to pretend to be engaged in meaningful responses to serious issues.

Collecting and shoveling data from desk to desk is about as useless an activity as can be imagined.

Basis for that claim? Glad you asked!

If you haven’t read: Chinese Underwear and Presidential Briefs: What the CIA Told JFK and LBJ About Mao by Steve Usdin, do so.

Steve covers the development of the “presidential brief” and its long failure to provide useful information about China and Mao in particular. The CIA long opposed declassification of historical presidential briefs based on the need to protect “sources and methods.”

The presidential briefs for the Kennedy and Johnson administrations have been released and here is what Steve concludes:

In any case, at least when it comes to Mao and China, the PDBs released to date suggest that the CIA may have fought hard to keep the these documents secret not to protect “sources and methods,” but rather to conceal its inability to recruit sources and failure to provide sophisticated analyses.

Past habits of the intelligence community explain rather well why they have no, repeat no examples of how strong encryption as interfered with national security. There are none.

The paranoia about “crypto backdoors” is another way to engage in “known to be useless” action. It puts butts in seats and inflates agency budgets.


Unlike Robert, should Congress ban strong cryptography, I won’t be moving to a non-extradition country. Some of us need to be here when local police come to their senses and defect.

Google Paywall Loophole Going Bye-Bye [Fair Use Driving Pay-Per-View Traffic]

Filed under: Cybersecurity,Fair Use,Intellectual Property (IP) — Patrick Durusau @ 2:51 pm

The Wall Street Journal tests closing the Google paywall loophole by Lucia Moses.

From the post:

The Wall Street Journal has long had a strict paywall — unless you simply copy and paste the headline into Google, a favored route for those not wanting to pony up $200 a year. Some users have noticed in recent days that the trick isn’t working.

A Journal spokesperson said the publisher was running a test to see if doing so would entice would-be subscribers to pay up. The rep wouldn’t elaborate on how long and extensive the experiment was and if permanently closing the loophole was a possible outcome.

“We are experimenting with a number of different trial mechanics at the moment to provide a better subscription taster for potential new customers,” the rep said. “We are a subscription site and we are always looking at better ways to optimize The Wall Street Journal experience for our members.”

The Wall Street Journal can deprive itself of the benefits of “fair use” if it wants to, but is that a sensible position?

Fair Use Benefits the Wall Street Journal

Rather than a total ban on copying, what if the amount of an article that can be copied is set by algorithm? Such that at a minimum, the first two or three paragraphs of any story can be copied, whether you arrive from Google or directly on the WSJ site.

Think about it. Wall Street Journal readers aren’t paying to skim the lead paragraphs in the WSJ. They are paying to see the full story and analysis in particular subject areas.

Bloggers, such as myself, cannot drive content seekers to the WSJ because the first sentence or two isn’t enough for readers to develop an interest in the WSJ report.

If I could quote the first 2 or 3 paragraphs, add in some commentary and perhaps other links, then a visitor to the WSJ is visiting to see the full content the Wall Street Journal has to offer.

The story lead is acting, as it should, to drive traffic to the Wall Street Journal, possibly from readers who won’t otherwise think of the Wall Street Journal. Some of my readers on non-American/European continents for example.

Bloggers Driving Readers to Wall Street Journal Pay-Per-View Content

By developing algorithmic fair use as I describe it would enlist an army of bloggers in spreading notice of pay-per-view content of the Wall Street Journal, at no expense to the Wall Street Journal. As a matter of fact, bloggers would be alerting readers of pay-per-view WSJ content, at the blogger’s own expense.

It may just be me but if someone were going to drive viewers to pay-per-view content on my site, at their own expense, with fair use of content, I would be insane to prevent that. But, I’m not the one grasping at dimes while $100 bills are flying overhead.

Close the Loophole, Open Up Fair Use

Full disclosure, I don’t have any evidence for fair use driving traffic to the Wall Street Journal because that evidence doesn’t exist. The Wall Street Journal would have to enable fair use and track appearance of fair use content and the traffic originating from it. Along with conversions from that additional traffic.

Straight forward data analytics but it won’t happen by itself. When the WSJ succeeds with such a model, you can be sure that other paywall publishers will be quick to follow suite.

Caveat: Yes, there will be people who will only ever consume the free use content. And your question? If they aren’t ever going to be paying customers and the same fair use is delivering paying customers, will you lose the latter in order to spite the former?

Isn’t that like cutting off your nose to spite your face?

Historical PS:

I once worked for a publisher that felt a “moral obligation,” their words, not mine, to prevent anyone from claiming a missing journal issue to which they might not be entitled. Yeah. Journal issues that were as popular as the Watchtower is among non-Jehovah’s Witnesses. Cost to the publisher, about $3.00 per issue, cost to verify entitlement, a full time position at the publisher.

I suspect claims ran less than 200 per year. My suggestion was to answer any request with thanks, here’s your missing copy. End of transaction. Track claims only to prevent abuse. Moral outrage followed.

Is morality the basis for your pay-per-view access policy? I thought pay-per-view was a way to make money.

Pass this post along to the WSJ if you know anyone there. Free suggestion. Perhaps they will be interested in other, non-free suggestions.

Gremlin Users – Beware the Double-Underscore!

Filed under: Graphs,Gremlin,Language Design — Patrick Durusau @ 2:04 pm

A user recently posted this example from the Gremlin documentation:

g.V().hasLabel(‘person’).choose(values(‘age’)).option(27,_in()).option(32,_.
out()).values(‘name’) [apologies for the line wrap]

which returned:

“No such property: _ for class: Script121”

Marko Rodriguez responded:

Its a double underscore, not a single underscore.

__ vs. _

I mention this to benefit beginning Gremlin users who haven’t developed an underscore stutter but also as a plea for sanity in syntax design.

It’s is easy to type two successive underscores but the obviousness of a double underscore versus a single underscore depends on local typography.

To say nothing that what might be obvious to the eyes of a twenty-something may not be as obvious to the eyes of a fifty-something+.

In syntax design, answer the question:

Do you want to be clever or clear?

Reverse Image Search (TinEye) [Clue to a User Topic Map Interface?]

TinEye was mentioned in a post I wrote in 2015, Baltimore Burning and Verification, but I did not follow up at the time.

Unlike some US intelligence agencies, TinEye has a cool logo:

TinEye

Free registration enables you to share search results with others, an important feature for news teams.

I only tested the plugin for Chrome, but it offers useful result options:

tineye-options

Once installed, use by hovering over an image in your browser, right “click” and select “Search image on TinEye.” Your results will be presented as set under options.

Clue to User Topic Map Interface

That is a good example of how one version of a topic map interface should work. Select some text, right “click” and “Search topic map ….(preset or selection)” with configurable result display.

That puts you into interaction with the topic map, which can offer properties to enable you to refine the identification of a subject of interest and then a merged presentation of the results.

As with a topic map, all sorts of complicated things are happening in the background with the TinEye extension.

But as a user, I’m interested in the results that FireEye presents not how it got them.

I used to say “more interested” to indicate I might care how useful results came to be assembled. That’s a pretension that isn’t true.

It might be true in some particular case, but for the vast majority of searches, I just want the (uncensored Google) results.

US Intelligence Community Logo for Same Capability

I discovered the most likely intelligence community logo for a similar search program:

peeping-tom_2734636b

The answer to the age-old question of “who watches the watchers?” is us. Which watchers are you watching?

LispNYC Videos

Filed under: Clojure,Lisp — Patrick Durusau @ 9:36 am

LispNYC Videos

Videos recorded at some LispNYC meetings have been posted online.

Not enough to get your through the 2016 election cycle but a good start!

Enjoy!

February 2, 2016

Balisage 2016, 2–5 August 2016 [XML That Makes A Difference!]

Filed under: Conferences,XLink,XML,XML Data Clustering,XML Schema,XPath,XProc,XQuery,XSLT — Patrick Durusau @ 9:47 pm

Call for Participation

Dates:

  • 25 March 2016 — Peer review applications due
  • 22 April 2016 — Paper submissions due
  • 21 May 2016 — Speakers notified
  • 10 June 2016 — Late-breaking News submissions due
  • 16 June 2016 — Late-breaking News speakers notified
  • 8 July 2016 — Final papers due from presenters of peer reviewed papers
  • 8 July 2016 — Short paper or slide summary due from presenters of late-breaking news
  • 1 August 2016 — Pre-conference Symposium
  • 2–5 August 2016 — Balisage: The Markup Conference

From the call:

Balisage is the premier conference on the theory, practice, design, development, and application of markup. We solicit papers on any aspect of markup and its uses; topics include but are not limited to:

  • Web application development with XML
  • Informal data models and consensus-based vocabularies
  • Integration of XML with other technologies (e.g., content management, XSLT, XQuery)
  • Performance issues in parsing, XML database retrieval, or XSLT processing
  • Development of angle-bracket-free user interfaces for non-technical users
  • Semistructured data and full text search
  • Deployment of XML systems for enterprise data
  • Web application development with XML
  • Design and implementation of XML vocabularies
  • Case studies of the use of XML for publishing, interchange, or archiving
  • Alternatives to XML
  • the role(s) of XML in the application lifecycle
  • the role(s) of vocabularies in XML environments

Full papers should be submitted by the deadline given below. All papers are peer-reviewed — we pride ourselves that you will seldom get a more thorough, skeptical, or helpful review than the one provided by Balisage reviewers.

Whether in theory or practice, let’s make Balisage 2016 the one people speak of in hushed tones at future markup and information conferences.

Useful semantics continues to flounder about, cf. Vice-President Biden’s interest in “one cancer research language.” Easy enough to say. How hard could it be?

Documents are commonly thought of and processed as if from BOM to EOF is the definition of a document. Much to our impoverishment.

Silo dissing has gotten popular. What if we could have our silos and eat them too?

Let’s set our sights on a Balisage 2016 where non-technicals come away saying “I want that!”

Have your first drafts done well before the end of February, 2016!

Google to deliver wrong search results to would-be jihadis[, gays, unwed mothers, teenagers, Muslims

Filed under: Censorship,Government,Privacy,Security — Patrick Durusau @ 8:52 pm

Google to deliver wrong search results to would-be jihadis by David Barrett.

From the post:

Jihadi sympathisers who type extremism-related words into Google will be shown anti-radicalisation links instead, under a pilot scheme announced by the internet giant.

The new technology means people at risk of radicalisation will be presented with internet links which are the exact opposite of what they were searching for.

Dr Anthony House, a senior Google executive, revealed the pilot scheme in evidence to MPs scrutinising the role of internet companies in combating extremism.

It isn’t hard to see where this slippery road leads.

If any of the current Republican candidates are elected to the U.S. presidency, Google will:

Respond to gay sex or gay related searches with links for praying yourself straight.

Unwed mothers requesting abortion services will have their personal information forwarded to right-to-birth organizations and sent graphic anti-abortion images by email.

Teenagers seeking birth control advice will only see – Abstinence or Hell!

Muslims, well, unless Trump has deported all of them, will see anti-Muslim links.

Unlike bad decisions by government, Google can effectively implement demented schemes such as this one.

Censoring of search results to favor any side, policy, position, is just that censorship.

If you forfeit the rights of others, you have no claim to rights yourself.

Your call.

“A little sinister!!” (NRO’s Octopus Logo)

Filed under: Government,Security — Patrick Durusau @ 8:14 pm

“A little sinister!!” The story behind National Reconnaissance Office’s octopus logo by JPat Brown.

From the post:

When the National Reconnaissance Office (NRO) announced the upcoming launch of their NROL-39 mission back in December 2013, they didn’t get quite the response they had hoped.

nrol-2

That might have had something to do with the mission logo being a gigantic octopus devouring the Earth.

The logo was widely lampooned as emblematic of the intelligence community’s tone-deafness to public sentiment. Incidentally, an octopus enveloping the planet also so happens to be the logo of SPECTRE, the international criminal syndicate that James Bond is always thwarting. So there’s that.

Privacy and security researcher Runa Sandvik wanted to know who approved this and why, so she filed a FOIA with the NRO for the development materials that went into the logo. A few months later, the NRO delivered.

This is a great read and one you need to save to your local server. Especially for days when you think the U.S. government is conspiring against its citizens. It should be so well-organized.

All sorts of government outrages are the produce of the same decision making process as this lame looking octopus.

At the very least they could have gotten John Romita Jr. to do something a bit more creative:

Doctoroctopus Fair use.

More than “a little sinister” but why not be honest?

Bumping into Stallman, again [Stallmanism]

Filed under: Open Source,Software — Patrick Durusau @ 7:42 pm

Bumping into Stallman, again by Frederick Jacobs.

From the post:

This is the second time I’m talking at the same conference as Richard Stallman, after the Ind.ie Tech Summit in Brighton, this time was at the Fri Software Days in Fribourg, Switzerland.

One day before my presentation, I got an email from the organizers, letting me know that Stallman would like me to rename the title of my talk to remove any mentions of “Open Source Software” and replace them with “Free Software”.

The email read like this:

Is it feasible to remove the terms “Open-Source” from the title of your presentation and replace them by “Free-libre software”? It’s the wish of M. Stallman, that will probably attend your talk.

Frederick didn’t change his title or presentation, while at the same time handling the issue much better than I would have.

Well, after I got through laughing my ass off that Stallman would presume to dictate word usage to anyone.

Word usage, for any stallmanists in the crowd, is an empirical question of how many people use a word with a common meaning.

At least if you want to be understood by others.

Sunlight launches Hall of Justice… [ Topic Map “like” features?]

Filed under: Government,Government Data,Topic Maps — Patrick Durusau @ 6:53 pm

Sunlight launches Hall of Justice, a massive data inventory on criminal justice across the U.S. by Josh Stewart.

From the post:

Today, Sunlight is launching Hall of Justice, a robust, searchable data inventory of nearly 10,000 datasets and research documents from across all 50 states, the District of Columbia and the federal government. Hall of Justice is the culmination of 18 months of work gathering data and refining technology.

The process was no easy task: Building Hall of Justice required manual entry of publicly available data sources from a multitude of locations across the country.

Sunlight’s team went from state to state, meeting and calling local officials to inquire about and find data related to criminal justice. Some states like California have created a data portal dedicated to making criminal justice data easily accessible to the public; others had their data buried within hard to find websites. We also found data collected by state departments of justice, police forces, court systems, universities and everything in between.

“Data is shaping the future of how we address some of our most pressing problems,” said John Wonderlich, executive director of the Sunlight Foundation. “This new resource is an experiment in how a robust snapshot of data can inform policy and research decisions.”

In addition to being a great data collection, the Hall of Justice attempts to deliver topic map like capability for searches:

The resource attempts to consolidate different terminology across multiple states, which is far from uniform or standardized. For example, if you search solitary confinement you will return results for data around solitary confinement, but also for the terms “segregated housing unit,” “SHU,” “administrative segregation” and “restrictive housing.” This smart search functionality makes finding datasets much easier and accessible.

solitary

Looking at all thirteen results for a search on “solitary confinement,” I don’t see the mapping in question. Or certainly no mapping based on characteristics of the subject, “solitary confinement.”

As close as Georgia’s 2013 Juvenile Justice Reform is using the word “restrictive” as in:

Create a two-class system within the Designated Felony Act. Designated felony offenses are divided into two classes, based on severity—Class A and Class B—that continue to allow restrictive custody while also adjusting available sanctions to account for both offense severity and risk level.

Restrictive custody is what jail systems are about so that doesn’t trip the wire for “solitary confinement.”

Of course, the links are to entire reports/documents/data sets so each researcher will have to extract and collate content individually. When that happens, a means to contribute that collation/mapping to the Hall of Justice would be a boon for other researchers. (Can you say “topic map?”)

As I write this, you will need to prefer Mozilla over Chrome, at least on Ubuntu.

Trigger Warning: If you are sensitive to traumatic events and/or reports of traumatic events, you may want to ask someone less sensitive to review these data sources.

The only difference between a concentration camp and American prisons is the lack of mass gas chambers. Every horror and abuse that you can imagine and some you probably can’t, are visited on people in U.S. prisons everyday.

As Joan Baez says in Prison Triology:

Sunlight’s Hall of Justice is a great step forward in documenting the chambers of horror we call American prisons.

And we’re gonna raze, raze the prisons

To the ground

Help us raze, raze the prisons

To the ground

Are you ready?

All The Pubs In Britain & Ireland & Nothing Else

Filed under: Mapping,Maps — Patrick Durusau @ 2:44 pm

All The Pubs In Britain & Ireland & Nothing Else by Ramiro Gómez.

From the post:

The map above is elegant in its simplicity. It shows Great Britain and Ireland drawn from pubs. Each blue dot represents a single pub using data extracted from OpenStreetMap with the Matplotlib Basemap Toolkit.

Interestingly, if the same map had been drawn using the number of pubs from 1980 it would have looked quite different.

In total, the map has 29,195 pub locations across both the UK and Ireland. However, the UK alone has lost 21,000 pubs since 1980 according to the Institute of Economic Affairs, with half of these occurring since 2006.

Therefore, a map from 1980 might have had nearly twice as many dots as the one above and possibly not all in the same places. Going back even further, there were a reported 99,000 pubs in the UK in 1905.

See Ramiro’s post for the map but more importantly, book travel to the UK to help stem the loss of pubs!

How many of the 29,195 pubs in the UK have you visited?

Microsoft Quantum Challenge [Deadline April 29, 2016.]

Filed under: Challenges,Contest,Quantum — Patrick Durusau @ 2:22 pm

Microsoft Quantum Challenge

From the webpage:

Join students from around the world to investigate and solve problems facing the quantum universe using Microsoft’s simulator, LIQUi|>.

Win big prizes, or the opportunity to interview for internships at Microsoft Research.

Objectives of the Quantum Challenge

The Quantum Architectures and Computing Group QuArC is seeking exceptional students!

WE want to find students who are eager to expand their knowledge of quantum computing, and who can translate thoughts into programs. Thereby we will expand the use of Microsoft’s Quantum Simulator LIQUi|>.

How to enter

First of all, REGISTER for the Challenge so that you can receive updates about the contest.

In the challenge you will use the LIQUi|> simulator to solve a novel problem and then report on your findings. So, think of a project. Then, download the simulator from GitHub and work with it to solve your problem. Finally, write a report about your findings and submit it. Your report submission will enter you into the Challenge.

In the report, present a description of the project including goals, methods, challenges, and any result obtained using LIQUi|>. You do not need to submit circuits and the software you develop, however, sample input and output for LIQUi|> must be submitted to show you used the simulator in the project. Your entry must consist of six pages or less, in PDF format.

The Challenge is open to students at colleges and universities world-wide (with a few restrictions) and aged 18+. NO PURCHASE NECESSARY. For full details, see the Official Rules

The prizes

 The Quantum Challenge is your change to win a big prize!

  • First Prize:  $5,000
  • Second Prizes:   Four at $2,500
  • Honorary Mention: Certificates will be presented to runner-up entries

Extra – visits or internship interviews

As a result of the challenge, some entrants could be invited to visit the QuArC team at Microsoft Research in Redmond, or have an opportunity to interview for internships at Microsoft Research. Internships are highly prestigious and involve working with the QuArC team for 12 weeks on cutting edge research.

If you are young enough to enter, just a word of warning about the “big prize.” $5,000 today isn’t a “big prize.” Maybe a nice weekend if you keep it low key but only just.

Interaction with the QuArC team, either by winning or in online discussions is the real prize.

Besides, who need $5,000 if you can break quantum encrypted bank transfer orders? 😉

How to Build a TimesMachine [New York Times from 1851-2001]

Filed under: History,News,Search Engines — Patrick Durusau @ 1:59 pm

How to Build a TimesMachine by Jane Cotler and Evan Sandaus.

From the post:

At the beginning of this year, we quietly expanded TimesMachine, our virtual microfilm reader, to include every issue of The New York Times published between 1981 and 2002. Prior to this expansion, TimesMachine contained every issue published between 1851 and 1980, which consisted of over 11 million articles spread out over approximately 2.5 million pages. The new expansion adds an additional 8,035 complete issues containing 1.4 million articles over 1.6 million pages.

the_time_machine

Creating and expanding TimesMachine presented us with several interesting technical challenges, and in this post we’ll describe how we tackled two. First, we’ll discuss the fundamental challenge with TimesMachine: efficiently providing a user with a scan of an entire day’s newspaper without requiring the download of hundreds of megabytes of data. Then, we’ll discuss a fascinating string matching problem we had to solve in order to include articles published after 1980 in TimesMachine.

It’s not all the extant Hebrew Bible witnesses, both images and transcription, or all extant cuneiform tablets with existing secondary literature, but if you are interested in more recent events, what a magnificent resource!

Tesseract-ocr gets a shout-out and link for its use on the New York Times archives.

The string matching solution for search shows the advantages of finding a “nearly perfect” solution.

February 1, 2016

How’s Your Proofing Skill?

Filed under: Proofing — Patrick Durusau @ 5:20 pm

How would you rate your proofing reading skills?

It’s important to get things right in writing as well as code.

Where do you think you fall on a scale of 1 to 10, with 10 being nothing gets by you?

Think of the last time you saw a sign for a 7-Eleven store. If you don’t know what that is, try this link for images of 7-Eleven stores. Thousands of them.

Any comments?

This image is spoiler space:

unligthening

As is this one:

randbitmap-rdo

Have no fear, that’s not the image from Snow Crash. 😉

Hear is an annotated 7-Eleven image posted by Michael G. Munz on Twitter.

munz

So, how did you do?

Were you paying close enough attention?

Personal confession: I grew up in the land of 7-Eleven and never gave it a second thought. Ever.

In case you want to explore more, see 7-Eleven’s corporate site.

Where Does Your Dope Come From? [Interviewing Tips]

Filed under: Journalism,Mapping,Maps,News,Reporting — Patrick Durusau @ 4:35 pm

Visualizing Mexico’s Drug Cartels: A Roundup of Maps by Aleszu Bajak.

From the post:

With the big news this week of the arrest of Joaquín “El Chapo” Guzmán’s, the head of Mexico’s largest drug cartel, most of the attention is being paid to actor Sean Penn’s Rolling Stone interview with the kingpin in his mountain hideout in Mexico.

But where’s the context? How powerful is the Sinaloa cartel that he has run for decades and the other Mexican drug cartels, for that matter? Storybench has been sifting through a wealth of graphics on the workings of the drug trade in Mexico and its impact on the United States that help readers begin to understand the bigger picture of this complex drug war. So now that you’ve read your fill on Sean Penn’s (and Rolling Stone’s) editorial shortcomings, check out these impressive visualizations taken from news organizations, non-profits and government agencies.

Bajak presents a stunning array of maps that visualize the influence of Mexican drug cartels.

One of the most interesting has the title: United States: Areas of Influence of Major Mexican Transnational Criminal Organizations.

dea-us-influence

(You will need to select the image to get a useful sized image.)

All of the maps are interesting and some possibly more useful than others, such as if you are planning on expanding drug trade in one area but not another.

What I found missing was a map of all the organizations profiting from the war on drugs. Yes?

Location and approximate incomes of drug cartels, agencies, law enforcement offices, government budgets, etc.

The war on drugs isn’t just about the income (and failure to pay taxes) of the drug cartels, it is also about the allocation of personnel and budgets in law enforcement organizations, prisons that house drug offenders, etc.

One persuasive graphic would be the economic impact on government organizations if the drug trade stopped tomorrow and drug offense prisoners were released from jail.

There is a symbiotic relationship in the war on drugs. Government agents limit available competition and help keep prices artificially high. Drug cartels provide a desired product and a rationale for the existence of police and related agencies.

A rather cozy, if adversarial arrangement. (A topic map could clarify the benefits to both sides but truth telling isn’t a characteristic of either side.)

PS: Do read the piece on what Sean Penn should have done for his interview with El Chapo. It makes a good checklist of what reporters don’t do when interviewing political candidates or sitting members of government.

They want to be asked back if you know what I mean.

3 Decades of High Quality News! (code reuse)

Filed under: Archives,Journalism,News,Reporting,Topic Maps — Patrick Durusau @ 4:06 pm

‘NewsHour’ archives to be digitized and available online by Dru Sefton.

From the post:

More than three decades of NewsHour are heading to an online home, the American Archive of Public Broadcasting.

Nearly 10,000 episodes that aired from 1975 to 2007 will be archived through a collaboration among AAPB; WGBH in Boston; WETA in Arlington, Va.; and the Library of Congress. The organizations jointly announced the project Thursday.

“The project will take place over the next year and a half,” WGBH spokesperson Emily Balk said. “The collection should be up in its entirety by mid-2018, but AAPB will be adding content from the collection to its website on an ongoing, monthly basis.”

Looking forward to that collection!

Useful on its own, but even more so if you had an indexical object that could point to a subject in a PBS news episode and at the same time, point to episodes on the same subject from other TV and radio news archives, not to mention the same subject in newspapers and magazines.

Oh, sorry, that would be a topic in ISO 13250-2 parlance and the more general concept of a proxy in ISO 13250-5. Thought I should mention that before someone at IBM runs off to patent another pre-existing idea.

I don’t suppose padding patent statistics hurts all that much, considering that the Supremes are poised to invalidate process and software patents in one fell swoop.

Hopefully economists will be ready to measure the amount of increased productivity (legal worries about and enforcement of process/software patents aren’t productive activities) from foreclosing even the potential of process or software patents.

Copyright is more than sufficient to protect source code, as is any programmer is going to use another programmers code. They say that scientists would rather use another scientist’s toothbrush that his vocabulary.

Copying another programmer’s code (code re-use) is more akin to sharing a condom. It’s just not done.

« Newer Posts

Powered by WordPress