Clojure for Data Science [Caution: Danger of Buyer’s Regret]

February 6th, 2016

Clojure for Data Science by Mike Anderson.

From the webpage:

Presentation given at the Jan 2016 Singapore Clojure Users’ Group

You will have to work at the presentation because there is no accompanying video, but the effort will be well spent.

Before you review these slides or pass them onto others, take fair warning that you may experience “buyer’s regret” with regard to your current programming language/paradigm (if not already Clojure).

However powerful and shiny your present language seems now, its luster will be dimmed after scanning over this slides.

Don’t say you weren’t warned ahead of time!

BTW, if you search for “clojure for data science” (with the quotes) you will find among other things:

Clojure for Data Science Progressing by Henry Garner (Packt)

Repositories for the Clojure for Data Science Processing book.

@cljds Clojure Data Science twitter feed (Henry Garner). VG!

Clojure for Data Science Some 151 slides by Henry Garner.

Plus:

Planet Clojure, a metablog that collects posts from other Clojure blogs.

As a close friend says from time to time, “clojure for data science,”

G*****s well.;-)

Enjoy!

Between the Words [Alternate Visualizations of Texts]

February 6th, 2016

Between the Words – Exploring the punctuation in literary classics by Nicholas Rougeux.

From the webpage:

Between the Words is an exploration of visual rhythm of punctuation in well-known literary works. All letters, numbers, spaces, and line breaks were removed from entire texts of classic stories like Alice’s Adventures in Wonderland, Moby Dick, and Pride and Prejudice—leaving only the punctuation in one continuous line of symbols in the order they appear in texts. The remaining punctuation was arranged in a spiral starting at the top center with markings for each chapter and classic illustrations at the center.

The posters are 24″ X 36.”

Some small images to illustrate the concept:

achistmascarol

ataleoftwocities

aliceinwonderland

I’m not an art critic but I can say that unusual or unexpected visualizations of data can lead to new insights. Or should I say different insights than you may have previously held.

Seeing this visualization reminded me of a presentation too any years ago at Cambridge that argued the cantillation (think crudely “accents”) marks in the Hebrew Bible were a reliable guide to clause boundaries and reading.

FYI, the versification and divisions in the oldest known witnesses to the Hebrew Bible were added centuries after the text stabilized. There are generally accepted positions on the text but at best, they are just that, generally accepted positions.

Any number of alternative presentations of texts suggest themselves.

I haven’t performed the experiment but for numeric data, reordering the data so as to force re-casting of formulas, could be a way to explore presumptions that are glossed over the the “usual form.”

Not unlike copying a text by hand as opposed to typing or photocopying the text. Each step of performing the task with less deliberation increases the odds you will miss some decision that you are making unconsciously.

If you like these posters ore know an English major/professor who may, pass this site along to them. (I have no interest, financial or otherwise in this site but I like to encourage creative thinking.)

I first saw this in a tweet by Christopher Phipps.

Finding Roman Roads

February 6th, 2016

You (yes, you) can find Roman roads using data collected by lasers by Barbara Speed.

Barbara reports that using Lidar data available from the UK Survey portal, David Rateledge was able to discover a Roman road between Ribchester and Lancaster.

She closes with:


The Environment Agency is planning to release 11 Terabytes (for Luddites: that’s an awful lot of data) worth of LIDAR information as part of the Department for Engironment, Food and Rural Affairs’ open data initiative, available through this portal. Which means that any of us could download it and dig about for more lost roads.

That seems a bit thin on the advice side, if you are truly interested in using the data to find Roman roads and other sites.

An article posted under ‘Lost’ Roman road is discovered, doesn’t provide more on the technique but does point to Roman Roads in Lancashire. Interesting site but no help on using the data.

I can’t comment on the ease of use or documentation but LiDAR tools are available at: Free LiDAR tools.

See also my post on the OpenTopography Project.

How To Profit from Human Trafficking – Become a Trafficker or NGO

February 6th, 2016

Special Report: Money and Lies in Anti-Human Trafficking NGOs by Anne Elizabeth Moore.

From the post:

The United States’ beloved – albeit disgraced – anti-trafficking advocate Somaly Mam has been waging a slow but steady return to glory since a Newsweek cover story in May 2014 led to her ousting from the Cambodian foundation that bore her name. The allegations in the article were not new; they’d been reported and corroborated in bits and pieces for years. The magazine simply pointed out that Mam’s personal narrative as a survivor of sex trafficking and the similar stories that emerged from both clients and staff at the non-governmental organization (NGO) she founded to assist survivors of sex trafficking, were often unverifiable, if not outright lies.

Panic ensued. Mam had helped establish, for US audiences, key plot points in the narrative of trafficking and its future eradication. Her story is that she was forced into labor early in life by someone she called “Grandfather,” who then sold off her virginity and forced her into a child marriage. Later she says she was sold to a brothel where she watched several contemporaries die in violence. Childhood friends and even family members couldn’t verify Mam’s recollection of events for Newsweek, but Mam has suggested that her story is typical of trafficking victims.

Mam has also cultivated a massive global network of anti-trafficking NGOs, funders and supporters, who have based their missions, donations and often life’s work on her emotional – but fabricated – tale. Some distanced themselves from the Cambodian activist last spring, including her long-time supporter at The New York Times, Nicholas Kristof, while others suggested that even if untrue, Mam’s stories were told in support of a worthy cause and were therefore true enough.

Moore characterizes NGOs organized to stop human trafficking as follows:


Considering their common mythical enemy – the nameless and faceless men portrayed in TV dramas who trade in nubile human girl stock – one would hope anti-trafficking organizations would unite in an effort to be less shady. With names reliant on metaphors of recovery, light and sanctuary, anti-trafficking groups project an image of transparency. Yet these groups have shown a remarkable lack of fiscal accountability and organizational consistency, often even eschewing an open acknowledgement of board members, professional affiliates and funding relationships. The problems with this evasion go beyond ethical considerations: A certain level of budgetary disclosure, for example, is a legal requirement for tax-exempt 501(c)(3) organizations. Yet anti-trafficking groups fold, move, restructure and reappear under new names with alarming frequency, making them almost as difficult to track as their supposed foes.

It is a very compelling article that will leave you with more questions about the finances of NGOs “opposing” human trafficking than answers.

The lack of answers isn’t Moore’s fault, the NGOs in question were designed to make obtaining answers difficult, if not impossible.

After you read the article, more than once to get the full impact, how would you:

  1. Track organizations in the article that: “…fold, move, restructure and reappear under new names with alarming frequency…”?
  2. How would you gather and share data on those organizations?
  3. How would you map what data is available on funding to Moore’s report?
  4. How would you make Moore’s snapshot of data subject updating by later reporters?
  5. How would you track the individuals involved in the NGOs you track?

The answers to those questions are applicable to human traffickers as well.

Consider it to be a “two-for.”

The Vietnam War: A Non-U.S. Photo Essay

February 6th, 2016

1965-1975 Another Vietnam by Alex Q. Arbuckle.

From the post:

For much of the world, the visual history of the Vietnam War has been defined by a handful of iconic photographs: Eddie Adams’ image of a Viet Cong fighter being executed, Nick Ut’s picture of nine-year-old Kim Phúc fleeing a napalm strike, Malcolm Browne’s photo of Thích Quang Duc self-immolating in a Saigon intersection.

Many famous images of the war were taken by Western photographers and news agencies, working alongside American or South Vietnamese troops.

But the North Vietnamese and Viet Cong had hundreds of photographers of their own, who documented every facet of the war under the most dangerous conditions.

Almost all were self-taught, and worked for the Vietnam News Agency, the National Liberation Front, the North Vietnamese Army or various newspapers. Many sent in their film anonymously or under a nom de guerre, viewing themselves as a humble part of a larger struggle.

A timely reminder that Western media and government approved photographs are evidence for only one side of any conflict.

Efforts by Twitter and Facebook to censor any narrative other than a Western one on the Islamic State should be very familiar to anyone who remembers the “Western view only” from media reports in the 1960’s.

Censorship, whether during Vietnam or in opposition to the Islamic State, doesn’t make the “other” narrative go away. It cannot deny the facts known to residents in a war zone.

The only goal that censorship achieves and not always, is to keep the citizens of the censoring powers in ignorance. So much for freedom of speech. You can’t talk about what you don’t know about.

The essay uses images from Another Vietnam: Pictures of the War from the Other Side. I checked at National Geographic, the publisher, and it isn’t listed in their catalog. Used/new the book is about $160.00 and contains 180 never before published photographs.

Questions come to mind:

Where are the other North Vietnam/Viet Cong photos now? Shouldn’t those be documented, digitized and placed online?

Where are the Islamic States photos and videos that are purged from Twitter and Facebook?

The media is repeating the same mistake with the Islamic State that it made during Vietnam.

No reader can decide between competing narratives in the face of only one narrative.

Nor can they avoid making the same mistakes as have been made in the past.

Vietnam is a very good example of such a mistake.

Replacing the choices of other cultures with our own is a mission doomed to failure (and defeat).

I first saw this in a tweet by Lars Marius Garshol.

Are You A Scientific Twitter User or Polluter?

February 6th, 2016

Realscientists posted this image to Twitter:

science

Self-Scoring Test:

In the last week, how often have you retweeted without “read[ing] the actual paper” pointed to by a tweet?

How many times did you retweet in total?

Formula: retweets w/o reading / retweets in total = % of retweets w/o reading.

No scale with superlatives because I don’t have numbers to establish a baseline for the “average” Twitter user.

I do know that I see click-bait, out-dated and factually wrong material retweeted by people who know better. That’s Twitter pollution.

Ask yourself: Am I a scientific Twitter user or a polluter?

Your call.

Is Twitter A Global Town Censor? (Data Project)

February 5th, 2016

Twitter Steps Up Efforts to Thwart Terrorists’ Tweets by Mike Isaac.

From the post:

For years, Twitter has positioned itself as a “global town square” that is open to discourse from all. And for years, extremist groups like the Islamic State have taken advantage of that stance, using Twitter as a place to spread their messages.

Twitter on Friday made clear that it was stepping up its fight to stem that tide. The social media company said it had suspended 125,000 Twitter accounts associated with extremism since the middle of 2015, the first time it has publicized the number of accounts it has suspended. Twitter also said it had expanded the teams that review reports of accounts connected to extremism, to remove the accounts more quickly.

“As the nature of the terrorist threat has changed, so has our ongoing work in this area,” Twitter said in a statement, adding that it “condemns the use of Twitter to promote terrorism.” The company said its collective moves had already produced results, “including an increase in account suspensions and this type of activity shifting off Twitter.”

The disclosure follows intensifying pressure on Twitter and other technology companies from the White House, presidential candidates like Hillary Clinton and government agencies to take more action to combat the digital practices of terrorist groups. The scrutiny has grown after mass shootings in Paris and San Bernardino, Calif., last year, because of concerns that radicalizations can be accelerated by extremist postings on the web and social media.

Just so you know what the Twitter rule is:

Violent threats (direct or indirect): You may not make threats of violence or promote violence, including threatening or promoting terrorism. (The Twitter Rules)

Here’s your chance to engage in real data science and help decide the question if Twitter had changed from global town hall to global town censor.

Here’s the data gathering project:

Monitor all the Twitter streams for Republican and Democratic candidates for the U.S. presidency for tweets advocating violence/terrorism.

File requests with Twitter for those accounts to be replaced.

FYI: When you report a message (Reporting a Tweet or Direct Message for violations), it will disappear from Messages inbox.

You must copy every tweet you report (accounts disappear as well) if you want to keep a record of your report.

Keep track of your reports and the tweet you copied before reporting.

Post the record of your reports and the tweets reported, plus any response from Twitter.

Suggestions on how to format these reports?

Or would you rather not know what Twitter is deciding for you?

How much data needs to be collected to move onto part 2 of the project – data analysis?


Suggestions on who at Twitter to contact for a listing of the 125,000 accounts that were silenced along with the Twitter history for each one? (Or the entire history of silenced accounts at Twitter? Who gets censored by topic, race, gender, location, etc., are all open questions.)

That could change the Twitter process from a black box to having marginally more transparency. You would have to guess at why any particular account was silenced.

If Twitter wants to take credit for censoring public discourse then the least it can do is be honest about who was censored and what they were saying to be censored.

Yes?

Ethical Data Scientists: Will You Support A False Narrative – “Community of Hope?”

February 5th, 2016

Google executive Anthony House advocates a false narrative, a “community of hope” as a counter to truthful content from the Islamic State:

We should get the bad stuff down [online], but it’s also extremely important that people are able to find good information, that when people are feeling isolated, that when they go online, they find a community of hope, not a community of harm. (Google plans to fight extremist propaganda with AdWords)

Islamic State media is offering a community of hope. One based on facts, not a fantasy of Western planners.

The more immediate, but no less intractable, challenge is to change the reality on the ground in Syria and Iraq, so that ISIS’s narrative of Sunni Muslim persecution at the hands of the Assad regime and Iranian-backed Shiite militias commands less resonance among Sunnis. One problem in countering that narrative is that some of it happens to be true: Sunni Muslims are being persecuted in Syria and Iraq. This blunt empirical fact, just as much as ISIS’s success on the battlefield, and the rhetorical amplification and global dissemination of that success via ISIS propaganda, helps explain why ISIS has been so effective in recruiting so many foreign fighters to its cause. (Why It’s So Hard to Stop ISIS Propaganda)

Persecution of Sunni Muslims aren’t the only facts in the Islamic State narrative. Consider the following:

  • Muslim governments exist at the sufferance of the West. Ex. Afghanistan, Iran, Libya, Syria
  • Existing “Muslim” leaders are vassals of the West.
  • For more than a century the West has dictated the fate of Muslims in the Middle East.
  • The West supports oppression of the Palestinian people.
  • The West opposes democratic results in Muslim countries that don’t accord with its wishes.

We might disagree on the phrasing of those facts but can an ethical data scientist say they are not true?

Whatever the motivation of the West in each case, the West wants to decide the fate of Muslims.

Is the “community of hope” Google portrays to be based on false hopes or new realities on the ground?

There’s a question for all the “ethical” data scientists at Google.

Will you support a false narrative by Google for a “community of hope” to deter terrorism?

Beating Body Scanners

February 4th, 2016

Just on the off chance that some government mandates wholly ineffectual full body scanners for security purposes, Jonathan Corbett has two videos that demonstrate the ease with which such scanner can be defeated, completely.

Oh, I forgot, the US government has mandated such scanners!

Jonathan maintains a great site at: http://professional-troublemaker.com/. You can folow him @_JonCorbett.

Jon is right about the scanners being ineffectual but being effective wasn’t part of the criteria for purchasing the systems. Scanners were purchased to give the impression of frenzied activity, even if it was totally ineffectual.

What would happen if a terrorist did attack an airport, through one of the hundreds of daily lapses in security? What would the government say if it weren’t engaged in non-stop but meaningless activity?

Someone would say, falsely, that it was inactive on the part of government that enabled the attack.

Stuff and nonsense.

“Terrorist” attacks, actually violence committed by criminals by another name, can and will happen no matter what measures are taken by the government. Short of having an all-nude policy beginning at the perimeter of the airport and prohibiting anything larger than a clear quart zip lock bag being shipped. With passengers or as cargo.

Even then it isn’t hard to imagine several dozen ways to carry out “terrorist” attacks at any airport.

The sooner government leaders begin to educate their citizens that some risks are simply unavoidable, the sooner money can stop being wasted on visible but ineffectual efforts like easily defeated body scanners.

Comodo Chromodo browser – Danger! Danger! – Discontinue Use

February 4th, 2016

Comodo Chromodo browser does not enforce same origin policy and is based on an outdated version of Chromium

From the overview:

Comodo Chromodo browser, version 45.8.12.392, 45.8.12.391, and possibly earlier, does not enforce same origin policy, which allows for the possibility of cross-domain attacks by malicious or compromised web hosts. Chromodo is based on an outdated release of Chromium with known vulnerabilities.

Solution

The CERT/CC is currently unaware of a practical solution to this problem and recommends the following workarounds.

Disable JavaScript

Disabling JavaScript may mitigate cross-domain scripting attacks. For instructions, refer to Comodo’s help page.

Note that disabling JavaScript may not protect against known vulnerabilities in the version of Chromium on which Chromodo is based. For this reason, users should prioritize implementing the following workaround.

Discontinue use

Until these issues are addressed, consider discontinuing use of Chromodo.

Discontinue use is about as extreme a workaround as I can imagine.

Too bad the Comodo site doesn’t say anything about refunds and/or compensation for damaged customers.

Would you say that without any penalty, there is no incentive for Comodo to produce better software?

Or to put it differently, where is the downside to Comodo producing buggy software?

Where does that impact their bottom line?

I first saw this in a tweet by SecuriTay.

Toneapi helps your writing pack an emotional punch [Not For The Ethically Sensitive]

February 4th, 2016

Toneapi helps your writing pack an emotional punch by Martin Bryant.

From the post:

Language analysis is a rapidly developing field and there are some interesting startups working on products that help you write better.

Take Toneapi, for example. This product from Northern Irish firm Adoreboard is a Web-based app that analyzes (and potentially improves) the emotional impact of your writing.

Paste in some text, and it will offer a detailed visualization of your writing.

If you aren’t overly concerned about manipulating, sorry, persuading your readers to your point of view, you might want to give Toneapi a spin. Martin reports that IBM’s Watson has Tone Analyzer and you should also consider Textio and Relative Insight.

Before this casts an Orwellian pale over your evening/day, remember that focus groups and testing messages have been the staple of advertising for decades.

What these software services do is make a crude form of that capability available to the average citizen.

Some people have a knack for emotional language, like Donald Trump, but I can’t force myself to write in incomplete sentences or with one syllable words. Maybe there’s an app for that? Suggestions?

The Ethical Data Scientist

February 4th, 2016

The Ethical Data Scientist by Cathy O’Neil.

From the post:

….
After the financial crisis, there was a short-lived moment of opportunity to accept responsibility for mistakes with the financial community. One of the more promising pushes in this direction was when quant and writer Emanuel Derman and his colleague Paul Wilmott wrote the Modeler’s Hippocratic Oath, which nicely sums up the list of responsibilities any modeler should be aware of upon taking on the job title.

The ethical data scientist would strive to improve the world, not repeat it. That would mean deploying tools to explicitly construct fair processes. As long as our world is not perfect, and as long as data is being collected on that world, we will not be building models that are improvements on our past unless we specifically set out to do so.

At the very least it would require us to build an auditing system for algorithms. This would be not unlike the modern sociological experiment in which job applications sent to various workplaces differ only by the race of the applicant—are black job seekers unfairly turned away? That same kind of experiment can be done directly to algorithms; see the work of Latanya Sweeney, who ran experiments to look into possible racist Google ad results. It can even be done transparently and repeatedly, and in this way the algorithm itself can be tested.

The ethics around algorithms is a topic that lives only partly in a technical realm, of course. A data scientist doesn’t have to be an expert on the social impact of algorithms; instead, she should see herself as a facilitator of ethical conversations and a translator of the resulting ethical decisions into formal code. In other words, she wouldn’t make all the ethical choices herself, but rather raise the questions with a larger and hopefully receptive group.

First, the link for the Modeler’s Hippocratic Oath takes you to a splash page at Wiley for Derman’s book: My Life as a Quant: Reflections on Physics and Finance.

The Financial Modelers’ Manifesto (PDF) and The Financial Modelers’ Manifesto (HTML), are valid links as of today.

I commend the entire text of The Financial Modelers’ Manifesto to you for repeated reading but for present purposes, let’s look at the Modelers’ Hippocratic Oath:

~ I will remember that I didn’t make the world, and it doesn’t satisfy my equations.

~ Though I will use models boldly to estimate value, I will not be overly impressed by mathematics.

~ I will never sacrifice reality for elegance without explaining why I have done so.

~ Nor will I give the people who use my model false comfort about its accuracy. Instead, I will make explicit its assumptions and oversights.

~ I understand that my work may have enormous effects on society and the economy, many of them beyond my comprehension

It may just be me but I don’t see a charge being laid on data scientists to be the ethical voices in organizations using data science.

Do you see that charge?

To to put it more positively, aren’t other members of the organization, accountants, engineers, lawyers, managers, etc., all equally responsible for spurring “ethical conversations?” Why is this a peculiar responsibility for data scientists?

I take a legal ethics view of the employer – employee/consultant relationship. The client is the ultimate arbiter of the goal and means of a project, once advised of their options.

Their choice may or may not be mine but I haven’t ever been hired to play the role of Jiminy Cricket.

Jiminy_Cricket

It’s heady stuff to be responsible for bringing ethical insights to the clueless but sometimes the clueless have ethical insights on their on, or not.

Data scientists can and should raise ethical concerns but no more or less than any other member of a project.

As you can tell from reading this blog, I have very strong opinions on a wide variety of subjects. That said, unless a client hires me to promote those opinions, the goals of the client, by any legal means, are my only concern.

PS: Before you ask, no, I would not work for Donald Trump. But that’s not an ethical decision. That’s simply being a good citizen of the world.

Spontaneous Preference for their Own Theories (SPOT effect) [SPOC?]

February 4th, 2016

The SPOT Effect: People Spontaneously Prefer their Own Theories by Aiden P. Gregga, Nikhila Mahadevana, and Constantine Sedikidesa.

Abstract:

People often exhibit confirmation bias: they process information bearing on the truth of their theories in a way that facilitates their continuing to regard those theories as true. Here, we tested whether confirmation bias would emerge even under the most minimal of conditions. Specifically, we tested whether drawing a nominal link between the self and a theory would suffice to bias people towards regarding that theory as true. If, all else equal, people regard the self as good (i.e., engage in self-enhancement), and good theories are true (in accord with their intended function), then people should regard their own theories as true; otherwise put, they should manifest a Spontaneous Preference for their Own Theories (i.e., a SPOT effect). In three experiments, participants were introduced to a theory about which of two imaginary alien species preyed upon the other. Participants then considered in turn several items of evidence bearing on the theory, and each time evaluated the likelihood that the theory was true versus false. As hypothesized, participants regarded the theory as more likely to be true when it was arbitrarily ascribed to them as opposed to an “Alex” (Experiment 1) or to no one (Experiment 2). We also found that the SPOT effect failed to converge with four different indices of self-enhancement (Experiment 3), suggesting it may be distinctive in character.

I can’t give you the details on this article because it is fire-walled.

But the catch phrase, “Spontaneous Preference for their Own Theories (i.e., a SPOT effect)” certainly fits every discussion of semantics I have ever read or heard.

With a little funding you could prove the corollary, Spontaneous Preference for their Own Code (the SPOC effect) among programmers. ;-)

There are any number of formulations for how to fight confirmation bias but Jeremy Dean puts it this way:


The way to fight the confirmation bias is simple to state but hard to put into practice.

You have to try and think up and test out alternative hypothesis. Sounds easy, but it’s not in our nature. It’s no fun thinking about why we might be misguided or have been misinformed. It takes a bit of effort.

It’s distasteful reading a book which challenges our political beliefs, or considering criticisms of our favourite film or, even, accepting how different people choose to live their lives.

Trying to be just a little bit more open is part of the challenge that the confirmation bias sets us. Can we entertain those doubts for just a little longer? Can we even let the facts sway us and perform that most fantastical of feats: changing our minds?

I wonder if that includes imagining using JSON? (shudder) ;-)

Hard to do, particularly when we are talking about semantics and what we “know” to be the best practices.

Examples of trying to escape the confirmation bias trap and the results?

Perhaps we can encourage each other.

SQL Injection Hall-Of-Shame / Internet-of-Things Hall-Of-Shame

February 4th, 2016

SQL Injection Hall-Of-Shame by Arthur Hicken.

From the webpage:

In this day and age it’s ridiculous how frequently large organizations are falling prey to SQL Injection which is almost totally preventable as I’ve written previously.

Note that this is a work in progress. If I’ve missed something you’re aware of please let me know in the comments at the bottom of the page.

Don’t let this happen to you! For some simple tips see the OWASP SQL Injection Prevention Cheat Sheet. For more security info check out the security resources page and the book SQL Injection Attacks and Defense or Basics of SQL injection Analysis, Detection and Prevention: Web Security for more info.

IOT HALL-OF-SHAME

With the rise of internet enabled devices in the Internet of Things or IoT the need for software security is becoming even more important. Unfortunately many device makers seem to put security on the back burner or not even understand the basics of cybersecurity.

I am maintaining here a list of known hacks for “things”. The list is short at the moment but will grow, and is often more generic than it could be. It’s kind of in reverse-chronological order, based on the date that the hack was published. Please assist – if you’re aware of additional thing-hacks please let me know in the comments at the bottom of the page.

I assume you find “wall-of-shame” efforts as entertaining as I do.

I am aware of honor-shame debates from a biblical studies perspective, on which see: Complete Bibliography of Honor-Shame Resources

“Complete” is a relative term when used regarding any bibliography in biblical studies and this appears to have at least one resource from 2011, but none later. You can run the references forward to collect more recent literature.

But the question with shaming techniques is are they effective?

As a case in point, consider Researchers find it’s terrifyingly easy to hack traffic lights where the post points out:

In fact, the most upsetting passage in the entire paper is the dismissive response issued by the traffic controller vendor when the research team presented its findings. According to the paper, the vendor responsible stated that it “has followed the accepted industry standard and it is that standard which does not include security.”

We can entertain ourselves by shaming vendors all day but only the “P” word will drive greater security.

“P” as in penalty.

Vormetric found that to be the case in What Drives Compliance? Hint: The P Word Missing From Cybersecurity Discussions.

Be entertained by wall-of-shame efforts but lobby for compliance enforced by penalties. (Know to anthropologists as a fear culture.)

Truthful Paedophiles On The Darknet?

February 4th, 2016

There is credibility flaw in Cryptopolitik and the Darknet by Daniel Moore & Thomas Rid that I overlooked yesterday (The Dark Web, “Kissing Cousins,” and Pornography) Perhaps it was just too obvious to attract attention.

Moore and Rid write:

The pornographic content was perhaps the most distressing. Websites dedicated to providing links to videos purporting to depict rape, bestiality and paedophilia were abundant. One such post at a supposedly nonaffiliated content-sharing website offered a link to a video of ‘a 12 year old girl … getting raped at school by 4 boys’.52 Other examples include a service that sold online video access to the vendor’s own family members:

My two stepsisters … will be pleased to show you their little secrets. Well, they are rather forced to show them, but at least that’s what they are used to.53

Several communities geared towards discussing and sharing illegitimate fetishes were readily available, and appeared to be active. Under the shroud of anonymity, various users appeared to seek vindication of their desires, providing words of support and comfort for one another in solidarity against what was seen as society’s unjust discrimination against non-mainstream sexual practices. Users exchanged experiences and preferences, and even traded content. One notable example from a website called Pedo List included a commenter freely stating that he would ‘Trade child porn. Have pics of my daughter.’54 There appears to be no fear of retribution or prosecution in these illicit communities, and as such users apparently feel comfortable enough to share personal stories about their otherwise stifled tendencies. (page 23)

Despite their description of hidden services as dens of iniquity and crime, those who use them are suddenly paragons of truthfulness, at least when it suits the authors purpose?

Doesn’t crediting the content of the Darknet as truthful, as opposed to being wishful, fantasy, or even police officers posing to investigate (some would say entrap) others, strain the imagination?

Some of the content is no doubt truthful but policy arguments need to be based on facts, not a collection of self-justifying opinions from like minded individuals.

A quick search on the string (without quotes):

police officers posing as children sex rings

Returns 9.7 million “hits.

How many of those police officers appeared in the postings collected by Moore & Rid it isn’t possible to say.

But in science, there is this thing called the burden of proof. That is simply asserting a conclusion, even citing equally non-evidence based conclusions, isn’t sufficient to prove a claim.

Moore & Rid had the burden to prove that the Darknet is a wicked place that poses all sorts of dangers and hazards.

As I pointed out yesterday, The Dark Web, “Kissing Cousins,” and Pornography, their “proof” is non-replicable conclusions about a small part of the Darkweb.

Earlier today I realized their conclusions depend upon a truthful criminal element using the Darkweb.

What do you think about the presumption that criminals are truthful?

Sounds doubtful to me!

The Dark Web, “Kissing Cousins,” and Pornography

February 3rd, 2016

Dark web is mostly illegal, say researchers by Lisa Vaas.

You can tell where Lisa comes out on the privacy versus law enforcement issue by the slant of her conclusion:

Users, what’s your take: are hidden services worth the political firestorm they generate? Are they worth criminals escaping justice?

Illegal is a slippery concept.

Marriage of first “kissing” cousins is “illegal” in:

Arkansas, Delaware, Idaho, Iowa, Kansas, Kentucky, Louisiana, Michigan, Minnesota, Mississippi, Missouri, Montana, Nebraska, Nevada, New Hampshire, North Dakota, Ohio, Oklahoma, Oregon, Pennsylvania, South Dakota, Texas, Washington, West Virginia, and Wyoming.

Marriage of first “kissing” cousins is legal in:

Alabama, Alaska, California, Colorado, Connecticut, District of Columbia, Florida, Georgia, Hawaii, Maryland, Massachusetts, New Jersey, New Mexico, New York, North Carolina (first cousins but not double first), Rhode Island, South Carolina, Tennessee, Vermont, and Virginia.

There are some other nuances I didn’t capture and for those see: State Laws Regarding Marriages Between First Cousins.

If you read Cryptopolitik and the Darknet by Daniel Moore & Thomas Rid carefully, you will spot a number of problems with their methodology and reasoning.

First and foremost, no definitions were offered for their taxonomy (at page 20):

  • Arms
  • Drugs
  • Extremism
  • Finance
  • Hacking
  • Illegitimate pornography
  • Nexus
  • Other illicit
  • Social
  • Violence
  • Other
  • None

Readers and other researchers are left to wonder what was included or excluded from each of those categories.

In science, that would be called an inability to replicate the results. As if this were science.

Moore & Rid recite anecdotal accounts of particular pornography sites, calculated to shock the average reader, but that’s not the same thing as enabling replication of their research. Or a fair characterization of all the pornography encountered.

They presumed that text was equivalent to image content, so they discarded all images (pages 19-20). Which left them unable to test that presumption. Hmmm, untested assumptions in science?

The results of the unknown basis for classification identied 122 sites (page 21) as pornographic out of the 5,205 initial set of sites.

If you accept Tor’s estimate of 30,000 hidden services that announce themselves every day, Moore & Rid have found that illegal pornography (whatever that means) is:

122 / 30000 = 0.004066667

Moore & Rid have established that “illegal” porn is .004066667% of the Dark Net.

I should be grateful Moore & Rid have so carefully documented the tiny part of the Dark Web concerned with their notion of “illegal” pornography.

But, when you encounter “reasoning” such as:


The other quandary is how to deal with darknets. Hidden services have already damaged Tor, and trust in the internet as a whole. To save Tor – and certainly to save Tor’s reputation – it may be necessary to kill hidden services, at least in their present form. Were the Tor Project to discontinue hidden services voluntarily, perhaps to improve the reputation of Tor browsing, other darknets would become more popular. But these Tor alternatives would lack something precious: a large user base. In today’s anonymisation networks, the security of a single user is a direct function of the number of overall users. Small darknets are easier to attack, and easier to de-anonymise. The Tor founders, though exceedingly idealistic in other ways, clearly appreciate this reality: a better reputation leads to better security.85 They therefore understand that the popularity of Tor browsing is making the bundled-in, and predominantly illicit, hidden services more secure than they could be on their own. Darknets are not illegal in free countries and they probably should not be. Yet these widely abused platforms – in sharp contrast to the wider public-key infrastructure – are and should be fair game for the most aggressive intelligence and law-enforcement techniques, as well as for invasive academic research. Indeed, having such clearly cordoned-off, free-fire zones is perhaps even useful for the state, because, conversely, a bad reputation leads to bad security. Either way, Tor’s ugly example should loom large in technology debates. Refusing to confront tough, inevitable political choices is simply irresponsible. The line between utopia and dystopia can be disturbingly thin. (pages 32-33)

it’s hard to say nothing and see public discourse soiled with this sort of publication.

First, there is no evidence presented that hidden services have damaged Tor and/or trust in the Internet as a whole. Even the authors concede that Tor is the most popular option anonymous browsing and hidden services. That doesn’t sound like damage to me. You?

Second, the authors dump all hidden services in the “bad, very bad” basket, despite their own research classifying only .004066667% of the Dark Net as illicit pornography. They use stock “go to” examples to shock readers in place of evidence and reasoning.

Third, the charge that Tor has “[r]efused to confront tough, inevitable political choices is simply irresponsible” is false. Demonstrably false because the authors point out that Tor developers made a conscious choice to not take political considerations into account (page 25).

Since Moore & Rid disagree with that choice, they resort to name calling, terming the decision “simply irresponsible.” Moore & Rid are entitled to their opinions but they aren’t going to persuade even a semi-literate audience with name calling.

Take Cryptopolitik and the Darknet as an example of how to not write a well researched and reasoned paper. Although, that isn’t a bar to publication as you can see.

Cheating Cheaters [Honeypots for Government Agencies?]

February 3rd, 2016

Video Game Cheaters Outed By Logic Bombs by timothy.

From the post:

A Reddit user decided to tackle the issue of cheaters within Valve’s multiplayer shooter Counter Strike: Global Offensive in their own unique way: by luring them towards fake “multihacks” that promised a motherlode of cheating tools, but in reality, were actually traps designed to cause the users who installed them to eventually receive bans. The first two were designed as time bombs, which activated functions designed to trigger bans after a specific time of day. The third, which was downloaded over 3,500 times, caused instantaneous bans.

I wonder if anyone is running honeypots for intelligence agencies?

Or fake jihad sites for our friends in law enforcement?

Sort of a Spy vs. Spy situation, yes?

spion-mot-spion

Cyber-dueling with government before you aren’t wearing protective gear and the tips aren’t blunted.

Unpublished Black History Photos (NYT)

February 3rd, 2016

The New York Times is unearthing unpublished photos from its archives for Black History Month by Shan Wang.

From the post:

In this black and white photo taken by a New York Times staff photographer, two unidentified second graders at Princeton’s Nassau Street Elementary School stand in front of a classroom blackboard. Some background text accompanies the image, pointing to a 1964 Times article about school integration and adding that the story “offered a caveat that still resonates, noting that in the search for a thriving and equal community, ‘good schooling is not enough.’”

Times readers wrote in to ask specifically about the second graders in the photo, so the Times updated the post with a comment form asking readers to share anything they might know about the girl and boy depicted.

Great background on the Unpublished Black History project at the Times.

Public interfaces enable contribution of information on selected images along with comments.

Unlike the US Intelligence community, the Times is willing to admit that its prior conduct may not reflect (then) or current values.

If a private, for-profit organization can be that honest, what’s the deal with government agencies?

Must be that accountability thing that Republicans are always trying to foist off onto public school teachers and public school teachers alone.

No accountability for elected officials and/or their appointees and cronies.

They are deadly serious about crypto backdoors [And of the CIA and Chinese Underwear]

February 3rd, 2016

They are deadly serious about crypto backdoors by Robert Graham.

From the post:

Julian Sanchez (@normative) has an article questioning whether the FBI is serious about pushing crypto backdoors, or whether this is all a ploy pressuring companies like Apple to give them access. I think they are serious — deadly serious.

The reason they are only half-heartedly pushing backdoors at the moment is that they believe we, the opposition, aren’t serious about the issue. After all, the 4rth Amendment says that a “warrant of probable cause” gives law enforcement unlimited power to invade our privacy. Since the constitution is on their side, only irrelevant hippies could ever disagree. There is no serious opposition to the proposition. It’ll all work itself out in the FBI’s favor eventually. Among the fascist class of politicians, like the Dianne Feinsteins and Lindsay Grahams of the world, belief in this principle is rock solid. They have absolutely no doubt.

But the opposition is deadly serious. By “deadly” I mean this is an issue we are willing to take up arms over. If congress were to pass a law outlawing strong crypto, I’d move to a non-extradition country, declare the revolution, and start working to bring down the government. You think the “Anonymous” hackers were bad, but you’ve seen nothing compared to what the tech community would do if encryption were outlawed.

On most policy questions, there are two sides to the debate, where reasonable people disagree. Crypto backdoors isn’t that type of policy question. It’s equivalent to techies what trying to ban guns would be to the NRA.

What he says.

Crypto backdoors are a choice between a policy that benefits government at the expense of everyone (crypto backdoors) versus a policy that benefits everyone at the expense of the government (no crypto backdoors). It’s really that simple.

When I say crypto backdoors benefit the government, I mean that quite literally. Collecting data via crypto backdoors and otherwise, enables government functionaries to pretend to be engaged in meaningful responses to serious issues.

Collecting and shoveling data from desk to desk is about as useless an activity as can be imagined.

Basis for that claim? Glad you asked!

If you haven’t read: Chinese Underwear and Presidential Briefs: What the CIA Told JFK and LBJ About Mao by Steve Usdin, do so.

Steve covers the development of the “presidential brief” and its long failure to provide useful information about China and Mao in particular. The CIA long opposed declassification of historical presidential briefs based on the need to protect “sources and methods.”

The presidential briefs for the Kennedy and Johnson administrations have been released and here is what Steve concludes:

In any case, at least when it comes to Mao and China, the PDBs released to date suggest that the CIA may have fought hard to keep the these documents secret not to protect “sources and methods,” but rather to conceal its inability to recruit sources and failure to provide sophisticated analyses.

Past habits of the intelligence community explain rather well why they have no, repeat no examples of how strong encryption as interfered with national security. There are none.

The paranoia about “crypto backdoors” is another way to engage in “known to be useless” action. It puts butts in seats and inflates agency budgets.


Unlike Robert, should Congress ban strong cryptography, I won’t be moving to a non-extradition country. Some of us need to be here when local police come to their senses and defect.

Google Paywall Loophole Going Bye-Bye [Fair Use Driving Pay-Per-View Traffic]

February 3rd, 2016

The Wall Street Journal tests closing the Google paywall loophole by Lucia Moses.

From the post:

The Wall Street Journal has long had a strict paywall — unless you simply copy and paste the headline into Google, a favored route for those not wanting to pony up $200 a year. Some users have noticed in recent days that the trick isn’t working.

A Journal spokesperson said the publisher was running a test to see if doing so would entice would-be subscribers to pay up. The rep wouldn’t elaborate on how long and extensive the experiment was and if permanently closing the loophole was a possible outcome.

“We are experimenting with a number of different trial mechanics at the moment to provide a better subscription taster for potential new customers,” the rep said. “We are a subscription site and we are always looking at better ways to optimize The Wall Street Journal experience for our members.”

The Wall Street Journal can deprive itself of the benefits of “fair use” if it wants to, but is that a sensible position?

Fair Use Benefits the Wall Street Journal

Rather than a total ban on copying, what if the amount of an article that can be copied is set by algorithm? Such that at a minimum, the first two or three paragraphs of any story can be copied, whether you arrive from Google or directly on the WSJ site.

Think about it. Wall Street Journal readers aren’t paying to skim the lead paragraphs in the WSJ. They are paying to see the full story and analysis in particular subject areas.

Bloggers, such as myself, cannot drive content seekers to the WSJ because the first sentence or two isn’t enough for readers to develop an interest in the WSJ report.

If I could quote the first 2 or 3 paragraphs, add in some commentary and perhaps other links, then a visitor to the WSJ is visiting to see the full content the Wall Street Journal has to offer.

The story lead is acting, as it should, to drive traffic to the Wall Street Journal, possibly from readers who won’t otherwise think of the Wall Street Journal. Some of my readers on non-American/European continents for example.

Bloggers Driving Readers to Wall Street Journal Pay-Per-View Content

By developing algorithmic fair use as I describe it would enlist an army of bloggers in spreading notice of pay-per-view content of the Wall Street Journal, at no expense to the Wall Street Journal. As a matter of fact, bloggers would be alerting readers of pay-per-view WSJ content, at the blogger’s own expense.

It may just be me but if someone were going to drive viewers to pay-per-view content on my site, at their own expense, with fair use of content, I would be insane to prevent that. But, I’m not the one grasping at dimes while $100 bills are flying overhead.

Close the Loophole, Open Up Fair Use

Full disclosure, I don’t have any evidence for fair use driving traffic to the Wall Street Journal because that evidence doesn’t exist. The Wall Street Journal would have to enable fair use and track appearance of fair use content and the traffic originating from it. Along with conversions from that additional traffic.

Straight forward data analytics but it won’t happen by itself. When the WSJ succeeds with such a model, you can be sure that other paywall publishers will be quick to follow suite.

Caveat: Yes, there will be people who will only ever consume the free use content. And your question? If they aren’t ever going to be paying customers and the same fair use is delivering paying customers, will you lose the latter in order to spite the former?

Isn’t that like cutting off your nose to spite your face?

Historical PS:

I once worked for a publisher that felt a “moral obligation,” their words, not mine, to prevent anyone from claiming a missing journal issue to which they might not be entitled. Yeah. Journal issues that were as popular as the Watchtower is among non-Jehovah’s Witnesses. Cost to the publisher, about $3.00 per issue, cost to verify entitlement, a full time position at the publisher.

I suspect claims ran less than 200 per year. My suggestion was to answer any request with thanks, here’s your missing copy. End of transaction. Track claims only to prevent abuse. Moral outrage followed.

Is morality the basis for your pay-per-view access policy? I thought pay-per-view was a way to make money.

Pass this post along to the WSJ if you know anyone there. Free suggestion. Perhaps they will be interested in other, non-free suggestions.

Gremlin Users – Beware the Double-Underscore!

February 3rd, 2016

A user recently posted this example from the Gremlin documentation:

g.V().hasLabel(‘person’).choose(values(‘age’)).option(27,_in()).option(32,_.
out()).values(‘name’) [apologies for the line wrap]

which returned:

“No such property: _ for class: Script121″

Marko Rodriguez responded:

Its a double underscore, not a single underscore.

__ vs. _

I mention this to benefit beginning Gremlin users who haven’t developed an underscore stutter but also as a plea for sanity in syntax design.

It’s is easy to type two successive underscores but the obviousness of a double underscore versus a single underscore depends on local typography.

To say nothing that what might be obvious to the eyes of a twenty-something may not be as obvious to the eyes of a fifty-something+.

In syntax design, answer the question:

Do you want to be clever or clear?

Reverse Image Search (TinEye) [Clue to a User Topic Map Interface?]

February 3rd, 2016

TinEye was mentioned in a post I wrote in 2015, Baltimore Burning and Verification, but I did not follow up at the time.

Unlike some US intelligence agencies, TinEye has a cool logo:

TinEye

Free registration enables you to share search results with others, an important feature for news teams.

I only tested the plugin for Chrome, but it offers useful result options:

tineye-options

Once installed, use by hovering over an image in your browser, right “click” and select “Search image on TinEye.” Your results will be presented as set under options.

Clue to User Topic Map Interface

That is a good example of how one version of a topic map interface should work. Select some text, right “click” and “Search topic map ….(preset or selection)” with configurable result display.

That puts you into interaction with the topic map, which can offer properties to enable you to refine the identification of a subject of interest and then a merged presentation of the results.

As with a topic map, all sorts of complicated things are happening in the background with the TinEye extension.

But as a user, I’m interested in the results that FireEye presents not how it got them.

I used to say “more interested” to indicate I might care how useful results came to be assembled. That’s a pretension that isn’t true.

It might be true in some particular case, but for the vast majority of searches, I just want the (uncensored Google) results.

US Intelligence Community Logo for Same Capability

I discovered the most likely intelligence community logo for a similar search program:

peeping-tom_2734636b

The answer to the age-old question of “who watches the watchers?” is us. Which watchers are you watching?

LispNYC Videos

February 3rd, 2016

LispNYC Videos

Videos recorded at some LispNYC meetings have been posted online.

Not enough to get your through the 2016 election cycle but a good start!

Enjoy!

Balisage 2016, 2–5 August 2016 [XML That Makes A Difference!]

February 2nd, 2016

Call for Participation

Dates:

  • 25 March 2016 — Peer review applications due
  • 22 April 2016 — Paper submissions due
  • 21 May 2016 — Speakers notified
  • 10 June 2016 — Late-breaking News submissions due
  • 16 June 2016 — Late-breaking News speakers notified
  • 8 July 2016 — Final papers due from presenters of peer reviewed papers
  • 8 July 2016 — Short paper or slide summary due from presenters of late-breaking news
  • 1 August 2016 — Pre-conference Symposium
  • 2–5 August 2016 — Balisage: The Markup Conference

From the call:

Balisage is the premier conference on the theory, practice, design, development, and application of markup. We solicit papers on any aspect of markup and its uses; topics include but are not limited to:

  • Web application development with XML
  • Informal data models and consensus-based vocabularies
  • Integration of XML with other technologies (e.g., content management, XSLT, XQuery)
  • Performance issues in parsing, XML database retrieval, or XSLT processing
  • Development of angle-bracket-free user interfaces for non-technical users
  • Semistructured data and full text search
  • Deployment of XML systems for enterprise data
  • Web application development with XML
  • Design and implementation of XML vocabularies
  • Case studies of the use of XML for publishing, interchange, or archiving
  • Alternatives to XML
  • the role(s) of XML in the application lifecycle
  • the role(s) of vocabularies in XML environments

Full papers should be submitted by the deadline given below. All papers are peer-reviewed — we pride ourselves that you will seldom get a more thorough, skeptical, or helpful review than the one provided by Balisage reviewers.

Whether in theory or practice, let’s make Balisage 2016 the one people speak of in hushed tones at future markup and information conferences.

Useful semantics continues to flounder about, cf. Vice-President Biden’s interest in “one cancer research language.” Easy enough to say. How hard could it be?

Documents are commonly thought of and processed as if from BOM to EOF is the definition of a document. Much to our impoverishment.

Silo dissing has gotten popular. What if we could have our silos and eat them too?

Let’s set our sights on a Balisage 2016 where non-technicals come away saying “I want that!”

Have your first drafts done well before the end of February, 2016!

Google to deliver wrong search results to would-be jihadis[, gays, unwed mothers, teenagers, Muslims

February 2nd, 2016

Google to deliver wrong search results to would-be jihadis by David Barrett.

From the post:

Jihadi sympathisers who type extremism-related words into Google will be shown anti-radicalisation links instead, under a pilot scheme announced by the internet giant.

The new technology means people at risk of radicalisation will be presented with internet links which are the exact opposite of what they were searching for.

Dr Anthony House, a senior Google executive, revealed the pilot scheme in evidence to MPs scrutinising the role of internet companies in combating extremism.

It isn’t hard to see where this slippery road leads.

If any of the current Republican candidates are elected to the U.S. presidency, Google will:

Respond to gay sex or gay related searches with links for praying yourself straight.

Unwed mothers requesting abortion services will have their personal information forwarded to right-to-birth organizations and sent graphic anti-abortion images by email.

Teenagers seeking birth control advice will only see – Abstinence or Hell!

Muslims, well, unless Trump has deported all of them, will see anti-Muslim links.

Unlike bad decisions by government, Google can effectively implement demented schemes such as this one.

Censoring of search results to favor any side, policy, position, is just that censorship.

If you forfeit the rights of others, you have no claim to rights yourself.

Your call.

“A little sinister!!” (NRO’s Octopus Logo)

February 2nd, 2016

“A little sinister!!” The story behind National Reconnaissance Office’s octopus logo by JPat Brown.

From the post:

When the National Reconnaissance Office (NRO) announced the upcoming launch of their NROL-39 mission back in December 2013, they didn’t get quite the response they had hoped.

nrol-2

That might have had something to do with the mission logo being a gigantic octopus devouring the Earth.

The logo was widely lampooned as emblematic of the intelligence community’s tone-deafness to public sentiment. Incidentally, an octopus enveloping the planet also so happens to be the logo of SPECTRE, the international criminal syndicate that James Bond is always thwarting. So there’s that.

Privacy and security researcher Runa Sandvik wanted to know who approved this and why, so she filed a FOIA with the NRO for the development materials that went into the logo. A few months later, the NRO delivered.

This is a great read and one you need to save to your local server. Especially for days when you think the U.S. government is conspiring against its citizens. It should be so well-organized.

All sorts of government outrages are the produce of the same decision making process as this lame looking octopus.

At the very least they could have gotten John Romita Jr. to do something a bit more creative:

Doctoroctopus Fair use.

More than “a little sinister” but why not be honest?

Bumping into Stallman, again [Stallmanism]

February 2nd, 2016

Bumping into Stallman, again by Frederick Jacobs.

From the post:

This is the second time I’m talking at the same conference as Richard Stallman, after the Ind.ie Tech Summit in Brighton, this time was at the Fri Software Days in Fribourg, Switzerland.

One day before my presentation, I got an email from the organizers, letting me know that Stallman would like me to rename the title of my talk to remove any mentions of “Open Source Software” and replace them with “Free Software”.

The email read like this:

Is it feasible to remove the terms “Open-Source” from the title of your presentation and replace them by “Free-libre software”? It’s the wish of M. Stallman, that will probably attend your talk.

Frederick didn’t change his title or presentation, while at the same time handling the issue much better than I would have.

Well, after I got through laughing my ass off that Stallman would presume to dictate word usage to anyone.

Word usage, for any stallmanists in the crowd, is an empirical question of how many people use a word with a common meaning.

At least if you want to be understood by others.

Sunlight launches Hall of Justice… [ Topic Map “like” features?]

February 2nd, 2016

Sunlight launches Hall of Justice, a massive data inventory on criminal justice across the U.S. by Josh Stewart.

From the post:

Today, Sunlight is launching Hall of Justice, a robust, searchable data inventory of nearly 10,000 datasets and research documents from across all 50 states, the District of Columbia and the federal government. Hall of Justice is the culmination of 18 months of work gathering data and refining technology.

The process was no easy task: Building Hall of Justice required manual entry of publicly available data sources from a multitude of locations across the country.

Sunlight’s team went from state to state, meeting and calling local officials to inquire about and find data related to criminal justice. Some states like California have created a data portal dedicated to making criminal justice data easily accessible to the public; others had their data buried within hard to find websites. We also found data collected by state departments of justice, police forces, court systems, universities and everything in between.

“Data is shaping the future of how we address some of our most pressing problems,” said John Wonderlich, executive director of the Sunlight Foundation. “This new resource is an experiment in how a robust snapshot of data can inform policy and research decisions.”

In addition to being a great data collection, the Hall of Justice attempts to deliver topic map like capability for searches:

The resource attempts to consolidate different terminology across multiple states, which is far from uniform or standardized. For example, if you search solitary confinement you will return results for data around solitary confinement, but also for the terms “segregated housing unit,” “SHU,” “administrative segregation” and “restrictive housing.” This smart search functionality makes finding datasets much easier and accessible.

solitary

Looking at all thirteen results for a search on “solitary confinement,” I don’t see the mapping in question. Or certainly no mapping based on characteristics of the subject, “solitary confinement.”

As close as Georgia’s 2013 Juvenile Justice Reform is using the word “restrictive” as in:

Create a two-class system within the Designated Felony Act. Designated felony offenses are divided into two classes, based on severity—Class A and Class B—that continue to allow restrictive custody while also adjusting available sanctions to account for both offense severity and risk level.

Restrictive custody is what jail systems are about so that doesn’t trip the wire for “solitary confinement.”

Of course, the links are to entire reports/documents/data sets so each researcher will have to extract and collate content individually. When that happens, a means to contribute that collation/mapping to the Hall of Justice would be a boon for other researchers. (Can you say “topic map?”)

As I write this, you will need to prefer Mozilla over Chrome, at least on Ubuntu.

Trigger Warning: If you are sensitive to traumatic events and/or reports of traumatic events, you may want to ask someone less sensitive to review these data sources.

The only difference between a concentration camp and American prisons is the lack of mass gas chambers. Every horror and abuse that you can imagine and some you probably can’t, are visited on people in U.S. prisons everyday.

As Joan Baez says in Prison Triology:

Sunlight’s Hall of Justice is a great step forward in documenting the chambers of horror we call American prisons.

And we’re gonna raze, raze the prisons

To the ground

Help us raze, raze the prisons

To the ground

Are you ready?

All The Pubs In Britain & Ireland & Nothing Else

February 2nd, 2016

All The Pubs In Britain & Ireland & Nothing Else by Ramiro Gómez.

From the post:

The map above is elegant in its simplicity. It shows Great Britain and Ireland drawn from pubs. Each blue dot represents a single pub using data extracted from OpenStreetMap with the Matplotlib Basemap Toolkit.

Interestingly, if the same map had been drawn using the number of pubs from 1980 it would have looked quite different.

In total, the map has 29,195 pub locations across both the UK and Ireland. However, the UK alone has lost 21,000 pubs since 1980 according to the Institute of Economic Affairs, with half of these occurring since 2006.

Therefore, a map from 1980 might have had nearly twice as many dots as the one above and possibly not all in the same places. Going back even further, there were a reported 99,000 pubs in the UK in 1905.

See Ramiro’s post for the map but more importantly, book travel to the UK to help stem the loss of pubs!

How many of the 29,195 pubs in the UK have you visited?

Microsoft Quantum Challenge [Deadline April 29, 2016.]

February 2nd, 2016

Microsoft Quantum Challenge

From the webpage:

Join students from around the world to investigate and solve problems facing the quantum universe using Microsoft’s simulator, LIQUi|>.

Win big prizes, or the opportunity to interview for internships at Microsoft Research.

Objectives of the Quantum Challenge

The Quantum Architectures and Computing Group QuArC is seeking exceptional students!

WE want to find students who are eager to expand their knowledge of quantum computing, and who can translate thoughts into programs. Thereby we will expand the use of Microsoft’s Quantum Simulator LIQUi|>.

How to enter

First of all, REGISTER for the Challenge so that you can receive updates about the contest.

In the challenge you will use the LIQUi|> simulator to solve a novel problem and then report on your findings. So, think of a project. Then, download the simulator from GitHub and work with it to solve your problem. Finally, write a report about your findings and submit it. Your report submission will enter you into the Challenge.

In the report, present a description of the project including goals, methods, challenges, and any result obtained using LIQUi|>. You do not need to submit circuits and the software you develop, however, sample input and output for LIQUi|> must be submitted to show you used the simulator in the project. Your entry must consist of six pages or less, in PDF format.

The Challenge is open to students at colleges and universities world-wide (with a few restrictions) and aged 18+. NO PURCHASE NECESSARY. For full details, see the Official Rules

The prizes

 The Quantum Challenge is your change to win a big prize!

  • First Prize:  $5,000
  • Second Prizes:   Four at $2,500
  • Honorary Mention: Certificates will be presented to runner-up entries

Extra – visits or internship interviews

As a result of the challenge, some entrants could be invited to visit the QuArC team at Microsoft Research in Redmond, or have an opportunity to interview for internships at Microsoft Research. Internships are highly prestigious and involve working with the QuArC team for 12 weeks on cutting edge research.

If you are young enough to enter, just a word of warning about the “big prize.” $5,000 today isn’t a “big prize.” Maybe a nice weekend if you keep it low key but only just.

Interaction with the QuArC team, either by winning or in online discussions is the real prize.

Besides, who need $5,000 if you can break quantum encrypted bank transfer orders? ;-)