Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

January 19, 2017

Permitted Trump Protesters Will Be Ignored

Filed under: Government,Politics,Protests — Patrick Durusau @ 2:52 pm

I wish my headline was some of the “fake news” Democrats complain about but Alexandra Rosemann proves the truth of that headline in:

Ignoring anti-Trumpers: Why we can expect media blackout of protests against Trump’s inauguration.

Not ignored by just anybody, ignored by the media.

On Jan. 20 — 16 years ago — thousands of protesters lined the inauguration parade route of the incoming Republican president. “Not my president,” they chanted. But despite the enormity of the rally, it was largely ignored. Instead, pundits marveled over how George W. Bush “filled out the suit” and confirmed authority.

“The inauguration of George W. Bush was certainly a spectacle on Inauguration Day,” marvels Robin Andersen, the director of Peace and Justice studies at Fordham University, in the 2001 short documentary “Not My President: Voices From the Counter Coup.”

It’s nearly impossible not to anticipate the eerie parallels between George W. Bush’s inauguration and that of Donald Trump.

“Forty percent of the public still believed that Bush had not been legitimately elected, yet there’s almost no discussion of these electoral problems or the constitutional crisis,” Andersen explains in the film. “Instead, Bush undergoes a kind of transformation where he fills out the suit and becomes a leader. Forgotten are any of the questions about his ability, his experience or his mangling of the English language. His transformation is almost magical,” she adds.

Andersen estimated the inauguration protests, which occurred throughout the country, garnered approximately 10 minutes of total coverage on all the major networks.

“When we did see images of protesters, there was no explanation as to why. We were asked to be passive spectators in this ritual of legitimation when the real democratic issues that should have been being discussed were ignored,” Andersen says in the film, reflecting on the “real democracy” in the streets of Washington, D.C.

Your choice. Ten minutes of coverage out of over 24 hours of permitted protesting, or the media covering a 24 hour blockade of the DC Beltway.

fox5dc-map-460

Which one do you think draws more attention to your issues?

A new president will be inaugurated on January 20, 2017, but its your choice whether its him, his wife and a few cronies in attendance or hundreds of thousands.

See protests for more ideas on that possibility.

Empirical Analysis Of Social Media

Filed under: Government,Politics,Social Media — Patrick Durusau @ 11:01 am

How the Chinese Government Fabricates Social Media Posts for Strategic Distraction, not Engaged Argument by Gary King, Jennifer Pan, and Margaret E. Roberts. American Political Science Review, 2017. (Supplementary Appendix)

Abstract:

The Chinese government has long been suspected of hiring as many as 2,000,000 people to surreptitiously insert huge numbers of pseudonymous and other deceptive writings into the stream of real social media posts, as if they were the genuine opinions of ordinary people. Many academics, and most journalists and activists, claim that these so-called “50c party” posts vociferously argue for the government’s side in political and policy debates. As we show, this is also true of the vast majority of posts openly accused on social media of being 50c. Yet, almost no systematic empirical evidence exists for this claim, or, more importantly, for the Chinese regime’s strategic objective in pursuing this activity. In the first large scale empirical analysis of this operation, we show how to identify the secretive authors of these posts, the posts written by them, and their content. We estimate that the government fabricates and posts about 448 million social media comments a year. In contrast to prior claims, we show that the Chinese regime’s strategy is to avoid arguing with skeptics of the party and the government, and to not even discuss controversial issues. We infer that the goal of this massive secretive operation is instead to regularly distract the public and change the subject, as most of the these posts involve cheerleading for China, the revolutionary history of the Communist Party, or other symbols of the regime. We discuss how these results fit with what is known about the Chinese censorship program, and suggest how they may change our broader theoretical understanding of “common knowledge” and information control in authoritarian regimes.

I differ from the authors on some of their conclusions but this is an excellent example of empirical as opposed to wishful analysis of social media.

Wishful analysis of social media includes the farcical claims that social media is an effective recruitment tool for terrorists. Too often claimed to dignify with a citation but never with empirical evidence, only an author’s repetition of the common “wisdom.”

In contrast, King et al. are careful to say what their analysis does and does not support, finding in a number of cases, the evidence contradicts commonly held thinking about the role of the Chinese government in social media.

One example I found telling was the lack of evidence that anyone is paid for pro-government social media comments.

In the authors’ words:


We also found no evidence that 50c party members were actually paid fifty cents or any other piecemeal amount. Indeed, no evidence exists that the authors of 50c posts are even paid extra for this work. We cannot be sure of current practices in the absence of evidence but, given that they already hold government and Chinese Communist Party (CCP) jobs, we would guess this activity is a requirement of their existing job or at least rewarded in performance reviews.
… (at pages 10-11)

Here I differ from the author’s “guess”

…this activity is a requirement of their existing job or at least rewarded in performance reviews.

Kudos to the authors for labeling this a “guess,” although one expects the mainstream press and members of Congress to take it as written in stone.

However, the authors presume positive posts about the government of China can only result from direct orders or pressure from superiors.

That’s a major weakness in this paper and similar analysis of social media postings.

The simpler explanation of pro-government posts is a poster is reporting the world as they see it. (Think Occam’s Razor.)

As for sharing them with the so-called “propaganda office,” perhaps they are attempting to curry favor. The small number of posters makes it difficult to credit their motives (unknown) and behavior (partially known) as representative for the estimated 2 million posters.

Moreover, out of a population that nears 1.4 billion, the existence of 2 million individuals with a positive view of the government isn’t difficult to credit.

This is an excellent paper that will repay a close reading several times over.

Take it also as a warning about ideologically based assumptions that can mar or even invalidate otherwise excellent empirical work.

PS:

Additional reading:

From the Gary King’s webpage on the article:

This paper follows up on our articles in Science, “Reverse-Engineering Censorship In China: Randomized Experimentation And Participant Observation”, and the American Political Science Review, “How Censorship In China Allows Government Criticism But Silences Collective Expression”.

GNU Unifont Glyphs [Good News/Bad News]

Filed under: Fonts,Unicode — Patrick Durusau @ 9:43 am

GNU Unifont Glyphs 9.0.06.

From the webpage:

GNU Unifont is part of the GNU Project. This page contains the latest release of GNU Unifont, with glyphs for every printable code point in the Unicode 9.0 Basic Multilingual Plane (BMP). The BMP occupies the first 65,536 code points of the Unicode space, denoted as U+0000..U+FFFF. There is also growing coverage of the Supplemental Multilingual Plane (SMP), in the range U+010000..U+01FFFF, and of Michael Everson’s ConScript Unicode Registry (CSUR).
… (red highlight in original)

That’s the good news.

The bad news is shown by the coverage mapping:

0.0%  U+012000..U+0123FF  Cuneiform*
0.0%  U+012400..U+01247F  Cuneiform Numbers and Punctuation*
0.0%  U+012480..U+01254F  Early Dynastic Cuneiform*
0.0%  U+013000..U+01342F  Egyptian Hieroglyphs*
0.0%  U+014400..U+01467F  Anatolian Hieroglyphs*

These scripts will require a 32-by-32 pixel grid:

*Note: Scripts such as Cuneiform, Egyptian Hieroglyphs, and Bamum Supplement will not be drawn on a 16-by-16 pixel grid. There are plans to draw these scripts on a 32-by-32 pixel grid in the future.

One additional resource on creating cuneiform fonts:

Creating cuneiform fonts with MetaType1 and FontForge by Karel Píška:

Abstract:

A cuneiform font collection covering Akkadian, Ugaritic and Old Persian glyph subsets (about 600 signs) has been produced in two steps. With MetaType1 we generate intermediate Type 1 fonts, and then construct OpenType fonts using FontForge. We describe cuneiform design and the process of font development.

On creating fonts more generally with FontForge, see: Design With FontForge.

Enjoy!

January 18, 2017

Do You Have Big Brass Ones*? FOIA The President

Filed under: FOIA,Government,MuckRock,Politics — Patrick Durusau @ 5:20 pm

Join our project to FOIA the Trump administration by Michael Morisy.

From the post:

Since June 2015, MuckRock users have been filing FOIA requests regarding a possible Trump presidency. In fact, so far there’s been over 160 public Trump-related requests filed through the site, all of which you can browse here.

We’ve also put together a number of guides and articles on the upcoming administration, ranging from what you can and can’t file regarding Trump to deep dives into what’s already out there:

We’ve launched a new project page for users to showcase their requests, find new documents regarding the Trump administration, or get inspiration for their own requests, and we’ve created a special Slack channel for you to join in and strategize on future requests, or help share big league FOIA stories that shed light on the President Elect’s team.

We’ve had a few users join us there already and they’ve helped file some really fun requests, so we’re excited about what else the transparency community can come up with.

An effort worthy of both your time and support!

One answered, remember that availability isn’t the same thing as meaningful access.

OCR, indexing, entity extraction, in short any skill you have is important in this effort.

* No longer a gender specific reference as you well know.

PS: I’ve signed up and need suggestions on what to ask for? Suggestions?

The CIA’s Secret History Is Now Online [Indexing, Mapping, NLP Anyone?]

Filed under: Government,Government Data,Politics — Patrick Durusau @ 3:59 pm

The CIA’s Secret History Is Now Online by Jason Leopold.

From the post:

Decades ago, the CIA declassified a 26-page secret document cryptically titled “clarifying statement to Fidel Castro concerning assassination.”

It was a step toward greater transparency for one of the most secretive of all federal agencies. But to find out what the document actually said, you had to trek to the National Archives in College Park, Maryland, between the hours of 9 a.m. and 4:30 p.m. and hope that one of only four computers designated by the CIA to access its archives would be available.

But today the CIA posted the Castro record on its website along with more than 12 million pages of the agency’s other declassified documents that have eluded the public, journalists, and historians for nearly two decades. You can view the documents here.

The title of the Castro document, as it turns out, was far more interesting than the contents. It includes a partial transcript of a 1977 transcript between Barbara Walters and Fidel Castro in which she asked the late Cuban dictator whether he had “proof” of the CIA’s last attempt to assassinate him. The transcript was sent to Adm. Stansfield Turner, the CIA director at the time, by a public affairs official at the agency with a note highlighting all references to CIA.

But that’s just one of the millions documents, which date from the 1940s to 1990s, are wide-ranging, covering everything from Nazi war crimes to mind-control experiments to the role the CIA played in overthrowing governments in Chile and Iran. There are also secret documents about a telepathy and precognition program known as Star Gate, files the CIA kept on certain media publications, such as Mother Jones, photographs, more than 100,000 pages of internal intelligence bulletins, policy papers, and memos written by former CIA directors.

Michael Best, @NatSecGeek has pointed out the “CIA de-OCRed at least some of the CREST files before they uploaded them.”

Spy agency class petty. Grant public access but force the restoration of text search.

The restoration of text search work is underway so next steps will be indexing, NLP, mapping, etc.

A great set of documents to get ready for future official and unofficial leaks of CIA documents.

Enjoy!

PS: Curious if any of the search engine vendors will use CREST as demonstration data? Non-trivial size, interesting search issues, etc.

Ask at the next search conference.

Resistance Manual / Indivisible

Filed under: Government,Politics,Protests — Patrick Durusau @ 1:40 pm

Resistance Manual

An essential reference for the volatile politics of the Trump presidency.

Indivisible

Four former congressional staffers banded together to write: “A practical guide to resisting the Trump Agenda.”

Both are shaped by confidence in current political and social mechanisms, to say nothing of a faith in non-violence.

Education is seen as the key to curing bigotry/prejudice and moving towards a more just society.

You will not find links to:

Steal this Book or the Anarchist Cookbook, 2000 edition for example.

There are numerous examples cited as “successful” non-violent protests. The elimination of de jure segregation in the American South. (Resource includes oral histories of the time.)

But, de facto segregation in schools is larger than it was in the 1960’s.

How do you figure that into the “success” of non-violent protests?

Read both Resistance Manual and Indivisible for what may be effective techniques.

But ask yourself, do non-violent protests comfort the victims of violence?

Or just the non-violent protesters?

Quantum Computer Resistant Encryption

Filed under: Cryptography,Quantum,Security — Patrick Durusau @ 10:37 am

Irish Teen Introduces New Encryption System Resistant to Quantum Computers by Joseph Young.

From the post:


… a 16-year-old student was named as Ireland’s top young scientist and technologist of 2017, after demonstrating the application of qCrypt, which offers higher levels of protection, privacy and encryption in comparison to other innovative and widely-used cryptographic systems.

BT Young Scientist Judge John Dunnion, the associate professor at University of College Dublin, praised Curran’s project that foresaw the impact quantum computing will have on current cryptographic and encryption methods.

“qCrypt is a novel distributed data storage system that provides greater protection for user data than is currently available. It addresses a number of shortfalls of current data encryption systems; in particular, the algorithm used in the system has been demonstrated to be resistant to attacks by quantum computers in the future,” said Dunnion.

While it may be too early to predict whether technologies like qCrypt can protect existing encryption methods and data protection systems from quantum computers, Curran and the judges of the competition saw promising potential in the technology.

Word is spreading rapidly.

qCrypt has a place-holder website, Post-Quantum Cryptography for the Masses.

A Youtube video:

Shane’s Github repository (no qCrypt, yet)

Not to mention Shane’s website.

qCrypt has the potential to provide safety from government surveillance for everyone, everywhere.

Looking forward to this!

Top considerations for creating bioinformatics software documentation

Filed under: Bioinformatics,Documentation,Software — Patrick Durusau @ 9:33 am

Top considerations for creating bioinformatics software documentation by Mehran Karimzadeh and Michael M. Hoffman.

Abstract

Investing in documenting your bioinformatics software well can increase its impact and save your time. To maximize the effectiveness of your documentation, we suggest following a few guidelines we propose here. We recommend providing multiple avenues for users to use your research software, including a navigable HTML interface with a quick start, useful help messages with detailed explanation and thorough examples for each feature of your software. By following these guidelines, you can assure that your hard work maximally benefits yourself and others.

Introduction

You have written a new software package far superior to any existing method. You submit a paper describing it to a prestigious journal, but it is rejected after Reviewer 3 complains they cannot get it to work. Eventually, a less exacting journal publishes the paper, but you never get as many citations as you expected. Meanwhile, there is not even a single day when you are not inundated by emails asking very simple questions about using your software. Your years of work on this method have not only failed to reap the dividends you expected, but have become an active irritation. And you could have avoided all of this by writing effective documentation in the first place.

Academic bioinformatics curricula rarely train students in documentation. Many bioinformatics software packages lack sufficient documentation. Developers often prefer spending their time elsewhere. In practice, this time is often borrowed, and by ducking work to document their software now, developers accumulate ‘documentation debt’. Later, they must pay off this debt, spending even more time answering user questions than they might have by creating good documentation in the first place. Of course, when confronted with inadequate documentation, some users will simply give up, reducing the impact of the developer’s work.
… (emphasis in original)

Take to heart the authors’ observation on automatic generation of documentation:


The main disadvantage of automatically generated documentation is that you have less control of how to organize the documentation effectively. Whether you used a documentation generator or not, however, there are several advantages to an HTML web site compared with a PDF document. Search engines will more reliably index HTML web pages. In addition, users can more easily navigate the structure of a web page, jumping directly to the information they need.

I would replace “…less control…” with “…virtually no meaningful control…” over the organization of the documentation.

Think about it for a second. You write short comments, sometimes even incomplete sentences as thoughts occur to you in a code or data context.

An automated tool gathers those comments, even incomplete sentences, rips them out of their original context and strings them one after the other.

Do you think that provides a meaningful narrative flow for any reader? Including yourself?

Your documentation doesn’t have to be great literature but as Karimzadeh and Hoffman point out, good documentation can make the difference between use and adoption and your hard work being ignored.

Ping me if you want to take your documentation to the next level.

January 17, 2017

Online tracking: A 1-million-site measurement and analysis [Leaving False Trails]

Filed under: Cybersecurity,Privacy,Web Browser — Patrick Durusau @ 5:35 pm

Online tracking: A 1-million-site measurement and analysis by Steven Englehardt and Arvind Narayanan.

From the webpage:

Tracking Results

During our January 2016 measurement of the top 1 million sites, our tool made over 90 million requests, assembling the largest dataset (to our knowledge) used for studying web tracking. With this scale we can answer many web tracking questions: Who are the largest trackers? Which sites embed the largest number of trackers? Which tracking technologies are used, and who is using them? and many more.

Findings

The total number of third parties present on at least two first parties is over 81,000, but the prevalence quickly drops off. Only 123 of these 81,000 are present on more than 1% of sites. This suggests that the number of third parties that a regular user will encounter on a daily basis is relatively small. The effect is accentuated when we consider that different third parties may be owned by the same entity. All of the top 5 third parties, as well as 12 of the top 20, are Google-owned domains. In fact, Google, Facebook, and Twitter are the only third-party entities present on more than 10% of sites.
… (emphasis in original)

Impressive research based upon an impressive tool, OpenWPM.

The Github page for OpenWPM reads in part:

OpenWPM is a web privacy measurement framework which makes it easy to collect data for privacy studies on a scale of thousands to millions of site. OpenWPM is built on top of Firefox, with automation provided by Selenium. It includes several hooks for data collection, including a proxy, a Firefox extension, and access to Flash cookies. Check out the instrumentation section below for more details.

Just a point of view but I’m more interested in specific privacy tracking data for some given set of servers than general privacy statistics.

Specific privacy tracking data that enables planning the use of remote browsers to leave false trails.

Kudos to the project, however you choose to use the software.

The Political Librarian (volume 2, issue 2)

Filed under: Government,Library,Politics — Patrick Durusau @ 5:16 pm

The Political Librarian

From the webpage:

The Political Librarian is dedicated to expanding the discussion of, promoting research on, and helping to re-envision locally focused advocacy, policy, and funding issues for libraries.

We want to bring in a variety of perspectives to the journal and do not limit our contributors to just those working in the field of library and information science. We seek submissions from researchers, practitioners, community members, or others dedicated to furthering the discussion, promoting research, and helping to re-envision tax policy and public policy on the extremely local level.

Grab the entire volume 2, issue 2 (December 2016) for reading while stopped on the DC Beltway, January 20, 2017.

Libraries need your help to survive and prosper during the rapidly approaching winter of ignorance.

#DisruptJ20 – 3 inch resolution aerial imagery Washington, DC @J20protests

Filed under: Geographic Data,Image Understanding,MapBox,Mapping,Maps — Patrick Durusau @ 4:22 pm

3 inch imagery resolution for Washington, DC by Jacques Tardie.

From the post:

We updated our basemap in Washington, DC with aerial imagery at 3 inch (7.5 cm) resolution. The source data is openly licensed by DC.gov, thanks to the District’s open data initiative.

If you aren’t familiar with Mapbox, there is no time like the present!

If you are interested in the just the 3 inch resolution aerial imagery, see: http://opendata.dc.gov/datasets?keyword=imagery.

Enjoy!

Raw SIGINT Locations Expanded

Filed under: Cybersecurity,Government,Intelligence,Privacy — Patrick Durusau @ 3:30 pm

President Obama has issued new rules for sharing information under Executive Order 12333, with the ungainly title: (U) Procedures for the Availability or Dissemination of Raw Signals Intelligence Information by the National Security Agency Under Section 2.3 of Executive Order 12333 (Raw SIGINT Availability Procedures).

Kate Tummarello, in Obama Expands Surveillance Powers On His Way Out by Kate Tummarello, sees a threat to “innocent persons:”

With mere days left before President-elect Donald Trump takes the White House, President Barack Obama’s administration just finalized rules to make it easier for the nation’s intelligence agencies to share unfiltered information about innocent people.

New rules issued by the Obama administration under Executive Order 12333 will let the NSA—which collects information under that authority with little oversight, transparency, or concern for privacy—share the raw streams of communications it intercepts directly with agencies including the FBI, the DEA, and the Department of Homeland Security, according to a report today by the New York Times.

That’s a huge and troubling shift in the way those intelligence agencies receive information collected by the NSA. Domestic agencies like the FBI are subject to more privacy protections, including warrant requirements. Previously, the NSA shared data with these agencies only after it had screened the data, filtering out unnecessary personal information, including about innocent people whose communications were swept up the NSA’s massive surveillance operations.

As the New York Times put it, with the new rules, the government claims to be “reducing the risk that the N.S.A. will fail to recognize that a piece of information would be valuable to another agency, but increasing the risk that officials will see private information about innocent people.”

All of which is true, but the new rules have other impacts as well.

Who is an “IC element?”

The new rules make numerous references to an “IC element,” but comes up short in defining them:

L. (U) IC element is as defined in section 3.5(h) of E.O. 12333.
(emphasis in original)

Great.

Searching for E.O. 12333 isn’t enough. You need Executive Order 12333 United States Intelligence Activities (As amended by Executive Orders 13284 (2003), 13355 (2004) and 13470 (2008)). The National Archives version of Executive Order 12333 is not amended and hence is misleading.

From the amended E.0. 12333:

3.5 (h) Intelligence Community and elements of the Intelligence Community 
        refers to:
(1) The Office of the Director of National Intelligence;
(2) The Central Intelligence Agency;
(3) The National Security Agency;
(4) The Defense Intelligence Agency;
(5) The National Geospatial-Intelligence Agency;
(6) The National Reconnaissance Office; 
(7) The other offices within the Department of Defense for the collection of 
    specialized national foreign intelligence through reconnaissance programs;
(8) The intelligence and counterintelligence elements of the Army, the Navy,
    the Air Force, and the Marine Corps;
(9) The intelligence elements of the Federal Bureau of Investigation;
(10) The Office of National Security Intelligence of the Drug Enforcement
     Administration;
(11) The Office of Intelligence and Counterintelligence of the Department
      of Energy;
(12) The Bureau of Intelligence and Research of the Department of State;
(13) The Office of Intelligence and Analysis of the Department of the Treasury;
(14) The Office of Intelligence and Analysis of the Department of Homeland 
     Security;
(15) The intelligence and counterintelligence elements of the Coast Guard; and
(16) Such other elements of any department or agency as may be designated by 
     the President, or designated jointly by the Director and the head of the 
     department or agency concerned, as an element of the Intelligence Community. 

The Office of the Director of National Intelligence has an incomplete list of IC elements:

Air Force Intelligence Defense Intelligence Agency Department of the Treasury National Geospatial-Intelligence Agency
Army Intelligence Department of Energy Drug Enforcement Administration National Reconnaissance Office
Central Intelligence Agency Department of Homeland Security Federal Bureau of Investigation National Security Agency
Coast Guard Intelligence Department of State Marine Corps Intelligence Navy Intelligence

I say “incomplete” because from E.O. 12333, it is missing (with original numbers for reference):

...
(7) The other offices within the Department of Defense for the collection of 
    specialized national foreign intelligence through reconnaissance programs;
(8) The intelligence and counterintelligence elements of ..., and the 
    Marine Corps;
...
(16) Such other elements of any department or agency as may be designated by 
     the President, or designated jointly by the Director and the head of the 
     department or agency concerned, as an element of the Intelligence Community.

Under #7 and #16, there are other IC elements that are unnamed and unlisted by the Office of the DOI. I suspect the Marines were omitted for stylistic reasons.

Where to Find Raw SIGINT?

Identified IC elements are important because the potential presence of “Raw SIGINT,” beyond the NSA, has increased their value as targets.

P. (U) Raw SIGINT is any SIGINT and associated data that has not been evaluated for foreign intelligence purposes and/or minimized.
… (emphasis in original, from the new rules.)

Tummarello is justly concerned about “innocent people” but there are less than innocent people, any number of appointed/elected official or barons of industry who may be captured on the flypaper of raw SIGINT.

Happy hunting!

PS:

Warning: It’s very bad OPSEC to keep a trophy chart on your wall. 😉

IC_Circle-460

You will, despite this warning, but I had to try.

The original image is here at Wikipedia.

January 16, 2017

Never Allow Your Self-Worth To Depend Upon A Narcissist

Filed under: Journalism,News,Politics,Reporting — Patrick Durusau @ 5:26 pm

The White House press corps has failed, again, in its relationship with President Trump.

The latest debacle is described in Defiant WH Press Corps “won’t go away” if ejected, says Major Garrett.

From the post:

There have been rumblings about kicking the press out of the White House almost since Donald Trump won the presidency, culminating with a report in Esquire last week that the Trump administration has in fact been giving the idea “serious consideration.”

“If they do so, we’ll still cover him. The White House press corps won’t go away,” CBS News Chief White House Correspondent Major Garrett told CBSN’s Josh Elliott Monday. “You can shove us a block away, two blocks away, a mile away. We will be on top of this White House — as we’ve been on top of every White House.”

Mr. Trump and several on his communications team have had a stormy relationship with the press, both during his presidential campaign and during his transition.

“I would not be surprised if they moved us out. I really do think there is something about the Trump administration and those closest to him who want the symbolism of driving reporters out of the White House, moving the elites out farther away from this president,” Garrett said.

Does the self-worth of the White House press corps depend upon where they are located by a known narcissist?

If so, they are in for a long four years.

That is doubly true for Trump’s denigration of reporters and others.

A fundamental truth to remember for the next four years:

Trump’s comments about you, favorable or unfavorable, are smelly noise. They will dissipate, unless repeated over and over, as though it matters if a narcissist denies or affirms your existence.

It doesn’t.

XML.com Relaunch!

Filed under: XML,XML Schema,XPath,XQuery,XSLT — Patrick Durusau @ 4:11 pm

XML.com

Lauren Wood posted this note about the relaunch of XML.com recently:

I’ve relaunched XML.com (for some background, Tim Bray wrote an article here: https://www.xml.com/articles/2017/01/01/xmlcom-redux/). I’m hoping it will become part of the community again, somewhere for people to post their news (submit your news here: https://www.xml.com/news/submit-news-item/) and articles (see the guidelines at https://www.xml.com/about/contribute/). I added a job board to the site as well (if you’re in Berlin, Germany, or able to
move there, look at the job currently posted; thanks LambdaWerk!); if your employer might want to post XML-related jobs please email me.

The old content should mostly be available but some articles were previously available at two (or more) locations and may now only be at one; try the archive list (https://www.xml.com/pub/a/archive/) if you’re looking for something. Please let me know if something major is missing from the archives.

XML is used in a lot of areas, and there is a wealth of knowledge in this community. If you’d like to write an article, send me your ideas. If you have comments on the site, let me know that as well.

Just in time as President Trump is about to stir, vigorously, that big pot of crazy known as federal data.

Mapping, processing, transformation demands will grow at an exponential rate.

Notice the emphasis on demand.

Taking a two weeks to write custom software to sort files (you know the Weiner/Abedin laptop story, yes?) won’t be acceptable quite soon.

How are your on-demand XML chops?

Defeating Police Formations – Parallel Distributed Protesting

Filed under: Politics,Protests — Patrick Durusau @ 2:58 pm

If you haven’t read FEMA’s Field Force Operations PER-200, then you are unprepared for #DisruptJ20 or any other serious protest effort.

It’s a real snore in parts, but knowing police tactics will:

  1. Eliminate the element of surprise and fear of the unexpected
  2. Enable planning of protective clothing and other measures
  3. Enable planning of protests to eliminate police advantages
  4. Enable protesters to respond with their own formations

among other things.

On Common Police Formation

While reading Field Force Operations PER-200, I encountered several police formations you are likely to see at #DisruptJ20.

The crossbow arrest formation is found at pages 48-49 and illustrated with:

cross-bow-01-460

cross-bow-02-460

cross-bow-03-460

A number of counter tactics suggest themselves, depending upon your views on non-violence. Passive resistance by anyone who is arrested, thereby consuming more police personnel to secure their arrest. Passively prevented the retreat of the arrest team and its security circle. Breaching the skirmish line on either side of the column, just before the column surges forward, exposing the flank of the column.

Requirements for the crossbow arrest formation

What does the crossbow arrest formation require more than anything else?

You peeked! 😉

Yes, the police formations in Field Force Operations PER-200, including the crossbow arrest formation all require a crowd.

Don’t get me wrong, crowds can be a good thing and sometimes the only solution. Standing Rock is a great example of taking and holding a location against all odds.

But a great tactic for one protest and its goals, may be a poor tactic for another protest, depending upon goals, available tactics, resources, etc.

Consider the planned and permitted protests for #DisruptJ20.

All are subject to the police formation detailed by FEMA and the use of “less lethal” force by police forces.

How can #DisruptJ20 demonstrate the anger of the average citizen and at the same time defeat police formations?

Parallel Distributed Protesting

Instead of massing in a crowd, where police formations and “less lethal” force are options, what if protesters stopped, ran out of gas, had flat tires on the 64-mile DC Beltway.

I mention the length of the Beltway, 64 miles, because it is ten miles longer than marches from Montgomery to Selma, Alabama. You may remember one of those marches, it’s documented at The incident at the Edmund Pettus Bridge.

On March 7, 195, Representative John Lewis, Hosea Williams and other protesters marched across the Pettus bridge knowing that brutality and perhaps death awaited them.

Protesters who honor Lewis, Williams and other great civil rights leaders can engage in parallel distributed protesting on January 20, 2017.

Each car slowing, stopping, having a flat tire, is a distributed protest point. With distributed protest points occurring in parallel, the Beltway grinds to a halt. No one enters or leaves Washington, D.C. for a day.

Not the same as the footage from the Pettus Bridge, but shutting down the D.C. Beltway will be a news story for months and years to come.

fox5dc-map-460

Lewis, Williams and others were willing to march into the face violence and evil, are you willing to drive to the D.C. Beltway to stop, run out of gas or have a flat tire in their honor?

PS: Beltway blockaders should always be respectful of police officers. They probably don’t like what is happening any more than you do. Besides, their police cruisers are also blocking traffic so their presence is contributing to the gridlock as well.

Password Advice For Leakers

Filed under: Cybersecurity,Journalism — Patrick Durusau @ 11:31 am

What the Most Common Passwords of 2016 List Reveals [Research Study] by Keeper.

As a prospective leaker, if your password is any of the ones listed below, congratulations! (“123456” leads with 17%.)

Your password is in the top 50% of 10 million passwords analyzed by Keeper in 2016.

Extremely plausible evil hackers “discovered” your login and then “cracked” your password.

No longer a “leak,” but a theft and the thief isn’t you. (How’s that for protecting leakers?)

Rank Password
1. 123456
2. 123456789
3. qwerty
4. 12345678
5. 11111
6. 1234567890
7. 1234567
8. password
9. 123123
10. 987654321
11. qwertyuiop
12. mynoob
13. 123321
14. 666666
15. 18atcsk2w
16. 7777777
17. 1q2w3e4r
18. 654321
19. 555555
20. 3rjs1la7qe
21. google
22. 1q2w3e4r5t
23. 123qwe
24. zxcvbnm
25. 1q2w3e

You must follow the leaking instructions at: https://theintercept.com/leak/, but leak only your login, password and network URL.

No guarantees that The Intercept will take the initiative but they aren’t the only game in town.

Highly Effective Gmail Phishing

Filed under: Cybersecurity,Journalism,News,Reporting — Patrick Durusau @ 8:56 am

Wide Impact: Highly Effective Gmail Phishing Technique Being Exploited by Mark Maunder.

From the post:

As you know, at Wordfence we occasionally send out alerts about security issues outside of the WordPress universe that are urgent and have a wide impact on our customers and readers. Unfortunately this is one of those alerts. There is a highly effective phishing technique stealing login credentials that is having a wide impact, even on experienced technical users.

I have written this post to be as easy to read and understand as possible. I deliberately left out technical details and focused on what you need to know to protect yourself against this phishing attack and other attacks like it in the hope of getting the word out, particularly among less technical users. Please share this once you have read it to help create awareness and protect the community.

Mark’s omission of the “technical details” makes this more of an advertisement for phishing with Gmail than a how-to guide.

Still, the observation that even “experienced technical users” are trapped by this technique should encourage journalists in particular to consider adding phishing, voluntary or otherwise to their data gathering toolkit.

As I pointed out yesterday, Phishing As A Public Service – Leak Access, Not Data, enabling leakers to choose to receive phishing emails can result in greater access to documents by reporters at less risk to leakers.

With the daily hype about data breaches, who can blame some mid-level management type for their computer being breached? Oh, it could result in loss of employment, maybe, but greatly reduces the odds of being fingered as a leaker.

Unlike plain brown paper wrappers with Glenn Greenwald‘s address on them. 😉

If phishing sounds a bit exotic, consider listing software/versions with known vulnerabilities that users can install and then visit a website for an innocent registration that captures their details.

Journalism as active information gathering as opposed to consuming leaks and government hand-outs.

January 15, 2017

Phishing As A Public Service – Leak Access, Not Data

Filed under: Government,Politics,Protests — Patrick Durusau @ 5:23 pm

The Intercept tweeted today:

intercept-460

Kudos to The Intercept for reaching out to (US) federal employees to encourage safe leaking.

On the other hand, have you thought about the allocation of risks for leaking?

Take Edward Snowden for example. If caught, Snowden is going to jail, NOT Glenn Greenwald or other reporters who used the Snowden leak.

The Intercept has a valid point when it says:


Without leaks, journalists would have never connected the Watergate scandal to President Nixon, or discovered that the Reagan White House illegally sold weapons to Iran. In the past 15 years alone, inside sources played a vital role in uncovering secret prisons, abuses at Abu Ghraib, atrocities in Afghanistan and Iraq, and mass surveillance by the NSA.

At least historically speaking. Back in the days when hard copy was the norm.

Hard copy isn’t the norm now and leaking guidelines need to catch up to the present day.

Someone could have leaked a portion of the Office of Personnel Management records but in a modern age, digital was far more powerful. (That was a straight hack but it illustrates the difference between sweaty smuggling of hard copy versus giving others the key to a vault.)

If instead of leaking documents/data, imagine following these instructions:

The best option is to use our SecureDrop server, which has the advantage of allowing us to send messages back to you, while allowing you to remain totally anonymous — even to us, if that is what you prefer.

  • Begin by bringing your personal computer to a Wi-Fi network that isn’t associated with you or your employer, like one at a coffee shop. Download the Tor Browser. (Tor allows you to go online while concealing your IP address from the websites you visit.)
  • You can access our SecureDrop server by going to http://y6xjgkgwj47us5ca.onion/ in the Tor Browser. This is a special kind of URL that only works in Tor. Do NOT type this URL into a non-Tor Browser. It won’t work — and it will leave a record.
  • If that is too complicated, or you don’t wish to engage in back-and-forth communication with us, a perfectly good alternative is to simply send mail to P.O. Box 65679, Washington, D.C., 20035, or to The Intercept, 114 Fifth Avenue, 18th Floor, New York, New York, 10011. Drop it in a mailbox (do not send it from home, work or a post office) with no return address.

And you send the following:

  1. Your email address
  2. Screen shots of legitimate emails you get on a regular basis
  3. What passwords are the most important

That’s it.

The receiver constructs a phishing email and sends it to your address.

Like John Podesta and numerous other public figures, you are taken in by this scam.

Evil doers use your present password for access and you have system recorded evidence that you were duped.

How does that allocation of risk look to you, as a potential leaker?

PS: Some, but not all, journalists will be quick to point out what I suggest is, drum roll, illegal. OK, and the question?

Those journalists are being very brave on behalf of leakers, knowing they will never share the fate of a leaker.

I make an exception for all the very brave journalists writing outside of the United States and a few other areas at great personal risk. But then they are unlikely to be concerned with the niceties of the law when dealing with a rogue government.

Update: Apologies but I forgot to include a link to the original post: Attention Federal Employees: If You See Something, Leak Something.

Thoughts on Blockading Metro Rail Stops

Filed under: Government,Politics,Protests — Patrick Durusau @ 3:38 pm

A recent news report mentioned the potential for blockades of DC Metro Rail stops.

Curbed Washington DC posted a list of those stops, but like many reporters, did not provide links to the stops.

🙁

Here’s their list:

metro-stops-460

Metro Stops with Hyperlinks

Here’s my version, in the same ticket color order:

Presented as the original, the list leaves the impression of more Metro stops than require blockading. Here is “apparent” count of Metro Stops is twelve (12).

Discovering Duplicate Metro Rail Stops

Rearrangement by Metro Rail stops reveals duplicates:

Deduped Metro Stops and Priority Map

If we remove the duplicate stops and sort by stop name, we find only eight (8) Metro Stops for blockading.

  1. Capital South Green Ticket Holders
  2. Eastern Market Green Ticket Holders
  3. Federal Center SW Orange Ticket Holders, Silver Ticket Holders
  4. Gallery Place-Chinatown Blue Ticket Holders, Red Ticket Holders
  5. Judiciary Square Blue Ticket Holders, Red Ticket Holders
  6. L’Enfant Plaza Orange Ticket Holders, Silver Ticket Holders
  7. NoMa-Gallaudet U Yellow Ticket Holders
  8. Union Station Yellow Ticket Holders

All of this is public information and with a little rearrangement, it becomes easier to focus resources on any potential blockading of those stops.

In terms of priorities, Curbed Washington DC posted a map of the gate locations and guest sections for ticket holders. I took a screen-shot of the center portion:

guest-sections-460

If your are interested in activities around the checkpoints, see the larger map.

So You Want To Blockade A Metro Stop

A map of Union Station reminded me that open street blockading isn’t likely to close a Metro Rail stop.

Why? Even with a large number of hardened protesters, the police can approach you from all sides, driving you in particular directions with “less lethal” weapons.

But the architecture of a Metro Rail stop offers an alternative strategy to open air resistance.

Don’t blockade outside a Metro Rail stop, blockade the stop by occupying stairwells, access points, etc.

Anyone opposing the blockade will seek to restore service and so be less likely to use persistent gases or other irritants in closed spaces.

The other advantage of escalators, stairways is that the police can only approach from in front or from behind you. Enabling you to defend the edges of your formation with layers of the most recalcitrant protesters.

I know you intend to peacefully and lawfully assemble only but be aware you may have those in your midst who damage and/or disable turnstiles. Either with some variety of fast acting adhesives or jamming them with thin metallic objects. Although illegal, those acts will also contribute to delaying the restoration of full service.

More thoughts on blockades reduce the number of people reaching Metro Rail stops tomorrow.

PS: It’s unfortunate the Metro doesn’t use tokens anymore. There are some interesting things that can happen with tokens.

January 14, 2017

New Spaceship Speed in Conway’s Game of Life

Filed under: Cellular Automata,Computer Science — Patrick Durusau @ 5:09 pm

New Spaceship Speed in Conway’s Game of Life by Alexy Nigin.

From the post:

In this article, I assume that you have basic familiarity with Conway’s Game of Life. If this is not the case, you can try reading an explanatory article but you will still struggle to understand much of the following content.

The day before yesterday ConwayLife.com forums saw a new member named zdr. When we the lifenthusiasts meet a newcomer, we expect to see things like “brand new” 30-cell 700-gen methuselah and then have to explain why it is not notable. However, what zdr showed us made our jaws drop.

It was a 28-cell c/10 orthogonal spaceship:

An animated image of the spaceship

… (emphasis in the original)

The mentioned introduction isn’t sufficient to digest the material in this post.

There is a wealth of material available on cellular automata (the Game of Life is one).

LifeWiki is one and Complex Cellular Automata is another. While not exhaustive of all there is to know about cellular automata, familiarity with take some time and skill.

Still, I offer this as encouragement that fundamental discoveries remain to be made.

But if and only if you reject conventional wisdom that prevents you from looking.

Looking up words in the OED with XQuery [Details on OED API Key As Well]

Filed under: Dictionary,XML,XQuery — Patrick Durusau @ 4:14 pm

Looking up words in the OED with XQuery by Clifford Anderson.

Clifford has posted a gist of work from the @VandyLibraries XQuery group, looking up words in the Oxford English Dictionary (OED) with XQuery.

To make full use of Clifford’s post, you will need for the Oxford Dictionaries API.

If you go straight to the regular Oxford English Dictionary (I’m omitting the URL so you don’t make the same mistake), there is nary a mention of the Oxford Dictionaries API.

The free plan allows 3K queries a month.

Not enough to shut out the outside world for the next four/eight years but enough to decide if it’s where you want to hide.

Application for the free api key was simple enough.

Save that the dumb password checker insisted on one or more special characters, plus one or more digits, plus upper and lowercase. When you get beyond 12 characters the insistence on a special character is just a little lame.

Email response with the key was fast, so I’m in!

What about you?

January 13, 2017

D-Wave Just Open-Sourced Quantum Computing [DC Beltway Parking Lot Distraction]

Filed under: Computer Science,Quantum — Patrick Durusau @ 9:10 pm

D-Wave Just Open-Sourced Quantum Computing by Dom Galeon.

D-Wave has just released a welcome distraction for CS types sitting in the DC Beltway Parking Lot on January 20-21, 2017. (I assuming you brought extra batteries for your laptop.) After you run out of gas, your laptop will be running on battery power alone.

Just remember to grab a copy of Qbsolv before you leave for the tailgate/parking lot party on the Beltway.

A software tool known as Qbsolv allows developers to program D-Wave’s quantum computers even without knowledge of quantum computing. It has already made it possible for D-Wave to work with a bunch of partners, but the company wants more. “D-Wave is driving the hardware forward,” Bo Ewald, president of D-Wave International, told Wired. “But we need more smart people thinking about applications, and another set thinking about software tools.”

To that end, D-Wave has open-sourced Qbsolv, making it possible for anyone to freely share and modify the software. D-Wave hopes to build an open source community of sorts for quantum computing. Of course, to actually run this software, you’d need access to a piece of hardware that uses quantum particles, like one of D-Wave’s quantum computers. However, for the many who don’t have that access, the company is making it possible to download a D-Wave simulator that can be used to test Qbsolv on other types of computers.

This open-source Qbsolv joins an already-existing free software tool called Qmasm, which was developed by one of Qbsolv’s first users, Scott Pakin of Los Alamos National Laboratory. “Not everyone in the computer science community realizes the potential impact of quantum computing,” said mathematician Fred Glover, who’s been working with Qbsolv. “Qbsolv offers a tool that can make this impact graphically visible, by getting researchers and practitioners involved in charting the future directions of quantum computing developments.”

D-Wave’s machines might still be limited to solving optimization problems, but it’s a good place to start with quantum computers. Together with D-Wave, IBM has managed to develop its own working quantum computer in 2000, while Google teamed up with NASA to make their own. Eventually, we’ll have a quantum computer that’s capable of performing all kinds of advanced computing problems, and now you can help make that happen.

From the github page:

qbsolv is a metaheuristic or partitioning solver that solves a potentially large quadratic unconstrained binary optimization (QUBO) problem by splitting it into pieces that are solved either on a D-Wave system or via a classical tabu solver.

The phrase, “…might still be limited to solving optimization problems…” isn’t as limiting as it might appear.

A recent (2014) survey of quadratic unconstrained binary optimization (QUBO), The Unconstrained Binary Quadratic Programming Problem: A Survey runs some thirty-three pages and should keep you occupied however long you sit on the DC Beltway.

From page 10 of the survey:


Kochenberger, Glover, Alidaee, and Wang (2005) examine the use of UBQP as a tool for clustering microarray data into groups with high degrees of similarity.

Where I read one person’s “similarity” to be another person’s test of “subject identity.”

PS: Enjoy the DC Beltway. You may never see it motionless ever again.

Calling Bullshit in the Age of Big Data (Syllabus)

Filed under: Critical Reading,Journalism,News,Reporting,Research Methods — Patrick Durusau @ 7:33 pm

Calling Bullshit in the Age of Big Data by Carl T. Bergstrom and Jevin West.

From the about page:

The world is awash in bullshit. Politicians are unconstrained by facts. Science is conducted by press release. So-called higher education often rewards bullshit over analytic thought. Startup culture has elevated bullshit to high art. Advertisers wink conspiratorially and invite us to join them in seeing through all the bullshit, then take advantage of our lowered guard to bombard us with second-order bullshit. The majority of administrative activity, whether in private business or the public sphere, often seems to be little more than a sophisticated exercise in the combinatorial reassembly of bullshit.

We’re sick of it. It’s time to do something, and as educators, one constructive thing we know how to do is to teach people. So, the aim of this course is to help students navigate the bullshit-rich modern environment by identifying bullshit, seeing through it, and combatting it with effective analysis and argument.

What do we mean, exactly, by the term bullshit? As a first approximation, bullshit is language intended to persuade by impressing and overwhelming a reader or listener, with a blatant disregard for truth and logical coherence.

While bullshit may reach its apogee in the political sphere, this isn’t a course on political bullshit. Instead, we will focus on bullshit that comes clad in the trappings of scholarly discourse. Traditionally, such highbrow nonsense has come couched in big words and fancy rhetoric, but more and more we see it presented instead in the guise of big data and fancy algorithms — and these quantitative, statistical, and computational forms of bullshit are those that we will be addressing in the present course.

Of course an advertisement is trying to sell you something, but do you know whether the TED talk you watched last night is also bullshit — and if so, can you explain why? Can you see the problem with the latest New York Times or Washington Post article fawning over some startup’s big data analytics? Can you tell when a clinical trial reported in the New England Journal or JAMA is trustworthy, and when it is just a veiled press release for some big pharma company?

Our aim in this course is to teach you how to think critically about the data and models that constitute evidence in the social and natural sciences.

Learning Objectives

Our learning objectives are straightforward. After taking the course, you should be able to:

  • Remain vigilant for bullshit contaminating your information diet.
  • Recognize said bullshit whenever and wherever you encounter it.
  • Figure out for yourself precisely why a particular bit of bullshit is bullshit.
  • Provide a statistician or fellow scientist with a technical explanation of why a claim is bullshit.
  • Provide your crystals-and-homeopathy aunt or casually racist uncle with an accessible and persuasive explanation of why a claim is bullshit.

We will be astonished if these skills do not turn out to be among the most useful and most broadly applicable of those that you acquire during the course of your college education.

A great syllabus and impressive set of readings, although I must confess my disappointment that Is There a Text in This Class? The Authority of Interpretive Communities and Doing What Comes Naturally: Change, Rhetoric, and the Practice of Theory in Literary and Legal Studies, both by Stanley Fish, weren’t on the list.

Bergstrom and West are right about the usefulness of this “class” but I would use Fish and other literary critics to push your sensitivity to “bullshit” a little further than the readings indicate.

All communication is an attempt to persuade within a social context. If you share a context with a speaker, you are far more likely to recognize and approve of their use of “evidence” to make their case. If you don’t share such a context, say a person claiming a particular interpretation of the Bible due to divine revelation, their case doesn’t sound like it has any evidence at all.

It’s a subtle point but one known in the legal, literary and philosophical communities for a long time. That it’s new to scientists and/or data scientists speaks volumes about the lack of humanities education in science majors.

Security Design: Stop Trying to Fix the User (Or Catch Offenders)

Filed under: Cybersecurity,Security — Patrick Durusau @ 4:09 pm

Security Design: Stop Trying to Fix the User by Bruce Schneier.

From the post:

Every few years, a researcher replicates a security study by littering USB sticks around an organization’s grounds and waiting to see how many people pick them up and plug them in, causing the autorun function to install innocuous malware on their computers. These studies are great for making security professionals feel superior. The researchers get to demonstrate their security expertise and use the results as “teachable moments” for others. “If only everyone was more security aware and had more security training,” they say, “the Internet would be a much safer place.”

Enough of that. The problem isn’t the users: it’s that we’ve designed our computer systems’ security so badly that we demand the user do all of these counterintuitive things. Why can’t users choose easy-to-remember passwords? Why can’t they click on links in emails with wild abandon? Why can’t they plug a USB stick into a computer without facing a myriad of viruses? Why are we trying to fix the user instead of solving the underlying security problem?

Traditionally, we’ve thought about security and usability as a trade-off: a more secure system is less functional and more annoying, and a more capable, flexible, and powerful system is less secure. This “either/or” thinking results in systems that are neither usable nor secure.

Non-reliance on users is a good first step.

An even better second step would create financial incentives for Bruce’s first step.

Financial incentives similar to those in products liability cases, where a “reasonable care” standard evolves over time. No product has to be perfect, but there are expectations of how not bad a product must be.

Liability not only for the producer of the software but also enterprises using that software, when third-parties are hurt by data breaches.

Claims about the complexity of software are true, but can you honestly say that software is more complex than drug interactions across an unknown population? Yet, we have products liability standards for those cases.

Without financial incentives, substantial financial incentives, such as with products liability, cybersecurity experts (Bruce excepted) will still be trying to “fix the user” a decade from now.

The romantic quest to capture and punish those guilty of cybercrime, hasn’t worked so well. One collection of cybercrime statistics pointed out that detected cybercrime incidents increased by 38% in the last year.

Tell me, do you know of any statistics showing a 38% increase in the arrest and prosecution of cybercriminals in the last year? No? That’s what I thought.

With estimated cybercrime prevention spending at $80 billion this year and an estimated cybercrime cost of $2 trillion by 2019, you don’t seem to be getting very much return on your investment.

We know that fixing users doesn’t work and capturing cybercriminals is a dicey proposition.

Both of those issues can be addressed by establishing incentives for more secure software. (Legal liability takes legislative misjudgment out of the loop, enabling the organic growth of software liability principles.)

Ultrasound Tracking Defeats Tor (Provides Pathway Into Government Offices)

Filed under: Cybersecurity,Government,Security,Tor — Patrick Durusau @ 2:26 pm

Tor users at risk of being unmasked by ultrasound tracking by Danny Bradbury.

How close is your phone to your computer right now?

That close?

You may want to rethink your phone’s location.

From the post:

A new type of attack should make Tor users – and countless dogs around the world – prick up their ears. The attack, revealed at BlackHat Europe in November and at the 33rd Chaos Computer Congress the following month, uses ultrasounds to track users, even if they are communicating over anonymous networks.

The attack uses a technique called ultrasound cross-device tracking (uXDT), which made its way into advertising circles as early as 2012. Marketing companies running uXDT campaigns will play an ultrasonic sound, inaudible to the human ear, in a TV or radio ad, or even in an ad delivered via a computer browser.

Although the user won’t hear it, other devices such as smartphones using uXDT-enabled apps will be listening. When the app hears the signal, it will ping the advertising network with details about itself. What details? Anything it asks for the phone for, such as its IP address, geolocation Coleman’s, telephone number and IMEI (SIM card) code.

That’s creepy enough in marketing. Now, advertisers can tell what TV or radio ads you’ve been listening to, matching them with the universe of other information they have about you from your web searches, social media activity and emails.

In essence the technique uses an ultrasound “beacon” to trigger your phone to “call home.”

Hmmm, betrayed by your own phone.

Danny outlines a number of scenarios of governments using this technique against users.

Ultrasound tracking poses a significant risk for Tor users, but they are security conscious enough to be using Tor.

Consider the flip side of using ultrasound tracking as a pathway into government offices. A phone that can “call home” can certainly listen for keystrokes.

Where do you think most sysadmins keep their phones? 😉

ODI – Access To Legal Data News

Filed under: Law,Law - Sources,Legal Informatics,Open Access,Open Data — Patrick Durusau @ 12:44 pm

Strengthening our legal data infrastructure by Amanda Smith.

Amanda recounts an effort between the Open Data Institute (ODI) and Thomas Reuters to improve access to legal data.

From the post:


Paving the way for a more open legal sector: discovery workshop

In September 2016, Thomson Reuters and the ODI gathered publishers of legal data, policy makers, law firms, researchers, startups and others working in the sector for a discovery workshop. Its aims were to explore important data types that exist within the sector, and map where they sit on the data spectrum, discuss how they flow between users and explore the opportunities that taking a more open approach could bring.

The notes from the workshop explore current mechanisms for collecting, managing and publishing data, benefits of wider access and barriers to use. There are certain questions that remain unanswered – for example, who owns the copyright for data collected in court. The notes are open for comments, and we invite the community to share their thoughts on these questions, the data types discussed, how to make them more open and what we might have missed.

Strengthening data infrastructure in the legal sector: next steps

Following this workshop we are working in partnership with Thomson Reuters to explore data infrastructure – datasets, technologies and processes and organisations that maintain them – in the legal sector, to inform a paper to be published later in the year. The paper will focus on case law, legislation and existing open data that could be better used by the sector.

The Ministry of Justice have also started their own data discovery project, which the ODI have been contributing to. You can keep up to date on their progress by following the MOJ Digital and Technology blog and we recommend reading their data principles.

Get involved

We are looking to the legal and data communities to contribute opinion pieces and case studies to the paper on data infrastructure for the legal sector. If you would like to get involved, contact us.
…(emphasis in original)

Encouraging news, especially for those interested in building value-added tools on top of data that is made available publicly. At least they can avoid the cost of collecting data already collected by others.

Take the opportunity to comment on the notes and participate as you are able.

If you think you have seen use cases for topic maps before, consider that the Code of Federal Regulations (US), as of December 12, 2016, has 54938 separate but not unique, definitions of “person.” The impact of each regulation depending upon its definition of that term.

Other terms have similar semantic difficulties both in the Code of Federal Regulations as well as the US Code.

Cellebrite Hacked (Crowd-Funding for Tools?)

Filed under: Cybersecurity,Security — Patrick Durusau @ 11:07 am

Phone-Hacking Firm Cellebrite Got Hacked; 900GB of Data Stolen by Swati Khandelwal.

From the post:

Israeli firm Cellebrite, the popular company that provides digital forensics tools and software to help law enforcement access mobile phones in investigations, has had 900 GB of its data stolen by an unknown hacker.

But the hacker has not yet publicly released anything from the stolen data archive, which includes its customer information, user databases, and a massive amount of technical data regarding its hacking tools and products.

Instead, attackers are looking for possible opportunities to sell the access to Cellebrite system and data on a few selected IRC chat rooms, the hacker told Joseph Cox, contributor at Motherboard, who was contacted by the hacker and received a copy of the stolen data.

I can understand the hacker’s desire to make money and if unlike TheShadowBrokers, who are still pricing themselves out of a sale (approximately $8,230,000), the price is a reasonable one, crowd-funding might be a useful approach to purchasing the tools for public release.

I can’t afford to bid on the tools as an individual, but would contribute to a crowd-funded effort to secure a public release of the tools.

Why? The more hacking tools that are available, the less secure governments become.

People become less secure as well but governments are a far greater threat to people than cyber-criminals will ever be.

Cyber-criminals want your money, governments want your freedom.

Humanities Digital Library [A Ray of Hope]

Filed under: Digital Library,Humanities,Library,Open Access — Patrick Durusau @ 10:16 am

Humanities Digital Library (Launch Event)

From the webpage:

Date
17 Jan 2017, 18:00 to 17 Jan 2017, 19:00

Venue

IHR Wolfson Conference Suite, NB01/NB02, Basement, IHR, Senate House, Malet Street, London WC1E 7HU

Description

6-7pm, Tuesday 17 January 2017

Wolfson Conference Suite, Institute of Historical Research

Senate House, Malet Street, London, WC1E 7HU

www.humanities-digital-library.org

About the Humanities Digital Library

The Humanities Digital Library is a new Open Access platform for peer reviewed scholarly books in the humanities.

The Library is a joint initiative of the School of Advanced Study, University of London, and two of the School’s institutes—the Institute of Historical Research and the Institute of Advanced Legal Studies.

From launch, the Humanities Digital Library offers scholarly titles in history, law and classics. Over time, the Library will grow to include books from other humanities disciplines studied and researched at the School of Advanced Study. Partner organisations include the Royal Historical Society whose ‘New Historical Perspectives’ series will appear in the Library, published by the Institute of Historical Research.

Each title is published as an open access PDF, with copies also available to purchase in print and EPUB formats. Scholarly titles come in several formats—including monographs, edited collections and longer and shorter form works.
(emphasis in the original)

Timely evidence that not everyone in the UK is barking mad! “Barking mad” being the only explanation I can offer for the Investigatory Powers Bill.

I won’t be attending but if you can, do and support the Humanities Digital Library after it opens.

The People vs the Snoopers’ Charter [No Input = No Surveillance, Of Gaff Hooks]

Filed under: Government,Privacy — Patrick Durusau @ 9:58 am

The People vs the Snoopers’ Charter

From the webpage:


Ever googled something personal?

Who you text, email or call. Your social media activity. Which websites you visit.

Who you bank with. Where your kids go to school. Your sexual preferences, health worries, religious and political beliefs.

Since November, the Snoopers’ Charter – the Investigatory Powers Act – has let the Government access all this intimate information, building up an incredibly detailed picture of you, your family and friends, your hobbies and habits – your entire life.

And it won’t just be accessed by the Home Secretary. Dozens of agencies – the Department for Work and Pensions, HMRC and 46 others – can now see sensitive details of your personal life.

Over 200,000 people signed a petition to stop the Snoopers’ Charter, the Government didn’t listen so we’re taking them to court and we need your help.

There’s no opt-out and you don’t need to be suspected of anything. It will just happen all the time, to every one of us.

The Investigatory Powers Act lets Government keep records of and monitor your private emails, texts and phone calls – that’s where you are, who you speak to, what you say – and all without any suspicion of wrongdoing.

It forces internet companies like Sky, BT and TalkTalk to log every website you visit or app you have used, creating a vast database of deeply sensitive and revealing information. At a time when companies and governments are under increasingly frequent attack from hackers, this will create a goldmine for criminals and foreign spies.

Your support will help us clear the first hurdle, being granted permission by the Court to proceed with our case against the Government.

It’s time we all took a stand. We’ve told the Government we’ll see them in court and we need your help to make that happen. Please donate whatever you can to fund this vital case.
… (emphasis in original)

In case you are missing the background, see: Investigatory Powers Act 2016, which is now law in the UK.

The text as originally enacted.

The true extent of surveillance in the United States is unknown so it isn’t clear if the UK was playing “catch up” with this draconian measure or trying to beat the United States in a race to the least civil society.

Either way, it is an unfortunate milestone in the legal history of a country that gave us the common law.

surveillance-camera-460

From a data science perspective, I would point out that no input = no surveillance.

Your eyes maybe better than mine but in the surveillance camera image, I count at least three vulnerabilities that would render the camera useless.

Ordinary wire cutters:

cutters-460

won’t be useful but a gaff hook could be quite effective in creating a no input state.

The same principle applies whether you choose a professionally made gaff hook or some DIY version of the same instrument.

A gaff hook won’t stop surveillance of ISPs, etc., but disabling a surveillance camera could be seen as poking the government in the eye.

That’s an image I can enjoy. You?

PS: I’m not intimate with UK criminal law. Is possession of a gaff hook legal in the UK?

January 12, 2017

Applied Computational Genomics Course at UU: Spring 2017

Filed under: Bioinformatics,Computational Biology,Genomics — Patrick Durusau @ 9:39 pm

Applied Computational Genomics Course at UU: Spring 2017 by Aaron Quinlan.

I initially noticed this resource from posts on the two part Introduction to Unix (part 1) and Introduction to Unix (part 2).

Both of which are too elementary for you but something you can pass onto others. They do give you an idea of the Unix skill level required for the rest of the course.

From the GitHub page:

This course will provide a comprehensive introduction to fundamental concepts and experimental approaches in the analysis and interpretation of experimental genomics data. It will be structured as a series of lectures covering key concepts and analytical strategies. A diverse range of biological questions enabled by modern DNA sequencing technologies will be explored including sequence alignment, the identification of genetic variation, structural variation, and ChIP-seq and RNA-seq analysis. Students will learn and apply the fundamental data formats and analysis strategies that underlie computational genomics research. The primary goal of the course is for students to be grounded in theory and leave the course empowered to conduct independent genomic analyses. (emphasis in the original)

I take it successful completion will also enable you to intelligently question genomic analyses by others.

The explosive growth of genomics makes that a valuable skill in public discussions as well something nice for your toolbox.

« Newer PostsOlder Posts »

Powered by WordPress