July « 2017 « Another Word For It

July 28, 2017

Microsoft Fuzzing (Linux Too)

Filed under: Cybersecurity,Fuzzing,Security — Patrick Durusau @ 4:53 pm

From the webpage:

What is Microsoft Security Risk Detection?

Security Risk Detection is Microsoft’s unique fuzz testing service for finding security critical bugs in software. Security Risk Detection helps customers quickly adopt practices and technology battle-tested over the last 15 years at Microsoft.

“Million dollar” bugs

Security Risk Detection uses “Whitebox Fuzzing” technology which discovered 1/3rd of the “million dollar” security bugs during Windows 7 development.

Battle tested tech

The same state-of-the-art tools and practices honed at Microsoft for the last decade and instrumental in hardening Windows and Office — with the results to prove it.

Scalable fuzz lab in the cloud

One click scalable, automated, Intelligent Security testing lab in the cloud.

Cross-platform support

Linux Fuzzing is now available. So, whether you’re building or deploying software for Windows or Linux or both, you can utilize our Service.
…

No bug detection and/or fuzzing technique is 100%.

Here MS says for one product its “Whitebox Fuzzing” was 33% effective against “million dollar” security bugs.

A more meaningful evaluation of “Whitebox Fuzzing” would be to say which of the 806 Windows 7 vulnerabilities listed at CVE Details were detected and which ones were not.

I don’t know your definition of a “million dollar” security bugs so statistics against known bugs would be more meaningful.

Yes?

Comments Off

Open Source GPS Tracking System: Traccar (Super Glue + Burner Phone)

Filed under: Geographic Data,GPS — Patrick Durusau @ 10:33 am

Open Source GPS Tracking System: Traccar

From the post:

Traccar is an open source GPS tracking system for various GPS tracking devices. This Maven Project is written in Java and works on most platforms with installed Java Runtime Environment. System supports more than 80 different communication protocols from popular vendors. It includes web interface to manage tracking devices online… Traccar is the best free and open source GPS tracking system software offers self hosting real time online vehicle fleet management and personal tracking… Traccar supports more than 80 GPS communication protocols and more than 600 models of GPS tracking devices.

(image omitted)

To start using Traccar Server follow instructions below:

Download and install Traccar

Reboot system, Traccar will start automatically

Open web interface (http://localhost:8082)

Log in as administrator (user – admin, password – admin) or register a new user

Add new device with unique identifier (see section below)

Configure your device to use appropriate address and port (see section below)

…

With nearly omnipresent government surveillance of citizens, citizens should return the favor by surveillance of government officers.

Super Glue plus a burner phone enables GPS tracking of government vehicles.

For those with greater physical access, introducing a GPS device into vehicle wiring is also an option.

You may want to restrict access to Traccar as public access to GPS location data will alert targets to GPS tracking of their vehicles.

It’s a judgment call when the loss of future tracking data is offset by the value of accumulated tracking data for a specific purpose.

What if you tracked all county police car locations for a year and patterns emerge from that data? What forums are best for summarized (read aggregated) presentation of the data? When/where is it best to release the detailed data? How do you sign released data to verify future analysis is using the same data?

Hard questions but better hard questions than no tracking data for government agents at all.

Comments Off

Surveillance Industry Index – Update – 223 More Sources/Targets

Filed under: Cybersecurity,Security — Patrick Durusau @ 9:31 am

Surveillance Industry Index

When I last mentioned the Surveillance Industry Index in Vendors, Targets, Both? (August 2, 2016), it listed 2350 vendors.

As of today (28 August 2017), that listing has grown to 2573 vendors, and increase of 223.

Enjoy!

Comments Off

July 27, 2017

Tired of Chasing Ephemera? Open Greek and Latin Design Sprint (bids in August, 2017)

Filed under: Classics,Greek,Humanities,Interface Research/Design,Language — Patrick Durusau @ 3:06 pm

Tired of reading/chasing the ephemera explosion in American politics?

I’ve got an opportunity for you to contribute to a project with texts preserved by hand for thousands of years!

Design Sprint for Perseus 5.0/Open Greek and Latin

From the webpage:

We announced in June that Center for Hellenic Studies had signed a contract with Intrepid.io to conduct a design sprint that would support Perseus 5.0 and the Open Greek and Latin collection that it will include. Our goal was to provide a sample model for a new interface that would support searching and reading of Greek, Latin, and other historical languages. The report from that sprint was handed over to CHS yesterday and we, in turn, have made these materials available, including both the summary presentation and associated materials. The goal is to solicit comment and to provide potential applicants to the planned RFP with access to this work as soon as possible.

The sprint took just over two weeks and was an intensive effort. An evolving Google Doc with commentary on the Intrepid Wrap-up slides for the Center for Hellenic studies should now be visible. Readers of the report will see that questions remain to be answered. How will we represent Perseus, Open Greek and Latin, Open Philology, and other efforts? One thing that we have added and that will not change will be the name of the system that this planned implementation phase will begin: whether it is Perseus, Open Philology or some other name, it will be powered by the Scaife Digital Library Viewer, a name that commemorates Ross Scaife, pioneer of Digital Classics and a friend whom many of us will always miss.

The Intrepid report also includes elements that we will wish to develop further — students of Greco-Roman culture may not find “relevance” a helpful way to sort search reports. The Intrepid Sprint greatly advanced our own thinking and provided us with a new starting point. Anyone may build upon the work presented here — but they can also suggest alternate approaches.

…

The core deliverables form an impressive list:

At the moment we would summarize core deliverables as:

A new reading environment that captures the basic functionality of the Perseus 4.0 reading environment but that is more customizable and that can be localized efficiently into multiple modern languages, with Arabic, Persian, German and English as the initial target languages. The overall Open Greek and Latin team is, of course, responsible for providing the non-English content. The Scaife DL Viewer should make it possible for us to localize into multiple languages as efficiently as possible.
The reading environment should be designed to support any CTS-compliant collection and should be easily configured with a look and feel for different collections.
The reading environment should contain a lightweight treebank viewer — we don’t need to support editing of treebanks in the reading environment. The functionality that the Alpheios Project provided for the first book of the Odyssey would be more than adequate. Treebanks are available under the label “diagram” when you double-click on a Greek word.
The reading environment should support dynamic word/phrase level alignments between source text and translation(s). Here again, the The functionality that the Alpheios Project provided for the first book of the Odyssey would be adequate. More recent work implementing this functionality is visible at Tariq Yousef’s work at http://divan-hafez.com/ and http://ugarit.ialigner.com/.
The system must be able to search for both specific inflected forms and for all forms of a particular word (as in Perseus 4.0) in CTS-compliant epiDoc TEI XML. The search will build upon the linguistically analyzed texts available in https://github.com/gcelano/CTSAncientGreekXML. This will enable searching by dictionary entry, by part of speech, and by inflected form. For Greek, the base collection is visible at the First Thousand Years of Greek website (which now has begun to accumulate a substantial amount of later Greek). CTS-compliant epiDoc Latin texts can be found at https://github.com/OpenGreekAndLatin/csel-dev/tree/master/data and https://github.com/PerseusDL/canonical-latinLit/tree/master/data.
The system should ideally be able to search Greek and Latin that is available only as uncorrected OCR-generated text in hOCR format. Here the results may follow the image-front strategy familiar to academics from sources such as Jstor. If it is not feasible to integrate this search within the three months of core work, then we need a plan for subsequent integration that Leipzig and OGL members can implement later.
The new system must be scalable and updating from Lucene to Elasticsearch is desirable. While these collections may not be large by modern standards, they are substantial. Open Greek and Latin currently has c. 67 million words of Greek and Latin at various stages of post-processing and c. 90 million words of addition translations from Greek and Latin into English,French, German and Italian, while the Lace Greek OCR Project has OCR-generated text for 1100 volumes.
The system integrate translations and translation alignments into the searching system, so that users can search either in the original or in modern language translations where we provide this data. This goes back to work by David Bamman in the NEH-funded Dynamic Lexicon Project (when he was a researcher at Perseus at Tufts). For more recent examples of this, see http://divan-hafez.com/ and Ugarit. Note that one reason to adopt CTS URNs is to simplify the task of display translations of source texts — the system is only responsible for displaying translations insofar as they are available via the CTS API.
The system must provide initial support for a user profile. One benefit of the profile is that users will be able to define their own reading lists — and the Scaife DL Viewer will then be able to provide personalized reading support, e.g., word X already showed up in your reading at places A, B, and C, while word Y, which is new to you, will appear 12 times in the rest of your planned readings (i.e., you should think about learning that word). By adopting the CTS data model, we can make very precise reading lists, defining precise selections from particular editions of particular works. We also want to be able to support an initial set of user contributions that are (1) easy to implement technically and (2) easy for users to understand and perform. Thus we would support fixing residual data entry errors, creating alignments between source texts and translations, improving automated part of speech tagging and lemmatization but users would go to external resources to perform more complex tasks such as syntactic markup (treebanking).
We would welcome a bids that bring to bear expertise in the EPUB format and that could help develop a model for representing for representing CTS-compliant Greek and Latin sources in EPUB as a mechanism to make these materials available on smartphones. We can already convert our TEI XML into EPUB. The goal here is to exploit the easiest ways to optimize the experience. We can, for example, convert one or more of our Greek and Latin lexica into the EPUB Dictionary format and use our morphological analyses to generate links from particular forms in a text to the right dictionary entry or entries. Can we represent syntactically analyzed sentences with SVG? Can we include dynamic translation alignments?
Bids should consider including a design component. We were very pleased with the Design Sprint that took place in July 2017 and would like to include a follow-up Design Sprint in early 2018 that will consider (1) next steps for Greek and Latin and (2) generalizing our work to other historical languages. This Design Sprint might well go to a separate contractor (thus providing us also with a separate point of view on the work done so far).
Work must be build upon the Canonical Text Services Protocol. Bids should be prepared to build upon https://github.com/Capitains, but should also be able to build upon other CTS servers (e.g., https://github.com/ThomasK81/LightWeightCTSServer and cts.informatik.uni-leipzig.de).
All source code must be available on Github under an appropriate open license so that third parties can freely reuse and build upon it.
Source code must be designed and documented to facilitate actual (not just legally possible) reuse.
The contractor will have the flexibility to get the job done but will be expected to work as closely as possible with, and to draw wherever possible upon the on-going work done by, the collaborators who are contributing to Open Greek and Latin. The contractor must have the right to decide how much collaboration makes sense.

You can use your data science skills to sell soap, cars, ED treatments, or even apocalyptically narcissistic politicians, or, you can advance Perseus 5.0.

Your call.

Comments Off

Dimensions of Subject Identification

Filed under: Subject Identifiers,Subject Identity,Topic Maps — Patrick Durusau @ 2:30 pm

This isn’t a new idea, but it occurred to me that introducing readers to “dimensions of subject identification” might be an easier on ramp for topic maps. It enables us to dodge the sticky issues of “identity,” in favor of asking what do you want to talk about? and how many do you want/need to identify it?

To start with a classic example, if we only have one dimension and the string “Paris,” ambiguity is destined to follow.

If we add a country dimension, now having two dimensions, “Paris” + “France” can be distinguished from all other uses of “Paris” with the string + country dimension.

The string + country dimension fares less well for “Paris” + country = “United States:”

Paris, Arkansas, a city

Paris, Idaho, a city

Paris, Illinois, a city

Paris, Indiana, an unincorporated community

Paris, Iowa, an unincorporated community

Paris, Kentucky, a city

Paris, Maine, a town

Paris, an unincorporated community in Green Charter Township, Michigan

Paris, Mississippi, an unincorporated community

Paris, Missouri, a city

Paris, New Hampshire, an unincorporated community

Paris, New York, a town

Paris, Portage County, Ohio, an unincorporated community

Paris, Stark County, Ohio, an unincorporated community

Paris, Oregon, an unincorporated community

Paris, Pennsylvania, a census-designated place

Paris, Tennessee, a city

Paris, Texas, a city

Paris, Virginia, an unincorporated community

Paris, Wisconsin (disambiguation), several Wisconsin localities

Paris Township (disambiguation), several US localities

Beresford, South Dakota, a city formerly called Paris

Loraine, California, an unincorporated community formerly called Paris

Paris Mountain, South Carolina – see Paris Mountain State Park

Paris Mountain, Virginia

For the United States you need “Paris” + country + state dimensions, at a minimum, but that leaves you with two instances of Paris in Ohio.

One advantage of speaking of “dimensions of subject identification” is that we can order systems of subject identification by the number of dimensions they offer. Not to mention examining the consequences of the choices of dimensions.

One dimensional systems, that is a solitary string, "Paris," as we said above, leave users with no means to distinguish one use from another. They are useful and common in CSV files or database tables, but risk ambiguity and being difficult to communicate accurately to others.

Two dimensional systems, that is city = "Paris," enables users to distinguish usages other than for city, but as you can see from the Paris example in the U.S., that may not be sufficient.

Moreover, city itself may be a subject identified by multiple dimensions, as different governmental bodies define “city” differently.

Just as some information systems only use one dimensional strings for headers, other information systems may use one dimensional strings for the subject city in city = "Paris." But all systems can capture multiple dimensions of identification for any subjects, separate from those systems.

Perhaps the most useful aspect of dimensions of identification is enabling user to ask their information architects what dimensions and their values serve to identify subjects in information systems.

Such as the headers in database tables or spreadsheets.

Comments Off

July 26, 2017

Deep Learning for NLP Best Practices

Filed under: Deep Learning,Natural Language Processing,Neural Networks — Patrick Durusau @ 3:18 pm

Deep Learning for NLP Best Practices by Sebastian Ruder.

From the introduction:

This post is a collection of best practices for using neural networks in Natural Language Processing. It will be updated periodically as new insights become available and in order to keep track of our evolving understanding of Deep Learning for NLP.

There has been a running joke in the NLP community that an LSTM with attention will yield state-of-the-art performance on any task. While this has been true over the course of the last two years, the NLP community is slowly moving away from this now standard baseline and towards more interesting models.

However, we as a community do not want to spend the next two years independently (re-)discovering the next LSTM with attention. We do not want to reinvent tricks or methods that have already been shown to work. While many existing Deep Learning libraries already encode best practices for working with neural networks in general, such as initialization schemes, many other details, particularly task or domain-specific considerations, are left to the practitioner.

This post is not meant to keep track of the state-of-the-art, but rather to collect best practices that are relevant for a wide range of tasks. In other words, rather than describing one particular architecture, this post aims to collect the features that underly successful architectures. While many of these features will be most useful for pushing the state-of-the-art, I hope that wider knowledge of them will lead to stronger evaluations, more meaningful comparison to baselines, and inspiration by shaping our intuition of what works.

I assume you are familiar with neural networks as applied to NLP (if not, I recommend Yoav Goldberg’s excellent primer [43]) and are interested in NLP in general or in a particular task. The main goal of this article is to get you up to speed with the relevant best practices so you can make meaningful contributions as soon as possible.

I will first give an overview of best practices that are relevant for most tasks. I will then outline practices that are relevant for the most common tasks, in particular classification, sequence labelling, natural language generation, and neural machine translation.
…

Certainly a resource to bookmark while you read A Primer on Neural Network Models for Natural Language Processing by Yoav Goldberg, at 76 pages and to consult frequently as you move beyond the primer stage.

Enjoy and pass it on!

Comments Off

Fancy Airline Lounges W/O Fancy Airline Ticket

Filed under: Cybersecurity,QR Codes,Security — Patrick Durusau @ 2:26 pm

Andy Greenberg posted a hot travel tip last August (2016) in Fake Boarding Pass App Gets Hacker Into Fancy Airline Lounges:

As the head of Poland’s Computer Emergency Response Team, Przemek Jaroszewski flies 50 to 80 times a year, and so has become something of a connoisseur of airlines’ premium status lounges. (He’s a particular fan of the Turkish Airlines lounge in Istanbul, complete with a cinema, putting green, Turkish bakery and free massages.) So when his gold status was mistakenly rejected last year by an automated boarding pass reader at a lounge in his home airport in Warsaw, he applied his hacker skills to make sure he’d never be locked out of an airline lounge again.

The result, which Jaroszewski plans to present Sunday at the Defcon security conference in Las Vegas, is a simple program that he’s now used dozens of times to enter airline lounges all over Europe. It’s an Android app that generates fake QR codes to spoof a boarding pass on his phone’s screen for any name, flight number, destination and class. And based on his experiments with the spoofed QR codes, almost none of the airline lounges he’s tested actually check those details against the airline’s ticketing database—only that the flight number included in the QR code exists. And that security flaw, he says, allows him or anyone else capable of generating a simple QR code to both access exclusive airport lounges and buy things at duty free shops that require proof of international travel, all without even buying a ticket.
…

See Greenberg’s post for details on prior work with boarding passes.

Caveat: This has not been tested outside of Europe.

Airlines could challenge your right to use a lounge, based on your appearance, but an incident or two with legitimate customers being booted, should cure them of that pettiness.

Greenberg posted this in August of 2016 and I haven’t seen any updates.

You?

Happy travels!

Comments Off

Weaponry on the Dark Web – Read The Fine Print

Filed under: Dark Web,Journalism,News,Reporting — Patrick Durusau @ 2:01 pm

The NextGov headline screaming: 3D-Printed Gun Designs Are Selling For $12 On The Dark Web is followed by this pic:

But the fine print in the caption reads:

“The additive-manufactured RAMBO system includes an NSRDEC-designed standalone kit with printed adjustable buttstock, mounts, grips and other modifications—modifications made possible by the quick turnaround time afforded by 3D printing. // US Army”

So….

This is NOT a printable gun from the Dark Web
Printable parts ARE buttstock, mounts, grips, not the gun itself

Just so you know, the RAND paper doesn’t include this image.

In fact, Behind the curtain: The illicit trade of firearms, explosives and ammunition on the dark web by Giacomo Persi Paoli, Judith Aldridge, Nathan Ryan, Richard Warnes, concede trading of weapons on the Dark Web is quite small beside non-Dark Web trafficking.

Missing in the discussion of 3-D weapons plans is a comparison of the danger they pose relative to other technologies.

The Cal Fire map leaves no doubt that $12 or less in gasoline and matches can produce far more damage than any 3-D printed weapon. Without the need for a 3-D printer.

Yes?

All weapons pose some danger. Decisions makers need to know the relative dangers of weapons vis-a-vis each other.

A RAND report on the comparative danger of weapons would be far more useful than reports on weapons and their sources in isolation.

Comments Off

#DAPL – Jessica Reznicek and Ruby Montoya Press Release

Filed under: #DAPL,Protests — Patrick Durusau @ 12:37 pm

Unicorn Riot reposted the press release by Jessica Reznicek and Ruby Montoya, detailing damage to the DAPL pipeline (July 24, 2017) at: Two Women Claim Responsibility for Sabotage and Arson Attacks to Stop DAPL.

Media reports quote some of the statement but omit information such as:

This is our statement, we will be representing ourselves, as standby counsel and immediate contact is federal attorney, Bill Quigley, quigley77@gmail.com, 504-710-3074. Media contact, Amber Mae, 618-334-6035, amber@earthdefensecoalition.com; Melissa Fuller 515-490-5705.
…

Unlike many (most?, all?) of your elected representatives, Jessica Reznicek and Ruby Montoya, are not seeking personal gain from their actions with regard to DAPL.

Participate and take such part as you deem appropriate.

Comments Off

July 25, 2017

DAPL: Two Heroes, One Establishment Toady (Sierra Club)

Filed under: #DAPL,Protests — Patrick Durusau @ 4:45 pm

Dakota Access protesters claim responsibility for pipeline sabotage by William Petroski.

From the post:

Two Iowa activists with a history of arrests for political dissent are claiming responsibility for repeatedly damaging the Dakota Access Pipeline while the four-state, $3.8 billion project was under construction in Iowa.

Jessica Reznicek, 35, and Ruby Montoya, 27, both of Des Moines, held a news conference Monday outside the Iowa Utilities Board’s offices where they provided a detailed description of their deliberate efforts to stop the pipeline’s completion. They were taken into custody by state troopers immediately afterward when they abruptly began using a crowbar and a hammer to damage a sign on state property.

Both women are involved in Iowa’s Catholic Worker social justice movement and they described their pipeline sabotage as a “direct action” campaign that began on Election Day 2016. They said their first incident of destruction involved burning at least five pieces of heavy equipment on the pipeline route in northwest Iowa’s Buena Vista County.

…

The two women said they researched how to pierce the steel pipe used for the pipeline and in March they began using oxyacetylene cutting torches to damage exposed, empty pipeline valves. They said they started deliberately vandalizing the pipeline in southeast Iowa’s Mahaska County, delaying completion for weeks.

Jessica Reznicek (Photo: Special to the Register)
Reznicek and Montoya said they subsequently used torches to cause damage up and down the pipeline throughout Iowa and into part of South Dakota, moving from valve to valve until running out of supplies. They said their actions were rarely reported in the media. They also contended the federal government and Dallas-based Energy Transfer Partners, the pipeline developer, withheld vital information from the public.
…

Two people out of a U.S. population of 326,474,013 delayed the Dakota Access Pipeline for weeks.

BTW, a representative of the “we collect money and talk about the environment,” the Sierra Club, quite naturally denounces effective action against the pipeline.

After all, if anyone actually stopped damage to the environment in any one case, donors would expect effective action in other cases. Leaving the Sierra Club high and dry.

As a reading test, you pick the heroes in Petroski’s report.

Comments Off

We’ll Pay You to #HackTor

Filed under: Cybersecurity,Security,Tor — Patrick Durusau @ 4:02 pm

We’ll Pay You to #HackTor

From the post:

THERE ARE BUGS AMONG US

Millions of people around the world depend on Tor to browse the internet privately and securely every day, so our security is critical. Bugs in our code pose one of the biggest threats to our users’ safety; they allow skilled attackers to bypass Tor’s protections and compromise the safety of Tor users.

We’re constantly looking for flaws in our software and been fortunate to have a large community of hackers who help us identify and fix serious issues early on, but we think we can do even more to protect our users. That’s why if you can #HackTor and find bugs in our software, we want reward you.

JOIN OUR FIRST PUBLIC BUG BOUNTY

With support from the Open Technology Fund, we’re launching our first public bug bounty with HackerOne. We’re specifically looking for your help to find bugs in Tor (the network daemon) and Tor Browser. A few of the vulnerabilities we’re looking for include local privilege escalation, unauthorized access of user data, attacks that cause the leakage of crypto material of relays or clients, and remote code execution. In January 2016, we launched a private bug bounty; hackers helped us catch 3 crash/DoS bugs (2 OOB-read bugs + 1 infinite loop bug) and 4 edge-case memory corruption bugs.

Tor users around the globe, including human rights defenders, activists, lawyers, and researchers, rely on the safety and security of our software to be anonymous online. Help us protect them and keep them safe from surveillance, tracking, and attacks. We’ll award up to $4,000 per bug report, depending on the impact and severity of what you find.

HERE’S HOW TO GET STARTED

Sign up for an account at HackerOne. Visit https://hackerone.com/torproject for the complete guidelines, details, terms, and conditions of our bug bounty. Then, start finding and reporting bugs to help keep Tor and Tor Browser safe.

Happy bug hunting!

The pay isn’t great but it’s for a worthy cause.

Any improvement individual security is a net win for individuals everywhere.

Comments Off

Apologies For Silence – Ergonomics Problem

Filed under: Marketing — Patrick Durusau @ 3:50 pm

Apologies for the sudden silence!

I have had a bad situation, ergonomically speaking, which manifested itself in my left hand.

Have isolated the problem and repairs/exercises are underway. Won’t be back to full speed for several months but will be trying to do better.

Hope you are having a great summer!

Comments Off

July 19, 2017

If You Believe in Parliaments

Filed under: Government,Law,Law - Sources,Legal Informatics,Politics — Patrick Durusau @ 3:41 pm

If you believe in parliaments, other than as examples of how governments don’t “get it,” then the The Law Library of Congress, Global Legal Research Center has a treat for you!

Fifty (50) countries and seventy websites surveyed in: Features of (70)Parliamentary Websites in Selected Jurisdictions.

From the summary:

In recent years, parliaments around the world have enhanced their websites in order to improve access to legislative information and other parliamentary resources. Innovative features allow constituents and researchers to locate and utilize detailed information on laws and lawmaking in various ways. These include tracking tools and alerts, apps, the use of open data technology, and different search functions. In order to demonstrate some of the developments in this area, staff from the Global Legal Research Directorate of the Law Library of Congress surveyed the official parliamentary websites of fifty countries from all regions of the world, plus the website of the European Parliament. In some cases, information on more than one website is provided where separate sites have been established for different chambers of the national parliament, bringing the total number of individual websites surveyed to seventy.

While the information on the parliamentary websites is primarily in the national language of the particular country, around forty of the individual websites surveyed were found to provide at least limited information in one or more other languages. The European Parliament website can be translated into any of the twenty-four official languages of the members of the European Union.

All of the parliamentary websites included in the survey have at least basic browse tools that allow users to view legislation in a list format, and that may allow for viewing in, for example, date or title order. All of the substantive websites also enable searching, often providing a general search box for the whole site at the top of each page as well as more advanced search options for different types of documents. Some sites provide various facets that can be used to further narrow searches.

Around thirty-nine of the individual websites surveyed provide users with some form of tracking or alert function to receive updates on certain documents (including proposed legislation), parliamentary news, committee activities, or other aspects of the website. This includes the ability to subscribe to different RSS feeds and/or email alerts.

The ability to watch live or recorded proceedings of different parliaments, including debates within the relevant chamber as well as committee hearings, is a common feature of the parliamentary websites surveyed. Fifty-eight of the websites surveyed featured some form of video, including links to dedicated YouTube channels, specific pages where users can browse and search for embedded videos, and separate video services or portals that are linked to or viewable from the main site. Some countries also make videos available on dedicated mobile-friendly sites or apps, including Denmark, Germany, Ireland, the Netherlands, and New Zealand.

In total, apps containing parliamentary information are provided in just fourteen of the countries surveyed. In comparison, the parliamentary websites of thirty countries are available in mobile-friendly formats, enabling easy access to information and different functionalities using smartphones and tablets.

The table also provides information on some of the additional special features available on the surveyed websites. Examples include dedicated sites or pages that provide educational information about the parliament for children (Argentina, El Salvador, Germany, Israel, Netherlands, Spain, Taiwan, Turkey); calendar functions, including those that allow users to save information to their personal calendars or otherwise view information about different types of proceedings or events (available on at least twenty websites); and open data portals or other features that allow information to be downloaded in bulk for reuse or analysis, including through the use of APIs (application programming interfaces) (at least six countries).

With differing legal vocabularies and local personification of multi-nationals, this is a starting point for transparency based upon topic maps.

I first saw this in a tweet by the Global Investigative Journalism Network (GIJN).

Comments Off

July 15, 2017

Twitter – Government Censor’s Friend

Filed under: Censorship,Free Speech,Twitter — Patrick Durusau @ 3:59 pm

Governments, democratic, non-democratic, kingships, etc. that keep secrets from the public, share a common enemy in Wikileaks.

Wikileaks self-describes in part as:

WikiLeaks is a multi-national media organization and associated library. It was founded by its publisher Julian Assange in 2006.

WikiLeaks specializes in the analysis and publication of large datasets of censored or otherwise restricted official materials involving war, spying and corruption. It has so far published more than 10 million documents and associated analyses.

“WikiLeaks is a giant library of the world’s most persecuted documents. We give asylum to these documents, we analyze them, we promote them and we obtain more.” – Julian Assange, Der Spiegel Interview.

WikiLeaks has contractual relationships and secure communications paths to more than 100 major media organizations from around the world. This gives WikiLeaks sources negotiating power, impact and technical protections that would otherwise be difficult or impossible to achieve.

Although no organization can hope to have a perfect record forever, thus far WikiLeaks has a perfect in document authentication and resistance to all censorship attempts.
…

Those same governments, share a common ally in Twitter, which has engaged in systematic actions to diminish the presence/influence of Julian Assange on Twitter.

Caitlin Johnstone documents Twitter’s intentional campaign against Assange in Twitter Is Using Account Verification To Stifle Leaks And Promote War Propaganda.

Catch Johnstone’s post for the details but then:

Follow @JulianAssange on Twitter (watch for minor variations that are not this account.
Tweet to your followers, at least once a week, urging them to follow @JulianAssange
Investigate and support non-censoring alternatives to Twitter.

You can verify Twitter’s dilution of Julian Assange for yourself.

Type “JulianAssange_” in the Twitter search box (my results):

Twitter was a remarkably good idea, but has long since poisoned itself with censorship and pettiness.

Your suggested alternative?

Comments Off

July 14, 2017

Next Office of Personnel Management (OPM) Leak, When, Not If

Filed under: Cybersecurity,Government,Security — Patrick Durusau @ 4:52 pm

2 Years After Massive Breach, OPM Isn’t Sufficiently Vetting IT Systems by Joseph Marks.

From the post:

More than two years after suffering a massive data beach, the Office of Personnel Management still isn’t sufficiently vetting many of its information systems, an auditor found.

In some cases, OPM is past due to re-authorize IT systems, the inspector general’s audit said. In other cases, OPM did reauthorize those systems but did it in a haphazard and shoddy way during a 2016 “authorization sprint,” the IG said.

“The lack of a valid authorization does not necessarily mean that a system is insecure,” the auditors said. “However, it does mean that a system is at a significantly higher risk of containing unidentified security vulnerabilities.”
…

The full audit provides more details but suffice it to say OPM security is as farcical as ever.

Do you think use of https://www.opm.gov/ in hacking examples and scripts, would call greater attention to flaws at the OPM?

Comments Off

Detecting Leaky AWS Buckets

Filed under: Cybersecurity,Security — Patrick Durusau @ 3:22 pm

Experts Warn Too Often AWS S3 Are Misconfigured, Leak Data by Tom Spring.

From the post:

A rash of misconfigured Amazon Web Services storage servers leaking data to the internet have plagued companies recently. Earlier this week, data belonging to anywhere between six million and 14 million Verizon customers were left on an unprotected server belonging to a partner of the telecommunications firm. Last week, wrestling giant World Wide Entertainment accidentally exposed personal data of three million fans. In both cases, it was reported that data was stored on AWS S3 storage buckets.

Reasons why this keeps on happening vary. But, Detectify Labs believes many leaky servers trace back to common errors when it comes to setting up access controls for AWS Simple Storage Service (S3) buckets.

In a report released Thursday, Detectify’s Security Advisor Frans Rosén said network administrators too often gloss over rules for configuring AWS’ Access Control Lists (ACL) and the results are disastrous.

…

Jump to the report released Thursday for the juicy details.

Any thoughts on the going rate for discovery of leaky AWS buckets?

Could be something, could be nothing.

In any event, you should be checking your own AWS buckets.

Comments Off

Successful Phishing Subject Lines

Filed under: Cybersecurity,Security — Patrick Durusau @ 12:47 pm

Gone Phishing: The Top 10 Attractive Lures by Roy Urrico.

From the post:

…

The list shows there’s still a lot of room to train employees on how to spot a phishing or spoofed email. Here they are:

Security Alert – 21%

Revised Vacation and Sick Time Policy – 14%

UPS Label Delivery 1ZBE312TNY00015011 – 10%

BREAKING: United Airlines Passenger Dies from Brain Hemorrhage – VIDEO – 10%

A Delivery Attempt was made – 10%

All Employees: Update your Healthcare Info – 9%

Change of Password Required Immediately – 8%

Password Check Required Immediately – 7%

Unusual sign-in activity – 6%

Urgent Action Required – 6%

*Capitalization is as it was in the phishing test subject line

…

A puff piece for KnowBe4 but a good starting point. KnowBe4 has an online phishing test among others. The phishing test requires registration.

Enjoy!

Comments Off

Targets of Government Cybercrimnal Units

Filed under: Cybersecurity,Government,Security — Patrick Durusau @ 12:27 pm

The Unfortunate Many: How Nation States Select Targets

From the post:

Key Takeaways

It’s safe to assume that all governments are developing and deploying cyber capabilities at some level. It’s also safe to assume most governments are far from open about the extent of their cyber activity.

If you take the time to understand why nation states get involved with cyber activity in the first place, you’ll find their attacks are much more predictable than they seem.

Each nation state has its own objectives and motivations for cyber activity. Even amongst big players like China, Russia, and the U.S. there’s a lot of variation.

Most nation states develop national five-year plans that inform all their cyber activities. Understanding these plans enables an organization to prioritize preparations for the most likely threats.

…

There’s a name for those who rely on governments, national or otherwise, to protect their cybersecurity: victims.

Recorded Future gives a quick overview of factors that may drive the objectives of government cybercriminal units.

I use “cybercriminal units” to avoid the false dichotomy between alleged “legitimate” government hacking and that of other governments and individuals.

We’re all adults here and realize government is a particular distribution of reward and stripes, nothing more. It has no vision, no goal beyond self-preservation and certainly, beyond your locally owned officials, no interest in you or yours.

That is to say governments undertaking hacking to further a “particular distribution of reward and stripes” and their choices are no more (or less) legitimate than anyone else’s.

Government choices are certainly no more legitimate than your choices. Although governments claim a monopoly on criminal prosecutions, which accounts for why criminals acting on their behalf are never prosecuted. That monopoly also explains why governments, assuming they have possession of your person, may prosecute you for locally defined “criminal” acts.

Read the Recorded Future post to judge your odds of being a victim of a national government. Then consider which governments should be your victims.

Comments Off

Summer Pocket Change – OrientDB Code Execution

Filed under: Cybersecurity,Security,Software — Patrick Durusau @ 10:32 am

SSD Advisory – OrientDB Code Execution

From the webpage:

Want to get paid for a vulnerability similar to this one?

Contact us at: ssd@beyondsecurity.com

Vulnerability Summary

The following advisory reports a vulnerability in OrientDB which allows users of the product to cause it to execute code.

OrientDB is a Distributed Graph Database engine with the flexibility of a Document Database all in one product. The first and best scalable, high-performance, operational NoSQL database.

Credit

An independent security researcher, Francis Alexander, has reported this vulnerability to Beyond Security’s SecuriTeam Secure Disclosure program.

Vendor response

The vendor has released patches to address this vulnerability.

For more information: https://github.com/orientechnologies/orientdb/wiki/OrientDB-2.2-Release-Notes#security.
…

Some vulnerabilities require deep code analysis, others, well, just asking the right questions.

If you are looking for summer pocket change, check out default users, permissions, etc. on popular software.

Comments Off

July 13, 2017

Locate Your Representative/Senator In Hell

Filed under: Government,Humanities,Literature,Maps,Politics,Visualization — Patrick Durusau @ 3:38 pm

Mapping Dante’s Inferno, One Circle of Hell at a Time by Anika Burgess.

From the post:

I found myself, in truth, on the brink of the valley of the sad abyss that gathers the thunder of an infinite howling. It was so dark, and deep, and clouded, that I could see nothing by staring into its depths.”

This is the vision that greets the author and narrator upon entry the first circle of Hell—Limbo, home to honorable pagans—in Dante Alighieri’s Inferno, the first part of his 14th-century epic poem, Divine Comedy. Before Dante and his guide, the classical poet Virgil, encounter Purgatorio and Paradiso, they must first journey through a multilayered hellscape of sinners—from the lustful and gluttonous of the early circles to the heretics and traitors that dwell below. This first leg of their journey culminates, at Earth’s very core, with Satan, encased in ice up to his waist, eternally gnawing on Judas, Brutus, and Cassius (traitors to God) in his three mouths. In addition to being among the greatest Italian literary works, Divine Comedy also heralded a craze for “infernal cartography,” or mapping the Hell that Dante had created.
… (emphasis in original)

Burgess has collected seven (7) traditional maps of the Inferno. I take them to be early essays in the art of visualization. They are by no means, individually or collectively, the definitive visualizations of the Inferno.

The chief deficit of all seven, to me, is the narrowness of the circles/ledges. As I read the Inferno, Dante and Virgil are not pressed for space. Expanding and populating the circles more realistically is one starting point.

The Inferno has no shortage of characters in each circle, Dante predicting the fate of Pope Boniface VIII, to place him in the eight circle of Hell (simoniacs A subclass of fraud.). (Use the online Britannica with caution. It’s entry for Boniface VIII doesn’t even mention the Inferno. (As of July 13, 2017.)

I would like to think being condemned to Hell by no less than Dante would rate at least a mention in my biography!

Sadly, Dante is no longer around to add to the populace of the Inferno but new visualizations could take the opportunity to update the resident list for Hell!

It’s an exercise in visualization, mapping, 14th century literature, and, an excuse to learn the name of your representative and senators.

Enjoy!

Comments Off

July 12, 2017

DigitalGlobe Platform

Filed under: Geospatial Data,GIS,Maps — Patrick Durusau @ 8:04 pm

DigitalGlobe Platform

The Maps API offers:

Recent Imagery

A curated satellite imagery layer of the entire globe. More than 80% of the Earth’s landmass is covered with high-resolution (30 cm-60 cm) imagery, supplemented with cloud-free LandSat 8 as a backdrop.

Street Map

An accurate, seamless street reference map. Based on contributions from the OpenStreetMap community, this layer combines global coverage with essential “locals only” perspectives.

Terrain Map

A seamless, visually appealing terrain perspective of the planet. Shaded terrain with contours guide you through the landscape, and OpenStreetMap reference vectors provide complete locational context.
…

Prices start at $5/month and go up. (5,000 map views for $5.)

BTW, 30 cm is 11.811 inches, just a little less than a foot.

For planning constructive or disruptive activities, that should be sufficient precision.

I haven’t tried the service personally but the resolution of the imagery compels me to mention it.

Enjoy!

Comments Off

July 11, 2017

Graphing the distribution of English letters towards…

Filed under: Language,Linguistics,Python — Patrick Durusau @ 9:05 pm

Graphing the distribution of English letters towards the beginning, middle or end of words by David Taylor.

From the post:

(partial image)

Some data visualizations tell you something you never knew. Others tell you things you knew, but didn’t know you knew. This was the case for this visualization.

Many choices had to be made to visually present this essentially semi-quantitative data (how do you compare a 3- and a 13-letter word?). I semi-exhaustively explain everything at on my other, geekier blog, prooffreaderplus, and provide the code I used; I’ll just repeat the most crucial here:
…

The counts here were generated from Brown corpus, which is composed of texts printed in 1961.

Take Taylor’s post as an inducement to read both Prooffreader Plus and Prooffreader on a regular basis.

Comments Off

Media Verification Assistant + Tweet Verification Assistant

Filed under: Journalism,News,Reporting,Verification — Patrick Durusau @ 7:39 pm

Media Verification Assistant

From the welcome screen:

Who

We are a joint team of engineers and investigators from CERTH-ITI and Deutsche Welle, aiming to build a comprehensive tool for image verification on the Web.

Features

The Media Verification Assistant features a multitude of image tampering detection algorithms plus metadata analysis, GPS Geolocation, EXIF Thumbnail extraction and integration with Google reverse image search.

Alpha

It is constantly being developed, expanded and upgraded -our ambition is to include most state-of-the-art verification technologies currently available on the Web, plus unique implementations of numerous experimental algorithms from the research literature. As the platform is currently in its Alpha stage, errors may occur and some algorithms may not operate as expected.

Feedback

For comments, suggestions and error reports, please contact verifymedia@iti.gr.

Sharing

The source code of the Java back-end is freely distributed at GitHub.

Even in alpha, this is a great project!

Even though images can be easily altered, Photoshop and Gimp, they continue to be admissible in court, so long as a witness testifies the image
is a fair and accurate representation of the subject matter.

This project has spawned a related project: Tweet Verification Assistant, which leverages the image algorithms to verify tweets with an image or video.

Another first stop before retweeting or re-publishing an image with a story.

Comments Off

Open Islamicate Texts Initiative (OpenITI)

Filed under: Arabic,Islam,Literature,Text Corpus,Texts — Patrick Durusau @ 4:37 pm

Open Islamicate Texts Initiative (OpenITI)

From the description (Annotation) of the project:

Books are grouped into authors. All authors are grouped into 25 AH periods, based on the year of their death. These repositories are the main working loci—if any modifications are to be added or made to texts or metadata, all has to be done in files in these folders.

There are three types of text repositories:

RAWrabicaXXXXXX repositories include raw texts as they were collected from various open-access online repositories and libraries. These texts are in their initial (raw) format and require reformatting and further integration into OpenITI. The overall current number of text files is over 40,000; slightly over 7,000 have been integrated into OpenITI.

XXXXAH are the main working folders that include integrated texts (all coming from collections included into RAWrabicaXXXXXX repositories).

i.xxxxx repositories are instantiations of the OpenITI corpus adapted for specific forms of analysis. At the moment, these include the following instantiations (in progress):

i.cex with all texts split mechanically into 300 word units, converted into cex format.

i.mech with all texts split mechanically into 300 word units.

i.logic with all texts split into logical units (chapters, sections, etc.); only tagged texts are included here (~130 texts at the moment).

i.passim_new_mech with all texts split mechanically into 300 word units, converted for the use with new passim (JSON).

[not created yet] i.passim_new_mech_cluster with all text split mechanically into 900 word units (3 milestones) with 300 word overlap; converted for the use with new passim (JSON).

i.passim_old_mech with all texts split mechanically into 300 word units, converted for the use with old passim (XML, gzipped).

i.stylo includes all texts from OpenITI (duplicates excluded) that are renamed and slightly reformatted (Arabic orthography is simplified) for the use with stylo R-package.

A project/site to join to hone your Arabic NLP and reading skills.

Enjoy!

Comments Off

The Classical Language Toolkit

Filed under: Classics,History,Humanities,Natural Language Processing — Patrick Durusau @ 4:26 pm

The Classical Language Toolkit

From the webpage:

The Classical Language Toolkit (CLTK) offers natural language processing (NLP) support for the languages of Ancient, Classical, and Medieval Eurasia. Greek and Latin functionality are currently most complete.

Goals

compile analysis-friendly corpora;

collect and generate linguistic data;

act as a free and open platform for generating scientific research.

You are sure to find one or more languages of interest:

Collecting, analyzing and mapping Tweets can be profitable and entertaining, but tomorrow or perhaps by next week, almost no one will read them again.

The texts in this project survived by hand preservation for thousands of years. People are still reading them.

How about you?

Comments Off

Truth In Terrorism Labeling (TITL) – A Starter Set

Filed under: Censorship,Facebook,Free Speech,Government,Terrorism — Patrick Durusau @ 3:28 pm

Sam Biddle‘s recent post: Facebook’s Tough-On-Terror Talk Overlooks White Extremists, is a timely reminder that “terrorism” and “terrorist” are labels with no agreed upon meaning.

To illustrate, here are some common definitions with suggestions for specifying the definition in use:

Terrorist/terrorism(Biddle): ISIS, Al Qaeda, and US white extremists. But not Tibetans and Uyghurs.

Terrorist/terrorism(China): From: How China Sees ISIS Is Not How It Sees ‘Terrorism’:

… in Chinese discourse, terrorism is employed exclusively in reference to Tibetans and Uyghurs. Official government statements typically avoid identifying acts of violence with a specific ethnic group, preferring more generic descriptors like “Xinjiang terrorists,“ “East Turkestan terror forces and groups,” the “Tibetan Youth Congress,” or the “Dalai clique.” In online Chinese chat-rooms, however, epithets like “Uyghur terrorist” or “Tibetan splittest” are commonplace and sometimes combine with homophonic racial slurs like “dirty Tibetans” or “raghead Uyghurs.”

Limiting “terrorism” to Tibetans and Uyghurs excludes ISIS, Al Qaeda, and US white extremists from that term.

Terrorist/terrorism(Facebook): ISIS, Al Qaeda, but no US white extremists (following US)

Terrorist/terrorism(Russia): Putin’s Flexible Definition of Terrorism

Who, exactly, counts as a terrorist? If you’re Russian President Vladimir Putin, the definition might just depend on how close or far the “terror” is from Moscow. A court in the Nizhniy Novgorod regional center last week gave a suspended two year sentence to Stanislav Dmitriyevsky, Chair of the local Russian-Chechen Friendship Society, and editor of Rights Defense bulletin. Dmitriyevsky was found guilty of fomenting ethnic hatred, simply because in March 2004, he published an appeal by Chechen rebel leader Aslan Maskhadov — later killed by Russian security services — and Maskhadov’s envoy in Europe, Akhmet Zakayev.

Maskhadov, you see, is officially a terrorist in the eyes of the Kremlin. Hamas, however, isn’t. Putin said so at his Kremlin press-conference on Thursday, where he extended an invitation — eagerly accepted — to Hamas’s leaders to Moscow for an official visit.

In fairness to Putin, as a practical matter, who is or is not a “terrorist” for the US depends on the state of US support. US supporting, not terrorists, US not supporting, likely to be terrorists.

Terrorist/terrorism(US): Generally ISIS, Al Qaeda, no US white extremists, for details see: Terrorist Organizations.

By appending parentheses and Biddle, China, Facebook, Russia, or US to terrorist or terrorism, the reading public has some chance to understand your usage of “terrorism/terrorist.”

Otherwise they are nodding along using their definitions of “terrorism/terrorist” and not yours.

Or was that vagueness intentional on your part?

Comments Off

July 7, 2017

New York Times, Fact Checking and Dacosta’s First OpEd

Filed under: Government,Journalism,News,Politics,Reporting,Transparency — Patrick Durusau @ 4:44 pm

Cutbacks on editors/fact-checking at the New York Times came at an unfortunate time for Marc Dacosta‘s first OpEd, The President Wants to Keep Us in the Dark (New York Times, 28 June 2017).

DaCosta decries the lack of TV cameras at several recent White House press briefings. Any proof the lack of TV cameras altered the information available to reporters covering the briefings? Here’s DaCosta on that point:

…
But the truth is that the decision to prevent the press secretary’s comments on the day’s most pressing matters from being televised is an affront to the spirit of an open and participatory government. It’s especially chilling in a country governed by a Constitution whose very First Amendment protects the freedom of the press.

Unfortunately, the slow death of the daily press briefing is only part of a larger assault by the Trump administration on a precious public resource: information.
…

DaCosta’s implied answer is no, a lack of TV cameras resulted in no diminishing of information from the press conference. But, his hyperbole gland kicks in, then he cites disjointed events claimed to diminish public access to information.

For example, Trump’s non-publication of visitor records:

…
Immediately after Mr. Trump took office, the administration stopped publishing daily White House visitor records, reversing a practice established by President Obama detailing the six million appointments he and administration officials took at the White House during his eight years in office. Who is Mr. Trump meeting with today? What about Mr. Bannon? Good luck finding out.
…

Really? Mark J. Rozell summarizes the “detailing the six million appointments he and administration officials took…” this way:

…
Obama’s action clearly violated his own pledge of transparency and an outpouring of criticism of his action somewhat made a difference. He later reversed his position when he announced that indeed the White House visitor logs would be made public after all.

Unfortunately, the president decided only to release lengthy lists of names, with no mention of the purpose of White House visits or even differentiation between tourists and people consulted on policy development.

This action enabled the Obama White House to appear to be promoting openness while providing no substantively useful information. If the visitor log listed “Michael Jordan,” there was no way to tell if the basketball great or a same-named industry lobbyist was the person at the White House that day and the layers of inquiry required to get that information were onerous. But largely because the president had appeared to have reversed himself in reaction to criticism for lack of transparency, the controversy died down, though it should not have.

Much of the current reaction to President Trump’s decision has contrasted that with the action of his predecessor, and claimed that Obama had set the proper standard by opening the books. The reality is different though, as Obama’s action set no standard at all for transparency.
…(Trump should open White House visitor logs, but don’t flatter Obama, The Hill, 18 April 2017)

That last line on White House visitor records under Obama is worth repeating:

The reality is different though, as Obama’s action set no standard at all for transparency.

Obama-style opaqueness would not answer the questions:

Who is Mr. Trump meeting with today? What about Mr. Bannon? [Questions by DaCosta.]

A fact-checker and/or editor at the New York Times knew that answer (hint to NYT management).

Even more disappointing is the failure of DaCosta, as the co-founder of Engima, to bring any data to a claim that White House press briefings are of value.

One way to test the value of White House press briefings is to extract the “facts” announced during the briefing and compare those to media reports in the prior twenty-four hours.

If DaCosta thought of such a test, the reason it went unperformed isn’t hard to guess:

…
The Senate had just released details of a health care plan that would deprive 22 million Americans of health insurance, and President Trump announced that he did not, as he had previously hinted, surreptitiously record his conversations with James Comey, the former F.B.I. director.
… (DaCosta)

First, a presidential press briefing isn’t an organ for the US Senate and second, Trump had already tweeted the news about not recording his conversations with James Comey. None of those “facts” broke at the presidential press briefing.

DaCosta is 0 for 2 for new facts at that press conference.

I offer no defense for the current administration’s lack of transparency, but fact-free and factually wrong claims against it don’t advance DaCosta’s cause:

…
Differences of belief and opinion are inseparable from the democratic process, but when the facts are in dispute or, worse, erased altogether, public debate risks breaking down. To have a free and democratic society we all need a common and shared context of facts to draw from. Facts or data will themselves never solve any problem. But without them, finding solutions to our common problems is impossible.
…

We should all expect better of President Trump, the New York Times and Marc DaCosta (@marc_dacosta).

Comments Off

July 6, 2017

Deanonymizing the Past

Filed under: Face Detection,Image Recognition,Neural Networks — Patrick Durusau @ 3:52 pm

What Ever Happened to All the Old Racist Whites from those Civil Rights Photos? by Johnny Silvercloud raises an interesting question but never considers it from a modern technology perspective.

Silvercloud includes this lunch counter image:

I count almost twenty (20) full or partial faces in this one image. Thousands if not hundreds of thousands of other images from the civil rights era capture similar scenes.

Then it occurred to me, unlike prior generations with volumes of photographs, populated by anonymous bystanders/perpetrators to/of infamous acts, we have the present capacity to deanonimize the past.

As a starting point, may I suggest Deep Face Recognition by Omkar M. Parkhi, Andrea Vedaldi, Andrew Zisserman, one of the more popular papers in this area, with 429 citations as of today (06 July 2017).

Abstract:

The goal of this paper is face recognition – from either a single photograph or from a set of faces tracked in a video. Recent progress in this area has been due to two factors: (i) end to end learning for the task using a convolutional neural network (CNN), and (ii) the availability of very large scale training datasets.

We make two contributions: first, we show how a very large scale dataset (2.6M images, over 2.6K people) can be assembled by a combination of automation and human in the loop, and discuss the trade off between data purity and time; second, we traverse through the complexities of deep network training and face recognition to present methods and procedures to achieve comparable state of the art results on the standard LFW and YTF face benchmarks.

That article was written in 2015 so consulting a 2017 summary update posted to Quora is advised for current details.

Banks, governments and others are using facial recognition for their own purposes, let’s also uses it to hold people responsible for their moral choices.

Moral choices at lunch counters, police riots, soldiers and camp guards from any number of countries and time periods, etc.

Yes?

Comments Off

Kaspersky: Is Source Code Disclosure Meaningful?

Filed under: Cybersecurity,Government,Open Source,Security — Patrick Durusau @ 2:23 pm

Responding to a proposed ban of Kaspersky Labs software, Eugene Kaspersky, chief executive of Kaspersky, is quoted in Russia’s Kaspersky Lab offers up source code for US government scrutiny, as saying:

The chief executive of Russia’s Kaspersky Lab says he’s ready to have his company’s source code examined by U.S. government officials to help dispel long-lingering suspicions about his company’s ties to the Kremlin.

In an interview with The Associated Press at his Moscow headquarters, Eugene Kaspersky said Saturday that he’s also ready to move part of his research work to the U.S. to help counter rumors that he said were first started more than two decades ago out of professional jealousy.

“If the United States needs, we can disclose the source code,” he said, adding that he was ready to testify before U.S. lawmakers as well. “Anything I can do to prove that we don’t behave maliciously I will do it.”
…

Personally I think Kaspersky is about to be victimized by anti-Russia hysteria, where repetition of rumors, not facts, are the coin of the realm.

Is source code disclosure is meaningful? A question applicable to Kasperky disclosures to U.S. government officials, or Microsoft or Oracle disclosures of source code to foreign governments.

My answer is no, at least if you mean source code disclosure limited to governments or other clients.

Here’s why:

Limited competence: For the FBI in particular, source code disclosure is meaningless. Recall the FBI blew away $170 million in the Virtual Case File project with nothing to show and no prospect of a timeline, after four years of effort.
Limited resources: Guido Vranken‘s The OpenVPN post-audit bug bonanza demonstrates that after two (2) manual audits, vulnerabilities remain to be found in OpenVPN. Unlike OpenVPN, any source code given to a government will be reviewed at most once and then only by a limited number of individuals. Contrast that with OpenVPN, which has been reviewed for years by a large number of people and yets flaws remain to be discovered.
Limited staff: Closely related to my point about limited resources, the people in government who are competent to undertake a software review are already busy with other tasks. Most governments don’t have a corps of idle but competent programmers waiting for source code disclosures to evaluate. Whatever source code review takes place, it will be the minimum required and that only as other priorities allow.

If Kaspersky Labs were to open source but retain copyright on their software, then their source code could be reviewed by:

As many competent programmers as are interested
On an ongoing basis
By people with varying skills and approaches to software auditing

Setting a new standard, that is open source but copyrighted for security software, would be to the advantage of leaders in Gartner’s Magic Quadrant, others, not so much.

It’s entirely possible for someone to compile source code and avoid paying a license fee but seriously, is anyone going to pursue pennies on the ground when there are $100 bills blowing overhead? Auditing, code review, transparency, trust. (I know, the RIAA chases pennies but it’s run by delusional paranoids.)

Three additional reasons for Kaspersky to go open source but copyrighted:

Angst among its more poorly managed competitors will soar.
Example for government mandated open source but copyright for domestic sales. (Think China, EU, Russia.)
Front page news featuring Kaspersky Labs as breaking away from the pack.

Entirely possible for Kaspersky to take advantage of the narrow-minded nationalism now so popular in some circles of the U.S. government. Not to mention changing the landscape of security software to its advantage.

Comments Off

Full Fact is developing two new tools for automated fact-checking

Filed under: Journalism,News,Reporting — Patrick Durusau @ 10:13 am

Full Fact is developing two new tools for automated fact-checking by Mădălina Ciobanu.

From the post:

…

The first tool, Live, is based on the assumption that people, especially politicians, repeat themselves, Babakar explained, so a claim that is knowingly or unknowingly false or inaccurate is likely to be said more than once by different people.

Once Full Fact has fact-checked a claim, it becomes part of their database, and the next step is making sure that data is available every time the same assertion is being made, whether on TV or at a press conference. “That’s when it gets interesting – how can you scale the fact check so that it can be distributed in a much grander way?”

Live will be able to monitor live TV subtitles and eventually perform speech-to-text analysis, taking a live transcript from a radio programme or a press conference and matching it against Full Fact’s database.

…

The second tool Full Fact is building is called Trends, and it aims to record every time a wrong or false claim is repeated, and by whom, to enable fact-checkers to track who or what is “putting misleading claims out into the world”.

Because part of Full Fact’s remit is also to get corrections on claims they verify, the team wants to be able to measure the work of their impact, by looking at whether a claim has been said again once they have fact-checked it and requested a correction for it.

…

The work on Live and Trends has just been funded and the tools are scheduled to appear in 2018.

They are hiring, by the way: Automated Factchecking at Full Fact. Full Fact is also a charity, in case you want to donate to support this work.

I wonder how Full Fact rate stories such as Crowdstrike‘s, a security firm that lives in the back pocket of the Democratic Party (US), report claiming Russian hacks of the DNC? A report it later revised.

Personally since the claims were “confirmed” by a known liar, James Capper, former Director of National Intelligence, I would downgrade such reports and repetitions by others to latrine gossip.

In case you haven’t read in detail the various reports, there have been no records produced, but much looks like, “in our experience,” etc., but a positive dearth of facts. That interested “experts” say it is so, in the absence of evidence, doesn’t make their claims facts.

Looking forward to news on these projects as they develop!

Comments Off

Older Posts »

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

July 28, 2017

July 27, 2017

July 26, 2017

July 25, 2017

July 19, 2017

July 15, 2017

July 14, 2017

July 13, 2017

July 12, 2017

July 11, 2017

July 7, 2017

July 6, 2017