Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 11, 2018

Hiding Places for Bias in Deep Learning

Filed under: Bias,Deep Learning — Patrick Durusau @ 8:17 pm

Are Deep Policy Gradient Algorithms Truly Policy Gradient Algorithms? by Andrew Ilyas, et al.

Abstract:

We study how the behavior of deep policy gradient algorithms reflects the conceptual framework motivating their development. We propose a fine-grained analysis of state-of-the-art methods based on key aspects of this framework: gradient estimation, value prediction, optimization landscapes, and trust region enforcement. We find that from this perspective, the behavior of deep policy gradient algorithms often deviates from what their motivating framework would predict. Our analysis suggests first steps towards solidifying the foundations of these algorithms, and in particular indicates that we may need to move beyond the current benchmark-centric evaluation methodology.

Although written as an evaluation of the framework for deep policy gradient algorithms with suggestions for improvement, it isn’t hard to see how the same factors create hiding places for bias in deep learning algorithms.

  • Gradient Estimation: we find that even while agents are improving in terms of reward, the gradient
    estimates used to update their parameters are often virtually uncorrelated with the true gradient.
  • Value Prediction: our experiments indicate that value networks successfully solve the supervised learning task they are trained on, but do not fit the true value function. Additionally, employing a value network as a baseline function only marginally decreases the variance of gradient estimates (but dramatically increases agent’s performance).
  • Optimization Landscapes: we also observe that the optimization landscape induced by modern policy gradient algorithms is often not reflective of the underlying true reward landscape, and that the latter is often poorly behaved in the relevant sample regime.
  • Trust Regions: our findings show that deep policy gradient algorithms sometimes violate theoretically motivated trust regions. In fact, in proximal policy optimization, these violations stem from a fundamental problem in the algorithm’s design.

The key take-away is that if you can’t explain the behavior of an algorithm, then how do you detect or guard against bias in such an algorithm? Or as the authors put it:

Deep reinforcement learning (RL) algorithms are rooted in a well-grounded framework of classical RL, and have shown great promise in practice. However, as our investigations uncover, this framework fails to explain much of the behavior of these algorithms. This disconnect impedes our understanding of why these algorithms succeed (or fail). It also poses a major barrier to addressing key challenges facing deep RL, such as widespread brittleness and poor reproducibility (cf. Section 4 and [3, 4]).

Do you plan on offering ignorance about your algorithms as a defense for discrimination?

Interesting.

November 10, 2018

Relational inductive biases, deep learning, and graph networks

Filed under: Deep Learning,Graphs,Networks — Patrick Durusau @ 9:15 pm

Relational inductive biases, deep learning, and graph networks by Peter W. Battaglia, et al.

Abstract:

Artificial intelligence (AI) has undergone a renaissance recently, making major progress in key domains such as vision, language, control, and decision-making. This has been due, in part, to cheap data and cheap compute resources, which have fit the natural strengths of deep learning. However, many defining characteristics of human intelligence, which developed under much different pressures, remain out of reach for current approaches. In particular, generalizing beyond one’s experiences–a hallmark of human intelligence from infancy–remains a formidable challenge for modern AI.

The following is part position paper, part review, and part unification. We argue that combinatorial generalization must be a top priority for AI to achieve human-like abilities, and that structured representations and computations are key to realizing this objective. Just as biology uses nature and nurture cooperatively, we reject the false choice between “hand-engineering” and “end-to-end” learning, and instead advocate for an approach which benefits from their complementary strengths. We explore how using relational inductive biases within deep learning architectures can facilitate learning about entities, relations, and rules for composing them. We present a new building block for the AI toolkit with a strong relational inductive bias–the graph network–which generalizes and extends various approaches for neural networks that operate on graphs, and provides a straightforward interface for manipulating structured knowledge and producing structured behaviors. We discuss how graph networks can support relational reasoning and combinatorial generalization, laying the foundation for more sophisticated, interpretable, and flexible patterns of reasoning. As a companion to this paper, we have released an open-source software library for building graph networks, with demonstrations of how to use them in practice.

Forty pages of very deep sledding.

Just on a quick scan, I do take encouragement from:

An entity is an element with attributes, such as a physical object with a size and mass. (page 4)

Could it be that entities have identities defined by their attributes? Are the attributes and their values recursive subjects?

Only a close read of the paper will tell but I wanted to share it today.

Oh, the authors have released a library for building graph networks: https://github.com/deepmind/graph_nets.

PyCoder’s Weekly Archive 2012-2018 [Indexing Data Set?]

Filed under: Indexing,Python,Search Engines,Searching — Patrick Durusau @ 8:53 pm

PyCoder’s Weekly Archive 2012-2018

Python programmers already know about PyCoder Weekly but if you don’t, it’s a weekly newsletter with headline Python news, discussions, Python jobs, articles & tutorials, projects & code, and events. Yeah, every week!

I mention it too as a potential indexing set for search software. I’m reasoning you are more likely to devote effort to indexing material of interest than out of copyright newspapers. Besides, you will be better able to judge a good search result from a bad one when indexing PyCoder’s Weekly.

Enjoy!

November 9, 2018

RunCode – (Was Codewarz last year) – Starts Nov 10 0900 (EST)

Filed under: Cybersecurity,Hacking — Patrick Durusau @ 8:47 pm

RunCode.

From the webpage:

Complete challenges to attain points. Attain points to impress your friends. Impress your friends to… lol, you don’t have any friends, what are you talking about!

The competition will begin at Nov 10 0900(EST) and run until Nov 12 0900(EST). The top 10 players will be able to pick a prize out of our prize list. In order to receive the prize you must provide the RunCode team your physical mailing address as we will be shipping you the prize. If you’d rather donate your prize instead of giving us your physical mailing address, we will give the prize of your choice or donate the equivalent monetary amount to a charity you choose. If you’re looking for the list of prizes, they can be found on our twitter. Good luck in the competition, and if you have any questions feel free to reach out to us on our slack chat server for support (you’ll get an email invite to our slack after making an account).

If you’d like to practice on some of our previous challenges. Head over to our main website where we have all of our previous challenges available for you to work on (the logins/accounts for the competition site and the main site are separate).

Sign up! Not many hours left!

I’ve got a full weekend of editing on tap already but registering will give me incentive to at least try some of the challenges.

I first read about the RunCode event in: Codewarz, reloaded: programming contest ads pwning, prizes as RunCode by Sean Gallagher.

November 8, 2018

Shape-Guided Image Generation [Danger! Danger! Sarah Huckabee Sanders]

Filed under: Deep Learning,Image Processing,Image synthesis — Patrick Durusau @ 9:34 pm

A Variational U-Net for Conditional Appearance and Shape Generation by Patrick Esser, Ekaterina Sutter, Björn Ommer.

Abstract:

Deep generative models have demonstrated great performance in image synthesis. However, results deteriorate in case of spatial deformations, since they generate images of objects directly, rather than modeling the intricate interplay of their inherent shape and appearance. We present a conditional U-Net [30] for shape-guided image generation, conditioned on the output of a variational autoencoder for appearance. The approach is trained end-to-end on images, without requiring samples of the same object with varying pose or appearance. Experiments show that the model enables conditional image generation and transfer. Therefore, either shape or appearance can be retained from a query image, while freely altering the other. Moreover, appearance can be sampled due to its stochastic latent representation, while preserving shape. In quantitative and qualitative experiments on COCO [20], DeepFashion [21, 23], shoes [43], Market-1501 [47] and handbags [49] the approach demonstrates significant improvements over the state-of-the-art.

The abstract fails to convey the results described in the paper. Try:

The animated versions are based on the single image on the left.

There is a Github site with training data: https://github.com/CompVis/vunet which carries this short description:

The model learns to infer appearance from a single image and can synthesize images with that appearance in different poses.

My answer to anyone who objects to Sarah Huckabee Sanders or other members of the current regime in Washington being the subjects of this technique: Jim Acosta video.

This is war friends and you don’t win wars by praying for the other side to be more courteous.

November 5, 2018

ꓘamerka —… [On Ubuntu 18.04] 

Filed under: Open Source Intelligence,Privacy,Shodan — Patrick Durusau @ 1:25 pm

ꓘamerka — Build interactive map of cameras from Shodan by Wojciech.

From the post:

This post will be really quick one, I want to share one of the curiosity I wrote recently. It’s proof of concept to visualize cameras from Shodan API into real map. Some of the cameras are left open with no authentication so you don’t need to have any hacking skills to get access, and depends on where camera is located you can get interesting view in some cases. With lot of luck, it can help you with OSINT investigations or geolocating photos. Imagine you have photo to geolocate and you found open camera exactly pointing to this place, or somewhere nearby, which can give you hint.

Source: https://github.com/woj-ciech/kamerka

OK, so I git clone git:github.com/woj-ciech/kamerka in a directory.

After changing to the kamerka directory:

pip -r install requirements

Answer:

Usage:
pip [options]

no such option: -r

Syntax error. Try:

pip install -r requirements.txt

Success!

Restriction: Works only with paid Shodan.io accounts.

Opps! I don’t have a commercial Shodan account (at the moment) so I need to break here.

When I obtain a commercial Shodan account I will report further on this script. Thinking Venice Beach would be a nice location to test for cameras. 😉

November 1, 2018

Field Notes: Building Data Dictionaries [Rough-n-Ready Merging]

Filed under: Data Management,Data Provenance,Documentation,Merging,Topic Maps — Patrick Durusau @ 4:33 pm

Field Notes: Building Data Dictionaries by Caitlin Hudon.

From the post:

The scariest ghost stories I know take place when the history of data — how it’s collected, how it’s used, and what it’s meant to represent — becomes an oral history, passed down as campfire stories from one generation of analysts to another like a spooky game of telephone.

These stories include eerie phrases like “I’m not sure where that comes from”, “I think that broke a few years ago and I’m not sure if it was fixed”, and the ever-ominous “the guy who did that left”. When hearing these stories, one can imagine that a written history of the data has never existed — or if it has, it’s overgrown with ivy and tech-debt in an isolated statuary, never to be used again.

The best defense I’ve found against relying on an oral history is creating a written one.

Enter the data dictionary. A data dictionary is a “centralized repository of information about data such as meaning, relationships to other data, origin, usage, and format”, and provides us with a framework to store and share all of the institutional knowledge we have about our data.

Unless you have taken over the administration of an undocumented network, you cannot really appreciate Hudon’s statement:


As part of my role as a lead data scientist at a start-up, building a data dictionary was one of the first tasks I took on (started during my first week on the job).

I have taken over undocumented Novell and custom-written membership systems. They didn’t remain that way but moving to fully documented systems was perilous and time-consuming.

The first task for any such position is to confirm an existing data dictionary and/or build one if it doesn’t exist. No other task, except maybe the paperwork for HR so you can get paid, is more important.

Hudon’s outline of her data dictionary process is as good as any, but it doesn’t allow for variant and/or possibly conflicting data dictionaries. Or for detecting when “variants” are only apparent and not real.

Where Hudon has Field notes, consider inserting structured properties that you can then query for “merging” purposes.

It’s not necessary to work out how to merge all the other fields automatically, especially if you are exploring data or data dictionaries.

Or to put it differently, not every topic map results in a final, publishable, editorial product. Sometimes you only want enough subject identity to improve your data set or results. That’s not a crime.

October 31, 2018

BaseX 9.1: The Autumn Edition [No Weaponized Email Leaks for Mid-Term Elections to Report]

Filed under: BaseX,XML,XQuery — Patrick Durusau @ 3:41 pm

Christian Gruin writes in an email:

Dear XML and XQuery aficionados,

It’s been exactly 5 months ago when BaseX 9 was released, and we are happy to announce version 9.1 of our XML framework, database system
and XQuery 3.1 processor! The latest release is online:

http://basex.org

The most exciting addition is support for WebSockets, which enable you to do bidirectional (full-duplex) client/server communication with
XQuery web applications:

http://docs.basex.org/wiki/WebSockets

Moreover, we have added convenient syntax extensions (ternary if, Elvis operator, if without else) to XQuery. Some of them may be made available in other implementations of XQuery as well (we’ll keep you updated):

http://docs.basex.org/wiki/XQuery_Extensions#Expressions

Other new features are as follows:

XQuery:
– set local locks via pragmas and function annotations
– Database Module: faster processing of value index functions
– Jobs Module: record and return registration times
– ENFORCEINDEX option: support for predicates with dynamic values
– Update Module, update:output: support for caching maps and arrays

GUI:
– Mac, Windows: Improved rendering support for latest Java versions
– XQuery editor: choose and display current query context

Visit http://docs.basex.org to get more information on the added features.

Your feedback is welcome! Have fun,

Christian
BaseX Team

I know of no examples of weaponized email leaks using BaseX for the mid-term elections in less than a week.

That absence is more than a little disappointing because industrial strength weapons are available, such as BaseX, and computer security remains on a Hooterville level of robustness.

Despite this missed opportunity, there are elections scheduled (still) for 2020.

ICC Metadata – Vulnerability Pattern?

Filed under: Steganography,Tweets,Twitter — Patrick Durusau @ 2:07 pm

This Tiny Picture on Twitter Contains the Complete Works of Shakespeare by Joseph Cox.

From the post:


The trick works by leveraging how Twitter handles metadata. Buchanan explained that Twitter strips most metadata from images, but the service leaves a particular type called ICC untouched. This is where Buchanan stored his data of choice, including ZIP and RAR archives.

“So basically, I wrote a script which parses a JPG file and inserts a big blob of ICC metadata,” he said. “The metadata is carefully crafted so that all the required ZIP headers are in the right place.” This process was quite fiddly, he added, saying it took a few hours to complete, although he wrote the script itself over a span of a couple of months.

“I was just testing to see how much raw data I could cram into a tweet and then a while later I had the idea to embed a ZIP file,” Buchanan added.

The ICC link points to PhotoMe:

PhotoME is a powerful tool to show and edit the meta data of image files. Thanks to the well organised layout and intuitive handling, it’s possible to analyse and modify Exif and IPTC-NAA data as well as analyse ICC profiles – and it’s completely FREE!

Useful link/software but it doesn’t define ICC metadata.

I’m curious because the handling of ICC metadata may be a vulnerability pattern found in other software.

ICC metadata is a color profile defined by the International Color Consortium. The ICC specifications page has links to the widely implemented version 4, Specification ICC.1:2010-12 (Profile version 4.3.0.0); its successor, now in development, Specification ICC.2:2018 (iccMAX); and, the previous ICC Profile, Specification ICC.1:2001-04.

The member list of ICC alone testifies to the reach of any vulnerability enabled by ICC metadata. Add to that implementers of ICC metadata and images with it.

How does your image processing software manage ICC metadata?

FeatherCast – Apache Software Foundation Podcast – Follow @FeatherCast

Filed under: Data Science,Podcasting — Patrick Durusau @ 9:25 am

FeatherCast – Apache Software Foundation Podcast

From the about page:

The Apache Software Foundation is a highly diverse organisation, with projects covering a wide range of technologies. Keeping track of them all is no easy task, nor is keeping track of all the news that it generates.

This podcast aims to provide a regular update and insight into the world of the foundation. We’re going to try and bring you interviews from the people who make the decisions and guide the foundation and its projects, giving you the chance to have your questions put to them.

FeatherCast was created by David Reid and Rich Bowen, both of whom are members of the Apache Software Foundation. Over time we have added and lost a number of interviewers. Right now, our active interviewers include Rich, and Sharan Foga.

Like many of you, my first visit to the Apache.org website is lost in the depths of time. It was certainly to explore the HTTP Server Project, which even today appears outside the list of equally important software projects.

Add @FeatherCast to the list of Twitter accounts you follow. The content is about what you would expect from one of the defining forces of the Internet and data science as we know it. That is to say, excellent!

Enjoy and please spread the news about Feathercast!

October 30, 2018

Fake News about Russian Porn Infection

Filed under: Cybersecurity,Hacking,Porn — Patrick Durusau @ 7:49 pm

Porn-Watching Employee Infected Government Networks With Russian Malware, IG Says

From the post:

The agency’s inspector general traced the malicious software to a single unnamed USGS employee, who reportedly used a government-issued computer to visit some 9,000 adult video sites, according to a report published Oct. 17.

Many of the prohibited pages were linked to Russian websites containing malware, which was ultimately downloaded to the employee’s computer and used to infiltrate USGS networks, auditors found. The investigation found the employee saved much of the pornographic material on an unauthorized USB drive and personal Android cellphone, both of which were connected to their computer against agency protocols.

Many people breathed a sigh of relief when it was reported the USGS staff used their computer:

…to visit some 9,000 adult video site, …

They hadn’t visited 9,000 adult video sites and that’s a lot of sites, assuming you had other job duties.

Sorry to disappoint but the IG report says in fact:

…Many of the 9,000 web pages ****** visited routed through websites that originated in Russia and contained malware.

Ah, “9,000 web pages,” not “…9,000 adult video sites.” That’s quite a difference.

More than a few but a much more plausible number.

Aside from poor fact checking, the real lesson here is to realize porn is a great carrier for malware, if you didn’t know that already.

My favorite AI newsletters…

Filed under: Artificial Intelligence,Machine Learning — Patrick Durusau @ 7:10 pm

My favorite AI newsletters, run by people working in the field by Rosie Campbell.

Campbell lists her top five (5) AI newsletters. That’s a manageable number, at least if I discontinue other newsletters that fill my inbox.

Not that my current newsletter subscriptions aren’t valuable, but I’m not the web archive for those mailings and if I lack the time to read them, what’s the point?

It’s not Spring so I need to do some Fall cleaning of my newsletter subscriptions.

Any additions to those suggested by Campbell?

r2con 2018 – videos [Dodging Political Ads]

Filed under: Cybersecurity,Hacking,Radare2 — Patrick Durusau @ 6:56 pm

r2con 2018 – videos

Avoid the flood of political ads this final week before the US mid-term elections! May I suggest the videos from r2con 2018?

Unlike with political ads and news coverage, laced with false information, r2con videos won’t make you dumber. May not make you smarter but you will be better informed about r2 topics.

Should you accidentally encounter political news coverage or a political ad, run to your computer and watch an r2con video. You will feel better.

Enjoy!

Caselaw Access Project – 360 Years of United States Caselaw

Filed under: Law,Law - Sources,Legal Informatics — Patrick Durusau @ 6:41 pm

Caselaw Access Project – 360 Years of United States Caselaw

From the about page:

The Caselaw Access Project (“CAP”) expands public access to U.S. law.

Our goal is to make all published U.S. court decisions freely available to the public online, in a consistent format, digitized from the collection of the Harvard Law Library.

CAP includes all official, book-published United States case law — every volume designated as an official report of decisions by a court within the United States.

Our scope includes all state courts, federal courts, and territorial courts for American Samoa, Dakota Territory, Guam, Native American Courts, Navajo Nation, and the Northern Mariana Islands. Our earliest case is from 1658, and our most recent cases are from 2018.

Each volume has been converted into structured, case-level data broken out by majority and dissenting opinion, with human-checked metadata for party names, docket number, citation, and date.

We also plan to share (but have not yet published) page images and page-level OCR data for all volumes.

On the bright side, 6.4 million unique cases, 40M pages scanned. On the dark side, access is limited in some situations. See the website for details.

Headnotes for volumes after 1922 are omitted (a symptom of insane copyright laws) but that presents the opportunity/necessity for generating headnotes automatically. A non-trivial exercise but an interesting one.

Take note:


You can report errors of all kinds at our Github issue tracker, where you can also see currently known issues. We particularly welcome metadata corrections, feature requests, and suggestions for large-scale algorithmic changes. We are not currently able to process individual OCR corrections, but welcome general suggestions on the OCR correction process.

What extra features would you like?

October 29, 2018

DeepCreamPy – Decensoring Hentai with Deep Neural Networks

Filed under: Deep Learning,Neural Networks,Porn — Patrick Durusau @ 4:18 pm

DeepCreamPy – Decensoring Hentai with Deep Neural Networks

From the webpage:

This project applies an implementation of Image Inpainting for Irregular Holes Using Partial Convolutions to the problem of hentai decensorship. Using a deep fully convolutional neural network, DeepCreamPy can replace censored artwork in hentai with plausible reconstructions. The user needs to specify the censored regions in each image by coloring those regions green in a separate image editing program like GIMP or Photoshop.

Limitations

The decensorship is intended to work on color hentai images that have minor to moderate censorship of the penis or vagina. If a vagina or penis is completely censored out, decensoring will be ineffective.

It does NOT work with:

  • Black and white/Monochrome image
  • Hentai containing screentones (e.g. printed hentai)
  • Real life porn
  • Censorship of nipples
  • Censorship of anus
  • Animated gifs/videos

… (emphasis in original)

Given the project limitations, there is a great opportunity for a major contribution.

Albeit I don’t know how “decensored drawings of anuses” would look on a resume. You might need to re-word that part.

What images do you want to decensor?

October 28, 2018

Conway’s Game of Life in R: … [Simple Rules Lead to Complex Behaviors]

Filed under: Cellular Automata,Chaos — Patrick Durusau @ 8:27 pm

Conway’s Game of Life in R: Or On the Importance of Vectorizing Your R Code byJohn Mount.

From the post:

R is an interpreted programming language with vectorized data structures. This means a single R command can ask for very many arithmetic operations to be performed. This also means R computation can be fast. We will show an example of this using Conway’s Game of Life.

A demonstration of a 10x speed increase from vectorized R code.

Cellular automata, Conway’s Game of Life is one, have a rich history, as well as being part of the focus of the Santa Fe Institute.

Suffice it to say that simple rules lead to complex and unpredictable behaviors.

Keep that in mind when people suggest simple solutions to complex behaviors, such as regulating social media in response to acts of violence.

October 27, 2018

How To Learn Data Science If You’re Broke

Filed under: Data Science — Patrick Durusau @ 8:30 pm

How To Learn Data Science If You’re Broke by Harrison Jansma.

From the post:

Over the last year, I taught myself data science. I learned from hundreds of online resources and studied 6–8 hours every day. All while working for minimum wage at a day-care.

My goal was to start a career I was passionate about, despite my lack of funds.

Because of this choice I have accomplished a lot over the last few months. I published my own website, was posted in a major online data science publication, and was given scholarships to a competitive computer science graduate program.

In the following article, I give guidelines and advice so you can make your own data science curriculum. I hope to give others the tools to begin their own educational journey. So they can begin to work towards a more passionate career in data science.

Great resource to keep bookmarked for people who ask about getting started in data science.

October 26, 2018

Best-First Search [Inspiration for Hackers]

Filed under: D3,Graphics — Patrick Durusau @ 9:09 pm

Best-First Search by Mike Bostock.

Take my first “best-first search” result:

as encouragement to see this “live code” for yourself!

Best-First Search represents, figuratively speaking, the process of breaching cybersystems of pipeline construction companies, pipeline operators, their lawyers, investors, etc. Magic bullets work but so does following best-first paths until success is achieved.

Good hunting!

October 25, 2018

DMCA Exemptions – 10/26/18 or White Hat Advertising Rules

Filed under: Cybersecurity,Hacking,Intellectual Property (IP) — Patrick Durusau @ 7:57 pm

Beau Woods posted a tweet with the URL for: Exemption to Prohibition on Circumvention of Copyright Protection Systems for Access Control Technologies.

Cutting to the chase:


(i)Computer programs, where the circumvention is undertaken on a lawfully acquired device or machine on which the computer program operates, or is undertaken on a computer, computer system, or computer network on which the computer program operates with the authorization of the owner or operator of such computer, computer system, or computer network, solely for the purpose of good-faith security research and does not violate any applicable law, including without limitation the Computer Fraud and Abuse Act of 1986.

(ii) For purposes of this paragraph (b)(11), “good-faith security research” means accessing a computer program solely for purposes of good-faith testing, investigation, and/or correction of a security flaw or vulnerability, where such activity is carried out in an environment designed to avoid any harm to individuals or the public, and where the information derived from the activity is used primarily to promote the security or safety of the class of devices or machines on which the computer program operates, or those who use such devices or machines, and is not used or maintained in a manner that facilitates copyright infringement.
… (page 65)

I have long puzzled over claims of fearing DMCA enforcement by security researchers. The FBI is busy building illegal silencers for the mentally ill. Or engaging in other illegal, if not insane, activities. When would the FBI find the time to pursue security researchers when fantasies about Russian/Chinese/North Korean election “interference” are rippling through Washington?

Although phrased as “fear of prosecution,” the DCMA issue for white hats was one of advertising. Advertising a hack could annoy a vendor. Annoying vendors along with your identity and location seemed like a bad plan. But with a DMCA exemption, white hats are free to spam the Internet with their latest “research.”

Not that I mind white hats advertising but drawing lines based on the economic interests of stakeholders doesn’t always point to greater freedom. Today it worked in favor of security researchers and possibly consumers, but there’s no guarantee that will always be the result.

CVE-2018–8414: A Case Study in Responsible Disclosure

Filed under: Cybersecurity,Hacking,Reverse Engineering — Patrick Durusau @ 3:21 pm

CVE-2018–8414: A Case Study in Responsible Disclosure by Matt Nelson.

From the post:

The process of vulnerability disclosure can be riddled with frustrations, concerns about ethics, and communication failure. I have had tons of bugs go well. I have had tons of bugs go poorly.

I submit a lot of bugs, through both bounty programs (Bugcrowd/HackerOne) and direct reporting lines (Microsoft). I’m not here to discuss ethics. I’m not here to provide a solution to the great “vulnerability disclosure” debate. I am simply here to share one experience that really stood out to me, and I hope it causes some reflection on the reporting processes for all vendors going forward.

First, I’d like to give a little background on myself and my relationship with vulnerability research.

I’m not an experienced reverse engineer. I’m not a full-time developer. Do I know C/C++ well? No. I’m relatively new to the industry (3 years in). I give up my free time to do research and close my knowledge gaps. I don’t find crazy kernel memory leaks, rather, I find often overlooked user-mode logic bugs (DACL overwrite bugs, anyone?).

Most importantly, I do vulnerability research (VR) as a hobby in order to learn technical concepts I’m interested in that don’t necessarily apply directly to my day job. While limited, my experience in VR comes with the same pains that everyone else has.

I mention this as one data point in the submission of bug reports and as encouragement to engage in bug hunting, even if you aren’t a kernel geek.

If you follow the disclosure “ethics” described in this post, the “us” who benefits includes the CIA, NSA, Saudi Arabia, Israel, and a host of others.

October 24, 2018

Bloomberg’s “China Hack” Conspiracy

Filed under: Cybersecurity,Hacking — Patrick Durusau @ 7:16 pm

Mathew Ingram writes in Pressure increases on Bloomberg to verify its China hack story:

It was a certified bombshell: Bloomberg News reported on October 4 that the Chinese government had been able to infiltrate both Apple and Amazon’s hardware systems by putting hacked microchips into the third-party motherboards they used in their servers. But as the days following the report have turned into weeks, doubts about the validity of the story have continued to grow, while the amount of independent verification and/or supporting material proving such a hack actually occurred remains at zero.

In a column on Tuesday, Washington Post media critic Erik Wemple argued the chorus of voices in opposition to the allegations in the piece—including strenuous and detailed denials from the companies involved—have put the onus on Bloomberg to come up with additional verification, or else risk casting even more doubt on its scoop. “The relentlessness of the denials and doubts from companies and government officials obligate Bloomberg to add the sort of proof that will make believers of its skeptics,” Wemple wrote. “Assign more reporters to the story, re-interview sources, ask for photos and emails. Should it fail in this effort, it’ll need to retract the entire thing.” Wemple also criticized the news outlet for using a photo of a generic microchip on the cover of Bloomberg BusinessWeek magazine, despite the fact that the news outlet has no photos of the actual chip that was allegedly used in the hacks.
… (emphasis in original)

Ingram has collected links to a number of the posts and refutations of the original Bloomberg claims.

But you don’t need the protests of innocence and/or deep technical analysis to be wary of the Bloomberg story.

On the face of the original report, how many people do you think would “know” about the subversion of the motherboards?

  1. Designers of the subversive chip
  2. Motherboard designers to create a motherboard that uses the subversive chip
  3. Development and testing staff for the chip and the motherboards
  4. Users of capabilities offered by the subversive chips
  5. Handlers of the intelligence produced by the subversive chips
  6. Funders for #1 – #5

Would you concede those in the “know” about the chips would have to number in the thousands?

I ask because research on conspiracies estimates to keep a secret for five years, the maximum number of participants has an upper limit of 2521 agents. On the Viability of Conspiratorial Beliefs, David Robert Grimes, PLOS, Published: January 26, 2016, https://doi.org/10.1371/journal.pone.0147905.

On the face of it, the ‘China Hack’ more closely resembles the NASA Moon-landing conspiracy than technological legerdemain.

Especially given Bloomberg’s explanation for the absence of any motherboard with the “extra” chip:


In the three years since the briefing in McLean, no commercially viable way to detect attacks like the one on Supermicro’s motherboards has emerged—or has looked likely to emerge. Few companies have the resources of Apple and Amazon, and it took some luck even for them to spot the problem. “This stuff is at the cutting edge of the cutting edge, and there is no easy technological solution,” one of the people present in McLean says. “You have to invest in things that the world wants. You cannot invest in things that the world is not ready to accept yet.”

Failure to detect becomes evidence of the cleverness of these conspirators.

Looks like a conspiracy theory, walks like a conspiracy theory, talks like a conspiracy theory, the absence of evidence proves the conspiracy theory, all suggests Bloomberg’s “China Hack” is a conspiracy theory.

Hacking Rent-A-Spy Vendors (Partial Target List)

Filed under: Cybersecurity,Government,Hacking — Patrick Durusau @ 3:49 pm

Does “hacking” apply to data found in publicly accessible locations? Lorenzo Franceschi-Bicchierai thinks so in Government Spyware Vendor Left Customer, Victim Data Online for Everyone to See.

However you answer that question, the post is an amusing tale of a spyware startup that left 20 gigabytes of data exposed to the public.

And it’s a valuable article, given the targeting data gthered:


Wolf Intelligence is part of the so-called “lawful intercept” industry. This is a relatively unregulated—but legal—part of the surveillance market that provides hacking and spy software to law enforcement and intelligence agencies around the world. Hacking Team, FinFisher, and NSO Group are the more well-known companies in this sector. According to a recent estimate, this market is expected to be worth $3.3 billion in 2022.

These companies generally sell spyware that infects computers and cell phones with the goal of extracting evidence for police or intelligence operations, which can be particularly useful when authorities need to get around encryption and have a warrant to access the content of a target’s communications. But in the past, companies like Hacking Team, FinFisher, and NSO Group have all sold their malware to authoritarian regimes who have used it against human rights defenders, activists, and journalists.

As demand for these technologies has grown, many smaller players have entered the market. Some of them have made embarrassing mistakes that have helped cybersecurity researchers expose them.

You can spend $$$ on R&D developing cutting-edge malware or wait for rent-a-spy vendors and the like to leak it. Rent-a-spy vendors hire from the same gene pool that makes phishing the #1 means of cybersecurity breaches. Picking up malware litter has a higher ROI.

Is anyone keeping a list of rent-a-spy vendors? Pointers? Thanks!

October 21, 2018

Why You Should Start Doing CTFs (Women in RE)

Filed under: Cybersecurity,Hacking — Patrick Durusau @ 3:16 pm

Why You Should Start Doing CTFs by Oryan De Paz.

From the post:

Capture The Flag (CTF) is a competition in the Information Security field. The main idea is to simulate different kinds of attack concepts with various challenges such as Reverse Engineering, Networks and Protocols, Programming, Crypto, Web Security, Exploits, etc.

All these challenges have one goal — capture the flag: solve the puzzle and use your skills in order to find a string that you can eventually type-in as your solution. If the solution is correct — you get the challenge points, which depend on the task difficulty. These days you can find CTF competitions in many of the infosec conferences.

De Paz has five (5) good reasons for doing Capture The Flag (CTF) exercises and pointers to additional resources.

De Paz mentions these reverse engineers as guideposts for her journey into CTF (in a Twitter thread on her post):

Great advice and leads to exploring CTF for yourself!

October 16, 2018

There’s a Spectre Haunting the Classics, It’s Called the TLG

Filed under: Classics,Greek,Humanities — Patrick Durusau @ 6:50 pm

Index of Ancient Greek Lexica

Today being National Dictionary Day (a U.S. oddity), I was glad to see a tweet boasting of 28 Greek lexica for online searching.

While it is true that 28 Greek lexica are available for searching, only the results are available for eight (8) of them, access to the other twenty (20), depending upon a subscription to the TLG project.

Funded entirely with public monies and donations, the TLG created IP agreements with publishers of Greek texts, which succeeded in walling off this collection from the public for decades. Some of the less foul guardians at the TLG have prevailed upon it to offer a limited subset of the corpus for free. How kind.

Advances in digitization and artificial intelligence aided transcription promise access to original Greek materials in the not too distant future.

I look forward to a future when classicists look puzzled at mention of the TLG and then brighten to say: “Oh, that was when classics resources were limited to the privileged few.”

October 14, 2018

A Map of Every Building in America (NYT)

Filed under: Mapping,Maps — Patrick Durusau @ 6:54 pm

A Map of Every Building in America by Tim Wallace, Derek Watkins and John Schwartz.

From the post:

Most of the time, The New York Times asks you to read something. Today we are inviting you, simply, to look. On this page you will find maps showing almost every building in the United States.

Why did we make such a thing? We did it as an opportunity for you to connect with the country’s cities and explore them in detail. To find the familiar, and to discover the unfamiliar.

So … look. Every black speck on the map below is a building, reflecting the built legacy of the United States.

I’m sure maps of greater value are possible, but this interactive map of buildings by the New York Times sets a high bar.

If that weren’t good enough, Microsoft has released USBuildingFootprints, described as:

This dataset contains 125,192,184 computer generated building footprints in all 50 US states. This data is freely available for download and use.

The datasets are listed by state.

What other data set(s) would take this map from being a curiousity to being actionable?

October 13, 2018

“Oh I wish that I could be Melania Trump [Richard Cory]”

Filed under: Feminism,Politics — Patrick Durusau @ 2:24 pm

Among the shallow outpourings of scorn on Melania Trump, Arwa Mahdawi‘s Melania Trump claims of victimhood have a hollow ring, is representative of the rest.

Consider this snippet from her post:


In an interview with ABC News, the first lady said, “I support the women and they need to be heard” but added that if they come forward as victims they must “show the evidence”. Unfortunately, Melania did not elaborate on what sort of evidence she considers acceptable. Might she accept, for example, a tape of her husband boasting about grabbing women’s crotches without their consent?

Despite her immense advocacy for women, I’m sorry to report that Melania feels let down by the sisterhood. “I could say I’m the most bullied person in the world,” she said in her interview.

Listen, I support the Melanias and they need to be heard, but if you’re going to come forward as a victim, you must show the evidence. And right now all the evidence seems to point at the first lady being just as morally bankrupt as the president and deserving every ounce of criticism she attracts. If you do feel any spark of sympathy for Melania, I suggest you redirect your attention to the thousands of migrant children the Trump administration has kidnapped.

As far as Melania’s “show the evidence” comment, in context she clearly says that the media, emphasis on the media, goes too far when someone says they have been assaulted. Not quite the same impression as you get from Mahdawi’s account.

Melania may have been sexually abused or assaulted and being unable to “show the evidence,” she has suffered in silence along with millions of women around the world. If speaking out without evidence makes your life worse, then her advice may not be too far off the mark.

If she has abuse issues in her past, like any other survivor, she has an absolute right to speak or NOT speak about her prior abuse. Neither Mahdawi nor anyone else has the right to demand Melania shed her personal privacy so they can judge her legitimacy.

It’s not clear what Mahdawi find surprising about:

I could say I’m the most bullied person in the world

A question was asked and Melania answered. What other source of information would you use to judge a person’s view of the world?

Mahdawi’s projection of an imaginary world that Melania occupies reminds me of Richard Cory by Edwin Arlington Robinson, re-written by Paul Simon as Paul Simon – Richard Cory Lyrics, which reads in part:

They say that Richard Cory owns one half of this whole town
with political connections to spread his wealth around
born into society a banker’s only child
He had everything a man could want power, grace and style
But I work in his factory and I curse the life I’m living
and I curse my poverty and I wish that I could be
Oh I wish that I could be, Oh I wish that I could be Richard Cory

oh he surely must be happy with everything he’s got

“Richard Cory went home last night and put a bullet through his head.”

In Mahdawi’s imaginary world projection, Melania is not bullied by Trump and his band-of-sycophants. Nor has she paid a high price reach her present position and/or to remain there. Mahdawi is welcome to her fiction, but it’s not a valid basis for judging the words or actions of Melania Trump.

Spend less time fantasying about the First Lady and more on bringing the Trump administration to an end.

PS: To help you remember this lesson in the future:

October 12, 2018

EraseIt! Requirements for an iPhone Security App

Filed under: Cybersecurity,Government,Hacking — Patrick Durusau @ 3:40 pm

Joseph Cox writes in: Cops Told ‘Don’t Look’ at New iPhones to Avoid Face ID Lock-Out:


As Apple has improved its security protections against attackers who have physical access to a phone—Touch and Face ID, the Secure Enclave Processor that handles these tools, and robust encryption used by default—law enforcement agencies have come up with varying techniques for getting into devices they seize. In the UK, police officers simulated a mugging to steal a suspect’s phone while he was using it, so it would be unlocked, and the officer repeatedly swiped the screen to make sure the phone did not close itself off again. Police lawyers determined that they would have no legal power to force the suspect to place his finger on the device, so opted for this unusual, albeit novel, approach.

In the US, however, law enforcement agencies have used both technical and legal means to get into devices. Courts have compelled suspects to unlock their device with their face or fingerprint, but the same approach does not necessarily work for demanding a passcode; under the Fifth Amendment, which protects people from incriminating themselves, a passcode may be considered as “testimonial” evidence. A number of warrants have focused on forcing suspects to place their finger onto an iPhone, and, as Forbes noted in its recent report, some warrants now include boilerplate language that would cover unlocking a device with a person’s face as well. Law enforcement agencies across the country have also bought GrayKey, a small and relatively cheap device that has had success in unlocking modern iPhones by churning through different passcode combinations.

Of all the breaches of iPhone security mentioned, GreyKey is the most disturbing. It bypasses the repeated attempt limitation and GreyKey can crack a six-digit PIN in 22.2 hours (at worst) and 11.1 hours on average. Estimates in this tweet by @matthew_d_green:

While mulling over the implications of GrayKey, I found How to Set iPhone to Erase All Data After 10 Failed Passcode Attempts by Leomar Umpad.

The downside being you may be too excited (one word for it) when the door bursts open and a flash bang grenade goes off to quickly enter the wrong passcode in your iPhone. Or your freedom of movement may be restricted by armed police officers even after calm is restored.

You iPhone needs an EraseIt! app that:

  1. Responds to verbal commands
  2. User supplied command starts erasure process
  3. Once started, erasure process disables all input, including the power button
  4. Erases all data (among other things I don’t know, how effective is data erasure in iPhones?)
  5. (Refinement) Writes 0 or 1 to all memory locations until battery failure

Relying on passcodes reminds me of Bruce Schneier’s classification of cryptography in Applied Cryptography (2 ed.):

There are two kinds of cryptography in this world: cryptography that will stop your kid sister from reading your files, and cryptography that will stop major governments from reading your files. This book is about the latter.

Passcodes are the former.

What other requirements would you have for an EraseIt! app?

PS: Go carefully. Most government forces differ from those of Saudi Arabia (Jamal Khashoggi) only in their preference to kill with plausible deniability.

October 11, 2018

Lost Opportunity for Microsoft Edge Remote Execution Bug

Filed under: Cybersecurity,Hacking,Microsoft — Patrick Durusau @ 8:55 pm

Proof-of-concept code published for Microsoft Edge remote code execution bug by Catalin Cimpanu.

From the post:


The proof-of-concept (PoC) code is for a Microsoft Edge vulnerability —CVE-2018-8495— that Microsoft patched this week, part of its October 2018 Patch Tuesday.

The vulnerability was discovered by Kuwaiti security researcher Abdulrahman Al-Qabandi, who reported his findings to Microsoft via Trend Micro’s Zero-Day Initiative program.

Today, after making sure Microsoft had rolled out a fix, Al-Qabandi published in-depth details about the Edge vulnerability on his blog.

Such PoCs are usually quite complex, but Al-Qabandi’s code is only HTML and JavaScript, meaning it could be be hosted on any website.

When was the last time you heard of North Korean, Russian or Chinese security researchers (sounds classier than “hackers”) reporting a zero-day exploit to a vendor?

Same here.

Consider the opportunities presented by an HTML and Javascript zero-day with regard to governments, military installations and/or corporate entities.

All of those lost by the use of a zero-day submission process and issuance of a patch by Microsoft.

Follow your own conscience but remember, none of the aforementioned are on your side. Why should you be on theirs?

“I Can See You!” * 9 million (est.)

Filed under: Cybersecurity,Hacking — Patrick Durusau @ 8:04 pm

Millions at risk from default webcam passwords

From the post:


The vulnerability lies in a feature called XMEye P2P Cloud, which is enabled on all Xiongmai devices by default. It lets people access their devices remotely over the internet, so that they can see what’s happening on their IP cameras or set up recording on their DVRs.

Using a variety of apps, users log into their devices via Xiongmai’s cloud infrastructure. This means that they don’t have to set up complex firewall port forwarding or UPnP rules on their home routers, but it also means that it opens up a hole in the user’s network. That places the onus on Xiongmai to make the site secure. But it didn’t.

The article goes on to point out how to locate these insecure devices, which are estimated at a population of 9 million around the world.

Suggestions on AI-assisted recognition software to distinguish baby pics from more interesting content?

Morally Blind Reporting – 32 million Muslim Dead vs. Trade Secrets

Filed under: Government,News,Politics — Patrick Durusau @ 2:17 pm

You don’t need citations from me to know bias in news coverage is all the rage these days. But there is precious little discussion of what is meant by “bias,” other than the speaker knowing it when they see it.

Here’s my example of morally blind (biased) news reporting or the lack thereof:

Yanjun Xu, a high-ranking director in China’s Ministry of State Security (MSS), the country’s counter-intelligence and foreign intelligence agency…” was arrested for alleged economic espionage and attempts to steal trade secrets in the United States.

You will see much hand wringing and protests of how necessary such a step was to protect American companies and their trade secrets. Add in a dash of prejudice against China and indignation that a nation of thieves (the U.S.) should be stolen from by others and you complete the scene.

When you find stories about Yanjun Xu, check the same sources for reporting on U.S. responsibility for 32 million Muslim dead since 9/11.

In any moral calculus worthy of the name “moral,” surely the deaths of millions are more important than the intellectual property rights of U.S. industries. Yes?

The value U.S. news organizations place on Muslim deaths versus theft of trade secrets is made self-evident by their reporting.

I don’t want to re-live the 1960’s where people dying were a daily staple of the evening news (even then it was almost always Americans). However, fair and balanced reporting does not exist when millions perish without every man, woman and child being made aware of it on a daily basis. Along with the lack of even a flimsy excuse for their murders.

The U.S. media can start by televising the nearly daily murder of protesters in Gaza and work their way out from there. Close-ups, talk to families, bring the cruelty the U.S. is financing into our living rooms. Sicken us with our own inhumanity.

PS: Don’t bother commenting the media lacks access, permission, etc. If you want to be butt-puppets of government, say so, don’t sully the title reporter.

« Newer PostsOlder Posts »

Powered by WordPress