Launch of the PhilMath Archive

May 29th, 2017

Launch of the PhilMath Archive: preprint server specifically for philosophy of mathematics

From the post:

PhilSci-Archive is pleased to announce the launch of the PhilMath-Archive, http://philsci-archive.pitt.edu/philmath.html a preprint server specifically for the philosophy of mathematics. The PhilMath-Archive is offered as a free service to the philosophy of mathematics community. Like the PhilSci-Archive, its goal is to promote communication in the field by the rapid dissemination of new work. We aim to provide an accessible repository in which scholarly articles and monographs can find a permanent home. Works posted here can be linked to from across the web and freely viewed without the need for a user account.

PhilMath-Archive invites submissions in all areas of philosophy of mathematics, including general philosophy of mathematics, history of mathematics, history of philosophy of mathematics, history and philosophy of mathematics, philosophy of mathematical practice, philosophy and mathematics education, mathematical applicability, mathematical logic and foundations of mathematics.

For your reference, the PhilSci-Archive.

Enjoy!

Innovations In Security: Put All Potential Bombs In Cargo

May 29th, 2017

US Wants to Extend Laptop Ban to All International Flights by Catalin Cimpanu.

From the post:

US Secretary of Homeland Security Gen. John Kelly revealed in an interview over the weekend that the US might expand its current laptop ban to all flights into the US in the near future.

“I might,” said Gen. Kelly yesterday on Fox News Sunday. “There’s a real threat. There’s numerous threats against aviation. That’s really the thing they’re really obsessed with, the terrorists, the idea of knocking down an airplane in flight, particularly if it is a US carrier, particularly if it is full of mostly US folks.”

Is there an FOIA exception to obtaining the last fitness report on US Secretary of Homeland Security Gen. John F. Kelly when he was serving with the Marines?

Loading fire-prone laptops, which may potentially also contain bombs, into a planes cargo hold for “safety,” raises serious questions about Kelly’s mental competence.

Banning laptops could be a ruse to get passengers to use cloud services for their data, making it more easily available to the NSA.

As the general says, there are people obsessed with “the idea of knocking down an airplane in fight,” but those are mostly found in the Department of Homeland Security.

You need not take my word for it, consider the Wikipedia timeline of airline bombings shows eight such bombings since December of 2001. I find it difficult to credit “obsession” when worldwide there is only one bomb attack on an airline every two years.

Moreover, the GAO in Airport Perimeter and Access Control Security Would Benefit from Risk Assessment and Strategy Updates (2016) found the TSA has not evaluated the vulnerability at 81% of the 437 commercial airports. US airports are vulnerable and the TSA can’t say which ones or by how much.

If terrorists truly were “obsessed,” in General Kelly’s words, the abundance of vulnerable US airports should see US aircraft dropping like flies. Except they’re not.

PS: Anticipating a complete ban on laptops, now would be a good time to invest in airport laptop rental franchises.

Deep Learning – Dodging The NSA

May 29th, 2017

The $1700 great Deep Learning box: Assembly, setup and benchmarks by Slav Ivanov.

Ivanov’s motivation for local deep learning hardware came from monthly AWS bills.

You may suffer from those or be training on data sets you’d rather not share with the NSA.

For whatever reason, follow these detailed descriptions to build your own deep learning box.

Caution: If more than a month or more has lapsed from this post and your starting to build a system, check all the update links. Hardware and prices change rapidly.

The “blue screen of death” lives! (Humorous HTML Links)

May 29th, 2017

A simple file naming bug can crash Windows 8.1 and earlier by Steve J. Vaughan-Nichols.

From the post:

In a blast from the past, a Russian researcher has uncovered a simple bug in the NTFS file system that consistently crashed Windows Vista to 8.1 PCs.

Like the infamous Windows 95/98 /con/con bug, by simply entering a file name with “$MFT” the file-system bug locks up Windows at best, or dumps it into a “blue screen of death” at worse.

The bug won’t deliver malware but since it works in URLs (except for Chrome), humorous HTML links in emails are the order of the day.

Enjoy!

Combating YouTube Censorship (Carry Banned Videos Yourself)

May 29th, 2017

Memorial Day is always a backwards looking holiday, but reading How Terrorists Slip Beheading Videos Past YouTube’s Censors by Rita Katz, felt like time warping to the 1950’s.

Other jihadi propaganda on the video-sharing platform may be visually more low-key, but are just as insidious in their own ways.

There is a grim bit in comedian Dave Chappelle’s new Netflix special about clicking “don’t like” on an Islamic State beheading video.

“How is this guy cutting peoples’ heads off on YouTube?” Chappelle asks, noting the absurdity of it.

Don’t like. Click.

In reality, reports of extremist content littering YouTube aren’t new. But when hundreds of major advertisers began suspending contracts with YouTube and Google in recent months, boycotting the massive video-sharing platform over concerns with such explicit content, things got a lot more real.

Google services—namely YouTube—are the most plentiful and important links used by terrorist organizations to disseminate their propaganda. And despite all of YouTube’s efforts to keep them out thus far, such groups still manage to sneak their media onto its servers.
… (emphasis in original)

Whatever label you want to apply to another group, “terrorist,” “al Qaeda,” etc., censorship is and remains censorship.

Censorship and intimidation were practiced during the Red Scare of the 1940’s/50’s, lives/careers were ruined, and we weren’t one whit safer than without it.

Want to combat YouTube censorship?

When videos are censored by YouTube, carry them on your site.

Suggested header: Banned on YouTube to make it easy to find.

It won’t stop YouTube’s censorship but it can defeat its intended outcome.

Data Journalists! Data Gif Tool (Google)

May 29th, 2017

While not hiding its prior salary discrimination against women, Google has created and released a tool for creating data gifs.

Make your own data gifs with our new tool by Simon Rogers.

From the post:

Data visualizations are an essential storytelling tool in journalism, and though they are often intricate, they don’t have to be complex. In fact, with the growth of mobile devices as a primary method of consuming news, data visualizations can be simple images formatted for the device they appear on.

Enter data gifs.

(gif omitted)

These animations can be used for a variety of sophisticated storytelling approaches among data journalists: one example is Lena Groeger, who has become *the* expert in working with data gifs.

Today we are releasing Data Gif Maker, a tool to help journalists make these visuals, which show share of search interest for two competing topics.

A good way to get your feet wet with simple data gifs.

Don’t be surprised that Google does good things for the larger community while engaging in evil conduct.

Racists sheriffs who used water cannon and dogs on Black children loved their own children and remembered their birthdays. WWII death camps guards attended church. Were kind to small animals.

People and their organizations are complicated and the reading public is ill-served by shallow reporting of only one aspect or another as the “true” view.

Ethics, Data Scientists, Google, Wage Discrimination Against Women

May 28th, 2017

Accused of underpaying women, Google says it’s too expensive to get wage data by Sam Levin.

From the post:

Google argued that it was too financially burdensome and logistically challenging to compile and hand over salary records that the government has requested, sparking a strong rebuke from the US Department of Labor (DoL), which has accused the Silicon Valley firm of underpaying women.

Google officials testified in federal court on Friday that it would have to spend up to 500 hours of work and $100,000 to comply with investigators’ ongoing demands for wage data that the DoL believes will help explain why the technology corporation appears to be systematically discriminating against women.

Noting Google’s nearly $28bn annual income as one of the most profitable companies in the US, DoL attorney Ian Eliasoph scoffed at the company’s defense, saying, “Google would be able to absorb the cost as easy as a dry kitchen sponge could absorb a single drop of water.”

Disclosure: I assume Google is resisting disclosure because it has in fact has a history of engaging in discrimination against women. It may or may not be discriminating this month/year, but if known, the facts will support the government’s claim. The $100,000 alleged cost is chump change to prove such a charge groundless. Resistance signals the charge has merit.

Levin’s post gives me reason to doubt Google will prevail on this issue or on the merits in general. Read it in full.

My question is what of the ethical obligations of data scientists at Google?

Should data scientists inside Google come forward with the requested information?

Should data scientists inside Google stage a work slow down to protest Googles’ resistance?

Exactly what should ethical data scientists do when their employer is the 500 pound gorilla in their field?

Do you think Google executives need a memo from their data scientists cluing them in on the ethical issues here?

Possibly not, this is old fashioned gender discrimination.

Google’s resistance signals to all of its mid-level managers that gender based discrimination will be defended.

Does that really qualify for “Don’t be evil?”

Thank You, Scott – SNL

May 26th, 2017

I posted this to Facebook, search for “Thanks Scott SNL” to find my post or that of others.

Included this note (with edits):

Appropriate social media warriors (myself included). From sexism and racism to fracking and pipelines, push back in the real world if you [want] change. Push back on social media for a warm but meaningless feeling of solidarity.

For me the “real world,” includes cyberspace, where pushing can have consequences.

You?

Hacking Fingerprints (Yours, Mine, Theirs)

May 25th, 2017

Neural networks just hacked your fingerprints by Thomas McMullan.

From the post:

Fingerprints are supposed to be unique markers of a person’s identity. Detectives look for fingerprints in crime scenes. Your phone’s fingerprint sensor means only you can unlock the screen. The truth, however, is that fingerprints might not be as secure as you think – at least not in an age of machine learning.

A team of researchers has demonstrated that, with the help of neural networks, a “masterprint” can be used to fool verification systems. A masterprint, like a master key, is a fingerprint that can be open many different doors. In the case of fingerprint identification, it does this by tricking a computer into thinking the print could belong to a number of different people.

“Our method is able to design a MasterPrint that a commercial fingerprint system matches to 22% of all users in a strict security setting, and 75% of all users at a looser security setting,” the researchers ­– Philip Bontrager, Julian Togelius and Nasir Memon – claim in a paper.

The tweet that brought this post to my attention didn’t seem to take this as good news.

But it is, very good news!

Think about it for a moment. Who is most likely to have “strict security settings?”

Your average cubicle dweller/home owner or …, large corporation or government entity?

What is more, if you, as a cubicle dweller are ever accosted for a breach of security, leaking fingerprint protected files, etc., what better defense than known spoofing of fingerprints?

Not that you would be guilty of such an offense but its always nice to have a credible defense in addition to being innocent!

For further details:

DeepMasterPrint: Generating Fingerprints for Presentation Attacks by Philip Bontrager, Julian Togelius, Nasir Memon.

Abstract:

We present two related methods for creating MasterPrints, synthetic fingerprints that a fingerprint verification system identifies as many different people. Both methods start with training a Generative Adversarial Network (GAN) on a set of real fingerprint images. The generator network is then used to search for images that can be recognized as multiple individuals. The first method uses evolutionary optimization in the space of latent variables, and the second uses gradient-based search. Our method is able to design a MasterPrint that a commercial fingerprint system matches to 22% of all users in a strict security setting, and 75% of all users at a looser security setting.

Defeating fingerprints as “conclusive proof” of presence is an important step towards freedom for us all.

Samba Flaw In Linux PCs

May 25th, 2017

Samba Flaw Allows Hackers Access Thousands of Linux PCs Remotely

From the post:

A remote code execution vulnerability in Samba has potentially exposed a large number of Linux and UNIX machines to remote attackers. The code vulnerability (CVE-2017-7494) affects all machines with Samba versions newer than the 3.5.0 released last March 2010, making it a 7-year old flaw in the system.

Samba is a software that runs on most of the operating systems used today like Windows, UNIX, IBM, Linux, OpenVMS, and System 390. Due to its open source nature resulting from the reimplementation of the SMB (Server Message Block) networking protocol, Samba enables non-Windows operating systems like Mac OS X or GNU/Linux to give access to folders, printers, and files with Windows OS.

All affected machines can be remotely controlled by uploading a shared library to a writable program. Another command can then be used to cause the server to execute the code. This allows hackers access Linux PC remotely according to the published advisory by Samba last Wednesday, May 24.

Cited but not linked:

The Rapid7 Community post in particular has good details.

Not likely a repeat of WannaCry. It’s hard imagine NHS trusts running Linux.

😉

Banking Malware Tip: Don’t Kill The Goose

May 25th, 2017

Dridex: A History of Evolution by Nikita Slepogin.

From the post:

The Dridex banking Trojan, which has become a major financial cyberthreat in the past years (in 2015, the damage done by the Trojan was estimated at over $40 million), stands apart from other malware because it has continually evolved and become more sophisticated since it made its first appearance in 2011. Dridex has been able to escape justice for so long by hiding its main command-and-control (C&C) servers behind proxying layers. Given that old versions stop working when new ones appear and that each new improvement is one more step forward in the systematic development of the malware, it can be concluded that the same people have been involved in the Trojan’s development this entire time. Below we provide a brief overview of the Trojan’s evolution over six years, as well as some technical details on its latest versions.

Compared to the 2015 GDP of the United States at ~$18 trillion, the ~$40 million damage from Dridex is a rounding error.

The Dridex authors are not killing the goose that lays golden eggs.

Compare the WannaCry ransomware attack, which provoked a worldwide, all hands on deck response, including Microsoft releasing free patches for unsupported software!

Maybe you can breach an FBI file server and dump its contents to Pastebin. That attracts a lot of attention and is likely to be your only breach of that server.

Strategy is as important in cyberwarfare as in more traditional warfare.

Critical: Draw Coffee Cup In TeX/LaTeX

May 25th, 2017

How to draw a coffee cup.

I’m sure everyone who has ever seen a post, article, book on TeX/LaTeX has lost sleep over how to draw a coffee cup.

Thanks to a tweet from @TeXtip, we can all rest easier. Or at least be bothered by other problems.

@TeXtip points to answers to vexing questions such as how to draw a coffee cup and acts as a reminder to use a little TeX/LaTeX everyday.

Enjoy!

Sanborn Fire Insurance Maps Now Online (25K, Goal: ~500K)

May 25th, 2017

Sanborn Fire Insurance Maps Now Online

From the post:

The Library of Congress has placed online nearly 25,000 Sanborn Fire Insurance Maps, which depict the structure and use of buildings in U.S. cities and towns. Maps will be added monthly until 2020, for a total of approximately 500,000.

The online collection now features maps published prior to 1900. The states available include Arizona, Arkansas, Colorado, Delaware, Iowa, Kentucky, Louisiana, Michigan, Nebraska, Nevada, North Dakota, South Dakota, Vermont, Wisconsin and Wyoming. Alaska is also online, with maps published through the early 1960s. By 2020, all the states will be online, showing maps from the late 1880s through the early 1960s.

In collaboration with the Library’s Geography and Map Division, Historical Information Gatherers digitized the Sanborn Fire Insurance Maps during a 16-month period at the Library of Congress. The Library is in the process of adding metadata and placing the digitized, public-domain maps on its website.

The Sanborn Fire Insurance Maps are a valuable resource for genealogists, historians, urban planners, teachers or anyone with a personal connection to a community, street or building. The maps depict more than 12,000 American towns and cities. They show the size, shape and construction materials of dwellings, commercial buildings, factories and other structures. They indicate both the names and width of streets, and show property boundaries and how individual buildings were used. House and block numbers are identified. They also show the location of water mains, fire alarm boxes and fire hydrants.

In the 19th century, specialized maps were originally prepared for the exclusive use of fire insurance companies and underwriters. Those companies needed accurate, current and detailed information about the properties they were insuring. The Sanborn Map Company was created around 1866 in the United States in response to this need and began publishing and registering maps for copyright. The Library of Congress acquired the maps through copyright deposit, and the collection grew to 700,000 individual sheets. The insurance industry eventually phased out use of the maps and Sanborn stopped producing updates in the late 1970s.

The Sanborn Maps Collection.

From the collection page:


Fire insurance maps are distinctive because of the sophisticated set of symbols that allows complex information to be conveyed clearly. In working with insurance maps, it is important to remember that they were made for a very specific use, and that although they are now valuable for a variety of purposes, the insurance industry dictated the selection of information to be mapped and the way that information was portrayed. Knowledge of the keys and colors is essential to proper interpretation of the information found in fire insurance maps.

The collection page relates that the keys and use of the keys change over time so use of a topic map with scoping topics is highly recommended.

There aren’t many maps for Georgia but my hometown in Louisiana has good coverage through 1900. Reasoning that roughly knowing the geography, history of the area will help with map interpretation.

Enjoy!

How Not To Be Wrong

May 25th, 2017

How Not To Be Wrong by Winny de Jong.

From the post:

At the intersection of data and journalism, lots can go wrong. Merely taking precautions might not be enough.

“It’s very well possible that your story is true but wrong,” New York Times data journalist Robert Gebeloff explained at the European Investigative Journalism Conference & Dataharvest, which was recently held in Mechelen, a city 20 minutes outside of Brussels.

“When I work on a big story, I want to know everything about the topic.” To make sure he doesn’t miss out, Gebeloff gets all the data sources he can, examines it in all relevant ways and publishes only what he believes to be true.

The best part of this post is the distillation of Gebeloff’s presentation into a How Not To Be Wrong Checklist.

De Jong’s checklist is remarkably similar to requirements for replication of experiments in science.

It would make a great PDF file to share with data scientists in general.

Leaking Photos Of: “Sophisticated Bomb Parts”

May 24th, 2017

Theresa May to tackle Donald Trump over Manchester bombing evidence by Heather Stewart, Robert Booth and Vikram Dodd.

From the post:


British officials were infuriated on Wednesday when the New York Times published forensic photographs of sophisticated bomb parts that UK authorities fear could complicate the expanding investigation into the lethal blast in which five further arrests have been made in the UK and two more in Libya.

See for yourself: Found at the Scene in Manchester: Shrapnel, a Backpack and a Battery by C. J. Chivers.

Let’s see, remains of a backpack, detonator, metal scrap, battery.

Do you see any sophisticated bomb parts?

Sophistication, skill, encryption, etc., are emphasized after terrorist attacks, I assume to excuse the failure of authorities to prevent such attacks.

That’s more generous than assuming UK authorities are so untrained they consider this a “sophisticated” bomb. Just guessing from the parts, hardly.

“Click Bait” at The Kicker – Covering Manchester

May 24th, 2017

The Kicker: The media’s model for covering terrorist attacks is broken by Pete Vernon.

From the webpage:

ON THE LATEST EPISODE of The Kicker, we run through some of the week’s biggest media stories, including a ratings leaderboard shakeup for cable news, a spurious conspiracy that consumed the right-wing media universe, and a new study that says–surprise–journalists drink too much caffeine and alcohol. Then, we move on to the media coverage of the terrorist attack in Manchester, and tackle why we think the industry’s model for covering terror attacks is broken. Finally, CJR’s David Uberti interviews Clara Jeffery, editor in chief of Mother Jones. They discuss the magazine’s novel approach to funding its political coverage as well as the role Mother Jones played in breaking the Trump-Russia story.

Subscribe via iTunes · Stitcher · RSS Feed · SoundCloud.

The podcast.

Leading with the promise of The media’s model for covering terrorist attacks is broken, I listened to The Kicker today.

If you like podcasts, you will like The Kicker, but it illustrates for me the difficulties associated with podcasts.

First, the podcast covered five separate stories in a little over thirty minutes. Ranging from cable news ratings, Seth Rich and fake news, the drinking habits of journalists, the media model for terrorist coverage (the story of interest to me), and the role of Mother Jones in the continuing From Russia With Love connection to Donald Trump.

As “click bait” for the podcast, the media reporting on terrorism segment starts at approximately 8:20 and ends at approximately 16:50, some 8 minutes and 30 seconds of coverage, much shorter than the account concerning Mother Jones (16:49 – 31:14).

Second, what discussion occurred, included insights such as “…breaking news rooms, larger news rooms, don’t have the privilege of deciding whether to cover a story…?” To be fair, that was followed by discussions of “how to cover stories,” the use of raw/unexplained user video, and the appropriateness of experts discussing politics immediately following such events.

The point that got dropped in the podcast was Christie Chisholm‘s remark:

…breaking news rooms, larger news rooms, don’t have the privilege of deciding whether to cover a story…

Why so?

I may be reading entirely too much into Christie’s comment, but it implies that some news rooms must fill N minutes of coverage on breaking events, whether there is meaningful content to be delivered or not. Yes?

If that is the case, that coverage of breaking events requires wall-to-wall coverage for N minutes, then raw, unexplained video, expert opinions with no facts, reporters asking for each others reactions, the spontaneous speculation and condemnations, become easily explainable.

There is too little content and too much media time available to cover it.

Building on Christie’s insight, The Kicker could have created a timeline of “facts” with regard to the explosion in Manchester as a way to illustrate when facts became known about the explosion and contrast that with the drone of factless coverage of the event.

That would have made a rocking podcast and a pointed one at that.

PS: The podcast did discuss other issues with media coverage of Manchester but the lack of depth and time prevented substantive analysis or proposals. Media coverage of terrorist events certainly merits extended treatment by podcast or otherwise.

Music Encoding Initiative

May 24th, 2017

Music Encoding Initiative

From the homepage:

The Music Encoding Initiative (MEI) is an open-source effort to define a system for encoding musical documents in a machine-readable structure. MEI brings together specialists from various music research communities, including technologists, librarians, historians, and theorists in a common effort to define best practices for representing a broad range of musical documents and structures. The results of these discussions are formalized in the MEI schema, a core set of rules for recording physical and intellectual characteristics of music notation documents expressed as an eXtensible Markup Language (XML) schema. It is complemented by the MEI Guidelines, which provide detailed explanations of the components of the MEI model and best practices suggestions.

MEI is hosted by the Akademie der Wissenschaften und der Literatur, Mainz. The Mainz Academy coordinates basic research in musicology through editorial long-term projects. This includes the complete works of from Brahms to Weber. Each of these (currently 15) projects has a duration of at least 15 years, and some (like Haydn, Händel and Gluck) are running since the 1950s. Therefore, the Academy is one of the most prominent institutions in the field of scholarly music editing. Several Academy projects are using MEI already (c.f. projects), and the Academy’s interest in MEI is a clear recommendation to use standards like MEI and TEI in such projects.

This website provides a Gentle Introduction to MEI, introductory training material, and information on projects and tools that utilize MEI. The latest MEI news, including information about additional opportunities for learning about MEI, is displayed on this page.

If you want to become an active MEI member, you’re invited to read more about the community and then join us on the MEI-L mailing list.

Any project that cites and relies upon Standard Music Description Language (SMDL), merits a mention on my blog!

If you are interested in encoding music or just complex encoding challenges in general, MEI merits your attention.

China Draws Wrong Lesson from WannaCry Ransomware

May 23rd, 2017

Chinese state media says US should take some blame for cyberattack

From the post:


China’s cyber authorities have repeatedly pushed for what they call a more “equitable” balance in global cyber governance, criticizing U.S. dominance.

The China Daily pointed to the U.S. ban on Chinese telecommunication provider Huawei Technologies Co Ltd, saying the curbs were hypocritical given the NSA leak.

Beijing has previously said the proliferation of fake news on U.S. social media sites, which are largely banned in China, is a reason to tighten global cyber governance.

The newspaper said that the role of the U.S. security apparatus in the attack should “instill greater urgency” in China’s mission to replace foreign technology with its own.

The state-run People’s Daily compared the cyber attack to the terrorist hacking depicted in the U.S. film “Die Hard 4”, warning that China’s role in global trade and internet connectivity opened it to increased risks from overseas.

China is certainly correct to demand a place at the table for China and other world powers in global cyber governance.

But China is drawing the wrong lesson from the WannaCry ransomeware attacks if that is used as a motivation for closed source Chinese software to replace “foreign” technology.

NSA staffers may well be working for Microsoft and/or Oracle, embedding NSA produced code in their products. With closed source code, it isn’t possible to verify the absence of such code or to prevent its introduction.

Sadly, the same is true if closed source code is written by Chinese programmers, some of who may have agendas, domestic or foreign, of their own.

The only defense to rogue code is to invest in open source projects. Not everyone will read every line of code but being available for being read, is a deterrent to obvious subversion of an applications security.

China should have “greater urgency” to abandon closed source software, but investing in domestic closed source only replicates the mistake of investing in foreign closed source software.

Opensource projects cover every office, business and scientific need.

Chinese government support for Chinese participation in existing and new opensource projects can make these projects competitors to closed and potential spyware products.

The U.S. made the closed source mistake for critical cyber infrastructure. China should not make the same mistake.

Fiscal Year 2018 Budget

May 23rd, 2017

Fiscal Year 2018 Budget.

In the best pay-to-play tradition, the Government Printing Office (GPO) has these volumes for sale:

America First: A Budget Blueprint To Make America Great Again By: Executive Office of the President, Office of Management and Budget. GPO Stock # 041-001-00719-9 ISBN: 9780160937620. Price: $10.00.

Budget of the United States Government, FY 2018 (Paperback Book) By: Executive Office of the President, Office of Management and Budget. GPO Stock # 041-001-00723-7 ISBN: 9780160939228. Price: $38.00.

Appendix, Budget of the United States Government, FY 2018 By: Executive Office of the President, Office of Management and Budget GPO Stock # 041-001-00720-2 ISBN: 9780160939334. Price: $79.00.

Budget of the United States Government, FY 2018 (CD-ROM) By: Executive Office of the President, Office of Management and Budget GPO Stock # 041-001-00722-9 ISBN: 9780160939358. Price: $29.00.

Analytical Perspectives, Budget of the United States Government, FY 2018 By: Executive Office of the President, Office of Management and Budget. GPO Stock # 041-001-00721-1 ISBN: 9780160939341. Price: $56.00.

Major Savings and Reforms: Budget of the United States Government, Fiscal Year 2018 By: Executive Office of the President, Office of Management and Budget. GPO Stock # 041-001-00724-5 ISBN: 9780160939457. Price: $35.00.

If someone doesn’t beat me to it (very likely), I will be either uploading the CD-ROM and/or pointing you to a location with the contents of the CD-ROM.

As citizens, whether you voted or not, you should have the opportunity to verify news accounts, charges and counter-charges with regard to the budget.

C Reference Manual (D.M. Richie, 1974)

May 23rd, 2017

C Reference Manual (D.M. Richie, 1974)

I mention the C Reference Manual, now forty-three (43) years old, as encouragement to write good documentation.

It may have a longer life than you ever expected!

For example, in 1974 Richie writes:

2.2 Identifier (Names)

An identifier is a sequence of letters and digits: the first character must be alphabetic.

Which we find replicated years later in ISO/IEC 8879 : 1986 (SGML):

4.198 name: A name token whose first character is a name start character.

4.201 name start character: A character that can begin a name: letters and others designated by the concrete syntax.

And in production [53]:


name start character =
LC Letter \
UC Letter \
LCNMSTRT \
UCNMSTRT

Where Figure 1 of 9.2.1 SGML Character defines LC Letter as a-z, UC Letter as A-Z, LCNMSTRT as (none), UCNMSTRT as (none), in the concrete syntax.

And in 1997, the letter vs. digit distinction, finds its way into Extensible Markup Language (XML) 1.0.


[4] NameChar ::= Letter | Digit | ‘.’ | ‘-‘ | ‘_’ | ‘:’ | CombiningChar | Extender
[5] Name ::= (Letter | ‘_’ | ‘:’) (NameChar)*

“Letter” is a link to a production referencing all the qualifying Unicode characters which is too long to include here.

What started off as an arbitrary choice, “alphabetic” characters as name start characters in 1974, is picked up some 12 years later (1986) in ISO/IEC 8879 (SGML), both of which were bound by a restricted character set.

When the opportunity came to abandon the letter versus digit distinction in name start characters (XML 1.0), the result is a larger character repertoire for name start characters, but digits continue as second-class citizens.

Can you point to an explanation why Richie preferred alphabetic characters over digits for name start characters?

The power of algorithms and how to investigate them (w/ resources)

May 23rd, 2017

The power of algorithms and how to investigate them by Katrien Vanherck.

From the post:

Most Americans these days get their main news from Google or Facebook, two tools that rely heavily on algorithms. A study in 2015 showed that the way a search engine like Google selects and prioritises search results on political candidates can have an influence on voters’ preferences.

Similarly, it has been shown that by tweaking the algorithms behind the Facebook newsfeed, the turnout of voters in American elections can be influenced. If Marc Zuckerberg were ever to run for president, he would theoretically have an enormously powerful tool at his disposal. (Note: as recent article in The Guardian investigated the misuse of big data and social media in the context of the Brexit referendum).

Algorithms are everywhere in our everyday life and are exerting a lot of power in our society. They prioritise, classify, connect and filter information, automatically making decisions on our behalf all the time. But as long as the algorithms remain a ‘black box’, we don’t know exactly how these decisions are made.

Are these algorithms always fair? Examples of possible racial bias in algorithms include the risk analysis score that is calculated for prisoners that are up for parole or release (white people appear to get more favourable scores more often) and the service quality of Uber in Washington DC (waiting times are shorter in predominantly white neighbourhoods). Maybe such unfair results are not only due to the algorithms, but the lack of transparency remains a concern.

So what is going on in these algorithms, and how can we make them more accountable?
… (emphasis in original)

A great inspirational keynote but short on details for investigation of algorithms.

Such as failing to mention the algorithms of both Google and Facebook are secret.

Reverse engineering those from results would be a neat trick.

Google would be the easier of the two, since you could script searches domain by domain with a list of search terms to build up a data set of its results. That would not result in the algorithm per se but you could detect some of its contours.

Google has been accused of liberal bias, Who would Google vote for? An analysis of political bias in internet search engine results, bias in favor of Hillary Clinton, Google defends its search engine against charges it favors Clinton, and, bias in favor of the right wing, How Google’s search algorithm spreads false information with a rightwing bias.

To the extent you identify Hillary Clinton with the rightwing, those results may be expressions of the same bias.

In any event, you can discern from those studies some likely techniques to use in testing Google search/auto-completion results.

Facebook is be harder because you don’t have access to or control over the content it is manipulating for delivery. Although by manipulating social media identities, you could test and compare the content that Facebook delivers.

Breaking News Consumer’s Handbook

May 22nd, 2017

From a tweet by @onthemedia, see their website: onthemedia.org.

If you follow #2:

2. Don’t trust anonymous sources.

Skip political reports in the New York Times and Washington Post.

Is there a market for delayed news?

I ask because I understand there was an explosion in Manchester Arena in England, 10:35 PM their local. Even as I type this, mis-information is flooding social media channels from any number of sources.

What if there was a news service with a variable delay, say minimum 7 days but maximum of 14 days, that delivered a coherent and summarized version of breaking events?

As opposed to the click-bait teasers that get shared/forwarded/re-tweeted without anyone reading the mis-information behind the click-bait.

Weaponizing GPUs (Terrorism)

May 22nd, 2017

Nvidia reports in: Modeling Cities in 3D Using Only Image Data:

ETH Zurich scientists leveraged deep learning to automatically stich together millions of public images and video into a three-dimensional, living model of the city of Zurich.

The platform called “VarCity” combines a variety of different image sources: aerial photographs, 360-degree panoramic images taken from vehicles, photos published by tourists on social networks and video material from YouTube and public webcams.

“The more images and videos the platform can evaluate, the more precise the model becomes,” says Kenneth Vanhoey, a postdoc in the group led by Luc Van Gool, a Professor at ETH Zurich’s Computer Vision Lab. “The aim of our project was to develop the algorithms for such 3D city models, assuming that the volume of available images and videos will also increase dramatically in the years ahead.”

Using a cluster of GPUs including Tesla K40s with cuDNN to train their deep learning models, the technology recognizes image content such as buildings, windows and doors, streets, bodies of water, people, and cars. Without human assistance, the 3D model “knows”, for example, what pavements are and – by evaluating webcam data – which streets are one-way only.

The data/information gap between nation states and non-nation state groups grows narrower everyday. Here, GPUs and deep learning, produce planning data terrorists could have only dreamed about twenty years ago.

Technical advances make precautions such as:

Federal, state, and local law enforcement let people know that if they take pictures or notes around monuments and critical infrastructure facilities, they could be subject to an interrogation or an arrest; in addition to the See Something, Say Something awareness campaign, DHS also has broader initiatives such as the Buffer Zone Protection Program, which teach local police and security how to spot potential terrorist activities. (DHS focus on suspicious activity at critical infrastructure facilities)

sound old fashioned and quaint.

Such measures annoy tourists but unless potential terrorists are as dumb as the underwear bomber, against a skilled adversary, not so much.

I guess that’s the question isn’t it?

Are you planning to fight terrorists from shallow end of the gene pool or someone a little more challenging?

The Secrets of Technical Writing

May 22nd, 2017

The Secrets of Technical Writing by Matthew Johnston.

From the post:

The process of writing code, building apps, or developing websites is always evolving, with improvements in coding tools and practices constantly arriving. But one aspect hasn’t really been brought along for the journey, passed-by in the democratisation of learning that the internet has brought about, and that’s the idea of writing about code.

Technical writing is one of the darkest of dark arts in the domain of code development: you won’t find too many people talking about it, you won’t find too many great examples of it, and even hugely successful tech companies have no idea how to handle it.

So, in an effort to change that, I’m going to share with you what I’ve learnt about technical writing from building Facebook’s Platform docs, providing documentation assistance to their Open Source projects, and creating a large, multi-part tutorial for Facebook’s F8 conference in 2016. When I talk about the struggles of writing docs, I’ve seen it happen at the biggest and best of tech companies, and I’ve experienced how difficult it can be to get it right.

These tips aren’t perfect, they aren’t applicable to everything, and I’m not at an expert-level of technical writing, but I think it’s important to share thoughts on this, and help bring technical writing up to par with the rest of code development.

Note that this is from the perspective of writing technical docs, it can just as easily apply to shorter tutorials, blog posts, presentations, or talks.

The best tip of the lot: start early! Don’t wait until just before launch to hack some documentation together.

If you don’t have the cycles, I know someone, who might. 😉

More Dicking With The NSA

May 21st, 2017

Privacy-focused Debian 9 ‘Stretch’ Linux-based operating system Tails 3.0 reaches RC status by Brian Fagioli.

From the post:

If you want to keep the government and other people out of your business when surfing the web, Tails is an excellent choice. The Linux-based operating system exists solely for privacy purposes. It is designed to run from read-only media such as a DVD, so that there are limited possibilities of leaving a trail. Of course, even though it isn’t ideal, you can run it from a USB flash drive too, as optical drives have largely fallen out of favor with consumers.

Today, Tails achieves an important milestone. Version 3.0 reaches RC status — meaning the first release candidate (RC1). In other words, it may soon be ready for a stable release — if testing confirms as much. If you want to test it and provide feedback, you can download the ISO now.

Fagioli covers some of the details but the real story is this:

The sooner testers (that can include you) confirm the stability, etc., of Tails Version 3.0 (RC1), the sooner it can be released for general use.

In part, the release schedule for Tails Version 3.0 (RC1) depends on you.

Your response?

Check Fagoli’s post for links to the release and docs.

immersive linear algebra

May 21st, 2017

immersive linear algebra by J. Ström, K. Åström, and T. Akenine-Möller.

Billed as:

The world’s first linear algebra book with fully interactive figures.

From the preface:

“A picture says more than a thousand words” is a common expression, and for text books, it is often the case that a figure or an illustration can replace a large number of words as well. However, we believe that an interactive illustration can say even more, and that is why we have decided to build our linear algebra book around such illustrations. We believe that these figures make it easier and faster to digest and to learn linear algebra (which would be the case for many other mathematical books as well, for that matter). In addition, we have added some more features (e.g., popup windows for common linear algebra terms) to our book, and we believe that those features will make it easier and faster to read and understand as well.

After using linear algebra for 20 years times three persons, we were ready to write a linear algebra book that we think will make it substantially easier to learn and to teach linear algebra. In addition, the technology of mobile devices and web browsers have improved beyond a certain threshold, so that this book could be put together in a very novel and innovative way (we think). The idea is to start each chapter with an intuitive concrete example that practically shows how the math works using interactive illustrations. After that, the more formal math is introduced, and the concepts are generalized and sometimes made more abstract. We believe it is easier to understand the entire topic of linear algebra with a simple and concrete example cemented into the reader’s mind in the beginning of each chapter.

Please contact us if there are errors to report, things that you think should be improved, or if you have ideas for better exercises etc. We sincerely look forward to hearing from you, and we will continuously improve this book, and add contributing people to the acknowledgement.
… (popups omitted)

Unlike some standards I could mention, but won’t, the authors number just about everything, making it easy to reference equations, illustrations, etc.

Enjoy!

Gotta Minute To Help @WikiCommons?

May 21st, 2017

Wikimedia NYC tweeted and Michael Peter Edison retweeted:

I know. Moving images from one silo to another.

But, it does increase the odds of @WikiCommons users finding the additional images. That’s a good thing.

Take a minute to visit, https://metmuseum.org/art/collection, select the public domain facet and grab an image to upload to WikiMedia Commons.

The process is quite painless, I uploaded The Pit of Acheron, or the Birth of of the Plagues of England today.

With practice it should take less than a minute but I got diverted looking for more background on the image.

Rowlandson the Caricaturist: A Selection from His Works, with Anecdotal Descriptions of His Famous Caricatures and a Sketch of His Life, Times, and Contemporaries, Volume 1 by Joseph Greco, J. W. Bouton, New York, 1880, page 112:

January 1. 1784. The Pit of Acheron, or the Birth of of the Plagues of England. —

The Pit of Acheron, if we may trust the satirist, is not situated at any considerable distance from Westminister; the precincts of that city appear through the smoke of the incantations which are carried on in the Pit. Three weird sisters, like the Witches in ‘Macbeth,’ are working the famous charm; a monstrous cauldron is supported by death’s-heads and harpies; the ingredients of the broth are various; a crucifix, a rosary, Deceit, Loans, Lotteries, and Pride, together with a fox’s head, cards, dice, daggers, and an executioner’s axe, &c., form portions of the accessories employed in these uncanny rites. Three heads are rising from the flames—the good-natured face of Lord North, the spectacled and incisive outline of Burke, and Fox’s ‘gunpowder jowl,’ which is drifting Westminster-wards. One hag, who is dropping Rebellion into the brew, is demanding, ‘Well, sister, what hast thou got for the ingredients of our charm’d pot?’ To this her fellow-witch, who is turning out certain mischievous ingredients which she has collected in her bag, is responding, ‘A best from Scotland called an Erskine, famous for duplicity, low art, and cunning; the other a monster who’d spurn even at Charter’s Rights.’ Erskine is shot out of the bag, crying, ‘I am like a Proteus, can turn any shape, from a sailor to a lawyer, and always lean to the strongest side!’ The other member, whose tail is that of a serpent, is singing, ‘Over the water and over the lee, thro’ hell I would follow my Charlie.’

I remain uncertain about the facts and circumstances surrounding the Westminster election of 1784 that would further explain this satire. Perhaps another day.

If you can’t wait, consider reading History of the Westminster Election, containing Every Material Occurrence, from its commencement On the First of April to the Close of the Poll, on the 17th of May, to which is prefixed A Summary Account of the Proceedings of the Late Parliament by James Hartley. (562 pages)

Rowlandson was also noted for his erotica: collection of erotica by Rowlandson.

Global Investigative Journalism Network: Russian Feed

May 21st, 2017

Global Investigative Journalism Network has added a Twitter feed in Russian: @gijnRu!

Great way for journalists to learn/reinforce their skills with Russian.

You can rely on The New York Times or the Washington Post as primary sources for the next 1339 days (as of today, Trump presidency) or you can strike out on your own.

As an editor, I would tire pretty quickly of “…as reported in NYT/WaPo….”

You?

Why Terrorism Sells

May 20th, 2017

Daniel Gilbert, Edgar Pierce Professor of Psychology at Harvard University, explains the lack of a focused response on global warming, and incidentally explains the popularity of terrorism in one presentation.

When I say “popularity of terorism,” I don’t mean terrorism is widespread, but fear of terrorism is and funding to combat terrorism defies accounting.

Terrorism has four characteristics, all of which global warming lack:

  • Intentional: We are hard-wired to judge the intent of others.
  • Immoral: Food/sex rules. Killing us falls under “immoral.”
  • Imminent: Clear and present danger. (As in maybe today.)
  • Instantaneous: Bombs, bullets, fast enough to be dangers.

Gilbert’s focus was on climate change but his presentation has helped me understand why terrorism sells.

Here is an image of the human brain Gilbert uses in his presentation:

The part of most brains that is fighting terrorism?

That would be the big dark blue part.

The part capable of recognizing death by terrorist and asteroid are about the same?

That would be the small red part.

Assuming the small red part, which does planning, etc., isn’t overwhelmed by plotting routes to banks for the money you have earned fighting terrorism.

Why my sales pitch on terrorism fails: I’m pushing against decisions made by the big dark blue part that benefit the small red part (career, success, profit).

Two lessons from Gilbert’s presentation:

First, look for issues/needs with these characteristics:

  • Intentional: We are hard-wired to judge the intent of others.
  • Immoral: Food/sex rules. Killing us falls under “immoral.”
  • Imminent: Clear and present danger. (As in maybe today.)
  • Instantaneous: Bombs, bullets, fast enough to be dangers.

Second, craft sales pitch to big dark blue part of the brain that benefit the small red part of the brain (career, success, profit).

If you or a company you know has a pitch man/woman who can handle the fear angle, I’m looking for work.

Just keep me away from your fearful clients. 😉

SketchRNN model released in Magenta [Hieroglyphs/Cuneiform Anyone?]

May 19th, 2017

SketchRNN model released in Magenta by Douglas Eck.

From the post:

Sketch-RNN, a generative model for vector drawings, is now available in Magenta. For an overview of the model, see the Google Research blog from April 2017, Teaching Machines to Draw (David Ha). For the technical machine learning details, see the arXiv paper A Neural Representation of Sketch Drawings (David Ha and Douglas Eck).

To try out Sketch-RNN, visit the Magenta GitHub for instructions. We’ve provided trained models, code for you to train your own models in TensorFlow and a Jupyter notebook tutorial (check it out!)

The code release is timed to coincide with a Google Creative Lab data release. Visit Quick, Draw! The Data for more information. For versions of the data pre-processed to work with Sketch-RNN, please refer to the GitHub repo for more information.

We’ll leave you with a look at yoga poses generated by moving through the learned representation (latent space) of the model trained on yoga drawings. Notice how it gets confused at around 10 seconds when it moves from poses standing towards poses done on a yoga mat. In our arXiv paper A Neural Representation of Sketch Drawings we discuss reasons for this behavior.

The paper, A Neural Representation of Sketch Drawings mentions:


ShadowDraw [17] is an interactive system that predicts what a finished drawing looks like based on a set of incomplete brush strokes from the user while the sketch is being drawn. ShadowDraw used a dataset of 30K raster images combined with extracted vectorized features. In this work, we use a much larger dataset of vector sketches that is made publicly available.

ShadowDraw is described at: ShadowDraw: Real-Time User Guidance for Freehand Drawing as:


We present ShadowDraw, a system for guiding the freeform drawing of objects. As the user draws, ShadowDraw dynamically updates a shadow image underlying the user’s strokes. The shadows are suggestive of object contours that guide the user as they continue drawing. This paradigm is similar to tracing, with two major differences. First, we do not provide a single image from which the user can trace; rather ShadowDraw automatically blends relevant images from a large database to construct the shadows. Second, the system dynamically adapts to the user’s drawings in real-time and produces suggestions accordingly. ShadowDraw works by efficiently matching local edge patches between the query, constructed from the current drawing, and a database of images. A hashing technique enforces both local and global similarity and provides sufficient speed for interactive feedback. Shadows are created by aggregating the top retrieved edge maps, spatially weighted by their match scores. We test our approach with human subjects and show comparisons between the drawings that were produced with and without the system. The results show that our system produces more realistically proportioned line drawings.

My first thought was the use of such techniques to assist in copying hieroglyphs or cuneiform as such or perhaps to assist in the practice of such glyphs.

OK, that may not have been your first thought but you have to admit it would make a rocking demonstration!