Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 24, 2018

The Dark Web = Freedom of Speech

Filed under: Censorship,Free Speech,Privacy — Patrick Durusau @ 4:49 pm

Freedom of speech never was all that popular in the United States and recently it has become even less so.

Craigslist personals, some subreddits disappear after FOSTA passage by Cyrus Farivar.

From the post:

In the wake of this week’s passage of the Allow States and Victims to Fight Online Sex Trafficking Act (FOSTA) bill in both houses of Congress on Wednesday, Craigslist has removed its “Personals” section entirely, and Reddit has removed some related subreddits, likely out of fear of future lawsuits.

FOSTA, which awaits the signature of President Donald Trump before becoming law, removes some portions of Section 230 of the Communications Decency Act. The landmark 1996 law shields website operators that host third-party content (such as commenters, for example) from civil liability. The new bill is aimed squarely at Backpage, a notorious website that continues to allow prostitution advertisements and has been under federal scrutiny for years.

I am deeply saddened to report that the House vote was 388 ayes and 25 noes and the Senate vote was 97 to 2.

You can follow the EFF lead as they piss and moan about this latest outrage. But all their activity (and fund raising) didn’t prevent its passage. So, what are the odds the EFF will get it repealed? That’s what I thought.

I’m not looking for Craigslist to jump to the Dark Web but certainly subreddits should be able to make the switch. The more subreddits, along with new sites and services that switch to the Dark Web, the more its usage and bandwidth will grow. Looking forward to the day when the default configuration of new computers is for the Dark Web. The “open” web being an optional choice with appropriate warnings.

If you are not (yet) a Dark Web jockey, try: How To Access Notorious Dark Web Anonymously (10 Step Guide). Enough to get you started and to demonstrate the potential of the Dark Web.

March 23, 2018

BaseX 9.0 – The Spring Edition – 229 Days to US Mid-Term Elections

Filed under: BaseX,Politics,XML,XQuery — Patrick Durusau @ 7:32 pm

Christian Grün writes:

We are very happy to announce the release of BaseX 9.0!

The new version of our XML database system and XQuery 3.1 processor includes some great new features and a vast number of minor improvements and optimizations. It’s both the usage of BaseX in productive environments as well as the valuable feedback of our open source users that make BaseX better and better, and that allow and motivate us to keep going. Thanks to all of you!

Along with the new release, we invite you to visit our relaunched homepage: http://basex.org/.

Java 8 is now required to run BaseX. The most prominent features of Version 9.0 are:

Sorry! No spoilers here! Grab a copy of BaseX 9.0 and read Christian’s post for the details.

Take 229 days until the US mid-term elections (November 6, 2018) as fair warning that email leaks are possible (likely?) between now and election day.

The better your skills with BaseX, the better change you have to interfere with, sorry, participate in the 2018 election cycle.

Good luck to us all!

March 11, 2018

Phishing, The 43% Option

Filed under: Cybersecurity,Politics,Security — Patrick Durusau @ 2:54 pm

How’s that for a motivational poster?

You can, and some do, spend hours plumbing in the depths of code or chip design for vulnerabilities.

Or, you can look behind door #2, the phishing door, and find 43% of data breaches start with phishing.

Phishing doesn’t have the glamor or prestige of finding a Meltdown or Spectre bug.

But, on the other hand, do you want to breach a congressional email account for the 2018 mid-term election, or for the 2038 election?

Just so you know, no rumors of breached congressional email accounts have surfaced, at least not yet.

Ping me if you see any such news.

PS: The tweet points to: https://qz.com/998949/can-you-outwit-a-hacker/, an ad for AT&T.

Spreading “Fake News,” Science Says It Wasn’t Russian Bots

Filed under: Fake News,Politics,Twitter — Patrick Durusau @ 2:04 pm

The spread of true and false news online by Soroush Vosoughi, Deb Roy, and Sinan Aral. (Science 09 Mar 2018: Vol. 359, Issue 6380, pp. 1146-1151 DOI: 10.1126/science.aap9559)

Abstract:

We investigated the differential diffusion of all of the verified true and false news stories distributed on Twitter from 2006 to 2017. The data comprise ~126,000 stories tweeted by ~3 million people more than 4.5 million times. We classified news as true or false using information from six independent fact-checking organizations that exhibited 95 to 98% agreement on the classifications. Falsehood diffused significantly farther, faster, deeper, and more broadly than the truth in all categories of information, and the effects were more pronounced for false political news than for false news about terrorism, natural disasters, science, urban legends, or financial information. We found that false news was more novel than true news, which suggests that people were more likely to share novel information. Whereas false stories inspired fear, disgust, and surprise in replies, true stories inspired anticipation, sadness, joy, and trust. Contrary to conventional wisdom, robots accelerated the spread of true and false news at the same rate, implying that false news spreads more than the truth because humans, not robots, are more likely to spread it.

Real data science. The team had access to all the Twitter data and not a cherry-picked selection, which of course can’t be shared due to Twitter rules, or so say ISIS propaganda scholars.

The paper merits a slow read but highlights for the impatient:

  1. Don’t invest in bots or high-profile Twitter users for the 2018 mid-term elections.
  2. Craft messages with a high novelty factor that disfavor your candidates opponents.
  3. Your messages should inspire fear, disgust and surprise.

Democrats working hard to lose the 2018 mid-terms will cry you a river about issues, true facts, engagement on the issues and a host of other ideas used to explain losses to losers.

There’s still time to elect a progressive Congress in 2018.

Are you game?

March 8, 2018

Contesting the Right to Deliver Disinformation

Filed under: Fake News,Journalism,News — Patrick Durusau @ 8:42 pm

Eric Singerman reports on a recent conference titled: Understanding and Addressing the Disinformation Ecosystem.

He summarizes the conference saying:

The problem of mis- and disinformation is far more complex than the current obsession with Russian troll factories. It’s the product of the platforms that distribute this information, the audiences that consume it, the journalist and fact-checkers that try to correct it – and even the researchers who study it.

In mid-December, First Draft, the Annenberg School of Communication at the University of Pennsylvania and the Knight Foundation brought academics, journalists, fact-checkers, technologists and funders together in a two-day workshop to discuss the challenges produced by the current disinformation ecosystem. The convening was intended to highlight relevant research, share best-practices, identify key questions of scholarly and practical concern and outline a potential research agenda designed to answer these questions.

In preparation for the workshop, a number of attendees prepared short papers that could act as starting points for discussion. These papers covered a broad range of topics – from the ways that we define false and harmful content, to the dystopian future of computer-generated visual disinformation.

Download the papers here.

Singerman points out the very first essay concedes that “fake news” isn’t anything new. Although I would read Schudson and Zelizer (authors of the first paper) with care. They contend:


Fake news lessened in centrality only in the late 1800s as printed news, particularly in Britain and the United States, came to center on what Jean Chalaby called “fact-centered discursive practices” and people realized that newspapers could compete with one another not simply on the basis of partisan affiliation or on the quality of philosophical and political essays but on the immediacy and accuracy of factual reports (Chalaby 1996).

I’m sorry, that’s just factually incorrect. The 1890’s were the age of “yellow journalism,” a statement confirmed by the Digital Library of America‘s resource collection: Fake News in the 1890s: Yellow Journalism:

Alternative facts, fake news, and post-truth have become common terms in the contemporary news industry. Today, social media platforms allow sensational news to “go viral,” crowdsourced news from ordinary people to compete with professional reporting, and public figures in offices as high as the US presidency to bypass established media outlets when sharing news. However, dramatic reporting in daily news coverage predates the smartphone and tablet by over a century. In the late nineteenth century, the news media war between Joseph Pulitzer’s New York World and William Randolph Hearst’s New York Journal resulted in the rise of yellow journalism, as each newspaper used sensationalism and manipulated facts to increase sales and attract readers.

Many trace the origin of yellow journalism to coverage of the sinking of the USS Maine in Havana Harbor on February 15, 1898, and America’s entry in the Spanish-American War. Both papers’ reporting on this event featured sensational headlines, jaw-dropping images, bold fonts, and aggrandizement of facts, which influenced public opinion and helped incite America’s involvement in what Hearst termed the “Journal’s War.”

The practice, and nomenclature, of yellow journalism actually predates the war, however. It originated with a popular comic strip character known as The Yellow Kid in Hogan’s Alley. Created by Richard F. Outcault in 1895, Hogan’s Alley was published in color by Pulitzer’s New York World. When circulation increased at the New York World, William Randolph Hearst lured Outcault to his newspaper, the New York Journal. Pulitzer fought back by hiring another artist to continue the comic strip in his newspaper.

The period of peak yellow journalism by the two New York papers ended in the late 1890s, and each shifted priorities, but still included investigative exposés, partisan political coverage, and other articles designed to attract readers. Yellow journalism, past and present, conflicts with the principles of journalistic integrity. Today, media consumers will still encounter sensational journalism in print, on television, and online, as media outlets use eye-catching headlines to compete for audiences. To distinguish truth from “fake news,” readers must seek multiple viewpoints, verify sources, and investigate evidence provided by journalists to support their claims.

You can see the evidence relied upon by the DPLA for its claims about yellow dog journalism here: Fake News in the 1890s: Yellow Journalism.

Why Schudson and Zelizer thought Chalaby, J. “Journalism as an Anglo-American Invention,” European Journal of Communication 11 (3), 1996, 303-326, supported their case isn’t clear.

If you read the Chalaby article, you find it is primarily concerned with contrasting the French press with Anglo-American practices, a comparison in which the French come off a distant second best.

More to the point, the New York World, the New York Journal, nor yellowdog journalism appears anywhere in the Chalaby article. Check for yourself: Journalism as an Anglo-American Invention.

Chalaby does claim the origin of “fact-centered discursive practices” in the 1890’s but the absence of any mention of journalism that lead to the Spanish-American war, casts doubt on how much we should credit Chalaby’s knowledge of US journalism.

I haven’t checked the other footnotes of Schudson and Zelizer, I leave that as an exercise for interested readers.

I do think Schudson and Zelizer capture the main driver of concern over “fake news” when they say:

First, there is a great anxiety today about the border between professional journalists and others who through digital media have easy access to promoting their ideas, perspectives, factual reports, pranks, inanities, conspiracy theories, fakes and lies….

Despite being framed as a contest between factual reporting and disinformation, the dispute over disinformation/fake news is over the right to profit from disinformation/fake news.

If you need a modern example of yellow journalism, consider the ongoing media frenzy over Russian “interference” in US elections.

How often do you hear reports of context that include instances of US-sponsored assassinations, funded and armed government overthrows, active military interference with both elections and governments, by the US?

What? Some Russians bought Facebook ads and used election hashtags on Twitter? That compares to overthrowing other governments? The long history of the U.S. interfering with elections elsewhere. (tip of the iceberg)

The constant hyperbole in the “Russian interference” story is a clue that journalists and social media are re-enacting the roles played by the New York World and the New York Journal, which lead to the Spanish-American war.

Truth be told, we should thank social media for the free distribution of disinformation, previously available only by subscription.

Discerning what is or is not accurate information, as always, falls on the shoulders of readers. It has ever been thus.

Confluence: Mapping @apachekafka connect schema types – to usual suspects

Filed under: Database,Kafka — Patrick Durusau @ 4:27 pm

Confluence has posted a handy mapping from Kafka connect schema types to MySQL, Oracle, PostgreSQL, SQLite, SQL Server and Vertica.

The sort of information that I will waste 10 to 15 minutes every time I need it. Posting it here means I’ll cut the wasted time down to maybe 5 minutes if I remember I posted about it. 😉

Digital Public Library of America (DPLA) Has New Website!

Filed under: #DAPL,Library,Library Associations — Patrick Durusau @ 4:04 pm

Announcing the Launch of our New Website (the chest beating announcement)

From the post:

The Digital Public Library of America (DPLA) is pleased to unveil its all-new redesigned website, now live at https://dp.la. Created in collaboration with renowned design firm Postlight, DPLA’s new website is more user-centered than ever before, with a focus on the tools, resources, and information that matter most to DPLA researchers and learners of all kinds. In a shift from the former site structure, content that primarily serves DPLA’s network of partners and others interested in deeper involvement with DPLA can now be found on DPLA Pro.

You can boil the post down to two links: DPLA (DPLA Resources) and DPLA Pro (helping DPLA build and spread resources). What more needs to be said?

Oh, yeah, donate to support the DPLA!

March 6, 2018

Numba Versus C++ – On Wolfram CAs

Filed under: C/C++,Cellular Automata,Programming,Python — Patrick Durusau @ 7:49 pm

Numba Versus C++ by David Butts, Gautham Dharuman, Bill Punch and Michael S. Murillo.

Python is a programming language that first appeared in 1991; soon, it will have its 27th birthday. Python was created not as a fast scientific language, but rather as a general-purpose language. You can use Python as a simple scripting language or as an object-oriented language or as a functional language…and beyond; it is very flexible. Today, it is used across an extremely wide range of disciplines and is used by many companies. As such, it has an enormous number of libraries and conferences that attract thousands of people every year.

But, Python is an interpreted language, so it is very slow. Just how slow? It depends, but you can count on about 10-100 times as slow as, say, C/C++. If you want fast code, the general rule is: don’t use Python. However, a few more moments of thought lead to a more nuanced perspective. What if you spend most of the time coding, and little time actually running the code? Perhaps your familiarity with the (slow) language, or its vast set of libraries, actually saves you time overall? And, what if you learned a few tricks that made your Python code itself a bit faster? Maybe that is enough for your needs? In the end, for true high performance computing applications, you will want to explore fast languages like C++; but, not all of our needs fall into that category.

As another example, consider the fact that many applications use two languages, one for the core code and one for the wrapper code; this allows for a smoother interface between the user and the core code. A common use case is C or C++ wrapped by, of course, Python. As a user, you may not even know that the code you are using is in another language! Such a situation is referred to as the “two-language problem”. This situation is great provided you don’t need to work in the core code, or you don’t mind working in two languages – some people don’t mind, but some do. The question then arises: if you are one of those people who would like to work only in the wrapper language, because it was chosen for its user friendliness, what options are available to make that language (Python in this example) fast enough that it can also be used for the core code?

We wanted to explore these ideas a bit further by writing a code in both Python and C++. Our past experience suggested that while Python is very slow, it could be made about as fast as C using the crazily-simple-to-use library Numba. Our basic comparisons here are: basic Python, Numba and C++. Because we are not religious about Python, and you shouldn’t be either, we invited expert C++ programmers to have the chance to speed up the C++ as much as they could (and, boy could they!).

This webpage is highly annoying, in both Mozilla and Chrome. You’ll have to visit to get the full impact.

It is, however, also a great post on using Numba to obtain much faster results while still using Python. The use of Wolfram CAs (cellular automata) as examples is an added bonus.

Enjoy!

March 1, 2018

An Interactive Timeline of the Most Iconic Infographics

Filed under: Graphics,Infographics,Visualization — Patrick Durusau @ 8:26 pm

Map of Firsts: An Interactive Timeline of the Most Iconic Infographics by R. J. Andrews.

Careful with this one!

You might learn some history as well as discovering an infographic for your next project!

Enjoy!

MSDAT: Microsoft SQL Database Attacking Tool

Filed under: Cybersecurity,Database,Security — Patrick Durusau @ 9:30 am

MSDAT: Microsoft SQL Database Attacking Tool

From the webpage:

MSDAT (Microsoft SQL Database Attacking Tool) is an open source penetration testing tool that tests the security of Microsoft SQL Databases remotely.

Usage examples of MSDAT:

  • You have a Microsoft database listening remotely and you want to find valid credentials in order to connect to the database
  • You have a valid Microsoft SQL account on a database and you want to escalate your privileges
  • You have a valid Microsoft SQL account and you want to execute commands on the operating system hosting this DB (xp_cmdshell)

Tested on Microsoft SQL database 2005, 2008 and 2012.

As I mentioned yesterday, you may have to wait a few years until the Office of Personnel Management (OMP) upgrades to a supported version of Microsoft SQL database, but think of the experience you will have gained with MSDAT by that time.

And by the time the OPM upgrades, new critical security flaws will emerge in Microsoft SQL database 2005, 2008 and 2012. Under current management, the OPM is becoming less and less secure over time.

Would it help if I posed a street/aerial view of OPM headquarters in DC? Would that help focus your efforts at dropping infected USB sticks, malware loaded DVDs and insecure sex toys for OPM management to find?

OPM headquarters is not marked on the standard tourist map for DC. The map does suggest a number of other fertile places for your wares.

February 28, 2018

Liberals Amping Right Wing Conspiracies

Filed under: Fake News,News,Social Media,Social Networks — Patrick Durusau @ 9:19 pm

You read the headline correctly: Liberals Amping Right Wing Conspiracies.

It’s the only reasonable conclusion after reading Molly McKew‘s post: How Liberals Amped up a Paranoid Shooting Conspiracy Theory.

From the post:


This terminology camouflages the war for minds that is underway on social media platforms, the impact that this has on our cognitive capabilities over time, and the extent to which automation is being engaged to gain advantage. The assumption, for example, that other would-be participants in social media information wars who choose to use these same tactics will gain the same capabilities or advantage is not necessarily true. This is a playing field that is hard to level: Amplification networks have data-driven, machine learning components that work better with refinement over time. You can’t just turn one on and expect it to work perfectly.

The vast amounts of content being uploaded every minute cannot possibly be reviewed by human beings. Algorithms, and the poets who sculpt them, are thus given an increasingly outsized role in the shape of our information environment. Human minds are on a battlefield between warring AIs—caught in the crossfire between forces we can’t see, sometimes as collateral damage and sometimes as unwitting participants. In this blackbox algorithmic wonderland, we don’t know if we are picking up a gun or a shield.

McKew has a great description of the amplification in the Parkland shooting conspiracy case, but it’s after the fact and not a basis for predicting the next amplification event.

Any number of research projects suggest themselves:

  • Observing and testing social media algorithms against content
  • Discerning patterns in amplified content
  • Testing refinement of content
  • Building automated tools to apply lessons in amplification

No doubt all those are underway in various guises for any number of reasons. But are you going to share in those results to protect your causes?

Six Degrees of Wikipedia – Eye Candy or Opportunity for Serendipity?

Filed under: Search Interface,Serendipity — Patrick Durusau @ 3:32 pm

Six Degrees of Wikipedia

As the name implies, finds the shortest path between two Wikipedia pages.

Whatis.com defines serendipity in part as:

In general, serendipity is the act of finding something valuable or delightful when you are not looking for it. In information technology, serendipity often plays a part in the recognition of a new product need or in solving a design problem. Web surfing can be an occasion for serendipity since you sometimes come across a valuable or interesting site when you are looking for something else.

Serendipity requires exposure to things you aren’t looking for, search engines excel at that, but their results are so noisy that serendipity is a rare occurrence.

Six Degrees of Wikipedia may have a different result.

First and foremost, humans created the links, for reasons unknown, that form the six degrees of separation. The resulting six degrees is a snapshot of human input from dozens, if not hundreds, of human actors. All of who had an unknown motivation.

Second, the limitation to six degrees results in a graph and nodes that can be absorbed in a glance.

Compare to the “I can make big and dense graphs” so typical in the “analysis” of social media results. (Hint: If any US government agency is asking, “The Russians did it.” is the correct response. Gin up supporting data on your own.)

Six degrees between topics would make a fascinating way to explore a topic map, especially one that merged topics from different domains. Randomly select labels to appear along side those more familiar to a user. Provoke serendipity!

Covering Human Trafficking … Gulf Arab States (@GIJN)

Filed under: Journalism,News,Reporting — Patrick Durusau @ 2:51 pm

Guide to Covering Human Trafficking, Forced Labor & Undocumented Migration in Gulf Arab Countries by Migrant-Rights.org.

From the post:

Over 11 million migrant workers work in the six Middle Eastern countries — Saudi Arabia, Kuwait, the United Arab Emirates, Qatar, Bahrain and Oman — that make up the political and economic alliance known as the Gulf Cooperation Council (GCC). Migrants comprise an extraordinary 67 percent of the labor force in these countries. Reforms in labor laws, adopted by just a few Gulf countries, are rarely implemented.

Abuse of these workers is widespread, with contract violations, dangerous working conditions and unscrupulous traffickers, brokers and employers. Media outlets, both local and international, have generally not covered this topic closely. Journalists attempting to investigate human trafficking and forced labor in the region have faced a lack of information, restrictions on press freedom and security threats. Some have faced detention and deportation.

For these reasons, GIJN, in collaboration with human rights organizations, is launching this first bilingual guide to teach journalists best practices, tools and steps in reporting on human trafficking and forced labor in the Gulf region…

If you are reporting on any aspect of these issues, see also the GINJ’s global Reporting Guide to Human Trafficking & Slavery.

Be aware that residence in a Gulf Arab State isn’t a requirement for reporting on human trafficking.

The top port of entry for human trafficking in the United States is shown on this excerpt of a Google Map:

That’s right, the Hartsfield-Jackson Atlanta International Airport.

Despite knowing their port of entry, Hartsfield-Jackson has yet to make an arrest for human trafficking. (as of May 3, 2017)

Schemes such as Hartsfield-Jackson Wants Travelers to Be the ‘Eyes and Ears’ Detecting Sex Trafficking, may explain their lack of success. Making it everyone’s responsibility means it’s no one’s responsibility.

Improvements aren’t hard to imagine. Separating adults without minors from those traveling with minors would be a first step. Separating minors from their accompanying adults, with native speakers who can speak with the minors privately, plus advertised guarantees of protection in the United States, would be another.

Those who could greatly reduce human trafficking have made a cost/benefit analysis and chosen to allow it to continue. In both the Gulf Arab States, the United States and elsewhere.

I’m hopeful you will reach a different conclusion.

Supporting GIJN, Migrate-Rights.org, your local reporters, are all ways to assist in combating human trafficking. Data wranglers of all levels and hackers should volunteer their efforts.

February 27, 2018

Kiddie Hack – OPM

Filed under: Cybersecurity,Government,Security — Patrick Durusau @ 9:24 pm

Is it fair to point out the Office of Personnel Management (OMP) continues to fail to plan upgrades to its security?

That’s right, not OPM security upgrades are failing, but OPM is failing to plan for security upgrades. Three years after 21.5 million current and former fed data records were stolen from the OPM.

The inspector general report reads in part:


While we believe that the Plan is a step in the right direction toward modernizing OPM’s IT environment, it falls short of the requirements outlined in the Appropriations Act. The Plan identifies several modernization-related initiatives and allocates the $11 million amongst these areas, but the Plan does not
identify the full scope of OPM’s modernization effort or contain cost estimates for the individual initiatives or the effort as a whole. All of the other capital budgeting, project planning, and IT security requirements are similarly missing.

At this rate, hackers are stockpiling gear slow enough to work with OPM systems.

Be careful on eBay and other online sources. No doubt the FBI is monitoring purchases of older computer gear.

February 26, 2018

FastPhotoStyle [Re-writing Dickens]

Filed under: Graphics,Visualization — Patrick Durusau @ 9:18 pm

Start Photo:

Style Photo:

Result Photo (start + style):

Impressive!

There are several other sample transformations at the webpage.

From the webpage:

This code repository contains an implementation of our fast photorealistic style transfer algorithm. Given a content photo and a style photo, the code can transfer the style of the style photo to the content photo. The details of the algorithm behind the code is documented in our arxiv paper. Please cite the paper if this code repository is used in your publications.

Yijun Li (UC Merced), Ming-Yu Liu (NVIDIA), Xueting Li (UC Merced), Ming-Hsuan Yang (NVIDIA, UC Merced), Jan Kautz (NVIDIA)A Closed-form Solution to Photorealistic Image Stylization” arXiv preprint arXiv:1802.06474

Re-writing Dickens:


Marley: Why do you not believe your own eyes?

Scrooge: Software makes them a cheat! A pass of PhotoShop or a round with Gimp, to say nothing of fast photorealistic style transfer algorithms.

Doesn’t have the same ring to it does it?

Forbes Vouches For Public Data Sources

Filed under: Artificial Intelligence,BigData — Patrick Durusau @ 8:48 pm

For Forbes readers, a demonstration with one of Bernard Marr’s Big Data And AI: 30 Amazing (And Free) Public Data Sources For 2018 (Forbes, Feb. 26, 2018), adds a ring of authenticity to your data. Marr and by extension, Forbes has vouched for these data sets.

Beats the hell out of opera, medieval boys choirs, or irises for your demonstration. 😉

These data sets show up everywhere but a reprint from Forbes to leave with your (hopefully) future client, sets your data set from others.

Tip: As interesting as it is, I’d skip the CERN Open Data unless you are presenting to physicists. Yes? Hint: Pick something relevant to your audience.

Guide to Searching CIA’s Declassified Archives

Filed under: CIA,Government — Patrick Durusau @ 5:08 pm

The ultimate guide to searching CIA’s declassified archives Looking to dig into the Agency’s 70 year history? Here’s where to start by Emma Best.

From the webpage:

While the Agency deserves credit for compiling a basic guide to searching their FOIA reading room, it still omits information or leaves it spread out across the Agency’s website. In one egregious example, the CIA guide to searching the records lists only three content types that users can search for, a review of the metadata compiled by Data.World reveals an addition ninety content types. This guide will tell you everything you need to know to dive into CREST and start searching like a pro.

Great guide for anyone interested in the declassified CIA archives.

Enjoy!

#7 Believing that information leads to action (Myth of Liberals)

Filed under: Advertising,Persuasion,Politics — Patrick Durusau @ 3:29 pm

Top 10 Mistakes in Behavior Change

Slides from Stanford University’s Persuasive Tech Lab, http://captology.stanford.edu.

A great resource whether you are promoting a product, service or trying to “interfere” with an already purchased election.

I have a special fondness for mistake #7 on the slides:

Believing that information leads to action

If you want to lose the 2018 mid-terms or even worse, the presidential election in 2020, you keep believing in “educating” voters.

Ping me if you want to be a winning liberal.

Governments Are Secure, But Only By Your Forbearance (happens-before (HB) graphs)

Filed under: Cybersecurity,Security — Patrick Durusau @ 3:05 pm

MeltdownPrime and SpectrePrime: Automatically-Synthesized Attacks Exploiting Invalidation-Based Coherence Protocols by Caroline Trippel, Daniel Lustig, Margaret Martonosi.

Abstract:

The recent Meltdown and Spectre attacks highlight the importance of automated verification techniques for identifying hardware security vulnerabilities. We have developed a tool for synthesizing microarchitecture-specific programs capable of producing any user-specified hardware execution pattern of interest. Our tool takes two inputs: a formal description of (i) a microarchitecture in a domain-specific language, and (ii) a microarchitectural execution pattern of interest, e.g. a threat pattern. All programs synthesized by our tool are capable of producing the specified execution pattern on the supplied microarchitecture.

We used our tool to specify a hardware execution pattern common to Flush+Reload attacks and automatically synthesized security litmus tests representative of those that have been publicly disclosed for conducting Meltdown and Spectre attacks. We also formulated a Prime+Probe threat pattern, enabling our tool to synthesize a new variant of each—MeltdownPrime and SpectrePrime. Both of these new exploits use Prime+Probe approaches to conduct the timing attack. They are both also novel in that they are 2-core attacks which leverage the cache line invalidation mechanism in modern cache coherence protocols. These are the first proposed Prime+Probe variants of Meltdown and Spectre. But more importantly, both Prime attacks exploit invalidation-based coherence protocols to achieve the same level of precision as a Flush+Reload attack. While mitigation techniques in software (e.g., barriers that prevent speculation) will likely be the same for our Prime variants as for original Spectre and Meltdown, we believe that hardware protection against them will be distinct. As a proof of concept, we implemented SpectrePrime as a C program and ran it on an Intel x86 processor, averaging about the same accuracy as Spectre over 100 runs—97.9% for Spectre and 99.95% for SpectrePrime.

A separate paper is under review for the “tool” used in this article so more joy is on your way!

As a bonus, “happens-before (HB) graphs” are used, enabling exercise of those graph skills you built making cluttered Twitter graphs.

Good hunting!

February 22, 2018

Learning Drawing Skills To Help You Communicate

Filed under: Art,Graphics,Visualization — Patrick Durusau @ 9:19 pm

I sigh with despair every time I see yet another drawing by Julia Evans.

All of it is clever, clear and without effort on my part, beyond me.

Yeah, it’s the “without effort on my part” that keeps me from learning basic drawing skills.

You’re never going to say of a drawing by me, “There’s a proper Julia Evans!” but I don’t think basic drawing skills beyond me, provided I take the time to practice.

How expensive are guidebooks? Does free sound OK?

By E.G. Lutz, What to Draw and How to Draw It (1913), Drawing Made Easy (1935).

BTW, Lutz inspired Walt Disney with: Animated Cartoons: How They Are Made, Their Origin and Development.

I found this at The Public Domain Review. Support for them is always a good idea.

Of course I would rather be exploring nuances of XQuery, but that’s because XQuery is already familiar.

It’s trying the unfamiliar that leads to new skills, hopefully. 😉

Comparing Comprehensive English Grammars?

Filed under: Grammar,Language — Patrick Durusau @ 8:17 pm

Neal Goldfarb in SCOTUS cites CGEL (Props to Justice Gorsuch and the Supreme Court library) highlights two comprehensive grammars for English.

Both are known by the initials CGEL:

Being the more recent work, Cambridge Grammar of the English Language lists today for $279.30 (1860 pages), whereas Quirk’s 1985 Comprehensive Grammar of the English Language, can be had for $166.08 (1779 pages).

Interesting fact, the acronym CGEL was in use for 17 years by Comprehensive Grammar of the English Language before Cambridge Grammar of the English Language was published, using the same acronym.

Curious how much new information was added by the Cambridge grammar? If you had a machine readable text of both, excluded the examples and then calculated the semantic distance between sections on the same material, you could produce a measurement of the distance between the two texts.

Given the prices of academic texts, standardizing a method of comparison would be a boon to scholars and graduate students!

(No comment on the over-writing of the acronym for Quirk’s work by Cambridge.)

Deep Voice – The Empire Grows Steadily Less Secure

Filed under: Artificial Intelligence,Cybersecurity — Patrick Durusau @ 5:17 pm

Baidu AI Can Clone Your Voice in Seconds

From the post:

Baidu’s research arm announced yesterday that its 2017 text-to-speech (TTS) system Deep Voice has learned how to imitate a person’s voice using a mere three seconds of voice sample data.

The technique, known as voice cloning, could be used to personalize virtual assistants such as Apple’s Siri, Google Assistant, Amazon Alexa; and Baidu’s Mandarin virtual assistant platform DuerOS, which supports 50 million devices in China with human-machine conversational interfaces.

In healthcare, voice cloning has helped patients who lost their voices by building a duplicate. Voice cloning may even find traction in the entertainment industry and in social media as a tool for satirists.

Baidu researchers implemented two approaches: speaker adaption and speaker encoding. Both deliver good performance with minimal audio input data, and can be integrated into a multi-speaker generative model in the Deep Voice system with speaker embeddings without degrading quality.

See the post for links to three-second voice clips and other details.

Concerns?


The recent breakthroughs in synthesizing human voices have also raised concerns. AI could potentially downgrade voice identity in real life or with security systems. For example voice technology could be used maliciously against a public figure by creating false statements in their voice. A BBC reporter’s test with his twin brother also demonstrated the capacity for voice mimicking to fool voiceprint security systems.

That’s a concern? 😉

I think cloned voices of battlefield military commanders, cloned politician voices with sex partners, or “known” voices badgering help desk staff into giving up utility plant or other access, those are “concerns.” Or “encouragements,” depending on your interests in such systems.

If You Like “Fake News,” You Will Love “Fake Science”

Filed under: Fake News,Media,Science,Skepticism — Patrick Durusau @ 4:53 pm

Prestigious Science Journals Struggle to Reach Even Average Reliability by Björn Brembs.

Abstract:

In which journal a scientist publishes is considered one of the most crucial factors determining their career. The underlying common assumption is that only the best scientists manage to publish in a highly selective tier of the most prestigious journals. However, data from several lines of evidence suggest that the methodological quality of scientific experiments does not increase with increasing rank of the journal. On the contrary, an accumulating body of evidence suggests the inverse: methodological quality and, consequently, reliability of published research works in several fields may be decreasing with increasing journal rank. The data supporting these conclusions circumvent confounding factors such as increased readership and scrutiny for these journals, focusing instead on quantifiable indicators of methodological soundness in the published literature, relying on, in part, semi-automated data extraction from often thousands of publications at a time. With the accumulating evidence over the last decade grew the realization that the very existence of scholarly journals, due to their inherent hierarchy, constitutes one of the major threats to publicly funded science: hiring, promoting and funding scientists who publish unreliable science eventually erodes public trust in science.

Facts, even “scientific facts,” should be questioned, tested and never blindly accepted.

Knowing a report appears in Nature, or Science, or (zine of your choice), helps you find it. Beyond that, you have to read and evaluate the publication to credit it with more than a place of publication.

Reading beyond abstracts or click-bait headlines, checking footnotes or procedures, do those things very often and you will be in danger of becoming a critical reader. Careful!

February 21, 2018

Self-Inflicted Insecurity in the Cloud – Selling Legal Firm Data

Filed under: Cloud Computing,Cybersecurity — Patrick Durusau @ 11:54 am

The self-inflicted insecurity phrase being “…behind your own firewall….”

You can see the rest of the Oracle huffing and puffing here.

The odds of breaching law firm security are increased by:

  • Changing to an unfamiliar computing environment (the cloud), or
  • Changing to unfamiliar security software (cloud firewalls).

Either one is sufficient but together, security breaching errors are nearly certain.

Even with an increase in vulnerability, hackers still face the question of how to monetize law firm data?

The economics and markets for stolen credit card and personal data are fairly well known. The Underground Economy of Data Breaches by Wade Williamson, and Once Stolen, What Do Hackers Do With Your Data?.

Dumping law firm data, such as the Panama Papers, generates a lot of PR but doesn’t add anything to your bank account.

Extracting value from law firm data is a variation on e-discovery, a non-trivial process, briefly described in: the Basics of E-Discovery.

However embarrassing law firm data may be, to its former possessors or their clients, market mechanisms akin to those for credit/personal data have yet to develop.

Pointers to the contrary?

February 20, 2018

The EFF, Privilege, Revolution

Filed under: Cybersecurity,Politics,Privacy — Patrick Durusau @ 8:57 pm

The Revolution and Slack by Gennie Gebhart and Cindy Cohn.

From the post:

The revolution will not be televised, but it may be hosted on Slack. Community groups, activists, and workers in the United States are increasingly gravitating toward the popular collaboration tool to communicate and coordinate efforts. But many of the people using Slack for political organizing and activism are not fully aware of the ways Slack falls short in serving their security needs. Slack has yet to support this community in its default settings or in its ongoing design.

We urge Slack to recognize the community organizers and activists using its platform and take more steps to protect them. In the meantime, this post provides context and things to consider when choosing a platform for political organizing, as well as some tips about how to set Slack up to best protect your community.

Great security advice for organizers and activists who choose to use Slack.

But let’s be realistic about “revolution.” The EFF, community organizers and activists who would use Slack, are by definition, not revolutionaries.

How else would you explain the pantheon of legal cases pursued by the EFF? When the EFF lost, did it seek remedies by other means? Did it take illegal action to protect/avenge injured innocents?

Privilege is what enables people to say, “I’m using the law to oppose to X,” while other people are suffering the consequences of X.

Privilege holders != revolutionaries.

FYI any potential revolutionaries: If “on the Internet, no one knows your a dog,” it’s also true “no one knows you are a government agent.”

February 17, 2018

Evidence for Power Laws – “…I work scientifically!”

Filed under: Computer Science,Networks,Scale-Free — Patrick Durusau @ 9:28 pm

Scant Evidence of Power Laws Found in Real-World Networks by Erica Klarreich.

From the post:

A paper posted online last month has reignited a debate about one of the oldest, most startling claims in the modern era of network science: the proposition that most complex networks in the real world — from the World Wide Web to interacting proteins in a cell — are “scale-free.” Roughly speaking, that means that a few of their nodes should have many more connections than others, following a mathematical formula called a power law, so that there’s no one scale that characterizes the network.

Purely random networks do not obey power laws, so when the early proponents of the scale-free paradigm started seeing power laws in real-world networks in the late 1990s, they viewed them as evidence of a universal organizing principle underlying the formation of these diverse networks. The architecture of scale-freeness, researchers argued, could provide insight into fundamental questions such as how likely a virus is to cause an epidemic, or how easily hackers can disable a network.

An informative and highly entertaining read that reminds me of an exchange between in The Never Ending Story between Atreyu and Engywook.

Engywook’s “scientific specie-ality” is the Southern Oracle. From the transcript:

Atreyu: Have you ever been to the Southern Oracle?

Engywook: Eh… what do YOU think? I work scientifically!

In the context of the movie, Engywook’s answer is deeply ambiguous.

Where do you land on the power law question?

Working with The New York Times API in R

Filed under: Journalism,News,R,Reporting — Patrick Durusau @ 8:49 pm

Working with The New York Times API in R by Jonathan D. Fitzgerald.

From the post:

Have you ever come across a resource that you didn’t know existed, but once you find it you wonder how you ever got along without it? I had this feeling earlier this week when I came across the New York Times API. That’s right, the paper of record allows you–with a little bit of programming skills–to query their entire archive and work with the data. Well, it’s important to note that we don’t get the full text of articles, but we do get a lot of metadata and URLs for each of the articles, which means it’s not impossible to get the full text. But still, this is pretty cool.

So, let’s get started! You’re going to want to head over to http://developer.nytimes.com to get an API Key. While you’re there, check out the selection of APIs on offer–there are over 10, including Article Search, Archive, Books, Comments, Movie Reviews, Top Stories, and more. I’m still digging into each of these myself, so today we’ll focus on Article Search, and I suspect I’ll revisit the NYT API in this space many times going forward. Also at NYT’s developer site, you can use their API Tool feature to try out some queries without writing code. I found this helpful for wrapping my head around the APIs.

A great “getting your feet wet” introduction to the New York Times API in R.

Caution: The line between the New York Times (NYT) and governments is a blurry one. It has cooperated with governments in the past and will do so in the future. If you are betrayed by the NYT, you have no one but yourself to blame.

The same is true for the content of the NYT, past or present. Chance is not the deciding factor on stories being reported in the NYT. It won’t be possible to discern motives in the vast majority of cases but that doesn’t mean they didn’t exist. Treat the “historical” record as carefully as current accounts based on “reliable sources.”

Distributed Systems Seminar [Accounting For Hostile Environments]

Filed under: Distributed Computing,Distributed Consistency,Distributed Systems — Patrick Durusau @ 8:22 pm

Distributed Systems Seminar by Peter Alvaro.

From the webpage:

Description

This graduate seminar will explore distributed systems research, both current and historical, with a particular focus on storage systems and programming models.

Due to fundamental uncertainty in their executions arising from asynchronous communication and partial failure, distributed systems present unique challenges to programmers and users. Moreover, distributed systems are increasingly ubiquitous: nearly all non-trivial systems are now physically distributed. It is no longer possible to relegate responsibility for managing the complexity of distributed systems to a group of expert library or infrastructure writers: all programmers must now be distributed programmers. This is both a crisis and an opportunity.

A great deal of theoretical work in distributed systems establishes important impossibility results, including the famous FLP result, the CAP Theorem, the two generals problem and the impossibility of establishing common knowledge via protocol. These results tell us what we cannot achieve in a distributed system, or more constructively, they tell us about the properties we must trade off for the properties we require when designing or using large-scale systems. But what can we achieve? The history of applied distributed systems work is largely the history of infrastructures — storage systems as well as programming models — that attempt to manage the fundamental complexity of the domain with a variety of abstractions.

This course focuses on these systems, models and languages. We will cover the following topics:

  • Consistency models
  • Large-scale storage systems and data processing frameworks
  • Commit, consensus and synchronization protocols
  • Data replication and partitioning
  • Fault-tolerant design
  • Programming models
  • Distributed programming languages and program analysis
  • Seminal theoretical results in distributed systems

Readings

This course is a research seminar: we will focus primarily on reading and discussing conference papers. We will read 1-2 papers (typically 2) per session; for each paper, you will provide a brief summary (about 1 page). The summary should answer some or all of the following questions:

  • What problem does the paper solve? Is is important?
  • How does it solve the problem?
  • What alternative approaches are there? Are they adequately discussed in the reading?
  • How does this work relate to other research, whether covered in this course or not?
  • What specific research questions, if any, does the paper raise for you?

What a great list of readings!

An additional question of each paper: Does It Account For Hostile Environments?

As Alvaro says: “…nearly all non-trivial systems are now physically distributed.”

That’s a rather large attack surface to leave for unknown others, by unknown means, to secure to an unknown degree, on your behalf.

If you make that choice, add “cyber-victim” to your business cards.

If you aren’t already, you will be soon enough.

February 16, 2018

@GalaxyKate, Generators, Steganographic Fields Forever (+ Secure Message Tip)

Filed under: Graphics,Steganography,Virtualization — Patrick Durusau @ 11:57 am

Before you skip this post as just being about “pretty images,” know that generators span grammars to constraint solvers. Artistry for sure, but exploration can lead to hard core CS rather quickly.

I stumbled upon a @GalaxyKate‘s Generative Art & Procedural Content Starter Kit

Practical Procedural Generation for Everyone: Thirty or so minutes on YouTube, 86,133 views when I checked the link.

So you want to build a generator: In depth blog post with lots of content and links.

Encyclopedia of Generativity: As far as I can tell, a one issue zine by @GalaxyKate but it will take months to explore.

One resource I found while chasing these links was: Procedural Generation.

Oh, and you owe it to yourself to visit GalaxyKate’s homepage:

The small scale of my blog presentation makes that screenshot a pale imitation of what you will find. Great resource!

There’s no shortage of visual content on the Web, one estimate says in 2017, 74% of all internet traffic was video.

Still, if you practice steganographic concealment of information, you should make the work of the hounds as difficult as possible. Generators are an obvious way of working towards that goal.

One secure message tip: Other than for propaganda, which you want discovered and read, omit any greetings, closings, or other rote content, such as blessings, religious quotes, etc.

The famous German Enigma was broken by messages having the same opening text, routine information, closing text (Heil Hitler!), sending the same message in different encodings. Exploring the Enigma

Or in other words, Don’t repeat famous cryptographic mistakes!

February 15, 2018

Krita (open source painting program)

Filed under: Art,Graphics,Visualization — Patrick Durusau @ 9:30 am

Krita

Do you know Krita? Not being artistically inclined, I don’t often encounter digital art tools. Judging from the examples though:

I’m missing some great imagery, even if I can’t create the same.

Great graphics can enhance your interfaces, education apps, games, propaganda, etc.

« Newer PostsOlder Posts »

Powered by WordPress