Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

November 30, 2017

Why “Russian Troll” is NOT a Useful Category/Class

Filed under: Government,Politics,Rhetoric — Patrick Durusau @ 4:39 pm

Caitlin Johnstone makes a great case in Accusing someone of being a ‘Russian troll’ is admitting you have no argument.

From the post:


Bottom line: when a stranger on the internet accuses you of being a Kremlin agent, of being a “useful idiot”, of “regurgitating Kremlin talking points”, this is simply their way of informing you that they have no argument for the actual thing that you are saying. If you’re using hard facts to point out the gaping plot holes in the Russiagate narrative, for example, and all they can do is call your argument Russian propaganda, this means that they have no counter-argument for the hard facts that you are presenting. They are deliberately shutting down the possibility of any dialogue with you because the cognitive dissonance you are causing them is making them uncomfortable.

Yes, paid shills for governments all over the world do indeed exist. But the odds are much greater that the stranger you are interacting with online is simply a normal person who isn’t convinced by the arguments that have been presented by the position you espouse. If your position is defensible you should be able to argue for it normally, regardless of whom you are speaking to.
… (emphasis in original)

Johnstone’s: Russian Troll accusation = No meaningful argument, postulate is a compelling one.

However, as the examples in Johnstone’s post also demonstrate, there is no common set of attributes that trigger its use.

“Russian Troll” is a brimful container of arbitrary whims, caprices and prejudices, which vary from user to user.

Arbitrary usage means it is unsuitable for use as a category or class, since any use is one off and unique.

I would not treat “Russian Troll” as a topic subject to merging but only as a string. Hopefully the 434K instances of it as a string (today, with quotes) will put users on notice of its lack of meaningful usage.

Will 2018 Be Your First Penetration? [Possession of SANS Posters]

Filed under: Cybersecurity,Security — Patrick Durusau @ 11:21 am

Blueprint: Building A Better Pen Tester Tuesday, January 9th, 2018 at 1:00 PM EST (18:00:00 UTC).

From the post:

Register for this webcast and have (4) printed copies of the *new* SANS Pen Test Poster “Blueprint: Building A Better Pen Tester” mailed to the address on your SANS Portal Account. Don’t have an account? Register today and then join Ed Skoudis, on January 9th at 1pm EST, as he dives into all the tips available on the poster so you’ll know how use it to become a better pen tester. If you’re not a pen tester, this webcast will help you learn many helpful tips to make you a better information security professional and bring additional value and tradecraft to your organization.

Posters will be mailed after the webcast in January 2018.
… (emphasis in original)

It’s never clear if “pen tester” is tongue in cheek or not. Perhaps the ambiguity is intentional.

Either I or Gimp failed to enlarge the posters sufficiently to produce readable text. But, given the reputation of SANS, it’s a nice way to start the new year.

Is possession of SANS posters considered evidence of illegal activity? Any court cases you can cite?

Over Thinking Secret Santa ;-)

Filed under: Graphs,R — Patrick Durusau @ 10:27 am

Secret Santa is a graph traversal problem by Tristan Mahr.

From the post:

Last week at Thanksgiving, my family drew names from a hat for our annual game of Secret Santa. Actually, it wasn’t a hat but you know what I mean. (Now that I think about it, I don’t think I’ve ever seen names drawn from a literal hat before!) In our family, the rules of Secret Santa are pretty simple:

  • The players’ names are put in “a hat”.
  • Players randomly draw a name from a hat, become that person’s Secret Santa, and get them a gift.
  • If a player draws their own name, they draw again.

Once again this year, somebody asked if we could just use an app or a website to handle the drawing for Secret Santa. Or I could write a script to do it I thought to myself. The problem nagged at the back of my mind for the past few days. You could just shuffle the names… no, no, no. It’s trickier than that.

In this post, I describe a couple of algorithms for Secret Santa sampling using R and directed graphs. I use the DiagrammeR package which creates graphs from dataframes of nodes and edges, and I liberally use dplyr verbs to manipulate tables of edges.

If you would like a more practical way to use R for Secret Santa, including automating the process of drawing names and emailing players, see this blog post.

If you haven’t done your family Secret Santa yet, you are almost late! (November 30, 2017)

Enjoy!

November 29, 2017

The Motherboard Guide to Avoiding State Surveillance [Where’s Your Security Cheat Sheet?]

Filed under: Cybersecurity,Journalism,News,Reporting — Patrick Durusau @ 9:50 pm

The Motherboard Guide to Avoiding State Surveillance by Sarah Jeong.

From the post:

In the wake of September 11th, the United States built out a massive surveillance apparatus, undermined constitutional protections, and limited possible recourse to the legal system.

Given the extraordinary capabilities of state surveillance in the US—as well as the capabilities of governments around the world—you might be feeling a little paranoid! It’s not just the NSA—the FBI and even local cops have more tools at their disposal to snoop on people than ever before. And there is a terrifying breadth of passive and unexpected surveillance to worry about: Your social media accounts can be subpoenaed, your emails or calls can be scooped up in bulk collection efforts, and your cell phone metadata can be captured by Stingrays and IMSI catchers meant to target someone else.

Remember, anti-surveillance is not the cure, it’s just one thing you can do to protect yourself and others. You probably aren’t the most at-risk person, but that doesn’t mean you shouldn’t practice better security. Surveillance is a complicated thing: You can practice the best security in the world, but if you’re sending messages to someone who doesn’t, you can still be spied on through their device or through their communications with other people (if they discuss the information you told them, for instance).

That’s why it’s important that we normalize good security practices: If you don’t have that much to be afraid of, it’s all the more important for you to pick up some of these tools, because doing that will normalize the actions of your friends who are, say, undocumented immigrants, or engaged in activism. Trump’s CIA Director thinks that using encryption “may itself be a red flag.” If you have “nothing to hide,” your use of encryption can actually help people at risk by obfuscating that red flag. By following this guide, you are making someone else safer. Think of it as herd immunity. The more people practice good security, the safer everyone else is.

The security tips provided earlier in this guide still apply: If you can protect yourself from getting hacked, you will have a better shot at preventing yourself from being surveilled (when it comes to surveilling iPhones, for instance governments often have few options besides hacking the devices). But tech tools don’t solve all problems. Governments have a weapon in their hands that criminal hackers do not: the power of the law. Many of the tips in this section of the guide will help you not only against legal requests and government hacking, but also against anyone else who may be trying to spy on you.

You don’t have to turn yourself into a security expert. Just start thinking about your risks, and don’t be intimidated by the technology. Security is an ongoing process of learning. Both the threats and the tools developed to address them are constantly changing, which is one of the reasons why privacy and security advice can often seem fickle and contradictory. But the tips below are a good starting point.

Jeong writes a great post but like most of you, what I need is a security cheat sheet so I start off everyday with the same standard security practices.

Read Jeong’s post but think about creating a personalized security cheat sheet that requires your initials at the start of each day and note any security fails on your part for that day.

At the end of each week, review your security fails for patterns and/or improvements.

What’s on your security cheat sheet?

Intellectual Property Rights Enforcement – EUC – Government For The Few, The Greedy, The Rich

Filed under: EU,Intellectual Property (IP) — Patrick Durusau @ 8:57 pm

The EU Commission, struggling to justify its existence to the few, presented initiatives on intellectual property rights today.

Two important “take aways” from the news bulletin:

First, the servitude of the EUC to the wealthy isn’t just my opinion, but the EUC admits as much saying:


And yet, according to a recent study, counterfeit and pirated goods account for 2.5% of global trade with a tendency to increase. 5% of all imports into the EU are counterfeit and pirated goods, corresponding to an estimated EUR 85 billion in illegal trade (see also Factsheet – Why Intellectual Property Rights matter).

So the EUC efforts today, are on behalf of the tiny group of people who control 5% of the imports into the EU?

And the members of that tiny group aren’t even members of the EU?

That’s serious hunting for wealthy people in need of government toadies!

Second, the EUC has created a communication tax on interoperable products by enabling FRAND (Fair, Reasonable and Non Discriminatory) licenses on technologies that should be governed by Open Source standards.

The EUC position can be illustrated by re-casting the familiar Matthew 6:28 verse from:

And why take ye thought for raiment? Consider the lilies of the field, how they grow; they toil not, neither do they spin:

to read:

Consider FRAND owners, how they grow; they toil not, neither do they spin:

FRAND owners are parasites on otherwise vibrant and growing networks of communication. They contribute nothing to the public. What, if anything, they contribute to members of the EUC isn’t known to me.

Amazon Neptune (graph database, preview)

Filed under: Graph Analytics,Graphs,Gremlin,TinkerPop — Patrick Durusau @ 5:54 pm

Amazon Neptune

From the webpage:

Amazon Neptune is a fast, reliable, fully-managed graph database service that makes it easy to build and run applications that work with highly connected datasets. The core of Amazon Neptune is a purpose-built, high-performance graph database engine optimized for storing billions of relationships and querying the graph with milliseconds latency. Amazon Neptune supports popular graph models Property Graph and W3C’s RDF, and their respective query languages Apache TinkerPop Gremlin and SPARQL, allowing you to easily build queries that efficiently navigate highly connected datasets. Neptune powers graph use cases such as recommendation engines, fraud detection, knowledge graphs, drug discovery, and network security.

Amazon Neptune is highly available, with read replicas, point-in-time recovery, continuous backup to Amazon S3, and replication across Availability Zones. Neptune is secure, with support for encryption at rest and in transit. Neptune is fully-managed, so you no longer need to worry about database management tasks such as hardware provisioning, software patching, setup, configuration, or backups.

Sign up for the Amazon Neptune preview here.

I’m skipping the rest of the graph/Amazon promotional material because if you are interested, you know enough about graphs to be bored by the repetition.

Interested in know your comments on:


Amazon Neptune provides multiple levels of security for your database, including network isolation using Amazon VPC, encryption at rest using keys you create and control through AWS Key Management Service (KMS), and encryption of data in transit using TLS. On an encrypted Neptune instance, data in the underlying storage is encrypted, as are the automated backups, snapshots, and replicas in the same cluster.

Experiences?

You are placing a great deal of trust in Amazon. Yes?

How Email Really Works

Filed under: Cybersecurity,Humor — Patrick Durusau @ 4:56 pm

There’s truth to both!

HT: @oxpss

November 28, 2017

Onion Deep Web Link Directory

Filed under: Dark Web,Searching — Patrick Durusau @ 2:11 pm

Onion Deep Web Link Directory (http://4bcdydpud5jbn33s.onion/)

Without a .onion address in hand, you will need to consult an .onion link list.

This .onion link list offers:

  • Hidden Service Lists and search engines – 23 links
  • Marketplace financial and drugs – 25 links
  • Hosting – 6 links
  • Blogs – 18 links
  • Forums and Chans – 12 links
  • Email and Messaging – 8 links
  • Political – 11 links
  • Hacking – 4 links
  • Warez – 12 links
  • Erotic 18+ – 7 links
  • Non-English – 18 links

Not an overwhelming number of links but enough to keep you and a Tor browser busy over the coming holiday season.

FYI, adult/erotic content sites are a primary means for the distribution of malware.

Hostile entity rules of engagement apply at all .onion addresses. (Just as no one “knows” you are a dog on the Internet, an entity found at a .onion address, could be the FBI. Act accordingly.)

I first saw this in the Hunchly Daily Hidden Services Report for 2017-11-03.

November 27, 2017

Why Study ARM Exploitation? 100 Billion Chips Shipped, 1 Trillion Projected in 20 Years.

Filed under: ARM,Cybersecurity,Malware — Patrick Durusau @ 10:07 pm

Getting Started With ARM Exploitation by Azeria.

From the post:

Since I published the tutorial series on ARM Assembly Basics, people keep asking me how to get started with exploitation on ARM. Since then, I added some tutorials on how to write ARM Shellcode, an introduction to Memory Corruptions, a detailed guide on how to set up your own ARM lab environment, and some small intro to debugging with GDB. Now it’s time we get to the meat of things and use all this knowledge to start exploiting some binaries.

This first part is aimed at those of you who have no experience with reverse engineering or exploiting ARM binaries. These challenges are relatively easy and are meant to introduce a few core concepts of binary exploitation.

Why Study ARM Exploitation?

Can you name another attack surface that large?

No?

Suggest you follow Azeria and her tutorials. Today.

eXist-db v3.6.0 [Prediction for 2018: Multiple data/document leak tsunamis. Are You Ready?]

Filed under: eXist,Government,Government Data,XML,XPath,XQuery — Patrick Durusau @ 9:28 pm

eXist-db v3.6.0

From the post:

Features

  • Switched Collation support to use ICU4j.
  • Implemented XQuery 3.1 UCA (Unicode Collation Algorithm).
  • Implemented map type parameters for XQuery F&O 3.1 fn:serialize.
  • Implemented declare context item for XQuery 3.0.
  • Implemented XQuery 3.0 Regular Expression’s support for non-capturing groups.
  • Implemented a type-safe DSL for describing and testing transactional operations upon the database.
  • Implemented missing node kind tests in the XQuery parser when using @ on an AbbrevForwardStep.
  • Added AspectJ support to the IntelliJ project files (IntelliJ Ultimate only).
  • Repaired the dependencies in the NetBeans project files.
  • Added support for Travis macOS CI.
  • Added support for AppVeyor Windows CI.
  • Updated third-party dependencies:
    • Apache Commons Codec 1.11
    • Apache Commons Compress 1.15
    • Apache Commons Lang 3.7
    • Eclipse AspectJ 1.9.0.RC1
    • Eclipse Jetty 9.4.7.v20170914
    • EXPath HTTP Client 20171116
    • Java 8 Functional Utilities 1.11
    • JCTools 2.1.1
    • XML Unit 2.4.0

Performance Improvements

  • Compiled XQuery cache is now multi-threaded; concurrency is now per-source.
  • RESTXQ compiled XQuery cache is now multi-threaded; concurrency is now per-query URI.
  • STX Templates Cache is now multithreaded.
  • XML-RPC Server will now use Streaming and GZip compression if supported by the client; enabled in eXist’s Java Admin Client.
  • Reduced object creation overhead in the XML-RPC Server.

Apps

The bundled applications of the Documentation, eXide, and Monex have all been updated to the latest versions.

Prediction for 2018: Multiple data/document leak tsunamis.

Are you prepared?

How are your XQuery skills and tools?

Or do you plan on regurgitating news wire summaries?

Berlusconi Market (Dark Web)

Filed under: Dark Web — Patrick Durusau @ 9:50 am

Berlusconi Market (http://55j6kjwki4vjtmzp.onion/)

This notice on Berlusconi Market brings a smile:

Due to high traffic, we are overwhelmed by tickets, vendor applies ecc. If any order-related error occurs, please contact us. Your money is safe. BM Staff

Hmmm, I’m using an allegedly non-traceable connection to a non-traceable website to connect with a non-traceable vendor, yet, I’m assured my money is safe.

That’s a big ask. 😉

The numbers will change but as of today:

Fraud 676
Drugs & Chemicals 2807
Guides & Tutorials 325
Counterfeit Items 145
Digital Products 347
Jewels & Gold 2
Weapons 75
Carded Items 20
Services 72
Software & Malware 20
Security & Hosting 10
Other Listings 33

Anonymous sources are as trustworthy as any government. Use security precautions suitable for a known hostile entity.

PS: As I cover useful Dark Web sites, I will be giving their .onion addresses. Not listing Dark Web addresses is a juvenile antic at best.

November 26, 2017

New York Times (on Dark Web)

Filed under: Journalism,News,Reporting — Patrick Durusau @ 11:35 am

Have you tried the New York Times (Dark Web) Site Map? AKA spiderbites.nytimes3xbfgragh.onion

Navigation has the usual Tor overhead, but not bad unless you expect an instance response. 😉

What’s your experience like?

I first saw this in the Hunchly Daily Hidden Services Report for 2017-11-01.

November 25, 2017

23 Deep Learning Papers To Get You Started — Part 1 (Reading Slowly)

Filed under: Artificial Intelligence,Deep Learning,Machine Learning — Patrick Durusau @ 9:36 pm

23 Deep Learning Papers To Get You Started — Part 1 by Rupak Kr. Thakur.

Deep Learning has probably been the single-most discussed topic in the academia and industry in recent times. Today, it is no longer exclusive to an elite group of scientists. Its widespread applications warrants that people from all disciplines have an understanding of the underlying concepts, so as to be able to better apply these techniques in their field of work. As a result of which, MOOCs, certifications and bootcamps have flourished. People have generally preferred the hands-on learning experiences. However, there is a considerable population who still give in to the charm of learning the subject the traditional way — through research papers.

Reading research papers can be pretty time-consuming, especially since there are hordes of publications available nowadays, as Andrew Ng said at an AI conference, recently, along with encouraging people to use the existing research output to build truly transformative solutions across industries.

In this series of blog posts, I’ll try to condense the learnings from some really important papers into 15–20 min reads, without missing out on any key formulas or explanations. The blog posts are written, keeping in mind the people, who want to learn basic concepts and applications of deep learning, but can’t spend too much time scouring through the vast literature available. Each part of the blog will broadly cater to a theme and will introduce related key papers, along with suggesting some great papers for additional reading.

In the first part, we’ll explore papers related to CNNs — an important network architecture in deep learning. Let’s get started!

The start of what promises to be a great series on deep learning!

While the posts will extract the concepts and important points of the papers, I suggest you download the papers and map the summaries back to the papers themselves.

It will be good practice on reading original research, not to mention re-enforcing what you have learned from the posts.

In my reading, I will be looking for ways to influence deep learning towards one answer or another.

Whatever they may say about “facts” in public, no sane client asks for advice without an opinion on the range of acceptable answers.

Imagine you found ISIS content on Twitter has no measurable impact on ISIS recruiting. Would any intelligence agency would ask you for deep learning services again?

A (somewhat) Shallower On-Ramp for Emacs

Filed under: Editor,Emacs,Lisp — Patrick Durusau @ 8:56 pm

Using Emacs – Introduction by Mike Zamansky.

From the webpage:

I’m sure I’ve mentioned that I’ve been an Emacs wonk for decades. Since the mid-80’s in fact. I’ve spent time using other editors, word processors, and development tools but always find my way back.

I recommend that budding computer science students develop a good tool set and encourage them to explore Emacs but while it’s pretty easy to load Emacs and find your way around, particularly if you use the mouse and menus there isn’t a clear path to take you from beginner to using it as an efficient tool let alone customizing it.

Inspired by Mattias Petter Johansson, or MPJ who make a weekly video, I decided to try to create a series of YouTube videos and matching blog posts. I’ll try to post one a week and I’ll try to keep the videos, at least after the first couple to just a few minutes and have them focus on micro-habits – one or two small things that you can bring to your work flow and internalize.

I say “somewhat” shallower because Zamansky presumes you have completed the basic Emacs tutorial (Meta-? t).

After completing the Emacs tutorial, start the Using Emacs Series of thirty-eight (38) videos.

The season of repetitive Christmas “classics” is upon us, making the Using Emacs Series even more welcome. (An observation for the US. I’m not familiar with mindless holiday television schedules in other countries. Consult your local curmudgeon.)

Daniel’s Hosting Service – Dark Web

Filed under: Dark Web — Patrick Durusau @ 10:42 am

Daniel’s Hosting Service (Onion Address)

Hunchly Daily Hidden Services Report delivers a daily email with hidden services discovered in the last 24 hours.

One of the common entries I have seen in those daily reports reads:

Site hosted by Daniel’s hosting service

which isn’t all that informative. 😉

I decided to check out Daniel’s Hosting Service (Onion Address) and found:

Here you can get yourself a hosting account on my server.

What you will get:

  • Free anonymous webhosting
  • Chose between PHP 7.0, 7.1 or no PHP support
  • Nginx Webserver
  • SQLite support
  • 1 MariaDB (MySQL) database
  • PHPMyAdmin and Adminer for web based database administration
  • Web-based file management
  • FTP access
  • SFTP access
  • No disk quota
  • mail() can send e-mails from your.onion@dhosting4okcs22v.onion (your.onion@hosting.danwin1210.me for clearnet)
  • Webmail and IMAP, POP3 and SMTP access to your mail account
  • Mail sent to anything@your.onion gets automatically redirected to your inbox
  • Your own .onion address
  • On request your own clearnet domain or a free subdomain of danwin1210.me. I can setup an I2P domain as well.
  • There is a missing feature or you need a special configuration? Just contact me and I’ll see what I can do.
  • Empty/Unused accounts will be automatically deleted after a month
  • More to come…

Rules

  • No child pornography!
  • No terroristic propaganda!
  • No illegal content according to German law!
  • No malware! (e.g. botnets)
  • No phishing!
  • No scams!
  • No spam!
  • No shops, markets or any other sites dedicated to making money! (This is a FREE hosting!)
  • No proxy scripts! (You are already using TOR and this will just burden the network)
  • No IP logger or similar de-anonymizer sites!
  • I preserve the right to delete any site for violating these rules and adding new rules at any time.

After reading the rules, I wondered, “Is this the dark web or not?” 😉

The list of sites hosted by Daniel can be found at: http://dhosting4okcs22v.onion/list.php, some 1409 public sites (577 hidden) as of today.

Daniel maintains an “Onion link list,” 5926 links, with this disclaimer:

I’m not responsible for any content of websites linked here. Be careful and use your brain.

Daniel has other resources and a simple registration form for a site.

I haven’t used Daniel’s hosting service but will in the near future to “kick the tires” so to speak.

Caution: This is a “free” dark web hosting service, so ask yourself what economic model is in play? If you are truly hidden from Daniel, how does he make a commodity out of you or your use of his service?

No reflection on Daniel as a person, assuming there is a Daniel person and he isn’t working for some intelligence service.

Intelligence services that steal, kidnap and murder based on whim and caprice are not above mis-representing themselves on the Dark Web.

That’s hard news to take, that intelligence services aren’t “playing fair” but it’s a fact. Intelligence services don’t play fair.

Prepare and play accordingly.

November 21, 2017

Maintaining Your Access to Sci-Hub

Filed under: Open Access,Open Data,Open Science,Publishing,Science — Patrick Durusau @ 4:42 pm

A tweet today by @Sci_Hub advises:

Sci-Hub is working. To get around domain names problem, use custom Sci-Hub DNS servers 80.82.77.83 and 80.82.77.84. How to customize DNS in Windows: https://pchelp.ricmedia.com/set-custom-dns-servers-windows/

No doubt, Elsevier will continue to attempt to interfere with your access to Sci-Hub.

Already the largest, bloated and insecure presence academic publishing presence on the Internet, Elsevier labors every day to become more of an attractive nuisance.

What corporate strategy is served by painting a flashing target on your Internet presence?

Thoughts?

PS: Do update your DNS entries while pondering that question.

November 20, 2017

What do you mean, “We?”

Filed under: Cybersecurity,Security — Patrick Durusau @ 10:35 am

Prasad Ajgaonkar reports in 94pc of cyber attacks are caused by lack of infosecurity awareness training. Is your organisation safe?:

Do you know that a cyber attack takes place every 10 minutes in India? This rate is higher than that in 2016, where a cyber attack took place once every 12 minutes. A study conducted by Fortinet found that a whopping 94 percent of IT experts believe that information security (InfoSec) practices in Indian organizations are sorely inadequate and completely fail to protect from cyber attacks in today’s world.

It is crucial to be aware that the exorbitantly high cyber attacks in India is a human issue, rather than an IT issue. This means that employees failing to follow InfoSec practices- rather than IT system failures- is the biggest contributor of cyber attacks.

Therefore, it is critical to ensure that all employees at an organisation are vigilant, fully aware of cyber-threats, and trained to follow InfoSec practices at all times.

Focusing on the lack of training for employees, the post suggests this solution:

Story-telling and scenario based training would be an excellent and highly effective way to ensure that employees consistently practice InfoSec measures. An effective InfoSec training programme has the following features:

  1. Educating employees through story-telling and interactive media – …
  2. Continuous top of the mind recall – …
  3. Presenting InfoSec tips, trivia and reminders to employees through mobile phone apps…
  4. Training through scenario-based assessments – …
  5. Training through group discussions – …

I have a simpler explanation for poor cybersecurity practices of employees in India.

The Hindu captured it in one headline: India Inc pay gap: CEOs earn up to 1,200-times of average staff

Many thought the American pay gap at CEOs make 271 times the pay of most workers was bad.

Try almost four (4) times the American CEO – worker pay gap.

How much commonality of interest exists between the worker who gets $1 and for every $1, their CEO gets $1,200?

Conventional training, excluding the use of drugs and/or physical torture, isn’t likely to create a commonality of interest. Yes?

Cybersecurity “solutions” that don’t address the worker to CEO wage gap, are castles made of sand.

SPARQL queries of Beatles recording sessions

Filed under: Music,Music Retrieval,RDF,SPARQL — Patrick Durusau @ 9:58 am

SPARQL queries of Beatles recording sessions – Who played what when? by Bob DuCharme.

From the post:

While listening to the song Dear Life on the new Beck album, I wondered who played the piano on the Beatles’ Martha My Dear. A web search found the website Beatles Bible, where the Martha My Dear page showed that it was Paul.

This was not a big surprise, but one pleasant surprise was how that page listed absolutely everyone who played on the song and what they played. For example, a musician named Leon Calvert played both trumpet and flugelhorn. The site’s Beatles’ Songs page links to pages for every song, listing everyone who played on them, with very few exceptions–for example, for giant Phil Spector productions like The Long and Winding Road, it does list all the instruments, but not who played them. On the other hand, for the orchestra on A Day in the Life, it lists the individual names of all 12 violin players, all 4 violists, and the other 25 or so musicians who joined the Fab Four for that.

An especially nice surprise on this website was how syntactically consistent the listings were, leading me to think “with some curl commands, python scripting, and some regular expressions, I could, dare I say it, convert all these listings to an RDF database of everyone who played on everything, then do some really cool SPARQL queries!”

So I did, and the RDF is available in the file BeatlesMusicians.ttl. The great part about having this is the ability to query across the songs to find out things such as how many different people played a given instrument on Beatles recordings or what songs a given person may have played on, regardless of instrument. In a pop music geek kind of way, it’s been kind of exciting to think that I could ask and answer questions about the Beatles that may have never been answered before.

Will the continuing popularity of the Beatles drive interest in SPARQL? Hard to say but DuCharme gives it a hard push in this post. It will certainly appeal to Beatles devotees.

Is it coincidence that DuCharme posted this on November 19, 2017, the same day as the reported death of Charles Mason? (cf. Helter Skelter)

There’s a logical extension to DuCharme’s RDF file, Charles Mason, the Mason family and music of that era.

Many foolish things were said about rock-n-rock in the ’60’s that are now being repeated about social media and terrorists. Same rant, same lack of evidence, same intolerance, same ineffectual measures against it. Not only can elders not learn from the past, they can’t wait to repeat it.

Be inventive! Learn from past mistakes so you can make new ones in the future!

So You Want to be a WIZARD [Spoiler Alert: It Requires Work]

Filed under: Computer Science,Programming — Patrick Durusau @ 9:28 am

So You Want to be a WIZARD by Julia Evans.

I avoid using terms like inspirational, transforming, etc. because it is so rare that software, projects, presentations merit merit those terms.

Today I am making an exception to that rule to say:

So You Want to be a Wizard by Julia Evans can transform your work in computer science.

Notice the use of “can” in that sentence. No guarantees because unlike many promised solutions, Julia says up front that hard work is required to use her suggestions successfully.

That’s right. If these methods don’t work for you it will be because you did not apply them. (full stop)

No guarantees you will get praise, promotions, recognition, etc., as a result of using Julia’s techniques, but you will be a wizard none the less.

One consolation is that wizards rarely notice back-biters, office sycophants, and a range of other toxic co-workers. They are too busy preparing themselves to answer the next technical issue that requires a wizard.

November 19, 2017

Shirriffs and Elephant Poaching

Filed under: Data Science,Environment — Patrick Durusau @ 9:27 am

I asked on Twitter yesterday:

How can data/computer science disrupt, interfere with, burden, expose elephant hunters and their facilitators? Serious question.

@Pembient pointed to Vulcan’s Domain Awareness Tool, describe in New Tech Gives Rangers Real-Time Tools to Protect Elephants as:


The Domain Awareness System (DAS) is a tool that aggregates the positions of radios, vehicles, aircraft and animal sensors to provide users with a real-time dashboard that depicts the wildlife being protected, the people and resources protecting them, and the potential illegal activity threatening them.

“Accurate data plays a critical role in conservation,” said Paul Allen. “Rangers deserve more than just dedication and good luck. They need to know in real-time what is happening in their parks.”

The visualization and analysis capabilities of DAS allow park managers to make immediate tactical decisions to then efficiently deploy resources for interdiction and active management. “DAS has enabled us to establish a fully integrated approach to our security and anti-poaching work within northern Kenya,” said Mike Watson, chief executive officer of Lewa Conservancy where the first DAS installation was deployed late last year. “This is making us significantly more effective and coordinated and is showing us limitless opportunities for conservation applications.”

The system has been installed at six protected wildlife conservation sites since November 2016. Working with Save the Elephants, African Parks Network, Wildlife Conservation Society, and the Singita Grumeti Fund as well as the Lewa Conservancy and Northern Rangelands Trust, a total of 15 locations are expected to adopt the system this year.

Which is great and a project that needs support and expansion.

However, the question remains that having “spotted” poachers, where are the resources to physically safeguard elephants and other targets of poachers?

A second link, also suggested by @Pembient, Wildlife Works, Wildlife Works Carbon / Kasigau Corridor, Kenya, another great project, reminds me of the Shirriffs of the Hobbits, who were distinguished from other Hobbits by a feather they wore in their caps:


Physical protection and monitoring – Wildlife Works trained over 120 young people, men and women, from the local communities to be Wildlife Rangers, and they perform daily foot patrols of the forest to ensure that it remains intact. The rangers are unarmed, but have the power of arrest granted by the local community.

Environmental monitoring isn’t like confronting poachers, or ordinary elephant hunters for that matter, who travel in packs, armed with automatic weapons, with dubious regard for lives other than their own.

Great programs, having a real impact, that merit your support, but not quite on point to my question of:

How can data/computer science disrupt, interfere with, burden, expose elephant hunters and their facilitators? Serious question.

Poachers must be stopped with police/military force. The use of DAS and similar information systems have the potential to effective deploy forces to stop poachers. Assuming adequate forces are available. The estimated loss of 100 elephants per day suggests they are not.

Hunters, on the other hand, are protected by law and tradition in their slaughter of adult elephants, who have no natural predators.

To be clearer, we know the classes of elephant hunters and facilitators exist, how should we go about populating those classes with instances, where each instance has a name, address, employer, website, email, etc.?

And once having that information, what can be done to to acknowledge their past, present or ongoing hunting of elephants? Acknowledge it in such a way as to discourage any further elephant hunting by themselves or anyone who reads about them?

Elephants aren’t killed by anonymous labels such as “elephant hunters,” or “poachers,” but by identifiable, nameable, traceable individuals.

Use data science to identify, name and trace those individuals.

November 17, 2017

DHS Algorithms – Putting Discrimination Beyond Discussion

Filed under: Algorithms,Government,Politics — Patrick Durusau @ 10:35 am

Coalition of 100+ tech groups and leaders warn the DHS that “extreme vetting” software will be a worse-than-useless, discriminatory nightmare by Cory Doctorow.

From the post:

In a pair of open letters to Letter to The Honorable Elaine C. Duke, Acting Secretary of Homeland, a coalition of more than 100 tech liberties groups and leading technology experts urged the DHS to abandon its plan to develop a black-box algorithmic system for predicting whether foreigners coming to the USA to visit or live are likely to be positive contributors or risks to the nation.

The letters warn that algorithmic assessment tools will be prone to religious and racial bias, in which programmers get to decide, without evidence, debate or transparency, what kind of person should be an American — which jobs, attitudes, skills and family types are “American” and which ones are “undesirable.”

Further, the system for predicting terrorist proclivities will draw from an infinitesimal data-set of known terrorists, whose common characteristics will be impossible to divide between correlative and coincidental.

If the Department of Homeland Security (DHS) needed confirmation it’s on the right track, then Doctorow and “the 100 tech liberties groups and leading technology experts” have provided that confirmation.


The letters warn that algorithmic assessment tools will be prone to religious and racial bias, in which programmers get to decide, without evidence, debate or transparency, what kind of person should be an American — which jobs, attitudes, skills and family types are “American” and which ones are “undesirable.”

To discriminate “…without evidence, debate or transparency…” is an obvious, if unstated, goal of the DHS “black-box algorithmic system.”

The claim by Doctorow and others the system will be ineffectual:

…the system for predicting terrorist proclivities will draw from an infinitesimal data-set of known terrorists, whose common characteristics will be impossible to divide between correlative and coincidental

imposes a requirement of effectiveness that has never been applied to the DHS.

Examples aren’t hard to find but consider that since late 2001, the Transportation Safety Administration (TSA) has not caught a single terrorist. Let me repeat that: Since late 2001, the Transportation Safety Administration (TSA) has not caught a single terrorist. But visit any airport and the non-terrorist catching TSA is in full force.

Since the Naturalization Act of 1790 forward, granting naturalization to “…free white person[s]…,” US immigration policy has been, is and likely will always be, a seething cauldron of discrimination.

That the DNS wants to formalize whim, caprice and discrimination into algorithms “…without evidence, debate or transparency…” comes as no surprise.

That Doctorow and others think pointing out discrimination to those with a history, habit and intent to discriminate is meaningful is surprising.

I’m doubtful that educating present members of Congress about the ineffective and discriminatory impact of the DHS plan will be useful as well. Congress is the source of the current discriminatory laws governing travel and immigration so I don’t sense a favorable reception there either.

Perhaps new members of Congress or glitches in DHS algorithms/operations that lead to unforeseen consequences?

November 16, 2017

Are You A Member of the 300+ Mile High Club? 1,738 Satellite Targets

Filed under: Cybersecurity,Radio,Security — Patrick Durusau @ 5:32 pm

UCS Satellite Database – In-depth details on the 1,738 satellites currently orbiting Earth.

From the post:

Assembled by experts at the Union of Concerned Scientists (UCS), the Satellite Database is a listing of the more than 1000 operational satellites currently in orbit around Earth.

Our intent in producing the database is to create a research tool for specialists and non-specialists alike by collecting open-source information on operational satellites and presenting it in a format that can be easily manipulated for research and analysis.

It is available as both a downloadable Excel file and in a tab-delimited text format. A version is also provided in which the “Name” column contains only the official name of the satellite in the case of government and military satellites, and the most commonly used name in the case of commercial and civil satellites.

Satellites are much easier targets than undersea cables. Specialized equipment required for both, but undersea cables also require a submarine while satellites only a line of sight. Much easier to arrange.

With a high quality antenna and electronic gear, the sky is alive with targets. For extra points, install your antenna remote to you and use an encrypted channel to control and receive data. (Makes you less obvious than several satellite dishes in the back yard.)

PS: Follow the USC Satellite DB on Twitter. Plus, the Union of Concerned Scientists.

10 Papers Every Developer Should Read (At Least Twice) [With Hyperlinks]

Filed under: Computer Science,Programming — Patrick Durusau @ 4:27 pm

10 Papers Every Developer Should Read (At Least Twice) by Michael Feathers

Feathers omits hyperlinks for the 10 papers every developer should read, at least twice.

Hyperlinks eliminate searches by every reader, saving them time and load on their favorite search engine, not to mention providing access more quickly. Feathers’ list with hyperlinks follows.

Most are easy to read but some are rough going – they drop off into math after the first few pages. Take the math to tolerance and then move on. The ideas are the important thing.

See Feather’s post for his comments on each paper.

Even a shallow web composed of hyperlinks is better than no web at all.

Why You Should Follow Caitlin Johnstone

Filed under: Government,Politics — Patrick Durusau @ 11:55 am

Why Everyone Should Do What WikiLeaks Did

From the post:


WikiLeaks did exactly what I would do, and so should you. We should all be shamelessly attacking the unelected power structure which keeps our planet locked in endless war while promoting ecocidal corporate interests which threaten the very ecosystemic context in which our species evolved. And we should be willing to use any tools at our disposal to do that.

I’ve been quite shameless about the fact that I’m happy to have my ideas advanced by people all across the political spectrum, from far left to far right. I will never have the ear of the US President’s eldest son, but if I did I wouldn’t hesitate to try and use that advantage if I thought I could get him to put our stuff out there. This wouldn’t mean that I support the US president, it would mean that I saw an opening to throw an anti-establishment idea over the censorship fence into mainstream consciousness, and I exploited the partisan self-interest of a mainstream figure to do that.

We should all be willing to do this. We should all get very clear that America’s unelected power establishment is the enemy, and we should shamelessly attack it with any weapons we’ve got. I took a lot of heat for expressing my willingness to have my ideas shared by high profile individuals on the far right, and I see the same outrage converging upon Assange. Assange isn’t going to stop attacking the establishment death machine with every tool at his disposal because of this outrage, though, and neither am I. The more people we have attacking the elites free from any burden of partisan or ideological nonsense, the better.

What she said.

Tools you suggest I should cover?

Caitlin Johnstone at:

Facebook

Medium

Twitter

Shape Searching Dictionaries?

Facebook, despite its spying, censorship, and being a shill for the U.S. government, isn’t entirely without value.

For example, this post by Simon St. Laurent:

Drew this response from Peter Cooper:

Which if you follow the link: Shapecatcher: Unicode Character Recognition you find:

Draw something in the box!

And let shapecatcher help you to find the most similar unicode characters!

Currently, there are 11817 unicode character glyphs in the database. Japanese, Korean and Chinese characters are currently not supported.
(emphasis in original)

I take “Japanese, Korean and Chinese characters are currently not supported.” means Anatolian Hieroglyphs; Cuneiform, Cuneiform Numbers and Punctuation, Early Dynastic Cuneiform, Old Persian, Ugaritic; Egyptian Hieroglyphs; Meroitic Cursive, and Meroitic Hieroglphs are not supported as well.

But my first thought wasn’t discovery of glyphs in Unicode Code Charts, although useful, but shape searching dictionaries, such as Faulkner’s A Concise Dictionary of Middle Egyptian.

A sample from Faulkner’s (1991 edition):

Or, The Student’s English-Sanskrit Dictionary by Vaman Shivram Apte (1893):

Imagine being able to search by shape for either dictionary! Not just as a gylph but as a set of glyphs, within any entry!

I suspect that’s doable based on Benjamin Milde‘s explanation of Shapecatcher:


Under the hood, Shapecatcher uses so called “shape contexts” to find similarities between two shapes. Shape contexts, a robust mathematical way of describing the concept of similarity between shapes, is a feature descriptor first proposed by Serge Belongie and Jitendra Malik.

You can find an indepth explanation of the shape context matching framework that I used in my Bachelor thesis (“On the Security of reCATPCHA”). In the end, it is quite a bit different from the matching framework that Belongie and Malik proposed in 2000, but still based on the idea of shape contexts.

The engine that runs this site is a rewrite of what I developed during my bachelor thesis. To make things faster, I used CUDA to accelereate some portions of the framework. This is a fairly new technology that enables me to use my NVIDIA graphics card for general purpose computing. Newer cards are quite powerful devices!

That was written in 2011 and no doubt shape matching has progressed since then.

No technique will be 100% but even less than 100% accuracy will unlock generations of scholarly dictionaries, in ways not imagined by their creators.

If you are interested, I’m sure Benjamin Milde would love to hear from you.

November 15, 2017

Going Among Capitalists? Don’t Forget Your S8 USB Cable!

Filed under: Cybersecurity,Privacy,Security — Patrick Durusau @ 5:45 pm

Teardown of a consumer voice/location cellular spying device that fits in the tip of a USB cable by Cory Doctorow.

From the post:

Mich from ha.cking bought a $25 “S8 data line locator” device — a cellular spying tool, disguised as a USB cable and marketed to the general public — and did a teardown of the gadget, offering a glimpse into the world of “trickle down surveillance” where the kinds of surveillance tools used by the NSA are turned into products and sold to randos over the internet for $25.

The S8 makes use of the GSM cellular network and takes a regular micro-SIM, and can use any of the international GSM bands. You communicate with it by sending it SMSes or by using a web front-end, which causes it to switch on a hidden mic so you can listen in on its surroundings; it can also give a coarse approximation of its location (based on GSM towers, not GPS, and accurate to within about 1.57km).

For all the technical details see: Inside a low budget consumer hardware espionage implant by mich @0x6d696368by.

In some legal jurisdictions use of this cable may be construed as a crime. But, as US torture of prisoners, NSA surveillance, and numerous other crimes by US operatives demonstrates, prosecution of crimes is at the whim and caprice of prosecutors.

Calling something a “crime” is pejorative labeling for media purposes, unless you are a prosecutor deciding on prosecution. Otherwise, it’s just labeling.

How-Keep A Secret, Well, Secret (Brill)

Filed under: Government,Government Data,Politics — Patrick Durusau @ 4:51 pm

Weapons of Mass Destruction: The Top Secret History of America’s Nuclear, Chemical and Biological Warfare Programs and Their Deployment Overseas, edited by Matthew M. Aid, is described as:

At its peak in 1967, the U.S. nuclear arsenal consisted of 31,255 nuclear weapons with an aggregate destructive power of 12,786 megatons – more than sufficient to wipe out all of humanity several hundred times over. Much less known is that hidden away in earth-covered bunkers spread throughout the U.S., Europe and Japan, over 40,000 tons of American chemical weapons were stored, as well as thousands of specially designed bombs that could be filled with even deadlier biological warfare agents.

The American WMD programs remain cloaked in secrecy, yet a substantial number of revealing documents have been quietly declassified since the late 1970s. Put together, they tell the story of how America secretly built up the world’s largest stockpile of nuclear, chemical, and biological weapons. The documents explain the role these weapons played in a series of world crises, how they shaped U.S. and NATO defense and foreign policy during the Cold War, and what incidents and nearly averted disasters happened. Moreover, they shed a light on the dreadful human and ecological legacy left by decades of nuclear, chemical and biological weapons manufacturing and testing in the U.S. and overseas.

This collection contains more than 2,300 formerly classified U.S. government documents, most of them classified Top Secret or higher. Covering the period from the end of World War II to the present day, it provides unique access to previously unpublished reports, memoranda, cables, intelligence briefs, classified articles, PowerPoint presentations, military manuals and directives, and other declassified documents. Following years of archival research and careful selection, they were brought together from the U.S. National Archives, ten U.S. presidential libraries, the NATO Archives in Brussels, the National Archives of the UK, the National Archives of Canada, and the National Archives of the Netherlands. In addition, a sizeable number of documents in this collection were obtained from the U.S. government and the Pentagon using the Freedom of Information Act (FOIA) and Mandatory Declassification Review (MDR) requests.

This collection comes with several auxiliary aids, including a chronology and a historiographical essay with links to the documents themselves, providing context and allowing for easy navigation for both students and scholars.

It’s an online resource of about 21,212 pages.

Although the editor, Aid and/or Brill did a considerable amount of work assembling these document, the outright purchase price: €4.066,00, $4,886.00 or the daily access price: $39.95/day, effectively keeps these once secret documents secret.

Of particular interest to historians and arms control experts, I expect those identifying recurrent patterns of criminal misconduct in governments will find the data of interest as well.

It does occur to me that when you look at the Contents tab, http://primarysources.brillonline.com/browse/weapons-of-mass-destruction#content-tab, each year lists the documents in the archive. Lists that could be parsed for recovery of the documents from other sources on the Internet.

You would still have to index (did I hear someone say topic map?) the documents, etc., but as a long term asset for the research community, it would be quite nice.

If you doubt the need for such a project, toss “USAF, Cable, CINCUSAFE to CSAF, May 6, 1954, Top Secret, NARA” into your nearest search engine.

How do you feel about Brill being the arbiter of 20th century history, for a price?

Me too.

From Forever Vulnerable (aka Microsoft) – Seventeen Years of Vulnerability

Filed under: Cybersecurity,Microsoft,Security — Patrick Durusau @ 4:15 pm

A seventeen year old vulnerability was patched in the Microsoft Equation Editor yesterday.

For a semi-technical overview, see Office Equation Editor Security Bug Runs Malicious Code Without User Interaction by Catalin Cimpanu.

For all the details and a back story useful for finding vulnerabilities, see: Skeleton in the closet. MS Office vulnerability you didn’t know about by Embedi.

Walking through the steps in the post to “re-discover” this vulnerability is good exercise.

It’s not the fault of Microsoft that its users fail to patch/upgrade Microsoft products. That being said, CVE-2017-11882, with a seventeen year range, should be added to your evergreen list of Microsoft vulnerabilities.

Call For Cyber Weapons (Arsenal at Black Hat Asia 2018)

Filed under: Conferences,Cybersecurity,Security — Patrick Durusau @ 11:46 am

Welcome to Arsenal at Black Hat Asia 2018 – Call for Tools Open

Deadline: January 10 at 23:59 Pacific

From the webpage:

The Black Hat Arsenal team will be back in Singapore with the very same goal: give hackers & security researchers the opportunity to demo their newest and latest code.

The Arsenal tool demo area is dedicated to researchers and the open source community. The concept is quite simple: we provide the space and you bring your machine to showcase your work and answer questions from delegates attending Black Hat.

Once again, the ToolsWatch (@toolswatch) team will work in conjunction with Black Hat for the special event Black Hat Arsenal Asia 2018.

The 16th session will be held at the Marina Bay Sands in Singapore from March 22-March 23, 2018.

The same rules to consider before applying to Arsenal:

  • Bring your computer (with VGA output), adapter, your tool, your stickers
  • Avoid stodgy presentations. Folks are expecting action, so give’em action.
  • No vendor pitches or gear!
  • Be yourself, be cool, and wear a smile.
  • Hug the folks at Arsenal :)
  • Above all, have tremendous fun!!

For any questions, contact blackhatarsenal@ubm.com.

*Please note: You may use the plaint text “Upload File” section if you wish to include whitepapers or research; however, this field is optional and not required.

Not as much advance notice as you have for Balisage 2018 but surely you are building new tools on a regular basis!

As you have learned from tools written by others, come to Arsenal at Black Hat Asia 2018 and enable others to learn from you.

Terminology: I say “weapons” instead of “tools” to highlight the lack of any “us” when it comes to cybersecurity.

Governments and corporations have an interest in personal privacy and security only when it furthers their agendas and none when it doesn’t.

Making governments and corporations more secure isn’t in my interest. Is it in yours? (Governments have declared their lack of interest in your privacy and security by their actions. Nothing more need be said.)

A Docker tutorial for reproducible research [Reproducible Reporting In The Future?]

Filed under: R,Replication,Reporting,Science — Patrick Durusau @ 10:07 am

R Docker tutorial: A Docker tutorial for reproducible research.

From the webpage:

This is an introduction to Docker designed for participants with knowledge about R and RStudio. The introduction is intended to be helping people who need Docker for a project. We first explain what Docker is and why it is useful. Then we go into the the details on how to use it for a reproducible transportable project.

Six lessons, instructions for installing Docker, plus zip/tar ball of the materials. What more could you want?

Science has paid lip service to the idea of replication of results for centuries but with the sharing of data and analysis, reproducible research is becoming a reality.

Is reproducible reporting in the near future? Reporters preparing their analysis and releasing raw data and their extraction methods?

Or will selective releases of data, when raw data is released at all, continue to be the norm?

Please let @ICIJorg know how you feel about data hoarding, #ParadisePapers, #PanamaPapers, when data and code sharing are becoming the norm in science.

Older Posts »

Powered by WordPress