Archive for the ‘Software’ Category

Summer Pocket Change – OrientDB Code Execution

Friday, July 14th, 2017

SSD Advisory – OrientDB Code Execution

From the webpage:

Want to get paid for a vulnerability similar to this one?

Contact us at: ssd@beyondsecurity.com

Vulnerability Summary

The following advisory reports a vulnerability in OrientDB which allows users of the product to cause it to execute code.

OrientDB is a Distributed Graph Database engine with the flexibility of a Document Database all in one product. The first and best scalable, high-performance, operational NoSQL database.

Credit

An independent security researcher, Francis Alexander, has reported this vulnerability to Beyond Security’s SecuriTeam Secure Disclosure program.

Vendor response

The vendor has released patches to address this vulnerability.

For more information: https://github.com/orientechnologies/orientdb/wiki/OrientDB-2.2-Release-Notes#security.

Some vulnerabilities require deep code analysis, others, well, just asking the right questions.

If you are looking for summer pocket change, check out default users, permissions, etc. on popular software.

Software Is Politics [Proudhon’s Response]

Sunday, February 19th, 2017

Software Is Politics by Richard Pope.

From the post:

If you work in software or design in 2016, you also work in politics. The inability of Facebook’s user interface, until recently, to distinguish between real and fake news is the most blatant example. But there are subtler examples all around us, from connected devices that threaten our privacy to ads targeting men for high-paying jobs.

Digital services wield power. They can’t be designed simply for ease of use—the goal at most companies and organizations. Digital services must be understandable, accountable, and trusted. It is now a commercial as well as a moral imperative.

DESIGN IS POLITICAL

Power and politics are not easy topics for many designers to chew on, but they’re foundational to my career. I worked for the U.K.’s Government Digital Service for five years, part of the team that delivered Gov.uk. I set up the labs team at Consumer Focus, the U.K.’s statutory consumer rights organization, building tools to empower consumers. In 2007, I cofounded the Rewired State series of hackdays that aimed to get developers and designers interested in making government better. I’ve also worked at various commercial startups including moo.com and ScraperWiki.

The last piece of work I did in government was on a conceptual framework for the idea of government as a platform. “Government as a platform” is the idea of treating government like a software stack to make it possible to build well-designed services for people. The work involved sketching some ideas out in code, not to try and solve them upfront, but to try and identify where some of the hard design problems were going to be. Things like: What might be required to enable an end-to-end commercial service for buying a house? Or what would it take for local authorities to be able to quickly spin up a new service for providing parking permits?

With this kind of thinking, you rapidly get into questions of power: What should the structure of government be? Should there be a minister responsible for online payment? Secretary of state for open standards? What does it do to people’s understanding of their government?

Which cuts to the heart of the problem in software design today: How do we build stuff that people can understand and trust, and is accountable when things go wrong? How do we design for recourse?
… (emphasis in original)

The flaw in Pope’s desire for applications are “…accountable, understandable, and trusted…” by all, is that it conceals the choosing of sides.

Or as Craig Gurian in Equally free to sleep under the bridge illustrates by quoting Anatole France:

“In its majestic equality, the law forbids rich and poor alike to sleep under bridges, beg in the streets and steal loaves of bread.”

Applications that are “…accountable, understandable, and trusted…” will have silently chosen sides just as the law does now.

Better to admit to and make explicit the choices of who serves and who eats in the design of applications. At least then disparities are not smothered by the pretense of equality.

Or as Proudhon would say:

What is equality before the law without equality of fortunes? A balance with false weights.

Speak not of “…accountable, understandable, and trusted…” applications in the abstract but for and against who?

Tooling Up: Adding Windows 10 to Ubuntu

Saturday, February 4th, 2017

In preparation for an exciting year, I have installed/upgraded several programs on Ubuntu but need to:

  • Generate OOXML files with MS Office
  • Run GIS software not otherwise available
  • Test IE/Office/Windows vulnerabilities
  • Use WebEx

That means a copy of Windows 10 to enable access to Office 365.

Abhishek Prakash’s How to Install Windows 10 in VirtualBox in Linux did the trick for me.

One caveat, my VirtualBox created by default an optical drive so when I added the Windows iso image as a second optical drive, starting the install reports no bootable media. Deleting the default optical drive, leaving only the Windows iso image fixed the problem.

The subscription/install of Office 365 went smoothly.

By default storing files on OneDrive. (1 TB)

Provocative name suggestions for encrypted core dumps?

Other than the glitch with the extra optical drive, it all went smoothly, albeit in Windows fashion, somewhat slowly at times.

Some traditions never change.

😉

Top considerations for creating bioinformatics software documentation

Wednesday, January 18th, 2017

Top considerations for creating bioinformatics software documentation by Mehran Karimzadeh and Michael M. Hoffman.

Abstract

Investing in documenting your bioinformatics software well can increase its impact and save your time. To maximize the effectiveness of your documentation, we suggest following a few guidelines we propose here. We recommend providing multiple avenues for users to use your research software, including a navigable HTML interface with a quick start, useful help messages with detailed explanation and thorough examples for each feature of your software. By following these guidelines, you can assure that your hard work maximally benefits yourself and others.

Introduction

You have written a new software package far superior to any existing method. You submit a paper describing it to a prestigious journal, but it is rejected after Reviewer 3 complains they cannot get it to work. Eventually, a less exacting journal publishes the paper, but you never get as many citations as you expected. Meanwhile, there is not even a single day when you are not inundated by emails asking very simple questions about using your software. Your years of work on this method have not only failed to reap the dividends you expected, but have become an active irritation. And you could have avoided all of this by writing effective documentation in the first place.

Academic bioinformatics curricula rarely train students in documentation. Many bioinformatics software packages lack sufficient documentation. Developers often prefer spending their time elsewhere. In practice, this time is often borrowed, and by ducking work to document their software now, developers accumulate ‘documentation debt’. Later, they must pay off this debt, spending even more time answering user questions than they might have by creating good documentation in the first place. Of course, when confronted with inadequate documentation, some users will simply give up, reducing the impact of the developer’s work.
… (emphasis in original)

Take to heart the authors’ observation on automatic generation of documentation:


The main disadvantage of automatically generated documentation is that you have less control of how to organize the documentation effectively. Whether you used a documentation generator or not, however, there are several advantages to an HTML web site compared with a PDF document. Search engines will more reliably index HTML web pages. In addition, users can more easily navigate the structure of a web page, jumping directly to the information they need.

I would replace “…less control…” with “…virtually no meaningful control…” over the organization of the documentation.

Think about it for a second. You write short comments, sometimes even incomplete sentences as thoughts occur to you in a code or data context.

An automated tool gathers those comments, even incomplete sentences, rips them out of their original context and strings them one after the other.

Do you think that provides a meaningful narrative flow for any reader? Including yourself?

Your documentation doesn’t have to be great literature but as Karimzadeh and Hoffman point out, good documentation can make the difference between use and adoption and your hard work being ignored.

Ping me if you want to take your documentation to the next level.

U.S. Navy As Software Pirates

Monday, November 14th, 2016

Navy denies it pirated 558K copies of software, says contractor consented by David Kravets.

From the post:

In response to a lawsuit accusing the US Navy of pirating more than 558,000 copies of virtual reality software, the Navy conceded Monday that it had installed the software on “hundreds of thousands of computers within its network” without paying the German software maker for it. But the Navy says it did so with the consent of the software producer.

I suspect that “consent” here means that Bitmanagement Software modified its product to remove installation restrictions in hopes the U.S. Navy would become utterly dependent upon the software and only then “notice” the Navy had licensed only 38 copies.

Nice try but sovereigns have been rolling citizens for generations.

The complaint and the government’s answer are both amusing reads.

The lesson here is you are responsible for protecting your property. Especially when exposing it to potential thieves.

Software Heritage – Universal Software Archive – Indexing/Semantic Challenges

Sunday, July 24th, 2016

Software Heritage

From the homepage:

We collect and preserve software in source code form, because software embodies our technical and scientific knowledge and humanity cannot afford the risk of losing it.

Software is a precious part of our cultural heritage. We curate and make accessible all the software we collect, because only by sharing it we can guarantee its preservation in the very long term.
(emphasis in original)

The project has already collected:

Even though we just got started, we have already ingested in the Software Heritage archive a significant amount of source code, possibly assembling the largest source code archive in the world. The archive currently includes:

  • public, non-fork repositories from GitHub
  • source packages from the Debian distribution (as of August 2015, via the snapshot service)
  • tarball releases from the GNU project (as of August 2015)

We currently keep up with changes happening on GitHub, and are in the process of automating syncing with all the above source code origins. In the future we will add many more origins and ingest into the archive software that we have salvaged from recently disappeared forges. The figures below allow to peek into the archive and its evolution over time.

The charters of the planned working groups:

Extending the archive

Evolving the archive

Connecting the archive

Using the archive

on quick review did not seem to me to address the indexing/semantic challenges that searching such an archive will pose.

If you are familiar with the differences in metacharacters between different Unix programs, that is only a taste of the differences that will be faced when searching such an archive.

Looking forward to learning more about this project!

Formal Methods for Secure Software Construction

Sunday, June 19th, 2016

Formal Methods for Secure Software Construction by Ben Goodspeed.

Abstract:

The objective of this thesis is to evaluate the state of the art in formal methods usage in secure computing. From this evaluation, we analyze the common components and search for weaknesses within the common workflows of secure software construction. An improved workflow is proposed and appropriate system requirements are discussed. The systems are evaluated and further tools in the form of libraries of functions, data types and proofs are provided to simplify work in the selected system. Future directions include improved program and proof guidance via compiler error messages, and targeted proof steps.

George chose Idris for this project saying:

The criteria for selecting a language for this work were expressive power, theorem proving ability (sufficient to perform universal quantification), extraction/compilation, and performance. Idris has sufficient expressive power to be used as a general purpose language (by design) and has library support for many common tasks (including web development). It supports machine verified proof and universal quantification over its datatypes and can be directly compiled to produce efficiently sized executables with reasonable performance (see section 10.1 for details). Because of these characteristics, we have chosen Idris as the basis for our further work. (at page 57)

The other contenders were Coq, Agda, Haskell, and Isabelle.

Ben provides examples of using Idris and his Proof Driven Development (PDD), but stops well short of solving the problem of secure software construction.

While waiting upon the arrival of viable methods for secure software construction, shouldn’t formal methods be useful in uncovering and documenting failures in current software?

Reasoning the greater specificity and exactness of formal methods will draw attention to gaps and failures concealed by custom and practice.

Akin to the human eye eliding over mistakes such as “When the the cat runs.”

The average reader “auto-corrects” for the presence of the second “the” in that sentence, even knowing there are two occurrences of the word “the.”

Perhaps that is a better way to say it: Formal methods avoid the human tendency to auto-correct or elide over unknown outcomes in code.

Learning to like design documents

Saturday, June 4th, 2016

Learning to like design documents by Julia Evans.

From the post:

Hi everyone! Today we’re going to talk about software engineering and process!

A design document is where, before starting to implement a system, you write up a thing explaining what the system is supposed to do first and how you’re planning to accomplish that. I think there are basically two goals:

  • tell people what you’re doing
  • figure out design problems with the system before you’ve been coding for 2 months

I understand that it’s super important to think ahead a lot before huge projects, but a little bit of thinking can be helpful even for smaller projects. I asked some people recently if they write design docs for small projects and some of them said “yeah totally! small ones! it helps! :D”.

I used to get kind of grumpy when someone was like “hey julia can you write a design document for your system?” It would seem like a reasonable idea, though, so I’d try to do it! But the first couple of times I tried to write one I felt like it didn’t actually really help me! I liked the idea in principle, but I didn’t really know how to apply it and I felt like it was hard to get good feedback.

Last week I wrote a design doc and I thought it was sort of helpful. Here are some current thoughts.

Be forewarned that Julia is a gifted writer and you will enjoy her posts more than your design documents. 😉

Still, Julia makes a great case for the use of design documents (a/k/a “documentation”).

Unless your job security is tied up in undocumented, spaghetti COBOL code (or its equivalent in another language), try putting Julia’s advice into action.

If you are looking for really broad but practical reading in programming, check out Julia’s list of all her posts. Pick one at random every week. You won’t be disappointed.

Flawed Input Validation = Flawed Subject Recognition

Friday, May 13th, 2016

In Vulnerable 7-Zip As Poster Child For Open Source, I covered some of the details of two vulnerabilities in 7-Zip.

Both of those vulnerabilities were summarized by the discoverers:

Sadly, many security vulnerabilities arise from applications which fail to properly validate their input data. Both of these 7-Zip vulnerabilities resulted from flawed input validation. Because data can come from a potentially untrusted source, data input validation is of critical importance to all applications’ security.

The first vulnerability is described as:

TALOS-CAN-0094, OUT-OF-BOUNDS READ VULNERABILITY, [CVE-2016-2335]

An out-of-bounds read vulnerability exists in the way 7-Zip handles Universal Disk Format (UDF) files. The UDF file system was meant to replace the ISO-9660 file format, and was eventually adopted as the official file system for DVD-Video and DVD-Audio.

Central to 7-Zip’s processing of UDF files is the CInArchive::ReadFileItem method. Because volumes can have more than one partition map, their objects are kept in an object vector. To start looking for an item, this method tries to reference the proper object using the partition map’s object vector and the “PartitionRef” field from the Long Allocation Descriptor. Lack of checking whether the “PartitionRef” field is bigger than the available amount of partition map objects causes a read out-of-bounds and can lead, in some circumstances, to arbitrary code execution.

(code in original post omitted)

This vulnerability can be triggered by any entry that contains a malformed Long Allocation Descriptor. As you can see in lines 898-905 from the code above, the program searches for elements on a particular volume, and the file-set starts based on the RootDirICB Long Allocation Descriptor. That record can be purposely malformed for malicious purpose. The vulnerability appears in line 392, when the PartitionRef field exceeds the number of elements in PartitionMaps vector.

I would describe the lack of a check on the “PartitionRef” field in topic maps terms as allowing a subject, here a string, of indeterminate size. That is there is no constraint on the size of the subject, which is here a string.

That may seem like an obtuse way of putting it, but consider that for a subject, here a string that is longer than the “available amount of partition may objects,” can be in association with other subjects, such as the user (subject) who has invoked the application(association) containing the 7-Zip vulnerability (subject).

Err, you don’t allow users with shell access to suid root do you?

If you don’t, at least not running a vulnerable program as root may help dodge that bullet.

Or in topic maps terms, knowing the associations between applications and users may be a window on the severity of vulnerabilities.

Lest you think logging suid is an answer, remember they were logging Edward Snowden’s logins as well.

Suid logs may help for next time, but aren’t preventative in nature.

BTW, if you are interested in the details on buffer overflows, Smashing The Stack For Fun And Profit looks like a fun read.

Cybersecurity Via Litigation

Friday, April 22nd, 2016

Ex-Hacker: If You Get Hacked, Sue Somebody by Frank Konkel.

From the post:

Jeff Moss, the hacker formerly known as Dark Tangent and founder of Black Hat and DEFCON computer security conferences, has a message for the Beltway tech community: If you get owned, sue somebody.

Sue the hackers, the botnet operators that affect your business or the company that developed insecure software that let attackers in, Moss said. The days of software companies having built-in legal “liability protections” are about to come to an end, he argued.

“When the Internet-connected toaster burns down the kitchen, someone is going to get sued,” said Moss, speaking Wednesday at the QTS Information Security and Compliance Forum in Washington, D.C. “The software industry is the only industry with liability protection. Nobody else has liability protection for some weird reason. Do you think that is going to last forever?”

Some customer and their law firm will be the first ones to tag a major software company for damages.

Will that be your company/lawyers?

The only way to dispel the aura invulnerability from around software companies is by repeated assaults by people damaged by their negligence.

Tort (think liability for civil damages) law has a long and complex history. A history that would not have developed had injured people been content to simply be injured with no compensation.

On torts in general, see: Elements of Torts in the USA by Robert B. Standler.

I tried to find an online casebook that had edited versions of some of the more amusing cases from tort history but to no avail.

You would be very surprised at what conduct has been shielded from legal liability over the years. But times do change and sometimes in favor of the injured party.

If you want to donate a used tort casebook, I’ll post examples of changing liability as encouragement for suits against software vendors. Stripped of all the legalese, facts of cases can be quite amusing/outraging.

EU Too Obvious With Wannabe A Monopoly Antics

Wednesday, April 20th, 2016

If you ever had any doubts (I didn’t) that the EU is as immoral as any other government, recent moves by the EU in the area of software will cure those.

EU hits Google with second antitrust charge by Foo Yun Chee reports:

EU antitrust regulators said that by requiring mobile phone manufacturers to pre-install Google Search and the Google Chrome browser to get access to other Google apps, the U.S. company was harming consumers by stifling competition.

Show of hands. How many of you think the EU gives a sh*t about consumers?

Yeah, that’s what I thought as well.

Or as Chee quotes European Competition Commissioner Margrethe Vestager:

“We believe that Google’s behavior denies consumers a wider choice of mobile apps and services and stands in the way of innovation by other players,” she said.

Hmmm, “other players.” Those don’t sound like consumers, those sound like people who will be charging consumers.

If you need confirmation of that reading, consider Anti-innovation: EU excludes open source from new tech standards by Glyn Moody.

From the post:


“Open” is generally used in the documents to denote “open standards,” as in the quotation above. But the European Commission is surprisingly coy about what exactly that phrase means in this context. It is only on the penultimate page of the ICT Standardisation Priorities document that we finally read the following key piece of information: “ICT standardisation requires a balanced IPR [intellectual property rights] policy, based on FRAND licensing terms.”

It’s no surprise that the Commission was trying to keep that particular detail quiet, because FRAND licensing—the acronym stands for “fair, reasonable, and non-discriminatory”—is incompatible with open source, which will therefore find itself excluded from much of the EU’s grand new Digital Single Market strategy. That’s hardly a “balanced IPR policy.”

Glyn goes on to say that FRAND licensing is the result of lobbying by American technical giants but seems unlikely.

The EU has attempted to favor EU-origin “allegedly” competitive software for years.

I say “allegedly” because the EU never points to competitive software in its antitrust proceedings that was excluded, only to the speculation that but for those evil American monopolists, there would be this garden of commercial and innovative European software. You bet.

There is a lot of innovative European software, but it hasn’t been produced in the same mindset that afflicts officials at the EU. They are fixated on an out-dated software sales/licensing model. Consider the rising number of companies based on nothing but open source if you want a sneak peek at the market of the future.

Being mired in market models from the past, the EU sees only protectionism (the Google complaint) and out-dated notions of software licensing (FRAND) as foundations for promoting a software industry in Europe.

Not to mention the provincialism of the EU makes it the enemy of a growing software industry in Europe. Did you know that EU funded startups are limited to hiring EU residents? (Or so I have been told, by EU startups.) That certainly works that way with EU awards.

There is nothing inconsistent with promoting open source and a vibrant EU software industry, so long as you know something about both. Knowing nothing about either has led the EU astray.

20 Slack Apps You’ll Love

Saturday, April 9th, 2016

20 Slack Apps You’ll Love

From the post:

Slack is taking the business world by storm. More and more companies are using this communication tool—and it’s becoming an increasingly robust platform due to all of the integrations being built on top of it. Now, you can do pretty much everything in Slack—from tracking how your customers use your app, to keeping tabs on company finances at a glance, to getting a daily digest of top news from around the web.

Here are 20 of the Product Hunt community’s most-loved Slack integrations. Trust us—once you give some of these a try, you’ll wonder how you ever made it through the day without them.

I tagged this under “information overload” even though I use and enjoy Slack, the last thing I need is another app to “manage” information flow.

What I desperately need is a mechanism that filters (no cat pics), promotes important content (not necessarily the most tweeted/reposted), integrates from multiple sources/feeds (think no repetition, how many times need I see “bombing in Brussels,” I get it), provides one-touch access to a history that is also governed by the same rules.

Although I gather and process large collections of information, I can only work closely with 10 to 20 results at any given point.

More than 20 “hits” is just advertising for the “depth/breath” of your search mechanism. Or perhaps my lack of skill with your search mechanism.

You are very likely to find a useful app in this collection but if not, don’t despair! The post concludes with a link to a list of over 350 more Slack apps!

Enjoy!

sqlite3 test suite

Sunday, March 20th, 2016

sqlite3 test suite by Nelson Minar.

From the post:

I felt guilty complaining about sqlite3’s source distribution, so I went to look at the real source, what the authors work with. It’s not managed by git but rather in Fossil (an SCM written by the sqlite3 author). Happily the web view is quite good.

One of the miraculous things about sqlite3 is its incredible test suite. There are 683,932 lines of test code. Compare to 273,000 lines of C code for the library and all its extensions. sqlite3 has a reputation for being solid and correct. It’s not an accident.

The test size is overcounted a bit because there’s a lot of test data. For instance the test for the Porter Stemmer is 24k lines of code, but almost all of that is a giant list of words and their correct stemming. Still very useful tests! But not quite as much human effort as it looks on first blush.

Just a quick reminder that test suites have the same mixture of code and data subjects as the code being tested.

So your software passes the test. What was being tested? What was not (the weird machines input) being tested?

If you don’t think that is a serious question, consult the page of SQLite vulnerabilities.

I saw this in a tweet by Julia Evans.

2015 Open Source Yearbook (without email conscription)

Sunday, March 13th, 2016

Publication of the 2015 Open Source Yearbook is good news!

Five or six “clicks” and having my email conscripted to obtain a copy, not so much.

For your reading pleasure with one-click access:

The 2015 Open Source Yearbook.

Impressive work, but marred by convoluted access and email conscription.

If you want to make a resource “freely” available, do so. Don’t extort contact information for “free” information.

I’m leading conference calls tomorrow or else I would be reading the 2015 Open Source Yearbook during my calls!

Government Source Code Policy

Thursday, March 10th, 2016

Government Source Code Policy

From the webpage:

The White House committed to adopting a Government-wide Open Source Software policy in its Second Open Government National Action Plan that “will support improved access to custom software code developed for the Federal Government,” emphasizing that using and contributing back to open source software can fuel innovation, lower costs, and benefit the public.[1] In support of that commitment, today the White House Office of Management and Budget (OMB) is releasing a draft policy to improve the way custom-developed Government code is acquired and distributed moving forward. This policy is consistent with the Federal Government’s long-standing policy of ensuring that “Federal investments in IT (information technology) are merit-based, improve the performance of our Government, and create value for the American people.”[2]

This policy requires that, among other things: (1) new custom code whose development is paid for by the Federal Government be made available for reuse across Federal agencies; and (2) a portion of that new custom code be released to the public as Open Source Software (OSS).

We welcome your input on this innovative draft policy. We are especially interested in your comments on considerations regarding the release of custom code as OSS. The draft policy proposes a pilot program requiring covered agencies to release at least 20 percent of their newly-developed custom code, in addition to the release of all custom code developed by Federal employees at covered agencies as part of their official duties, subject to certain exceptions as noted in the main body of the policy.[3]

In some absolute sense this is a step forward from the present practices of the government with regard to source code that it develops or pays to have developed.

On the other hand, what’s difficult about saying that all code (not 20%) developed by or at the direction of the federal government is deposited under an Apache license within 90 days of its posting to any source code repository. Subject to national security exceptions and then notice has to be given with the decision to be reviewed in the local DC federal court.

Short, simple, clear time constraints and a defined venue for review.

Anytime someone dodges the easy, obvious solution, there is a reason for that dodging. Not a reason or desire to benefit you. Unless you are the person orchestrating the dodge.

Bumping into Stallman, again [Stallmanism]

Tuesday, February 2nd, 2016

Bumping into Stallman, again by Frederick Jacobs.

From the post:

This is the second time I’m talking at the same conference as Richard Stallman, after the Ind.ie Tech Summit in Brighton, this time was at the Fri Software Days in Fribourg, Switzerland.

One day before my presentation, I got an email from the organizers, letting me know that Stallman would like me to rename the title of my talk to remove any mentions of “Open Source Software” and replace them with “Free Software”.

The email read like this:

Is it feasible to remove the terms “Open-Source” from the title of your presentation and replace them by “Free-libre software”? It’s the wish of M. Stallman, that will probably attend your talk.

Frederick didn’t change his title or presentation, while at the same time handling the issue much better than I would have.

Well, after I got through laughing my ass off that Stallman would presume to dictate word usage to anyone.

Word usage, for any stallmanists in the crowd, is an empirical question of how many people use a word with a common meaning.

At least if you want to be understood by others.

End The Lack Of Diversity On The Internet Today!

Saturday, January 16th, 2016

Julia Evans tweeted earlier today:

“programmers are 0.66% of internet users, and build the software that everyone uses” – @heddle317

The strengths of having diversity on teams, including software teams, is well known and I won’t repeat those arguments here.

See: Why Diverse Teams Create Better Work, Diversity and Work Group Performance, More Diverse Personalities Mean More Successful Teams, Managing Groups and Teams/Diversity, or, How Diversity Makes Us Smarter, for five entry points into the literature on the diversity.

With 0.66% of internet users writing software for everyone, do you see the lack of diversity?

One response is to turn people into “Linus Torvalds” so we have a broader diversity of people programming. Good thought but I don’t know of anyone who wants to be a Linus Torvalds. (Sorry Linus.)

There’s a great benefit to having more people master programming but long-term, its not a solution to the lack of diversity in the production of software for the Internet.

Even if the number of people writing software for the Internet went up ten-fold, that’s only 6.6% of the population of Internet users. Far too monotone to qualify as any type of diversity.

There is another way to increase diversity in the production of Internet software.

Warnings: You will have to express your intuitive experience in words. You will have to communicate your experiences to programmers. Some programmers will think they know a “better way” for you to experience the interface. Always remember your experience is the “users” experience, unlike theirs.

You can use, express comments on, track your comments and respond to comments from programmers, on software built for the Internet. Programmers won’t seek you or your comments out so volunteering is the only option.

Programmers have their views, but if software doesn’t meet the need, habits, customs of users, it’s useless.

Programmers can only learn the needs, habits and customs of users from you.

Are you going to help end this lack of diversity and programmers to write better software or not?

Internet Explorer 8, 9, and 10 – “Really Most Sincerely Dead”

Wednesday, January 6th, 2016

Web developers rejoice; Internet Explorer 8, 9 and 10 die on Tuesday by Owen Williams.

From the post:

Internet Explorer has long been the bane of many Web developers’ existence, but here’s some news to brighten your day: Internet Explorer 8, 9 and 10 are reaching ‘end of life’ on Tuesday, meaning they’re no longer supported by Microsoft.

Three down and one to go, IE 11, if I’m reading Owen’s post correctly. Past IE 11, users will be on Edge in Windows 10.

Oh, the “…really most sincerely dead…” is from the 1939 movie, Wizard of Oz.

Five Key Phases of Software Development – Ambiguity

Saturday, December 26th, 2015

development

It isn’t clear to me if the answer is wrong because:

  • Failure to follow instructions: No description followed the five (5) stages.
  • Five stages as listed were incorrect?

A popular alternative answer to the same question:

development_life_cycle

I have heard rumors and exhortations about requirements and documentation/testing but their incidence in practice is said to be low to non-existent.

As far as “designing” the program, isn’t bargaining what “agile programming” is all about? Showing users the latest mis-understanding of their desires and arguing it is in fact better than their original requests? Sounds like bargaining to me.

Anger may be a bit brief for “code the program” but after having lost arguments with users and told to make the UI a particular, less than best way, isn’t anger a fair description?

Acceptance is a no-brainer for “operate and maintain the system.” If no one is actively trying to change the system, what other name would you have for that state?

On the whole, it was failure to follow instructions and supply a description of each stage that lead to the answer being marked as incorrect. 😉

However, should you ever take the same exam, may I suggest that you give the popular alternative, although mythic, answer to such a question.

Like everyone else, software professions don’t appreciate their myths being questioned or disputed.

I first saw the test results in a tweet by Elena Williams.

Bytes that Rock! Software Awards 2015 (Nominations Open Now – Close 16th November 2015)

Friday, November 13th, 2015

Bytes that Rock! Software Awards 2015 (Nominations Open Now – Close 16th November 2015)

An awards program for excellence in software and blogs!

The only limitation I could find is:

Bytes that Rock recognizes the best software and blogs for their excellence in the past 12 months.

Your game/software/blog may have been excellent three (3) years ago but that doesn’t count. 😉

Subject to that mild limitation, step up and:

Submit a blog, software or game clicking on the categories below!

Software blogs
VideoGame blogs
Security blogs

PC Software
Software UI
Innovative Software
Protection Software
Open Source Software

PC Games
Indie Games
Mods for games

This is not a next week, or after I ask X, or when I get home task.

This is a hit a submit link now task!

You will feel better after having made a nomination. Promise. 😉

BTR_1

(Select the graphic for a much larger version of the image.)

The Architecture of Open Source Applications

Thursday, November 12th, 2015

The Architecture of Open Source Applications

From the webpage:

Architects look at thousands of buildings during their training, and study critiques of those buildings written by masters. In contrast, most software developers only ever get to know a handful of large programs well—usually programs they wrote themselves—and never study the great programs of history. As a result, they repeat one another’s mistakes rather than building on one another’s successes.

Our goal is to change that. In these two books, the authors of four dozen open source applications explain how their software is structured, and why. What are each program’s major components? How do they interact? And what did their builders learn during their development? In answering these questions, the contributors to these books provide unique insights into how they think.

If you are a junior developer, and want to learn how your more experienced colleagues think, these books are the place to start. If you are an intermediate or senior developer, and want to see how your peers have solved hard design problems, these books can help you too.

Follow us on our blog at http://aosabook.org/blog/, or on Twitter at @aosabook and using the #aosa hashtag.

I happened upon these four books because of a tweet that mentioned: Early Access Release of Allison Kaptur’s “A Python Interpreter Written in Python” Chapter, which I found to be the tenth chapter of “500 Lines.”

OK, but what the hell is “500 Lines?” Poking around a bit I found The Architecture of Open Source Applications.

Which is the source for the material I quote above.

Do you learn from example?

Let me give you the flavor of three of the completed volumes and the “500 Lines” that is in progress:

The Architecture of Open Source Applications: Elegance, Evolution, and a Few Fearless Hacks (vol. 1), from the introduction:

Carpentry is an exacting craft, and people can spend their entire lives learning how to do it well. But carpentry is not architecture: if we step back from pitch boards and miter joints, buildings as a whole must be designed, and doing that is as much an art as it is a craft or science.

Programming is also an exacting craft, and people can spend their entire lives learning how to do it well. But programming is not software architecture. Many programmers spend years thinking about (or wrestling with) larger design issues: Should this application be extensible? If so, should that be done by providing a scripting interface, through some sort of plugin mechanism, or in some other way entirely? What should be done by the client, what should be left to the server, and is “client-server” even a useful way to think about this application? These are not programming questions, any more than where to put the stairs is a question of carpentry.

Building architecture and software architecture have a lot in common, but there is one crucial difference. While architects study thousands of buildings in their training and during their careers, most software developers only ever get to know a handful of large programs well. And more often than not, those are programs they wrote themselves. They never get to see the great programs of history, or read critiques of those programs’ designs written by experienced practitioners. As a result, they repeat one another’s mistakes rather than building on one another’s successes.

This book is our attempt to change that. Each chapter describes the architecture of an open source application: how it is structured, how its parts interact, why it’s built that way, and what lessons have been learned that can be applied to other big design problems. The descriptions are written by the people who know the software best, people with years or decades of experience designing and re-designing complex applications. The applications themselves range in scale from simple drawing programs and web-based spreadsheets to compiler toolkits and multi-million line visualization packages. Some are only a few years old, while others are approaching their thirtieth anniversary. What they have in common is that their creators have thought long and hard about their design, and are willing to share those thoughts with you. We hope you enjoy what they have written.

The Architecture of Open Source Applications: Structure, Scale, and a Few More Fearless Hacks (vol. 2), from the introduction:

In the introduction to Volume 1 of this series, we wrote:

Building architecture and software architecture have a lot in common, but there is one crucial difference. While architects study thousands of buildings in their training and during their careers, most software developers only ever get to know a handful of large programs well… As a result, they repeat one another’s mistakes rather than building on one another’s successes… This book is our attempt to change that.

In the year since that book appeared, over two dozen people have worked hard to create the sequel you have in your hands. They have done so because they believe, as we do, that software design can and should be taught by example—that the best way to learn how think like an expert is to study how experts think. From web servers and compilers through health record management systems to the infrastructure that Mozilla uses to get Firefox out the door, there are lessons all around us. We hope that by collecting some of them together in this book, we can help you become a better developer.

The Performance of Open Source Applications, from the introduction:

It’s commonplace to say that computer hardware is now so fast that most developers don’t have to worry about performance. In fact, Douglas Crockford declined to write a chapter for this book for that reason:

If I were to write a chapter, it would be about anti-performance: most effort spent in pursuit of performance is wasted. I don’t think that is what you are looking for.

Donald Knuth made the same point thirty years ago:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

but between mobile devices with limited power and memory, and data analysis projects that need to process terabytes, a growing number of developers do need to make their code faster, their data structures smaller, and their response times shorter. However, while hundreds of textbooks explain the basics of operating systems, networks, computer graphics, and databases, few (if any) explain how to find and fix things in real applications that are simply too damn slow.

This collection of case studies is our attempt to fill that gap. Each chapter is written by real developers who have had to make an existing system faster or who had to design something to be fast in the first place. They cover many different kinds of software and performance goals; what they have in common is a detailed understanding of what actually happens when, and how the different parts of large applications fit together. Our hope is that this book will—like its predecessor The Architecture of Open Source Applications—help you become a better developer by letting you look over these experts’ shoulders.

500 Lines or Less From the GitHub page:

Every architect studies family homes, apartments, schools, and other common types of buildings during her training. Equally, every programmer ought to know how a compiler turns text into instructions, how a spreadsheet updates cells, and how a database efficiently persists data.

Previous books in the AOSA series have done this by describing the high-level architecture of several mature open-source projects. While the lessons learned from those stories are valuable, they are sometimes difficult to absorb for programmers who have not yet had to build anything at that scale.

“500 Lines or Less” focuses on the design decisions and tradeoffs that experienced programmers make when they are writing code:

  • Why divide the application into these particular modules with these particular interfaces?
  • Why use inheritance here and composition there?
  • How do we predict where our program might need to be extended, and how can we make that easy for other programmers

Each chapter consists of a walkthrough of a program that solves a canonical problem in software engineering in at most 500 source lines of code. We hope that the material in this book will help readers understand the varied approaches that engineers take when solving problems in different domains, and will serve as a basis for projects that extend or modify the contributions here.

If you answered the question about learning from example with yes, adding these works to your read and re-read list.

BTW, for markup folks, check out Parsing XML at the Speed of Light by Arseny Kapoulkine.

Many hours of reading and keyboard pleasure await anyone using these volumes.

Sharing Economy – Repeating the Myth of Code Reuse

Wednesday, September 23rd, 2015

Bitcoin and sharing economy pave the way for new ‘digital state’ by Sophie Curtis.

Sophie quotes Cabinet Office minister Matthew Hancock MP as saying:

For example, he said that Cloud Foundry, a Californian company that provides platform-as-a-service technology, could help to create a code library for digital public services, helping governments around the world to share their software.

“Governments across the world need the same sorts of services for their citizens, and if we write them in an open source way, there’s no need to start from scratch each time,” he said.

“So if the Estonian government writes a program for licensing, and we do loads of licensing in the UK, it means we’ll be able to pull that code down and build the technology cheaper. Local governments deliver loads of services too and they can base their services on the same platforms.”

However, he emphasised that this is about sharing programs, code and techniques – not about sharing data. Citizens’ personal data will remain the responsibility of the government in question, and will not be shared across borders, he said.

I’m guess that “The Rt Hon Matt Hancock MP” hasn’t read:

The code reuse myth: why internal software reuse initiatives tend to fail by Ben Morris


The largest single barrier to effective code reuse is that it is difficult. It doesn’t just happen by accident. Reusable code has to be specifically designed for a generalised purpose and it is unlikely to appear spontaneously as a natural by-product of development projects.

Reusable components are usually designed to serve an abstract purpose and this absence of tangible requirements can make them unusually difficult to design, develop and test. Their development requires specific skills and knowledge of design patterns that is not commonly found in development teams. Developing for reuse is an art in itself and it takes experience to get the level of abstraction right without making components too specific for general use or too generalised to add any real value.

These design challenges can be exasperated by organisational issues in larger and more diffused development environments. If you are going to develop common components then you will need a very deep understanding of a range of constantly evolving requirements. As the number of projects and teams involved in reuse grow it can be increasingly difficult to keep track of these and assert any meaningful consistency.

Successful code reuse needs continuous effort to evolve shared assets in step with the changing business and technical landscape. This demands ownership and governance to ensure that assets don’t fall into disrepair after the initial burst of effort that established them. It also requires a mature development process that provides sufficient time to design, test, maintain and enhance reusable assets. Above all, you need a team of skilled architects and developers who are sufficiently motivated and empowered to take a lead in implementing code reuse.

Reuse Myth – can you afford reusable code? by Allan Kelly


In my Agile Business Conference present (“How much quality can we afford?”) I talked about the Reuse Myth, this is something always touch on when I deliver a training course but I’ve never taken time to write it down. Until now.

Lets take as our starting point Kevlin Henney’s observation that “there is no such thing as reusable code, only code that is reused.” Kevlin (given the opportunity!) goes on to examine what constitutes “reuse” over simple “use.” A good discussion itself but right now the I want to suggest that an awful lot of code which is “designed for reuse” is never actually re-used.

In effect that design effort is over engineering, waste in other words. One of the reasons developers want to “design for reuse” is not so much because the code will be reused but rather because they desire a set of properties (modularity, high cohesion, low coupling, etc.) which are desirable engineering properties but sound a bit abstract.

In other words, striving for “re-usability” is a developers way of striving for well engineered code. Unfortunately in striving for re-usability we loose focus which brings us to the second consideration…. cost of re-usability.

In Mythical Man Month (1974) Fred Brooks suggests that re-usable code costs three times as much to develop as single use code. I haven’t seen any better estimates so I tend to go with this one. (If anyone has any better estimates please send them over.)

Think about this. This means that you have to use your “reusable&#82#8221; code three times before you break even. And it means you only see a profit (saving) on the fourth reuse.

How much code which is built for reuse is reused four times?

Those are two “hits” out of 393,000 that I got this afternoon searching on (with the quotes) “code reuse.”

Let’s take The Rt Hon Matt Hancock MP statement and re-write it a bit:

Hypothetical Statement – Not an actual statement by The Rt Hon Matt Hancock MP, he’s not that well informed:

“So if the Estonian government spends three (3) times as much to write a program for licensing, and we do loads of licensing in the UK, it means we’ll be able to pull that code down and build the technology cheaper. Local governments deliver loads of services too and they can base their services on the same platforms.”

Will the Estonian government, which is like other governments, will spend three (3) times as much developing software on the off chance that the UK may want to use it?

Would any government undertake software development on that basis?

Do you have an answer other than NO! to either of those questions?

There are lots of competent computer people in the UK but none of them are advising The Rt Hon Matt Hancock MP. Or he isn’t listening. Amounts to the same thing.

Most Significant Barriers to Achieving a Strong Cybersecurity Posture

Tuesday, September 15th, 2015

Cyber-Security Stat of the Day, is sponsored by Grid Cyber Sec, and is a window into cyber-security practices/thinking.

For September 14, 2015, we find Most Significant Barriers to Achieving a Strong Cybersecurity Posture:

cyber-stat-barriers

Does the omission of “more secure software” shock you? (You know the difference between “shock” and “surprise.” Yes?)

If we keep layering buggy software on top of buggy software, then we are no smarter than most of the members of Congress who think legislation can determine behavior. It can influence it, mostly in ways not intended but determine it?

Buggy software + more buggy software = cyber insecurity.

What’s so hard about that?

BTW, do subscribe to Cyber-Security Stat of the Day. Sometimes funny, sometimes helpful, sometimes dismaying but its never boring.

Sora high performance software radio is now open source

Saturday, July 25th, 2015

Sora high performance software radio is now open source by Jane Ma.

From the post:

Microsoft researchers today announced that their high-performance software radio project is now open sourced through GitHub. The goal for Microsoft Research Software Radio (Sora) is to develop the most advanced software radio possible, capable of implementing the latest wireless communication technology easily and efficiently.

"We believe that a fully open source Sora will better support the research community on more scientific innovation," said Kun Tan, a senior research on the software radio project team.

Conventionally, the critical lower layer processing in wireless communication systems, i.e., the physical layer (PHY) and medium access control (MAC), are typically implemented in hardware (ASIC chips), due to high-computational and real-time requirements. However, designing ASIC is very costly and inflexible since ASIC chips are fixed. Once delivered, it cannot be changed or upgraded. The lack of flexibility and programmability makes experimental research in wireless communication very difficult. Software Radio (or SDR), on the contrary, proposes implementing all these low-level PHY and MAC processes through software, which is practical for development, debugging and updating. The challenge, however, is how the software can stay up to date with hardware in terms of performance.

See also: Microsoft's Wireless and Networking research group

Sora was developed to solve this significant challenge. Sora is a fully programmable high-performance software radio that is capable of implementing state-of-the-art wireless technologies (Wi-Fi, LTE, MIMO, etc.). Sora is based on software running on a low-cost, commodity multi-core PC with a general purpose OS, i.e., Windows. A multi-core PC, plugged in to a PCIe radio control board, connecting to a third-party radio front-end with antenna, becomes a powerful software radio platform. The PC interface board transfers the raw wireless (I/Q) signals between the RF front-end and the PC memory through fast DMA. All signals are processed in the software running in the PC.

An avalanche of wireless signals will accompanying the Internet of Things (IoT). Intercepting all of them with custom hardware would be prohibitively expensive.

Thanks to Microsoft, you can skip the custom hardware step.

Remember: The question is who is listening?, not if?.

Mars Code

Monday, June 22nd, 2015

Mars Code by Gerald Holzmann, JPL Laboratory for Reliable Software.

Abstract:

On August 5 at 10:18 p.m. PDT, a large rover named Curiosity made a soft landing on the surface of Mars. Given the one-way light-time to Mars, the controllers on Earth learned about the successful touchdown 14 minutes later, at 10:32 p.m. PDT. As can be expected, all functions on the rover, and on the spacecraft that brought it to Mars, are controlled by software. In this talk we review the process that was followed to secure the reliability of this code.

Gerard Holzmann is a senior research scientist and a fellow at NASA’s Jet Propulsion Laboratory, the lab responsible for the design of the Mars Science Laboratory Mission to Mars and its Curiosity Rover. He is best known for designing the Logic Model Checker Spin, a broadly used tool for the logic verification of multi-threaded software systems. Holzmann is a fellow of the ACM and a member of the National Academy of Engineering.

Timemark 8:50 starts the discussion of software environments for testing.

The first slide about software reads:

3.8 million lines
~ 60,000 pages
~ 100 really large books

120 Parallel Threads

2 CPUs (1 spare, not parallel, hardware backup)

5 years development time, with a team of 40 software engineers, < 10 lines of code per hour

1 customer, 1 use: it must work the first time

So how do you make sure you get it right?

Steps they took to make the software right:

  1. adopted a risk-based Coding Standard with tool-based compliance checks (very few rules and every rule had a mission that failed because the rule wasn’t followed)
  2. provided training & Certification for software developers
  3. conducted daily builds integrated with Static Source Code Analysis (with penalities for breaking the build)
  4. used a tool-based Code Review process
  5. thorough unit- and (daily) integration testing
  6. did Logic Verification of critical subsystems with a model checker

Continues to examine each these areas in detail. Be forewarned, the first level of conformance is compiling with all warnings on and having 0 warnings. The bare minimum.

BTW, there are a number of resources online at the JPL Laboratory for Reliable Software (LaRS).

Share this post with anyone who claims it is too hard to write secure software. It may be, for them, but not for everyone.

jQAssistant 1.0.0 released

Friday, April 24th, 2015

jQAssistant 1.0.0 released by Dirk Mahler.

From the webpage:

We’re proud to announce the availability of jQAssistant 1.0.0 – lots of thanks go to all the people who made this possible with their ideas, criticism and code contributions!

Feature Overview

  • Static code analysis tool using the graph database Neo4j
  • Scanning of software related structures, e.g. Java artifacts (JAR, WAR, EAR files), Maven descriptors, XML files, relational database schemas, etc.
  • Allows definition of rules and automated verification during a build process
  • Rules are expressed as Cypher queries or scripts (e.g. JavaScript, Groovy or JRuby)
  • Available as Maven plugin or CLI (command line interface)
  • Highly extensible by plugins for scanners, rules and reports
  • Integration with SonarQube
  • It’s free and Open Source

Example Use Cases

  • Analysis of existing code structures and matching with proposed architecture and design concepts
  • Impact analysis, e.g. which test is affected by potential code changes
  • Visualization of architectural concepts, e.g. modules, layers and their dependencies
  • Continuous verification and reporting of constraint violations to provide fast feedback to developers
  • Individual gathering and filtering of metrics, e.g. complexity per component
  • Post-Processing of reports of other QA tools to enable refactorings in brown field projects
  • and much more…

Get it!

jQAssistant is available as a command line client from the downloadable distribution

jqassistant.sh scan -f my-application.war
jqassistant.sh analyze
jqassistant.sh server

or as Maven plugin:

<dependency>
    <groupId>com.buschmais.jqassistant.scm</groupId>
    <artifactId>jqassistant-maven-plugin</artifactId>
    <version>1.0.0</version>
</dependency>

For a list of latest changes refer to the release notes, the documentation provides usage information.

Those who are impatient should go for the Get Started page which provides information about the first steps about scanning applications and running analysis.

Your Feedback Matters

Every kind of feedback helps to improve jQAssistant: feature requests, bug reports and even questions about how to solve specific problems. You can choose between several channels – just pick your preferred one: the discussion group, stackoverflow, a Gitter channel, the issue tracker, e-mail or just leave a comment below.

Workshops

You want to get started quickly for an inventory of an existing Java application architecture? Or you’re interested in setting up a continuous QA process that verifies your architectural concepts and provides graphical reports?
The team of buschmais GbR offers individual workshops for you! For getting more information and setting up an agenda refer to http://jqassistant.de (German) or just contact us via e-mail!

Short of wide spread censorship, in order for security breaches to fade from the news spotlight, software quality/security must improve.

jQAssistant 1.0.0 is one example of the type of tool required for software quality/security to improve.

Of particular interest is its use of Neo4j, enables having named relationships of materials to your code.

I don’t mean to foster the “…everything is a graph…” any more than I would foster “…everything is a set of relational tables…” or “…everything is a key/value pair…,” etc. Yes, but the question is: “What is the best way, given my requirements and constraints to achieve objective X?” Whether relationships are explicit, if so, what can I say about them?, or implicit, depends on my requirements, not those of a vendor.

In the case of recording who wrote the most buffer overflows and where, plus other flaws, tracking named relationships and similar information should be part of your requirements and graphs are a good way to meet that requirement.

On Lemmings and PageRank

Tuesday, March 17th, 2015

Solving Open Source Discovery by Andrew Nesbitt.

From the post:

Today I’m launching Libraries.io, a project that I’ve been working on for the past couple of months.

The intention is to help developers find new open source libraries, modules and frameworks and keep track of ones they depend upon.

The world of open source software depends on a lot of open source libraries. We are standing on the shoulders of giants, which helps us to reach further than we could otherwise.

The problem with platforms like Rubygems and NPM is there are so many libraries, with hundreds of new ones added every day. Trying to find the right library can be overwhelming.

How do you find libraries that help you solve problems? How do you then know which of those libraries are worth using?

Andrew substitutes dependencies for links in a page rank algorithm and then:

Within Libraries.io I’ve aggregated over 700,000 projects, written in 130 languages from across 22 package managers, including dependencies, releases, license information and source code repository infomation. This results in a rich index of almost every open source library available for use today.

Follow me on Twitter at @teabass and @librariesio for updates. Discussion on Hacker News: https://news.ycombinator.com/item?id=9211084.

Is Libraries.io going to be useful? Yes!

Is Libraries.io a fun way to explore projects? Yes!

Is Libraries.io a great alternative to current source search options? Yes!

Is Libraries.io the solution to open source discovery? Less clear.

I say that because PageRank, whether using hyperlinks or dependencies, results in a lemming view of the world in question.

Wikipedia reports this is an image of a lemming:

Lemming

I, on the other hand, bear a passing resemblance to this image:

patrick-photo

I offer those images as evidence that I am not a lemming! 😉

The opinions and usages of others can be of interest, but I follow work and people of interest to me, not because they are of interest to others. Otherwise I would be following Lady Gaga on Twitter, for example. To save you the trouble of downloading her forty-five million (45M) followers, I hereby attest that I am not one of them.

Make no mistake, Andrew’s work should be used, followed, supported, improved, but as another view of an important data set, not a solution.

I first saw this in a tweet by Arfon Smith.

Can Spark Streaming survive Chaos Monkey?

Tuesday, March 17th, 2015

Can Spark Streaming survive Chaos Monkey? by Bharat Venkat, Prasanna Padmanabhan, Antony Arokiasamy, Raju Uppalap.

From the post:

Netflix is a data-driven organization that places emphasis on the quality of data collected and processed. In our previous blog post, we highlighted our use cases for real-time stream processing in the context of online recommendations and data monitoring. With Spark Streaming as our choice of stream processor, we set out to evaluate and share the resiliency story for Spark Streaming in the AWS cloud environment. A Chaos Monkey based approach, which randomly terminated instances or processes, was employed to simulate failures.

Spark on Amazon Web Services (AWS) is relevant to us as Netflix delivers its service primarily out of the AWS cloud. Stream processing systems need to be operational 24/7 and be tolerant to failures. Instances on AWS are ephemeral, which makes it imperative to ensure Spark’s resiliency.

If Spark was commercial product this is where you would see in bold, not a vendor report, from a customer.

You need to see the post for the details but so you know what to expect:

Component
Type
Behaviour on Component Failure
Resilient
Driver
Process
Client Mode: The entire application is killed
Cluster Mode with supervise: The Driver is restarted on a different Worker node
Master
Process
Single Master: The entire application is killed
Multi Master: A STANDBY master is elected ACTIVE
Worker Process
Process
All child processes (executor or driver) are also terminated and a new worker process is launched
Executor
Process
A new executor is launched by the Worker process
Receiver
Thread(s)
Same as Executor as they are long running tasks inside the Executor
Worker Node
Node
Worker, Executor and Driver processes run on Worker nodes and the behavior is same as killing them individually

I can think of few things more annoying that software that works, sometimes. If you want users to rely upon you, then your service will have to be reliable.

A performance post by Netflix is rumored to be in the offing!

Enjoy!

DYI Web Server

Monday, March 16th, 2015

Let’s Build A Web Server. Part 1. by Ruslan Spivak.

From the post:

Out for a walk one day, a woman came across a construction site and saw three men working. She asked the first man, “What are you doing?” Annoyed by the question, the first man barked, “Can’t you see that I’m laying bricks?” Not satisfied with the answer, she asked the second man what he was doing. The second man answered, “I’m building a brick wall.” Then, turning his attention to the first man, he said, “Hey, you just passed the end of the wall. You need to take off that last brick.” Again not satisfied with the answer, she asked the third man what he was doing. And the man said to her while looking up in the sky, “I am building the biggest cathedral this world has ever known.” While he was standing there and looking up in the sky the other two men started arguing about the errant brick. The man turned to the first two men and said, “Hey guys, don’t worry about that brick. It’s an inside wall, it will get plastered over and no one will ever see that brick. Just move on to another layer.”1

The moral of the story is that when you know the whole system and understand how different pieces fit together (bricks, walls, cathedral), you can identify and fix problems faster (errant brick).

What does it have to do with creating your own Web server from scratch?

I believe to become a better developer you MUST get a better understanding of the underlying software systems you use on a daily basis and that includes programming languages, compilers and interpreters, databases and operating systems, web servers and web frameworks. And, to get a better and deeper understanding of those systems you MUST re-build them from scratch, brick by brick, wall by wall. (emphasis in original)

You probably don’t want to try this with an office suite package but for a basic web server this could be fun!

More installments to follow.

Enjoy!

Principles of Model Checking

Tuesday, March 3rd, 2015

Principles of Model Checking by Christel Baier and Joost-Pieter Katoen. Foreword by Kim Guldstrand Larsen.

From the webpage:

Our growing dependence on increasingly complex computer and software systems necessitates the development of formalisms, techniques, and tools for assessing functional properties of these systems. One such technique that has emerged in the last twenty years is model checking, which systematically (and automatically) checks whether a model of a given system satisfies a desired property such as deadlock freedom, invariants, or request-response properties. This automated technique for verification and debugging has developed into a mature and widely used approach with many applications. Principles of Model Checking offers a comprehensive introduction to model checking that is not only a text suitable for classroom use but also a valuable reference for researchers and practitioners in the field.

The book begins with the basic principles for modeling concurrent and communicating systems, introduces different classes of properties (including safety and liveness), presents the notion of fairness, and provides automata-based algorithms for these properties. It introduces the temporal logics LTL and CTL, compares them, and covers algorithms for verifying these logics, discussing real-time systems as well as systems subject to random phenomena. Separate chapters treat such efficiency-improving techniques as abstraction and symbolic manipulation. The book includes an extensive set of examples (most of which run through several chapters) and a complete set of basic results accompanied by detailed proofs. Each chapter concludes with a summary, bibliographic notes, and an extensive list of exercises of both practical and theoretical nature.

The present IT structure has shown itself to be as secure as a sieve. Do you expect the “Internet of Things” to be any more secure?

If you are interested in secure or at least less buggy software, more formal analysis is going to be a necessity. This title will give you an introduction to the field.

It dates from 2008 so some updating will be required.

I first saw this in a tweet by Reid Draper.