Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 11, 2016

SEMS 2016 (Auditable Spreadsheets – Quick Grab Your Heart Pills)

Filed under: Programming,Spreadsheets,Transparency — Patrick Durusau @ 3:12 pm

3rd International Workshop on Software Engineering Methods in Spreadsheets

July 4, 2016 Vienna, Austria

Abstracts due: April 11th (that’s today!)

Papers due: April 22nd

From the webpage:

SEMS is the #1 venue for academic spreadsheet research since 2014 (SEMS’14, SEMS’15). This year, SEMS’16 is going to be co-located with STAF 2016 in Vienna.

Spreadsheets are heavily used in industry as they are easy to create and evolve through their intuitive visual interface. They are often initially developed as simple tools, but, over time, spreadsheets can become increasingly complex, up to the point they become too complicated to maintain. Indeed, in many ways, spreadsheets are similar to “professional” software: both concern the storage and manipulation of data, and the presentation of results to the user. But unlike in “professional” software, activities like design, implementation, and maintenance in spreadsheets have to be undertaken by end-users, not trained professionals. This makes applying methods and techniques from other software technologies a challenging task.

The role of SEMS is to explore the possibilities of adopting successful methods from other software contexts to spreadsheets. Some, like testing and modeling, have been tried before and can be built upon. For methods that have not yet been tried on spreadsheets, SEMS will serve as a platform for early feedback.

The SEMS program will include an industrial keynote, followed by a brainstorming session about the topic, a discussion panel of industrial spreadsheet usage, presentation of short and long research papers and plenty of lively discussions. The intended audience is a mixture of spreadsheet researchers and professionals.

Felienne Hermans pioneered viewing spreadsheets as programming artifacts, a view that can result in easier maintenance and even, gasp, auditing of spreadsheets.

Inspectors General, GAO and other birds of that feather should sign up for this conference.

Remember topic maps for cumulative and customized auditing data. For example, who, by name, was explaining entries that several years later appear questionable? Topic maps can capture as much or as little data as you require.

Attend, submit an abstract today and a paper in two weeks!

April 6, 2016

Exploratory Programming for the Arts and Humanities

Filed under: Art,Humanities,Programming — Patrick Durusau @ 8:28 pm

Exploratory Programming for the Arts and Humanities by Nick Montfort.

From the webpage:

This book introduces programming to readers with a background in the arts and humanities; there are no prerequisites, and no knowledge of computation is assumed. In it, Nick Montfort reveals programming to be not merely a technical exercise within given constraints but a tool for sketching, brainstorming, and inquiring about important topics. He emphasizes programming’s exploratory potential—its facility to create new kinds of artworks and to probe data for new ideas.

The book is designed to be read alongside the computer, allowing readers to program while making their way through the chapters. It offers practical exercises in writing and modifying code, beginning on a small scale and increasing in substance. In some cases, a specification is given for a program, but the core activities are a series of “free projects,” intentionally underspecified exercises that leave room for readers to determine their own direction and write different sorts of programs. Throughout the book, Montfort also considers how computation and programming are culturally situated—how programming relates to the methods and questions of the arts and humanities. The book uses Python and Processing, both of which are free software, as the primary programming languages.

Full Disclosure: I haven’t seen a copy of Exploratory Programming.

I am reluctant to part with $40.00 US for either print or an electronic version where the major heads in the table of contents read as follows:

1 Modifying a Program

2 Calculating

3 Double, Double

4 Programming Fundamentals

5 Standard Starting Points

6 Text I

7 Text II

8 Image I

9 Image II

10 Text III

11 Statistics and Visualization

12 Animation

13 Sound

14 Interaction

15 Onward

The table of contents shows more than one hundred pages out of two hundred and sixty-three are spend on introduction to computer programming topics.

Text, which has a healthy section on string operations, merits a mere seventy pages. The other one hundred pages is split between visualization, sound, animation, etc.

Compare that table of contents with this one*:

Chapter One – Modular Programming: An Approach

Chapter Two – Data Entry and Text Verification

Chapter Three – Index and Concordance

Chapter Four – Text Criticism

Chapter Five – Improved Searching Techniques

Chapter Six – Morphological Analysis

Which table of contents promises to be more useful for exploration?

Personal computers are vastly more powerful today than when the second table of contents was penned.

Yet, students start off as though they are going to write their own tools from scratch. Unlikely and certainly not the best use of their time.

In depth coverage of the NLTK Toolkit historical or contemporary texts, in depth, would teach them a useful tool. A tool they could apply to other material.

To cover machine learning, consider Weka. A tool students can learn in class and then apply in new and different situations.

There are tools for image and sound analysis but the important term is tool.

Just as we don’t teach students to make their own paper, we should focus on enabling them to reap the riches that modern software tools offer.

Or to put it another way, let’s stop repeating the past and move forward.

* Oh, the second table of contents? Computer Programs for Literary Analysis, John R. Abercrombie, Philadelphia : Univ. of Philadelphia Press, ©1984. Yes, 1984.

April 5, 2016

Python Code + Data + Visualization (Little to No Prose)

Filed under: Graphics,Programming,Python,Visualization — Patrick Durusau @ 12:46 pm

Up and Down the Python Data and Web Visualization Stack

Using the “USGS dataset listing every wind turbine in the United States:” this notebook walks you through data analysis and visualization with only code and visualizations.

That’s it.

Aside from very few comments, there is no prose in this notebook at all.

You will either hate it or be rushing off to do a similar notebook on a topic of interest to you.

Looking forward to seeing the results of those choices!

March 23, 2016

Brave Clojure: Become a Better Programmer

Filed under: Clojure,Functional Programming,Programming — Patrick Durusau @ 8:57 pm

Brave Clojure: Become a Better Programmer by Daniel Higginbotham.

From the post:

Next week week I’m re-launching www.braveclojure.com as Brave Clojure. The site will continue featuring Clojure for the Brave and True, but I’m expanding its scope a bit. Instead of just housing the book, the purpose of the site will be to help you and the people you cherish become better programmers.

Like many other Clojurists, I fell in love with the language because learning it made me a better programmer. I started learning it because I was a bit bored and burnt out on the languages and tools I had been using. Ruby, Javascript, Objective-C weren’t radically different from each other, and after using them for many years I felt like I was stagnating.

But Clojure, with its radically different approach to computation (and those exotic parentheses) drew me out of my programming funk and made it fun to code again. It gave me new tools for thinking about software, and a concomitant feeling that I had an unfair advantage over my colleagues. So of course the subtitle of Clojure for the Brave and True is learn the ultimate language and become a better programmer.

And, four years since I first encountered Rich Hickey’s fractal hair, I still find Clojure to be an exceptional tool for becoming a better programmer. This is because Clojure is a fantastic tool for exploring programming concepts, and the talented community has created exceptional libraries for such diverse approaches as forward-chaining rules engines and constraint programming and logic programming, just to name a few.

Mark your calendar to help drive the stats for Daniel’s relaunch of www.braveclojure.com as Brave Clojure.

Email, tweet, blog, etc., to help others drive not only the relaunch stats but the stats for following weeks as well.

This could be one of those situations where your early participation and contributions will shape the scope and the nature of this effort.

Enjoy!

March 20, 2016

How-To Maintain Project Delivery Dates – Skip Critical Testing

Filed under: Programming,Project Management,Security — Patrick Durusau @ 4:25 pm

David William documents a tried and true way to maintain a project schedule, skip critical testing in: Pentagon skips tests on key component of U.S.-based missile defense system.

How critical?

Here’s part of David’s description:

Against the advice of its own panel of outside experts, the U.S. Missile Defense Agency is forgoing tests meant to ensure that a critical component of the nation’s homeland missile defense system will work as intended.

The tests that are being skipped would evaluate the reliability of small motors designed to help keep rocket interceptors on course as they fly toward incoming warheads.

The components, called alternate divert thrusters, are vital to the high-precision guidance required to intercept and destroy an enemy warhead traveling at supersonic speed – a feat likened to hitting one speeding bullet with another.

The interceptors, deployed in underground silos at Vandenberg Air Force Base in Santa Barbara County and at Ft. Greely, Alaska, are the backbone of the Ground-based Midcourse Defense system (GMD) – the nation’s main defense against a sneak attack by North Korea or Iran.

Hmmm, hitting a supersonic target with a supersonic bullet and you don’t test the aiming mechanism that makes them collide?

How critical does that sound?

The consequences of failure, assuming the entire program isn’t welfare for the contractors and their employees, could be a nuke landing on the West Coast of the United States.

Does that make it sound more critical?

Or do we need to guess which city? Los Angeles, San Diego, would increase property values in San Jose so there would be an off-set to take into account.

Here’s my advice: Don’t ever skip critical testing or continue to participate in a project that skips critical testing. Walk away.

Not quietly, tell everyone you know of the skipped testing. NDAs be damned.

No one is well served by skipped testing.

A lack of testing has lead to the broken Internet of Things.

Is that what you want?

March 19, 2016

LANGSEC: Taming the Weird Machines (Subject Identities in Code/Data)

Filed under: Cybersecurity,Functional Programming,Programming,Security — Patrick Durusau @ 4:48 pm

LANGSEC: Taming the Weird Machines by Jacob Torrey.

From the post:

Introduction

I want to get some of my opinions on the current state of computer security out there, but first I want to highlight some of the most exciting, and in my views, promising recent developments in security: language-theoretic security (LangSec). Feel free to skip the next few paragraphs of background if you are familiar with the concepts to get to my analysis, otherwise, buckle up for a little ride!

Background

If I were to distill the core of the LangSec movement into a single thesis it would be this: The complexity of our computing systems (both software and hardware) have reached such a degree that data must treated as formally as code. A concrete example of this is return-oriented programming (ROP), where instead of executing shellcode loaded into memory by the attacker, a number of gadgets are found in existing code (such as libc) and their addresses chained together on the stack and as the ret instruction is repeatedly called, the semantics of the gadgets is executed. This hybrid execution environment of using existing code and driving it with a buffer-overflow of data is one example of a weird machine.

Such weird machines crop up in many sorts of places: viz. the Intel x86 MMU that has been shown to be Turing-complete, the meta-data of ELF executable files that can drive execution in the loading & dynamic-linking stage, etc… This highlights the fact that data can be treated as instructions or code on these weird machines, much like Java byte-code is data to an x86 CPU, it is interpreted as code by the JVM. The JVM is a formal, explicit machine, much like the x86 CPU; weird machines on the other hand are ad hoc, implicit and generally not intentionally created. Many exploits are simply shellcode developed for a weird machine instead of the native CPU.

The “…data must be formally treated as code…” caught my eye as the reverse of “…code-as-data…,” which is a characteristic of Lisp and Clojure.

From a topic map/subject identity perspective, the problem is accepting implied subject identities and therefore implied properties and associations.

Being “implied” and not “explicit,” the interaction of subjects can change when someone, perhaps a hacker (or a fat-fingered user), supplies values that fall within the range of implied subject identities, properties, or associations.

Implied subject identities, properties, or associations, in code or data, reside in the minds of programmers, making detection well nigh impossible. At least prior to some hacker discovering an implied subject identity, property or association.

Avoiding implied subject identities, properties and associations will require work, loathsome to all programmers, but making subject identities explicit, enumerating their properties and allowed associations, in code and data, is a countable activity.

Having made subject identities explicit, capturing those results in code based on those explicit subject identities more robust. You won’t be piling implied subject identities on top of implied subject identities, or in plainer English, you won’t be writing cybersecurity software.

PS: Using a subject identity discipline does not mean you must document all of your code using XTM. You could but DSLs designed for your code/data may be more efficient.

March 14, 2016

APL in R “The past isn’t dead. It isn’t even past.”*

Filed under: Arrays,Programming,R — Patrick Durusau @ 8:13 pm

APL in R by Jan de Leeuw and Masanao Yajima.

From the introduction:

APL was introduced by Iverson (1962). It is an array language, with many functions to manipulate multidimensional arrays. R also has multidimensional arrays, but not as many functions to work with them.

In R there are no scalars, there are vectors of length one. For a vector x in R we have dim(x) equal to NULL and length(x) > 0. For an array, including a matrix, we have length(dim(x)) > 0. APL is an array language, which means everything is an array. For each array both the shape ⍴A and the rank ⍴⍴A are defined. Scalars are arrays with shape equal to one, vectors are arrays with rank equal to one.

If you want to evaluate APL expressions using a traditional APL virtual keyboard, we recommend the nice webpage at ngn.github.io/apl/web/index.html. EliStudio at fastarray.appspot.com/default.html is essentially an APL interpreter running in a Qt GUI, using ascii symbols and symbol-pairs to replace traditional APL symbols (Chen and Ching (2013)). Eli does not have nested arrays. It does have ecc, which compiles eli to C.

In 1994 one of us coded most APL array operations in XLISP-STAT. The code is still available at gifi.stat.ucla.edu/apl.

Certain this will be useful for R programmers but more generally curious if there is a genealogy of functions across programming languages?

Enjoy!

*Apologies to William Faulkner.

Open Source Clojure Projects

Filed under: Clojure,Functional Programming,Open Source,Programming — Patrick Durusau @ 12:28 pm

Open Source Clojure Projects by Daniel Higginbotham.

Daniel Higginbotham of Clojure for the Brave and True, has posted this listing of open source Clojure projects with the blurb:

Looking to improve your skills and work with real code? These projects are under active development and welcome new contributors.

You can see the source at: https://github.com/braveclojure/open-source, where it says:

Pull requests welcome!

Do you know of any other open source Clojure projects that welcome new contributors?

Like yours?

Just by way of example, marked as “beginner friendly,” you will find:

alda – A general purpose music programming language

Avi – A lively vi (a spec & implementation of vim)

clj-rethinkdb – An idomatic RethinkDB client for Clojure

For the more sure-footed:

ClojureCL – Parallel computations on the GPU with OpenCL 2.0 in Clojure

Enjoy!

March 9, 2016

Program Derivation for Functional Languages – Tuesday, March 29, 2016 Utecht

Filed under: Functional Programming,Programming,Standards — Patrick Durusau @ 9:21 pm

Program Derivation for Functional Languages by Felienne Hermans.

From the webpage:

Program Derivation for Functional Languages

Program derivation of course was all the rage in the era of Dijkstra, but is it still relevant today in the age of TDD and model checking? Felienne thinks so!

In this session she will show you how to systematically and step-by-step derive a program from a specification. Functional languages especially are very suited to derive programs for, as they are close to the mathematical notation used for proofs.

You will be surprised to know that you already know and apply many techniques for derivation, like Introduce Parameter as supported by Resharper. Did you know that is actually program derivation technique called generalization?

I don’t normally post about local meetups but as it says in the original post, Felienne is an extraordinary speaker and the topic is an important one.

Personally I am hopeful that at least slides and/or perhaps even video will emerge from this presentation.

If you can attend, please do!

In the meantime, if you need something to tide you over, consider:

A Calculus of Functions for Program Derivation by Richard Bird (1987).

Lectures on Constructive Functional Programming by R.S. Bird (1988).

Richard Bird’s Publication page.

A brief introduction to the derivation of programs by Juris Reinfelds (1986).

March 4, 2016

Requirements – Programming Exercise – @jessitron

Filed under: Programming,Requirements — Patrick Durusau @ 1:36 pm

Jessica Kerr @jessitron posted to Twitter:

Programming exercise:
I give you some requirements
You write the code
A third person tries to guess the requirements based on the code.

Care to try the same exercise on existing business/government processes?

Or return to code that you wrote a year or more ago?

If you aren’t following @jessitron you should be.

February 13, 2016

Clojure for the Brave and True (Google Group)

Filed under: Clojure,Functional Programming,Programming — Patrick Durusau @ 5:02 pm

Clojure for the Brave and True (Google Group) by Daniel Higginbotham.

First there was the website: Clojure for the Brave and True.

Then there was the book: Clojure for the Brave and True.

But before the book there was Daniel’s twitter account: @nonrecursive.

There’s no truth to the rumor of a free print trade publication with the title: Clojure for the Brave and True so please direct your questions and answers to:

Clojure for the Brave and True (Google Group)

Enjoy!

February 6, 2016

Clojure for Data Science [Caution: Danger of Buyer’s Regret]

Filed under: Clojure,Data Science,Functional Programming,Programming — Patrick Durusau @ 10:15 pm

Clojure for Data Science by Mike Anderson.

From the webpage:

Presentation given at the Jan 2016 Singapore Clojure Users’ Group

You will have to work at the presentation because there is no accompanying video, but the effort will be well spent.

Before you review these slides or pass them onto others, take fair warning that you may experience “buyer’s regret” with regard to your current programming language/paradigm (if not already Clojure).

However powerful and shiny your present language seems now, its luster will be dimmed after scanning over this slides.

Don’t say you weren’t warned ahead of time!

BTW, if you search for “clojure for data science” (with the quotes) you will find among other things:

Clojure for Data Science Progressing by Henry Garner (Packt)

Repositories for the Clojure for Data Science Processing book.

@cljds Clojure Data Science twitter feed (Henry Garner). VG!

Clojure for Data Science Some 151 slides by Henry Garner.

Plus:

Planet Clojure, a metablog that collects posts from other Clojure blogs.

As a close friend says from time to time, “clojure for data science,”

G*****s well.” 😉

Enjoy!

February 4, 2016

Spontaneous Preference for their Own Theories (SPOT effect) [SPOC?]

Filed under: Ontology,Programming,Semantics — Patrick Durusau @ 5:04 pm

The SPOT Effect: People Spontaneously Prefer their Own Theories by Aiden P. Gregga, Nikhila Mahadevana, and Constantine Sedikidesa.

Abstract:

People often exhibit confirmation bias: they process information bearing on the truth of their theories in a way that facilitates their continuing to regard those theories as true. Here, we tested whether confirmation bias would emerge even under the most minimal of conditions. Specifically, we tested whether drawing a nominal link between the self and a theory would suffice to bias people towards regarding that theory as true. If, all else equal, people regard the self as good (i.e., engage in self-enhancement), and good theories are true (in accord with their intended function), then people should regard their own theories as true; otherwise put, they should manifest a Spontaneous Preference for their Own Theories (i.e., a SPOT effect). In three experiments, participants were introduced to a theory about which of two imaginary alien species preyed upon the other. Participants then considered in turn several items of evidence bearing on the theory, and each time evaluated the likelihood that the theory was true versus false. As hypothesized, participants regarded the theory as more likely to be true when it was arbitrarily ascribed to them as opposed to an “Alex” (Experiment 1) or to no one (Experiment 2). We also found that the SPOT effect failed to converge with four different indices of self-enhancement (Experiment 3), suggesting it may be distinctive in character.

I can’t give you the details on this article because it is fire-walled.

But the catch phrase, “Spontaneous Preference for their Own Theories (i.e., a SPOT effect)” certainly fits every discussion of semantics I have ever read or heard.

With a little funding you could prove the corollary, Spontaneous Preference for their Own Code (the SPOC effect) among programmers. 😉

There are any number of formulations for how to fight confirmation bias but Jeremy Dean puts it this way:


The way to fight the confirmation bias is simple to state but hard to put into practice.

You have to try and think up and test out alternative hypothesis. Sounds easy, but it’s not in our nature. It’s no fun thinking about why we might be misguided or have been misinformed. It takes a bit of effort.

It’s distasteful reading a book which challenges our political beliefs, or considering criticisms of our favourite film or, even, accepting how different people choose to live their lives.

Trying to be just a little bit more open is part of the challenge that the confirmation bias sets us. Can we entertain those doubts for just a little longer? Can we even let the facts sway us and perform that most fantastical of feats: changing our minds?

I wonder if that includes imagining using JSON? (shudder) 😉

Hard to do, particularly when we are talking about semantics and what we “know” to be the best practices.

Examples of trying to escape the confirmation bias trap and the results?

Perhaps we can encourage each other.

January 30, 2016

Tip #20: Play with Racket [Computer Science for Everyone?]

Filed under: Computer Science,Education,Programming — Patrick Durusau @ 2:23 pm

Tip #20: Play with Racket by Aaron Quint and Michael R. Bernstein.

From the post:

Racket is a programming language in the Lisp tradition that is different from other programming languages in a few important ways. It can be any language you want – because Racket is heavily used for pedagogy, it has evolved into a suite of languages and tools that you can use to explore as many different programming paradigms as you can think of. You can also download it and play with it right now, without installing anything else, or knowing anything at all about computers or programming. Watching Matthias Felleisen’s “big-bang: the world, universe, and network in the programming language” talk will give you an idea of how Racket can be used to help people learn how to think about mathematics, computation, and more. Try it out even if you “hate Lisp” or “don’t know how to program” – it’s really a lot of fun.

Aaron and Michael scooped President Obama’s computer science skills for everyone by a day:

President Barack Obama said Saturday he will ask Congress for billions of dollars to help students learn computer science skills and prepare for jobs in a changing economy.

“In the new economy, computer science isn’t an optional skill. It’s a basic skill, right along with the three R’s,” Obama said in his weekly radio and Internet address….(Obama Wants $4B to Help Students Learn Computer Science)

The “computer science for everyone” is a popular chant but consider the Insecure Internet of Things (IIoT).

Will minimal computer science skills increase or decrease the level of security for the IIoT?

That’s what I think too.

Removal of IoT components is the only real defense. Expect a vibrant cottage industry to grow up around removing IoT components.

January 16, 2016

End The Lack Of Diversity On The Internet Today!

Filed under: Design,Diversity,Programming,Software,Software Engineering — Patrick Durusau @ 4:04 pm

Julia Evans tweeted earlier today:

“programmers are 0.66% of internet users, and build the software that everyone uses” – @heddle317

The strengths of having diversity on teams, including software teams, is well known and I won’t repeat those arguments here.

See: Why Diverse Teams Create Better Work, Diversity and Work Group Performance, More Diverse Personalities Mean More Successful Teams, Managing Groups and Teams/Diversity, or, How Diversity Makes Us Smarter, for five entry points into the literature on the diversity.

With 0.66% of internet users writing software for everyone, do you see the lack of diversity?

One response is to turn people into “Linus Torvalds” so we have a broader diversity of people programming. Good thought but I don’t know of anyone who wants to be a Linus Torvalds. (Sorry Linus.)

There’s a great benefit to having more people master programming but long-term, its not a solution to the lack of diversity in the production of software for the Internet.

Even if the number of people writing software for the Internet went up ten-fold, that’s only 6.6% of the population of Internet users. Far too monotone to qualify as any type of diversity.

There is another way to increase diversity in the production of Internet software.

Warnings: You will have to express your intuitive experience in words. You will have to communicate your experiences to programmers. Some programmers will think they know a “better way” for you to experience the interface. Always remember your experience is the “users” experience, unlike theirs.

You can use, express comments on, track your comments and respond to comments from programmers, on software built for the Internet. Programmers won’t seek you or your comments out so volunteering is the only option.

Programmers have their views, but if software doesn’t meet the need, habits, customs of users, it’s useless.

Programmers can only learn the needs, habits and customs of users from you.

Are you going to help end this lack of diversity and programmers to write better software or not?

December 21, 2015

Encapsulation and Clojure – Part I

Filed under: Clojure,Programming — Patrick Durusau @ 6:48 pm

Encapsulation and Clojure – Part I by James Reeves.

From the post:

Encapsulation is a mainstay of object orientated programming, but in Clojure it’s often avoided. Why does Clojure steer clear of a concept that many programming languages consider to be best practice?

Err, because “best practices” may be required to “fix” problems baked into a language?

That would be my best guess.

December 14, 2015

Fixing Bugs In Production

Filed under: Humor,Privacy,Programming,Security — Patrick Durusau @ 8:48 pm

MΛHDI posted this to twitter and it is too good not to share:

Amusing now but what happens when the illusion of “static data” disappears and economic activity data is streamed from every transaction point?

Your code and analysis will need to specify the time boundaries of the data that underlie your analysis. Depending on the level of your analysis, it may quickly become outdated as new data streams in for further analysis.

To do the level of surveillance that law enforcement longs for in the San Bernardino attack, you would need real time sales transaction data for the last 5 years, plus bank records and “see something say something” reports on 322+ million citizens of the United States.

Now imagine fixing bugs in that production code, when arrest and detention, if not more severe consequences await.

Data Science Lessons [Why You Need To Practice Programming]

Filed under: Data Science,Programming,Python — Patrick Durusau @ 7:30 pm

Data Science Lessons by Shantnu Tiwari.

Shantnu has authored several programming books using Python and has a series of videos (with more forthcoming) on doing data science with Python.

Shantnu had me when he used data from the Hubble Space telescope in his Introduction to Pandas with Practical examples.

The videos build one upon another and new users will appreciate that not very move is the correct one. 😉

If I had to pick one video to share, of those presently available, it would be:

Why You Need To Practice Programming.

It’s not new advice but it certainly is advice that needs repeating.

This anecdote is told about Pablo Casals (world famous cellist):

When Casals (then age 93) was asked why he continued to practice the cello three hours a day, he replied, “I’m beginning to notice some improvement.”

What are you practicing three hours a day?

November 29, 2015

Idiomatic Python Resources

Filed under: Programming,Python — Patrick Durusau @ 4:57 pm

Idiomatic Python Resources by Andrew Montalenti.

From the post:

Let’s say you’ve just joined my team and want to become an idiomatic Python programmer. Where do you begin?

There are twenty-three resources listed and the benefits of being an idiomatic Python programmer (or an idiomatic programmer in any other language) aren’t limited to employment with Andrew. 😉

One of the advantages to being an idiomatic programmer is that you will be more easily understood by other programmers. Being understood isn’t a bad thing. Really.

Another advantage to being an idiomatic programmer is that it will influence the programmers around you and result in code that is easier for you to understand. Again, understanding isn’t a bad thing.

As if that weren’t enough, perusing the resources that Andrew lists will make you a better programmer overall, which is never a bad thing.

Enjoy!

November 17, 2015

Debugging with the Scientific Method [Debugging Search Semantics]

Filed under: Clojure,Programming,Semantics — Patrick Durusau @ 2:27 pm

Debugging with the Scientific Method by Stuart Halloway.

This webpage points to a video of Stuart’s keynote address at Clojure/conj 2015 with the same title and has pointers to other resources on debugging.

Stuart summarizes the scientific method for debugging in his closing as:


know where you are going

make well-founded choices

write stuff down

Programmers, using Clojure or not, will profit from Stuart’s advice on debugging program code.

A group that Stuart does not mention, those of us interested in creating search interfaces for users will benefit as well.

We have all had a similar early library experience, we are facing (in my youth) what seems like an endless rack of card files with the desire to find information on a subject.

Of course the first problem, from Stuart’s summary, is that we don’t know where we are going. At best we have an ill-defined topic on which we are supposed to produce a report. Let’s say “George Washington, father of our country” for example. (Yes, U.S. specific but I wasn’t in elementary school outside of the U.S. Feel free to post or adapt this with other examples.)

The first step, with help from a librarian, is to learn the basic author, subject, title organization of the card catalog. And things like looking for “George Washington” starting with “George” isn’t likely to produce a useful result. Eliding over the other details that a librarian would convey, you are somewhat equipped to move to step two.

Understanding the basic organization and mechanics of a library card catalog, you can develop a plan to search for information on George Washington. Such a plan would include excluding works over the reading level of the searcher, for example.

The third step of course is to capture all the information that is found from the resources located by using the library card catalog.

I mention that scenario not just out of nostalgia for card catalogs but to illustrate the difference between a card catalog and its electronic counter-parts, which have an externally defined schema and search interfaces with no disclosed search semantics.

That is to say, if a user doesn’t find an expected result for their search, how do you debug that failure?

You could say the user should have used “term X” instead of “term Y” but that isn’t solving the search problem, that is fixing the user.

Fixing users, as any 12-step program can attest, is a very difficult and fraught with failure process.

Fixing search semantics, debugging search semantics as it were, can fix the search results for a large number of users with little or no effort on their part.

There are any number of examples of debugging or fixing search semantics but the most prominent one that comes to mine is spelling correction by search engines that result results with the “correct” spelling and offer the user an opportunity to pursue their “incorrect” spelling.

At one time search engines returned “no results” in the event of mis-spelled words.

The reason I mention this is you are likely to be debugging search semantics on a less than global search space scale but the same principle applies as does Stuart’s scientific method.

Treat complaints about search results as an opportunity to debug the search semantics of your application. Follow up with users and test your improved search semantics.

Recalling that is all events, some user signs your check, not your application.

November 16, 2015

Recreational Constraint Programmer

Filed under: Automata,Clojure,Constraint Programming,Programming — Patrick Durusau @ 3:52 pm

https://youtu.be/AEhULv4ruL4 [Embedding of the video disabled at the source. Follow the link.]

From the description:

Many of us have hazy memories of finite state machines from computer science theory classes in college. But finite state machines (FSMs) have real, practical value, and it is useful to know how to build and apply them in Clojure. For example, FSMs have long been popular to model game AIs and workflow rules, and FSMs provide the behind-the-scenes magic that powers Java’s regexes and core.async’s go blocks. In this talk, we’ll look at two programming puzzles that, suprisingly, have very elegant solutions when looked at through the lens of FSMs, with code demonstrations using two different Clojure libraries for automata (automat and reduce-fsm), as well as loco, a Clojure constraint solver.

If you have never heard anyone describe themselves as a “recreational constraint programmer,” you really need to see this video!

If you think about having a single representative for a subject as a constraint on a set of topics, the question becomes what properties must each topic have to facilitate that constraint?

Some properties, such as family names, will lead to over-merging of topics and other properties, such as possession of one and only one social security number, will under-merge topics where a person has multiple social security numbers.

The best code demonstration in the video was the generation of a fairly complex cross-word puzzle, sans the clues for each word. I think the clues were left as an exercise for the reader. 😉

Code Repositories:

http://github.com/engelberg/automata

http://github.com/aengelberg/automata

Encouraging enough that you might want to revisit regular expressions.

Enjoy!

November 14, 2015

The 100 Most Used Clojure Expressions

Filed under: Clojure,Education,Programming — Patrick Durusau @ 5:03 pm

The 100 Most Used Clojure Expressions by Eric Normand.

From the post:

Summary: Would you like to optimize your learning of Clojure? Would you like to focus on learning only the most useful parts of the language first? Take this lesson from second language learning: learn the expressions in order of frequency of use.

When I was learning Spanish, I liked to use Anki to drill new vocabulary. It’s a flashcard program. I found that someone had made a set of cards from an analysis of thousands of newspapers. They read in all of the words from the newspapers, counted them up, and figured out what the most common words were. The top 1000 made it into the deck.

It turns out that this is a very good strategy for learning words. Word frequency follows a hockey stick distribution. The most common words are used so much more than the less common words. For instance, the 100 most common English words make up more than 50% of text. If you’ve got limited time, you should learn those most common words first.

People who are trying to learn Clojure have been asking me “how do I learn all of this stuff? There’s so much!” It’s a valid question and I haven’t had a good answer. I remembered the Spanish newspaper analysis and I thought I’d try to do a similar analysis of Clojure expressions.

Is Eric seriously suggesting using lessons learned in another field? 😉

Of course, for a CS conference using the top 100 most common Clojure expressions would have a title similar to:

Use of High Frequency Terminology Repetition: A Small Group Study (maybe 12 participants)

You could, of course, skip waiting for a conference presentation with a title like that one, followed by peer reviewed paper(s), more conference presentations and its final appearance in a collection of potential ways to improve CS instruction.

Let me know if Eric’s suggestion works for you.

Enjoy!

PS: Thanks Eric!

November 13, 2015

Reverse Engineering Challenges

Filed under: Programming,Reverse Engineering,Software Engineering — Patrick Durusau @ 4:42 pm

Reverse Engineering Challenges by Dennis Yorichev.

After the challenge/exercise listing:

About the website

Well, “challenges” is a loud word, these are rather just exercises.

Some exercises were in my book for beginners, some were in my blog, and I eventually decided to keep them all in one single place like this website, so be it.

The source code of this website is also available at GitHub: https://github.com/dennis714/challenges.re. I would love to get any suggestions and notices about misspellings and typos.

Exercise numbers

There is no correlation between exercise number and hardness. Sorry: I add new exercises occasionally and I can’t use some fixed numbering system, so numbers are chaotic and has no meaning at all.

On the other hand, I can assure, exercise numbers will never change, so my readers can refer to them, and they are also referred from my book for beginners.

Duplicates

There are some pieces of code which are really does the same thing, but in different ways. Or maybe it is implemented for different architectures (x86 and Java VM/.NET). That’s OK.

A major resource for anyone interested in learning reverse engineering!

If you are in the job market, Dennis concludes with this advice:

How can I measure my performance?

  • As far as I can realize, If reverse engineer can solve most of these exercises, he is a hot target for head hunters (programming jobs in general).
  • Those who can solve from ¼ to ½ of all levels, perhaps, can freely apply for reverse engineering/malware analysts/vulnerability research job positions.
  • If you feel even first level is too hard for you, you may probably drop the idea to learn RE.

You have a target, the book and the exercises. The rest is up to you.

You do not want to be an edge case [The True Skynet: Your Homogenized Future]

Filed under: Design,Humanities,Identification,Programming — Patrick Durusau @ 1:15 pm

You do not want to be an edge case.

John D. Cook writes:

Hilary Mason made an important observation on Twitter a few days ago:

You do not want to be an edge case in this future we are building.

Systems run by algorithms can be more efficient on average, but make life harder on the edge cases, people who are exceptions to the system developers’ expectations.

Algorithms, whether encoded in software or in rigid bureaucratic processes, can unwittingly discriminate against minorities. The problem isn’t recognized minorities, such as racial minorities or the disabled, but unrecognized minorities, people who were overlooked.

For example, two twins were recently prevented from getting their drivers licenses because DMV software couldn’t tell their photos apart. Surely the people who wrote the software harbored no malice toward twins. They just didn’t anticipate that two drivers licence applicants could have indistinguishable photos.

I imagine most people reading this have had difficulty with software (or bureaucratic procedures) that didn’t anticipate something about them; everyone is an edge case in some context. Maybe you don’t have a middle name, but a form insists you cannot leave the middle name field blank. Maybe there are more letters in your name or more children in your family than a programmer anticipated. Maybe you choose not to use some technology that “everybody” uses. Maybe you happen to have a social security number that hashes to a value that causes a program to crash.

When software routinely fails, there obviously has to have a human override. But as software improves for most people, there’s less apparent need to make provision for the exceptional cases. So things could get harder for edge cases as they get better for more people.

Recent advances in machine learning have led reputable thinkers (Steven Hawking for example) to envision a future where an artificial intelligence will arise to dispense with humanity.

If you think you have heard that theme before, you have, most recently as Skynet, an entirely fictional creation in the Terminator science fiction series.

Given that no one knows how the human brain works, much less how intelligence arises, despite such alarmist claims making good press, the risk is less than a rogue black hole or a gamma-ray burst. I don’t lose sleep over either one of those, do you?

The greater “Skynet” threat to people and their cultures is the enforced homogenization of language and culture.

John mentions lacking a middle name but consider the complexities of Japanese names. Due to the creeping infection of Western culture and computer-based standardization, many Japanese list their names in Western order, given name, family name, instead of the Japanese order of family name, given name.

Even languages can start the slide to being “edge cases,” as you will see from the erosion of Hangul (Korean alphabet) from public signs in Seoul.

Computers could be preserving languages and cultural traditions, they have the capacity and infinite patience.

But they are not being used for that purpose.

Cellphones, for example, are linking humanity into a seething mass of impoverished social interaction. Impoverished social interaction that is creating more homogenized languages, not preserving diverse ones.

Not only should you be an edge case but you should push back against the homogenizing impact of computers. The diversity we lose could well be your own.

November 12, 2015

The Architecture of Open Source Applications

Filed under: Books,Computer Science,Programming,Software,Software Engineering — Patrick Durusau @ 9:08 pm

The Architecture of Open Source Applications

From the webpage:

Architects look at thousands of buildings during their training, and study critiques of those buildings written by masters. In contrast, most software developers only ever get to know a handful of large programs well—usually programs they wrote themselves—and never study the great programs of history. As a result, they repeat one another’s mistakes rather than building on one another’s successes.

Our goal is to change that. In these two books, the authors of four dozen open source applications explain how their software is structured, and why. What are each program’s major components? How do they interact? And what did their builders learn during their development? In answering these questions, the contributors to these books provide unique insights into how they think.

If you are a junior developer, and want to learn how your more experienced colleagues think, these books are the place to start. If you are an intermediate or senior developer, and want to see how your peers have solved hard design problems, these books can help you too.

Follow us on our blog at http://aosabook.org/blog/, or on Twitter at @aosabook and using the #aosa hashtag.

I happened upon these four books because of a tweet that mentioned: Early Access Release of Allison Kaptur’s “A Python Interpreter Written in Python” Chapter, which I found to be the tenth chapter of “500 Lines.”

OK, but what the hell is “500 Lines?” Poking around a bit I found The Architecture of Open Source Applications.

Which is the source for the material I quote above.

Do you learn from example?

Let me give you the flavor of three of the completed volumes and the “500 Lines” that is in progress:

The Architecture of Open Source Applications: Elegance, Evolution, and a Few Fearless Hacks (vol. 1), from the introduction:

Carpentry is an exacting craft, and people can spend their entire lives learning how to do it well. But carpentry is not architecture: if we step back from pitch boards and miter joints, buildings as a whole must be designed, and doing that is as much an art as it is a craft or science.

Programming is also an exacting craft, and people can spend their entire lives learning how to do it well. But programming is not software architecture. Many programmers spend years thinking about (or wrestling with) larger design issues: Should this application be extensible? If so, should that be done by providing a scripting interface, through some sort of plugin mechanism, or in some other way entirely? What should be done by the client, what should be left to the server, and is “client-server” even a useful way to think about this application? These are not programming questions, any more than where to put the stairs is a question of carpentry.

Building architecture and software architecture have a lot in common, but there is one crucial difference. While architects study thousands of buildings in their training and during their careers, most software developers only ever get to know a handful of large programs well. And more often than not, those are programs they wrote themselves. They never get to see the great programs of history, or read critiques of those programs’ designs written by experienced practitioners. As a result, they repeat one another’s mistakes rather than building on one another’s successes.

This book is our attempt to change that. Each chapter describes the architecture of an open source application: how it is structured, how its parts interact, why it’s built that way, and what lessons have been learned that can be applied to other big design problems. The descriptions are written by the people who know the software best, people with years or decades of experience designing and re-designing complex applications. The applications themselves range in scale from simple drawing programs and web-based spreadsheets to compiler toolkits and multi-million line visualization packages. Some are only a few years old, while others are approaching their thirtieth anniversary. What they have in common is that their creators have thought long and hard about their design, and are willing to share those thoughts with you. We hope you enjoy what they have written.

The Architecture of Open Source Applications: Structure, Scale, and a Few More Fearless Hacks (vol. 2), from the introduction:

In the introduction to Volume 1 of this series, we wrote:

Building architecture and software architecture have a lot in common, but there is one crucial difference. While architects study thousands of buildings in their training and during their careers, most software developers only ever get to know a handful of large programs well… As a result, they repeat one another’s mistakes rather than building on one another’s successes… This book is our attempt to change that.

In the year since that book appeared, over two dozen people have worked hard to create the sequel you have in your hands. They have done so because they believe, as we do, that software design can and should be taught by example—that the best way to learn how think like an expert is to study how experts think. From web servers and compilers through health record management systems to the infrastructure that Mozilla uses to get Firefox out the door, there are lessons all around us. We hope that by collecting some of them together in this book, we can help you become a better developer.

The Performance of Open Source Applications, from the introduction:

It’s commonplace to say that computer hardware is now so fast that most developers don’t have to worry about performance. In fact, Douglas Crockford declined to write a chapter for this book for that reason:

If I were to write a chapter, it would be about anti-performance: most effort spent in pursuit of performance is wasted. I don’t think that is what you are looking for.

Donald Knuth made the same point thirty years ago:

We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil.

but between mobile devices with limited power and memory, and data analysis projects that need to process terabytes, a growing number of developers do need to make their code faster, their data structures smaller, and their response times shorter. However, while hundreds of textbooks explain the basics of operating systems, networks, computer graphics, and databases, few (if any) explain how to find and fix things in real applications that are simply too damn slow.

This collection of case studies is our attempt to fill that gap. Each chapter is written by real developers who have had to make an existing system faster or who had to design something to be fast in the first place. They cover many different kinds of software and performance goals; what they have in common is a detailed understanding of what actually happens when, and how the different parts of large applications fit together. Our hope is that this book will—like its predecessor The Architecture of Open Source Applications—help you become a better developer by letting you look over these experts’ shoulders.

500 Lines or Less From the GitHub page:

Every architect studies family homes, apartments, schools, and other common types of buildings during her training. Equally, every programmer ought to know how a compiler turns text into instructions, how a spreadsheet updates cells, and how a database efficiently persists data.

Previous books in the AOSA series have done this by describing the high-level architecture of several mature open-source projects. While the lessons learned from those stories are valuable, they are sometimes difficult to absorb for programmers who have not yet had to build anything at that scale.

“500 Lines or Less” focuses on the design decisions and tradeoffs that experienced programmers make when they are writing code:

  • Why divide the application into these particular modules with these particular interfaces?
  • Why use inheritance here and composition there?
  • How do we predict where our program might need to be extended, and how can we make that easy for other programmers

Each chapter consists of a walkthrough of a program that solves a canonical problem in software engineering in at most 500 source lines of code. We hope that the material in this book will help readers understand the varied approaches that engineers take when solving problems in different domains, and will serve as a basis for projects that extend or modify the contributions here.

If you answered the question about learning from example with yes, adding these works to your read and re-read list.

BTW, for markup folks, check out Parsing XML at the Speed of Light by Arseny Kapoulkine.

Many hours of reading and keyboard pleasure await anyone using these volumes.

R for cats

Filed under: Programming,R — Patrick Durusau @ 5:36 pm

An intro to R for new programmers by Scott Chamberlain.

From the webpage:

This is an introduction to R. I promise this will be fun. Since you have never used a programming language before, or any language for that matter, you won’t be tainted by other programming languages with different ways of doing things. This is good – we can teach you the R way of doing things.

Scott says this site is a rip off of JSforcats.com and I suggest we take his word for it.

If being “for cats” interests people who would not otherwise study either language, great.

Enjoy!

November 6, 2015

Learn R From Scratch

Filed under: Programming,R — Patrick Durusau @ 11:52 am

Learn R From Scratch

From the description:

A Channel dedicated to R Programming – The language of Data Science. We notice people learning the language in parts, so the initial lectures are dedicated to teach the language to aspiring Data Science Professionals, in a structured fashion so that you learn the language completely and be able to contribute back to the community. Upon taking the course, you will appreciate the inherent brilliance of R.

If I haven’t missed anything, thirty-seven (37) R videos await your viewing pleasure!

None of the videos are long, the vast majority shorter than four (4) minutes but a skilled instructor can put a lot in a four minute video.

The short length means you can catch a key concept and go on to practice it before it fades from memory. Plus you can find time for a short video when finding time for an hour lecture is almost impossible.

Enjoy!

October 31, 2015

PAPERS ARE AMAZING: Profiling threaded programs with Coz

Filed under: Profiling,Programming — Patrick Durusau @ 12:33 pm

PAPERS ARE AMAZING: Profiling threaded programs with Coz by Julia Evans.

I don’t often mention profiling at all but I mention Julia’s post because:

  1. It reports a non-intuitive insight in profiling threaded programs (at least until you have seen it).
  2. Julia writes a great post on new ideas with perf.

From the post:

The core idea in this paper is – if you have a line of code in a thread, and you want to know if it’s making your program slow, speed up that line of code to see if it makes the whole program faster!

Of course, you can’t actually speed up a thread. But you can slow down all other threads! So that’s what they do. The implemention here is super super super interesting – they use the perf Linux system to do this, and in particular they can do it without modifying the program’s code. So this is a) wizardry, and b) uses perf

Which are both things we love here (omg perf). I’m going to refer you to the paper for now to learn more about how they use perf to slow down threads, because I honestly don’t totally understand it myself yet. There are some difficult details like “if the thread is already waiting on another thread, should we slow it down even more?” that they get into.

The insight that slowing down all but one thread is the equivalent to speeding up the thread of interest for performance evaluation sounds obvious when mentioned. But only after it is mentioned.

I suspect the ability to have that type of insight isn’t teachable other than by demonstration across a wide range of cases. If you know of other such insights, ping me.

For those interested in “real world” application of insights, Julia mentions the use of this profiler on SQLite and Memcached.

See Julia’s post for the paper and other references.

If you aren’t already checking Julia’s blog on a regular basis you might want to start.

October 29, 2015

Concurrency, Specification & Programming (CS&P 2015)

Filed under: Concurrent Programming,Functional Programming,Programming — Patrick Durusau @ 9:38 am

Concurrency, Specification & Programming, volume 1, Zbigniew Suraj, Ludwik Czaja (Eds.)

Concurrency, Specification & Programming, volume 2, Zbigniew Suraj, Ludwik Czaja (Eds.)

From the preface:

This two-volume book contains the papers selected for presentation at the Concurrency, Specification and Programming (CS&P) Workshop. It is taking place from 28th to 30th September 2015 in Rzeszow, the biggest city in southeastern Poland. CS&P provides an international forum for exchanging scientific, research, and technological achievements in concurrency, programming, artificial intelligence, and related fields. In particular, major areas selected for CS&P 2015 include mathematical models of concurrency, data mining and applications, fuzzy computing, logic and probability in theory of computing, rough and granular computing, unconventional computing models. In addition, three plenary keynote talks were delivered.

Not for the faint of heart but if you are interested in the future of computing, these two volumes should be on your reading list.

October 21, 2015

Clojure for the Brave and True Update!

Filed under: Clojure,Functional Programming,Programming — Patrick Durusau @ 3:08 pm

Clojure for the Brave and True by Daniel Higginbotham.

From the webpage:

Clojure for the Brave and True is now available in print! You can use the coupon code ZOMBIEHUGS to get 30% off at No Starch (plus you’ll get a free sticker), or buy it from Amazon.

The web site has been updated, too! (Don’t forget to force refresh.) One of the reasons I went with No Starch as a publisher was that they supported the idea of keeping the entire book available for free online. It makes me super happy to release the professionally-edited, even better book for free. I hope it makes you laugh, cry, and give up on object-oriented programming forever.

Writing this book was one of the most ambitious projects of my life, and I appreciate all the support I’ve gotten from friends, family, and readers like you. Thank you from the bottom of my crusty heart!

[Update] I got asked for a list of the major differences. Here they are:

  • Illustrations!
  • Almost every chapter now has exercises
  • The first macro chapter, Read and Eval, is massively improved. I’m hoping this will gives readers an excellent conceptual foundation for working with macros
  • There’s now a joke about melting faces
  • There used to be two Emacs chapters (basic emacs and using Emacs for Clojure dev), now there’s just one
  • The concurrency chapter got split into two chapters
  • Appendices on Leiningen and Boot were added
  • The “Do Things” chapter is much friendlier
  • I spend a lot more time explaining some of the more obscure topics, like lazy sequences.
  • Many of the chapters got massive overhauls. The functional programming chapter, for example, was turned completely inside out, and the result is that it’s much, much clearer
  • Overall, everything should be clearer

Daniel has taken the plunge and quit his job to have more time for writing. If you can, buy a print copy and recommend Clojure for the Brave and True to a friend!

We need to encourage people like Daniel and publishers like No Starch. Vote with your feet and your pocket books.

Follow Daniel on twitter @nonrecursive

« Newer PostsOlder Posts »

Powered by WordPress