Archive for the ‘Programming’ Category

How to get started with Data Science using R

Sunday, November 20th, 2016

How to get started with Data Science using R by Karthik Bharadwaj.

From the post:

R being the lingua franca of data science and is one of the popular language choices to learn data science. Once the choice is made, often beginners find themselves lost in finding out the learning path and end up with a signboard as below.

In this blog post I would like to lay out a clear structural approach to learning R for data science. This will help you to quickly get started in your data science journey with R.

You won’t find anything you don’t already know but this is a great short post to pass onto others.

Point out R skills will help them expose and/or conceal government corruption.

Python Data Science Handbook

Saturday, November 19th, 2016

Python Data Science Handbook (Github)

From the webpage:

Jupyter notebook content for my OReilly book, the Python Data Science Handbook.


See also the free companion project, A Whirlwind Tour of Python: a fast-paced introduction to the Python language aimed at researchers and scientists.

This repository will contain the full listing of IPython notebooks used to create the book, including all text and code. I am currently editing these, and will post them as I make my way through. See the content here:


Useful Listicle: The 5 most downloaded R packages

Tuesday, November 15th, 2016

The 5 most downloaded R packages

From the post:

Curious which R packages your colleagues and the rest of the R community are using? Thanks to you can now see for yourself! aggregates R documentation and download information from popular repositories like CRAN, BioConductor and GitHub. In this post, we’ll take a look at the top 5 R packages with the most direct downloads!

Sorry! No spoiler!

Do check out: aggregates help documentation for R packages from CRAN, BioConductor, and GitHub – the three most common sources of current R documentation. goes beyond simply aggregating this information, however, by bringing all of this documentation to your fingertips via the RDocumentaion package. The RDocumentation package overwrites the basic help functions from the utils package and gives you access to from the comfort of your RStudio IDE. Look up the newest and most popular R packages, search through documentation and post community examples.

As they say:

Create an RDocumentation account today!

I’m always sympathetic to documentation but more so today because I have wasted hours over the past two or three days on issues that could have been trivially documented.

I will be posting “corrected” documentation later this week.

PS: If you have or suspect you have poorly written documentation, I have some time available for paid improvement of the same.

None/Some/All … Are Suicide Bombers & Probabilistic Programming Languages

Tuesday, November 8th, 2016

The Design and Implementation of Probabilistic Programming Languages by Noah D. Goodman and Andreas Stuhlmüller.


Probabilistic programming languages (PPLs) unify techniques for the formal description of computation and for the representation and use of uncertain knowledge. PPLs have seen recent interest from the artificial intelligence, programming languages, cognitive science, and natural languages communities. This book explains how to implement PPLs by lightweight embedding into a host language. We illustrate this by designing and implementing WebPPL, a small PPL embedded in Javascript. We show how to implement several algorithms for universal probabilistic inference, including priority-based enumeration with caching, particle filtering, and Markov chain Monte Carlo. We use program transformations to expose the information required by these algorithms, including continuations and stack addresses. We illustrate these ideas with examples drawn from semantic parsing, natural language pragmatics, and procedural graphics.

If you want to sharpen the discussion of probabilistic programming languages, substitute in the pragmatics example:

‘none/some/all of the children are suicide bombers’,

The substitution raises the issue of how “certainty” can/should vary depending upon the gravity of results.

Who is a nice person?, has low stakes.

Who is a suicide bomber?, has high stakes.

Resource: Malware analysis – …

Tuesday, October 4th, 2016

Resource: Malware analysis – learning How To Reverse Malware: A collection of guides and tools by Claus Cramon Houmann.

This resource will provide you theory around learning malware analysis and reverse engineering malware. We keep the links up to date as the infosec community creates new and interesting tools and tips.

Some technical reading to enjoy instead of political debates!


The Simpsons by the Data [South Park as well]

Thursday, September 29th, 2016

The Simpsons by the Data by Todd Schneider.

From the post:

The Simpsons needs no introduction. At 27 seasons and counting, it’s the longest-running scripted series in the history of American primetime television.

The show’s longevity, and the fact that it’s animated, provides a vast and relatively unchanging universe of characters to study. It’s easier for an animated show to scale to hundreds of recurring characters; without live-action actors to grow old or move on to other projects, the denizens of Springfield remain mostly unchanged from year to year.

As a fan of the show, I present a few short analyses about Springfield, from the show’s dialogue to its TV ratings. All code used for this post is available on GitHub.

Alert! You must run Flash in order to access Simpsons World, the source of Todd’s data.

Advice: Treat Flash as malware and run in a VM.

Todd covers the number of words spoken per character, gender imbalance, focus on characters, viewership, and episode summaries (tf-idf).

Other analysis awaits your imagination and interest.

BTW, if you want comedy data a bit closer to the edge, try Text Mining South Park by Kaylin Walker. Kaylin uses R for her analysis as well.

Other TV programs with R-powered analysis?

Hacker-Proof Code Confirmed [Can Liability Be Far Behind?]

Thursday, September 22nd, 2016

Hacker-Proof Code Confirmed by Kevin Hartnett.

From the post:

In the summer of 2015 a team of hackers attempted to take control of an unmanned military helicopter known as Little Bird. The helicopter, which is similar to the piloted version long-favored for U.S. special operations missions, was stationed at a Boeing facility in Arizona. The hackers had a head start: At the time they began the operation, they already had access to one part of the drone’s computer system. From there, all they needed to do was hack into Little Bird’s onboard flight-control computer, and the drone was theirs.

When the project started, a “Red Team” of hackers could have taken over the helicopter almost as easily as it could break into your home Wi-Fi. But in the intervening months, engineers from the Defense Advanced Research Projects Agency (DARPA) had implemented a new kind of security mechanism — a software system that couldn’t be commandeered. Key parts of Little Bird’s computer system were unhackable with existing technology, its code as trustworthy as a mathematical proof. Even though the Red Team was given six weeks with the drone and more access to its computing network than genuine bad actors could ever expect to attain, they failed to crack Little Bird’s defenses.

“They were not able to break out and disrupt the operation in any way,” said Kathleen Fisher, a professor of computer science at Tufts University and the founding program manager of the High-Assurance Cyber Military Systems (HACMS) project. “That result made all of DARPA stand up and say, oh my goodness, we can actually use this technology in systems we care about.”

Reducing the verification requirement to a manageable size appears to be the key to DARPA’s success.

That is rather than verification of the entire program, only critical parts, such as excluding hackers, need to be verified.

If this spreads, failure to formally verify critical parts of software would be a natural place to begin imposing liability for poorly written code.

PS: Would formal proof of data integration be a value-add?

R Weekly

Monday, September 12th, 2016

R Weekly

A new weekly publication of R resources that began on 21 May 2016 with Issue 0.

Mostly titles of post and news articles, which is useful, but not as useful as short summaries, including the author’s name.

Watch your Python script with strace

Sunday, September 11th, 2016


Modern operating systems sandbox each process inside of a virtual memory map from which direct I/O operations are generally impossible. Instead, a process has to ask the operating system every time it wants to modify a file or communicate bytes over the network. By using operating system specific tools to watch the system calls a Python script is making — using “strace” under Linux or “truss” under Mac OS X — you can study how a program is behaving and address several different kinds of bugs.

Brandon Rhodes does a delightful presentation on using strace with Python.

Slides for Tracing Python with strace or truss.

I deeply enjoyed this presentation, which I discovered while looking at a Python regex issue.

Anticipate running strace on the Python script this week and will report back on any results or failure to obtain results! (Unlike in academic publishing, experiments and investigations do fail.)

The Wrong Way to Teach Grammar [Programming?]

Tuesday, August 30th, 2016

The Wrong Way to Teach Grammar by Michelle Navarre Cleary.

From the post:

A century of research shows that traditional grammar lessons—those hours spent diagramming sentences and memorizing parts of speech—don’t help and may even hinder students’ efforts to become better writers. Yes, they need to learn grammar, but the old-fashioned way does not work.

This finding—confirmed in 1984, 2007, and 2012 through reviews of over 250 studies—is consistent among students of all ages, from elementary school through college. For example, one well-regarded study followed three groups of students from 9th to 11th grade where one group had traditional rule-bound lessons, a second received an alternative approach to grammar instruction, and a third received no grammar lessons at all, just more literature and creative writing. The result: No significant differences among the three groups—except that both grammar groups emerged with a strong antipathy to English.

There is a real cost to ignoring such findings. In my work with adults who dropped out of school before earning a college degree, I have found over and over again that they over-edit themselves from the moment they sit down to write. They report thoughts like “Is this right? Is that right?” and “Oh my god, if I write a contraction, I’m going to flunk.” Focused on being correct, they never give themselves a chance to explore their ideas or ways of expressing those ideas. Significantly, this sometimes-debilitating focus on “the rules” can be found in students who attended elite private institutions as well as those from resource-strapped public schools.

(Three out of five links here are pay-per-view. Sorry.)

It’s only a century of research. Don’t want to rush into anything. 😉

How would you adapt this finding to teaching programming and/or hacking?


Linux debugging tools you’ll love: the zine

Saturday, August 27th, 2016

Linux debugging tools you’ll love: the zine by Julia Evans.

From the website:

There are a ton of amazing debugging tools for Linux that I love. strace! perf! tcpdump! wireshark! opensnoop! I think a lot of them aren’t as well-known as they should be, so I’m making a friendly zine explaining them.

Donate, subscribe (PDF or paper)!

If you follow Julia’s blog ( or twitter (@b0rk), you know what a treat the zine will be!

If you don’t (correct that error now) and consider the following sample:


It’s possible there are better explanations than Julia’s, so if and when you see one, sing out!

Until then, get the zine!

The Hanselminutes Podcast

Friday, August 26th, 2016

The Hanselminutes Podcast: Fresh Air for Developers by Scott Hanselman.

I went looking for Felienne’s podcast on code smells and discovered along with it, The Hanselminutes Podcast: Fresh Air for Developers!

Felienne’s podcast is #542 so there is a lot of content to enjoy! (I checked the archive. Yes, there really are 542 episodes as of today.)

Exploring Code Smells in code written by Children

Friday, August 26th, 2016

Exploring Code Smells in code written by Children (podcast) by Dr. Felienne

From the description:

Felienne is always learning. In exploring her PhD dissertation and her public speaking experience it’s clear that she has no intent on stopping! Most recently she’s been exploring a large corpus of Scratch programs looking for Code Smells. How do children learn how to code, and when they do, does their code “smell?” Is there something we can do when teaching to promote cleaner, more maintainable code?

Felienne discusses a paper due to appear in September on analysis of 250K Scratch programs for code smells.

Thoughts on teaching programmers to detect bug smells?


Tuesday, August 23rd, 2016

Julia Evans tweeted:


It’s been two days without another suggestion.

Considering Brendan D. Gregg’s homepage, do you have another suggestion?

Too rich of a resource to not write down.

Besides, for some subjects and their relationships, you need specialized tooling to see them.

Not to mention that if you can spot patterns in subjects, detecting an unknown 0-day may be easier.

Of course, you can leave USB sticks at popular eateries near Fort Meade, MD 20755-6248, but some people prefer to work for their 0-day exploits.


Eloquent JavaScript

Tuesday, August 23rd, 2016

Eloquent JavaScript by by Marijn Haverbeke.

From the webpage:

This is a book about JavaScript, programming, and the wonders of the digital. You can read it online here, or get your own paperback copy of the book.


Embarrassing that authors post free content for the betterment of others, but wealthy governments play access games.

This book is also available in Български (Bulgarian), Português (Portuguese), and Русский (Russian).


A Whirlwind Tour of Python (Excellent!)

Tuesday, August 23rd, 2016

A Whirlwind Tour of Python by Jake VanderPlas.

From the webpage:

To tap into the power of Python’s open data science stack—including NumPy, Pandas, Matplotlib, Scikit-learn, and other tools—you first need to understand the syntax, semantics, and patterns of the Python language. This report provides a brief yet comprehensive introduction to Python for engineers, researchers, and data scientists who are already familiar with another programming language.

Author Jake VanderPlas, an interdisciplinary research director at the University of Washington, explains Python’s essential syntax and semantics, built-in data types and structures, function definitions, control flow statements, and more, using Python 3 syntax.

You’ll explore:

  • Python syntax basics and running Python code
  • Basic semantics of Python variables, objects, and operators
  • Built-in simple types and data structures
  • Control flow statements for executing code blocks conditionally
  • Methods for creating and using reusable functions
  • Iterators, list comprehensions, and generators
  • String manipulation and regular expressions
  • Python’s standard library and third-party modules
  • Python’s core data science tools
  • Recommended resources to help you learn more

Jake VanderPlas is a long-time user and developer of the Python scientific stack. He currently works as an interdisciplinary research director at the University of Washington, conducts his own astronomy research, and spends time advising and consulting with local scientists from a wide range of fields.

A Whirlwind Tour of Python, can be recommended without reservation.

In addition to the book, the Jupyter notebooks behind the book have been posted.


29 common beginner Python errors on one page [Something Similar For XQuery?]

Friday, August 19th, 2016

29 common beginner Python errors on one page

From the webpage:

A few times a year, I have the job of teaching a bunch of people who have never written code before how to program from scratch. The nature of programming being what it is, the same error crop up every time in a very predictable pattern. I usually encourage my students to go through a step-by-step troubleshooting process when trying to fix misbehaving code, in which we go through these common errors one by one and see if they could be causing the problem. Today, I decided to finally write this troubleshooting process down and turn it into a flowchart in non-threatening colours.

Behold, the “my code isn’t working” step-by-step troubleshooting guide! Follow the arrows to find the likely cause of your problem – if the first thing you reach doesn’t work, then back up and try again.

Click the image for full-size, and click here for a printable PDF. Colour scheme from Luna Rosa.

Useful for Python beginner’s and should be inspirational for other languages.

Thoughts on something similar for XQuery Errors? Suggestions for collecting the “most common” XQuery errors?

Contributing to StackOverflow: How Not to be Intimidated

Friday, August 19th, 2016

Contributing to StackOverflow: How Not to be Intimidated by Ksenia Coulter.

From the post:

StackOverflow is an essential resource for programmers. Whether you run into a bizarre and scary error message or you’re blanking on something you should know, StackOverflow comes to the rescue. Its popularity with coders spurred many jokes and memes. (Programming to be Officially Renamed “Googling Stackoverflow,” a satirical headline reads).

(image omitted)

While all of us are users of StackOverflow, contributing to this knowledge base can be very intimidating, especially to beginners or to non-traditional coders who many already feel like they don’t belong. The fact that an invisible barrier exists is a bummer because being an active contributor not only can help with your job search and raise your profile, but also make you a better programmer. Explaining technical concepts in an accessible way is difficult. It is also well-established that teaching something solidifies your knowledge of the subject. Answering StackOverflow questions is great practice.

All of the benefits of being an active member of StackOverflow were apparent to me for a while, but I registered an account only this week. Let me walk you t[h]rough thoughts that hindered me. (Chances are, you’ve had them too!)

I plead guilty to using StackOverFlow but not contributing back to it.

Another “intimidation” to avoid is thinking you must have the complete and killer answer to any question.

That can and does happen, but don’t wait for a question where you can supply such an answer.

Jump in! (Advice to myself as well as any readers.)

strace’ing a Clojure process under lein

Tuesday, August 16th, 2016

strace’ing a Clojure process under lein by Tim McCormack.

From the post:

Today I wanted to strace a JVM process to see if it was making network calls, and I discovered a minor roadblock: It was a Clojure program being run using the Leiningen build tool. lein run spawns a JVM subprocess and then exits, and I only wanted to trace that subprocess.

The solution is simple, but worth a post: Tell lein to run a different “java” command that actually wraps a call to java with strace. Here’s how I did it:

For the “…you never do know file…” and because it’s better to know than to assume.


Tuesday, August 9th, 2016

ARGUS by Christopher Meiklejohn.

From the post:

This is one post in a series about programming models and languages for distributed computing that I’m writing as part of my history of distributed programming techniques.

Relevant Reading

  • Abstraction Mechanisms in CLU, Liskov, Barbara and Snyder, Alan and Atkinson, Russell and Schaffert, Craig, CACM 1977 (Liskov et al. 1977).
  • Guardians and Actions: Linguistic Support for Robust, Distributed Programs, Liskov, Barbara and Scheifler, Robert, TOPLAS 1982 (Liskov and Scheifler 1983).
  • Orphan Detection in the Argus System, Walker, Edward Franklin, DTIC 1984 (Walker 1984).
  • Implementation of Argus, Liskov, Barbara and Curtis, Dorothy and Johnson, Paul and Scheifer, Robert, SIGOPS 1987 (Liskov et al. 1987).
  • Distributed Programming in Argus, Liskov, Barbara CACM 1988 (Liskov 1988).

I’m thinking about how to fix an XFCE trackpad problem and while I think about that, wanted to touch up the references from Christopher’s post.

Apologies but I was unable to find a public version of: Implementation of Argus, Liskov, Barbara and Curtis, Dorothy and Johnson, Paul and Scheifer, Robert, SIGOPS 1987 (Liskov et al. 1987).

Hoping that easier access to most of the relevant reading will increase your enjoyment of Christopher’s post.


Functional TypeScript

Wednesday, August 3rd, 2016

Functional TypeScript by Victor Savkin.

From the post:

When discussing functional programming we often talk about the machinery, and not the core principles. Functional programming is not about monads, monoids, or zippers, even though those are useful to know. It is primarily about writing programs by composing generic reusable functions. This article is about applying functional thinking when refactoring TypeScript code.

And to do that we will use the following three techniques:

  • Use Functions Instead of Simple Values
  • Model Data Transformations as a Pipeline
  • Extract Generic Functions

Let’s get started!

Parallel processing has been cited as a driver for functional programming for many years. It’s Time to Get Good at Functional Programming

The movement of the United States government towards being a “franchise” is another important driver for functional programming.

Code that has no-side effects can be more easily repurposed, depending on the needs of a particular buyer.

The NSA wants terabytes of telephone metadata to maintain its “data mining as useful activity” fiction, China wants telephone metadata on its financial investments, other groups are spying on themselves and/or others.

Wasteful, not to mention expensive, to maintain side-effect ridden code bases for each customer.

Prepare for universal parallel processing and governments as franchises, start thinking functionally today!

Pandas Exercises

Saturday, July 30th, 2016

Pandas Exercises

From the post:

Fed up with a ton of tutorials but no easy way to find exercises I decided to create a repo just with exercises to practice pandas. Don’t get me wrong, tutorials are great resources, but to learn is to do. So unless you practice you won’t learn.

There will be three different types of files:

  1. Exercise instructions
  2. Solutions without code
  3. Solutions with code and comments

My suggestion is that you learn a topic in a tutorial or video and then do exercises. Learn one more topic and do exercises. If you got the answer wrong, don’t go to the solution with code, follow this advice instead.

Suggestions and collaborations are more than welcome. 🙂

I’m sure you will find this useful but when I search for pandas exercise python, I get 298,000 “hits.”

Adding exercises here isn’t going to improve the findability of pandas for particular subject areas or domains.

Perhaps as exercises are added here, links to exercises by subject area can be added as well.

With nearly 300K potential sources, there is no shortage of exercises to go around!

An analysis of Pokémon Go types, created with R

Thursday, July 21st, 2016

An analysis of Pokémon Go types, created with R by David Smith.

From the post:

As anyone who has tried Pokémon Go recently is probably aware, Pokémon come in different types. A Pokémon’s type affects where and when it appears, and the types of attacks it is vulnerable to. Some types, like Normal, Water and Grass are common; others, like Fairy and Dragon are rare. Many Pokémon have two or more types.

To get a sense of the distribution of Pokémon types, Joshua Kunst used R to download data from the Pokémon API and created a treemap of all the Pokémon types (and for those with more than 1 type, the secondary type). Johnathon’s original used the 800+ Pokémon from the modern universe, but I used his R code to recreate the map for the 151 original Pokémon used in Pokémon Go.

If you or your dog:


need a break from Pokémon Go, check out this post!

You will get some much needed rest, polish up your R skills and perhaps learn something about the Pokémon API.

The Pokémon Go craze brings to mind the potential for the creation of alternative location-based games. Accessing locations which require steady nerves and social engineering skills. That definitely has potential.

Say a spy-vs-spy character at a location near a “secret” military base? 😉

HyperTerm (Not Windows HyperTerm)

Monday, July 18th, 2016


Tersely by Nat Torkington as:

— an open source in-browser terminal emulator.

That’s fair, but the project goals read:

The goal of the project is to create a beautiful and extensible experience for command-line interface users, built on open web standards.

In the beginning, our focus will be primarily around speed, stability and the development of the correct API for extension authors.

In the future, we anticipate the community will come up with innovative additions to enhance what could be the simplest, most powerful and well-tested interface for productivity.

JS/HTML/CSS Terminal. Visit HyperTerm for a rocking demo!

Scroll down after the demo to see more.

Looking forward to a Linux package being released!

Donald Knuth: Literate Programming on Channel 9

Thursday, July 7th, 2016

Donald Knuth: Literate Programming on Channel 9.


The speaker will discuss what he considers to be the most important outcome of his work developing TeX in the 1980s, namely the accidental discovery of a new approach to programming — which caused a radical change in his own coding style. Ever since then, he has aimed to write programs for human beings (not computers) to read. The result is that the programs have fewer mistakes, they are easier to modify and maintain, and they can indeed be understood by human beings. This facilitates reproducible research, among other things.

Presentation at the R User Conference 2016.

Increase your book budget before watching this video!

Free Programming Books – Update

Tuesday, July 5th, 2016

Free Programming Books by Victor Felder.

From the webpage:

This list initially was a clone of stackoverflow – List of Freely Available Programming Books by George Stocker. Now updated, with dead links gone and new content.

Moved to GitHub for collaborative updating.

Great listing of resources!

But each resource stands alone as its own silo. It can (and many do) refer to other materials, even with hyperlinks, but if you want to explore any of them, you must explore them separately. That’s what being in a silo means. You have to start over at the beginning. Every time.

That is complicated by the existence of thousands of slideshows and videos on programming topics not listed here. Search for your favorite programming language at Slideshare and Youtube. There are other repositories of slideshows and videos, those are just examples.

Each one of those slideshows and/or videos is also a silo. Not to mention that with video you need a time marker if you aren’t going to watch every second of it to find relevant material.

What if you could traverse each of those silos, books, posts, slideshows, videos, documentation, source code, seamlessly?

Making that possible for C/C++ now, given the backlog of material, would have a large upfront cost before it could be useful.

Making that possible for languages with shorter histories, well, how useful would it need to be to justify its cost?

And how would you make it possible for others to easily contribute gems that they find?

Something to think about as you wander about in each of these separate silos.


Cybersecurity By Design?

Monday, July 4th, 2016

Shaun Nichols reports in Mozilla emits nightly builds of heir-to-Firefox browser engine Servo:

Mozilla has started publishing nightly in-development builds of its experimental Servo browser engine so anyone can track the project’s progress.

Executables for macOS and GNU/Linux are available right here to download and test drive even if you’re not a developer. If you are, the open-source engine’s code is here if you want to build it from scratch, fix bugs, or contribute to the effort.

Right now, the software is very much in a work-in-progress state, with a very simple user interface built out of HTML. It’s more of a technology demonstration than a viable web browser, although Mozilla has pitched Servo as a potential successor to Firefox’s Gecko engine.

Crucially, Servo is written using Rust – Mozilla’s more-secure C-like systems programming language. If Google has the language of Go, Moz has the language of No: Rust. It works hard to stop coders making common mistakes that lead to exploitable security bugs, and we literally mean stop: the compiler won’t build the application if it thinks dangerous code is present.

Rust focuses on safety and speed: its security measures do not impact it at run-time as the safety mechanisms are in the language by design. For example, variables in Rust have an owner and a lifetime; they can be borrowed by another owner. When a variable is being used by one owner, it cannot be used by another. This is supposed to help enforce memory safety and stop data races between threads.

It also forces the programmer to stop and think about their software’s design – Rust is not something for novices to pick up and quickly bash out code on.

Even though pre-release and rough, I was fairly excited until I read:

One little problem is that Servo relies on Mozilla’s SpiderMonkey JavaScript engine, which is written in C/C++. So while the HTML-rendering engine will run secured Rust code, fingers crossed nothing terrible happens within the JS engine.


But then I checked Mozilla JavaScript-C Engine – SpiderMonkey at BlackDuck | Security, which shows zero (0) vulnerabilities over the last 10 versions.

Other than SpiderMonkey vulnerabilities known to the NSA, any others you care to mention?

Support, participate, submit bug reports on the new rendering engine but don’t forget about the JavaScript engine.

Computerworld’s advanced beginner’s guide to R

Wednesday, June 29th, 2016

Computerworld’s advanced beginner’s guide to R by David Smith.

From the post:

Many newcomers to R got their start learning the language with Computerworld’s Beginner’s Guide to R, a 6-part introduction to the basics of the language. Now, budding R users who want to take their skills to the next level have a new guide to help them: Computerword’s Advanced Beginner’s Guide to R. Written by Sharon Machlis, author of the prior Beginner’s guide and regular reporter of R news at Computerworld, this new 72-page guide dives into some trickier topics related to R: extracting data via API, data wrangling, and data visualization.

Well, what are you waiting for?

Either read it or pass it along!


Integrated R labs for high school students

Tuesday, June 28th, 2016

Integrated R labs for high school students by Amelia McNamara.

From the webpage:

Amelia McNamara, James Molyneux, Terri Johnson

This looks like a very promising approach for capturing the interests of high school students in statistics and R.

From the larger project, Mobilize, curriculum page:

Mobilize centers its curricula around participatory sensing campaigns in which students use their mobile devices to collect and share data about their communities and their lives, and to analyze these data to gain a greater understanding about their world.Mobilize breaks barriers by teaching students to apply concepts and practices from computer science and statistics in order to learn science and mathematics. Mobilize is dynamic: each class collects its own data, and each class has the opportunity to make unique discoveries. We use mobile devices not as gimmicks to capture students’ attention, but as legitimate tools that bring scientific enquiry into our everyday lives.

Mobilize comprises four key curricula: Introduction to Data Science (IDS), Algebra I, Biology, and Mobilize Prime, all focused on preparing students to live in a data-driven world. The Mobilize curricula are a unique blend of computational and statistical thinking subject matter content that teaches students to think critically about and with data. The Mobilize curricula utilize innovative mobile technology to enhance math and science classroom learning. Mobilize brings “Big Data” into the classroom in the form of participatory sensing, a hands-on method in which students use mobile devices to collect data about their lives and community, then use Mobilize Visualization tools to analyze and interpret the data.

I like the approach of having the student collect their own and process their own data. If they learn to question their own data and processes, hopefully they will ask questions about data processing results presented as “facts.” (Since 2016 is a presidential election year in the United States, questioning claimed data results is especially important.)


Clojure Gazette – New Format – Looking for New Readers

Monday, June 20th, 2016

Clojure Gazette by Eric Normand.

From the end of this essay:

Hi! The Clojure Gazette has recently changed from a list of curated links to an essay-style newsletter. I’ve gotten nothing but good comments about the change, but I’ve also noticed the first negative growth of readership since I started. I know these essays aren’t for everyone, but I’m sure there are people out there who would like the new format who don’t know about it. Would you do me a favor? Please share the Gazette with your friends!

The Biggest Waste in Our Industry is the title of the essay I link to above.

From the post:

I would like to talk about two nasty habits I have been party to working in software. Those two habits are 1) protecting programmer time and 2) measuring programmer productivity. I’m talking from my experience as a programmer to all the managers out there, or any programmer interested in process.

You can think of Eric’s essay as an update to Peopleware: Productive Projects and Teams by Tom DeMarco and Timothy Lister.

Peopleware was first published in 1987, second edition in 1999 (8 new chapters), third edition in 2013 (5 more pages than 1999 edition?).

Twenty-nine (29) years after the publication of Peopleware, managers still don’t “get” how to manage programmers (or other creative workers).

Disappointing, but not surprising.

It’s not uncommon to read position ads that describe going to lunch en masse, group activities, etc.

You would think they were hiring lemmings rather than technical staff.

If your startup founder is that lonely, check the local mission. Hire people for social activities, lunch, etc. Cheaper than hiring salaried staff. Greater variety as well. Ditto for managers with the need to “manage” someone.