Archive for the ‘C/C++’ Category

Google open sources a MapReduce framework for C/C++

Monday, February 23rd, 2015

Google open sources a MapReduce framework for C/C++ by Derrick Harris,

From the post:

Google announced on Wednesday that the company is open sourcing a MapReduce framework that will let users run native C and C++ code in their Hadoop environments. Depending on how much traction MapReduce for C, or MR4C, gets and by whom, it could turn out to be a pretty big deal.

Hadoop is famously, or infamously, written in Java and as such can suffer from performance issues compared with native C++ code. That’s why Google’s original MapReduce system was written in C++, as is the Quantcast File System, that company’s homegrown alternative for the Hadoop Distributed File System. And, as the blog post announcing MR4C notes, “many software companies that deal with large datasets have built proprietary systems to execute native code in MapReduce frameworks.”

Great news but be aware that “performance” is a tricky issue. If “performance” had a single meaning, the TIOBE Index for February 2015 (a rough gauge of programming language popularity) to look quite different over the years.

I remember a conference story where a programmer had written an application using Python, reasoning that resource limitations would compel the client to return for a fuller, enterprise solution. To their chagrin, the customer never exhausted the potential of the first solution. 😉

Flow: Actor-based Concurrency with C++ [FoundationDB]

Saturday, February 14th, 2015

Flow: Actor-based Concurrency with C++

From the post:

FoundationDB began with ambitious goals for both high performance per node and scalability. We knew that to achieve these goals we would face serious engineering challenges while developing the FoundationDB core. We’d need to implement efficient asynchronous communicating processes of the sort supported by Erlang
or the Async library in .NET, but we’d also need the raw speed and I/O efficiency of C++. Finally, we’d need to perform extensive simulation to engineer for reliability and fault tolerance on large clusters.

To meet these challenges, we developed several new tools, the first of which is Flow, a new programming language that brings actor-based concurrency to C++11. To add this capability, Flow introduces a number of new keywords and control-flow primitives for managing concurrency. Flow is implemented as a compiler which analyzes an asynchronous function (actor) and rewrites it as an object with many different sub-functions that use callbacks to avoid blocking (see streamlinejs for a similar concept using JavaScript). The Flow compiler’s output is normal C++11 code, which is then compiled to a binary using traditional tools. Flow also provides input to our simulation tool, Lithium, which conducts deterministic simulations of the entire system, including its physical interfaces and failure modes. In short, Flow allows efficient concurrency within C++ in a maintainable and extensible manner, achieving all three major engineering goals:

  • high performance (by compiling to native code),
  • actor-based concurrency (for high productivity development),
  • simulation support (for testing).

Flow Availability

Flow is not currently available outside of FoundationDB, but we’d like to open-source it in the future. If you’d like to stay in the loop with our progress subscribe below.

Are you going to be ready when Flow is released separate from FoundationDB?

Holiday Gift: Open-Source C++ SDK & GraphLab Create 1.2

Wednesday, December 24th, 2014

Holiday Gift: Open-Source C++ SDK & GraphLab Create 1.2 by Rajat Arya.

From the post:

Just when you were wondering how to keep from getting bored this holiday season, we’re delivering something to fuel your creativity and sharpen your C++ coding skills. With the release of GraphLab Create 1.x SDK (beta) you can now harness and extend the C++ engine that powers GraphLab Create.

Extensions built with the SDK can directly access the SFrame and SGraph data structures from within the C++ engine. Direct access enables you to build custom algorithms, toolkits, and lambdas in efficient native code. The SDK provides a lightweight path to create and compile custom functions and expose them through Python.

One of the great things about the Internet is that as soon as you wonder something like “…how am I going to keep from being bored…” a post like this one appears in your Twitter stream. Well, at least if you are a follower of @graphlabteam. (A good reason to be following @graphlabteam.)

Watching the explosive growth of progress on graphs and graph processing over the past couple of years makes me suspect that the security side of the house is doing something wrong. Not sure what but it isn’t making this sort of progress.

Enjoy the SDK!

Wintel and Open Source

Thursday, November 13th, 2014

The software world is reverberating with the news that Microsoft is in the process of making .NET completely open source.

On the same day, Intel announced that it had released “Julia2C, a source-to-source translator from Julia to C.”

Hmmm, is this evidence that open source is a viable path for commercial vendors? 😉

Next Question: How long before non-open source code become a liability? As in a nesting place for government surveillance/malware.

Speculation: Not as long as it took Wintel to move towards open source.

Consumers should demand open source code as a condition for purchase. All software, all the time.

Announcing Clasp

Tuesday, October 28th, 2014

Announcing Clasp by Christian Schafmeister.

From the post:

Click here for up to date build instructions

Today I am happy to make the first release of the Common Lisp implementation “Clasp”. Clasp uses LLVM as its back-end and generates native code. Clasp is a super-set of Common Lisp that interoperates smoothly with C++. The goal is to integrate these two very different languages together as seamlessly as possible to provide the best of both worlds. The C++ interoperation allows Common Lisp programmers to easily expose powerful C++ libraries to Common Lisp and solve complex programming challenges using the expressive power of Common Lisp. Clasp is licensed under the LGPL.

Common Lisp is considered by many to be one of the most expressive programming languages in existence. Individuals and small teams of programmers have created fantastic applications and operating systems within Common Lisp that require much larger effort when written in other languages. Common Lisp has many language features that have not yet made it into the C++ standard. Common Lisp has first-class functions, dynamic variables, true macros for meta-programming, generic functions, multiple return values, first-class symbols, exact arithmetic, conditions and restarts, optional type declarations, a programmable reader, a programmable printer and a configurable compiler. Common Lisp is the ultimate programmable programming language.

Clojure is a dialect of Lisp, which means you may spot situations where Lisp would be the better solution. Especially if you can draw upon C++ libraries.

The project is “actively looking” for new developers. Could be your opportunity to get in on the ground floor!

Native Actors – A Scalable Software Platform for Distributed, Heterogeneous Environments

Saturday, September 27th, 2014

Native Actors – A Scalable Software Platform for Distributed, Heterogeneous Environments by Dominik Charousset, Thomas C. Schmidt, Raphael Hiesgen, and Matthias Wählisch.

Abstract:

Writing concurrent software is challenging, especially with low-level synchronization primitives such as threads or locks in shared memory environments. The actor model replaces implicit communication by an explicit message passing in a ‘shared-nothing’ paradigm. It applies to concurrency as well as distribution, but has not yet entered the native programming domain. This paper contributes the design of a native actor extension for C++, and the report on a software platform that implements our design for (a)concurrent, (b) distributed, and (c) heterogeneous hardware environments. GPGPU and embedded hardware components are integrated in a transparent way. Our software platform supports the development of scalable and efficient parallel software. It includes a lock-free mailbox algorithm with pattern matching facility for message processing. Thorough performance evaluations reveal an extraordinary small memory footprint in realistic application scenarios, while runtime performance not only outperforms existing mature actor implementations, but exceeds the scaling behavior of low-level message passing libraries such as OpenMPI.

When I read Stroustrup: Why the 35-year-old C++ still dominates ‘real’ dev I started to post a comment asking why there were no questions about functional programming languages? But, the interview is a “puff” piece and not a serious commentary on programming.

Then I ran across this work on implementing actors in C++. Maybe Stroustrup was correct without being aware of it.

Bundled with the C++ library libcppa, available at: http://www.libcppa.org

OpenGM

Friday, August 1st, 2014

OpenGM

From the webpage:

OpenGM is a C++ template library for discrete factor graph models and distributive operations on these models. It includes state-of-the-art optimization and inference algorithms beyond message passing. OpenGM handles large models efficiently, since (i) functions that occur repeatedly need to be stored only once and (ii) when functions require different parametric or non-parametric encodings, multiple encodings can be used alongside each other, in the same model, using included and custom C++ code. No restrictions are imposed on the factor graph or the operations of the model. OpenGM is modular and extendible. Elementary data types can be chosen to maximize efficiency. The graphical model data structure, inference algorithms and different encodings of functions inter-operate through well-defined interfaces. The binary OpenGM file format is based on the HDF5 standard and incorporates user extensions automatically.

Documentation lists algorithms with references.

I first saw this in a post by Danny Bickson, OpenGM graphical models toolkit.

Learning Lisp With C

Wednesday, April 9th, 2014

Build Your Own Lisp by Daniel Holden.

From the webpage:

If you’re looking to learn C, or you’ve ever wondered how to build your own programming language, this is the book for you.

In just a few lines of code, I’ll teach you how to effectively use C, and what it takes to start building your own language.

Along the way we’ll learn about the weird and wonderful nature of Lisps, and what really makes a programming language. By building a real world C program we’ll learn implicit things that conventional books cannot teach. How to develop a project, how to make life easy for your users, and how to write beautiful code.

This book is free to read online. Get started now!

Read Online!

This looks interesting and useful.

Enjoy!

Screaming fast Lucene searches using C++ via JNI

Wednesday, June 19th, 2013

Screaming fast Lucene searches using C++ via JNI by Michael McCandless.

From the post:

At the end of the day, when Lucene executes a query, after the initial setup the true hot-spot is usually rather basic code that decodes sequential blocks of integer docIDs, term frequencies and positions, matches them (e.g. taking union or intersection for BooleanQuery), computes a score for each hit and finally saves the hit if it’s competitive, during collection.

Even apparently complex queries like FuzzyQuery or WildcardQuery go through a rewrite process that reduces them to much simpler forms like BooleanQuery.

Lucene’s hot-spots are so simple that optimizing them by porting them to native C++ (via JNI) was too tempting!

So I did just that, creating the lucene-c-boost github project, and the resulting speedups are exciting:

(…)

Speedups range from 0.7 X to 7.8 X.

Read Michael’s post for explanations, warnings, caveats, etc.

But it is exciting news!

Introduction to C and C++

Monday, March 25th, 2013

Introduction to C and C++

Description:

This course provides a fast-paced introduction to the C and C++ programming languages. You will learn the required background knowledge, including memory management, pointers, preprocessor macros, object-oriented programming, and how to find bugs when you inevitably use any of those incorrectly. There will be daily assignments and a small-scale individual project.

This course is offered during the Independent Activities Period (IAP), which is a special 4-week term at MIT that runs from the first week of January until the end of the month.

Just in case you want a deeper understanding of bugs that enable hacking or how to avoid creating such bugs in the first place.