Archive for the ‘Messaging’ Category

Futures of text

Wednesday, March 4th, 2015

Futures of text by Jonathan Libov.

From the post:

I believe comfort, not convenience, is the most important thing in software, and text is an incredibly comfortable medium. Text-based interaction is fast, fun, funny, flexible, intimate, descriptive and even consistent in ways that voice and user interface often are not. Always bet on text:

Text is the most socially useful communication technology. It works well in 1:1, 1:N, and M:N modes. It can be indexed and searched efficiently, even by hand. It can be translated. It can be produced and consumed at variable speeds. It is asynchronous. It can be compared, diffed, clustered, corrected, summarized and filtered algorithmically. It permits multiparty editing. It permits branching conversations, lurking, annotation, quoting, reviewing, summarizing, structured responses, exegesis, even fan fic. The breadth, scale and depth of ways people use text is unmatched by anything.

[Apologies, I lost some of Jonathan’s layout of the quote.]

Jonathan focuses on the use of text/messaging for interactions in a mobile environment, with many examples and suggestions for improvements along the way.

One observation that will have the fearful of an AI future (Elon Musk among others) running for the hills:

Messaging is the only interface in which the machine communicates with you much the same as the way you communicate with it. If some of the trends outlined in this post pervade, it would mark a qualitative shift in how we interact with computers. Whereas computer interaction to date has largely been about discrete, deliberate events — typing in the command line, clicking on files, clicking on hyperlinks, tapping on icons — a shift to messaging- or conversational-based UI’s and implicit hyperlinks would make computer interaction far more fluid and natural.

What’s more, messaging AI benefits from an obvious feedback loop: The more we interact with bots and messaging UI’s, the better it’ll get. That’s perhaps true for GUI as well, but to a far lesser degree. Messaging AI may get better at a rate we’ve never seen in the GUI world. Hold on tight.[Emphasis added.]

Think of it this way, a GUI locks you into the developer’s imagination. A text interface empowers the user and the AI’s imagination. I’m betting on the latter.

BTW, Jonathan ends with a great list of further reading on messaging and mobile applications.

Enjoy!

I first saw this in a tweet by Aloyna Medelyan.

Idempotence Is Not a Medical Condition

Saturday, January 4th, 2014

Idempotence Is Not a Medical Condition by Pat Helland.

From the post:

The definition of distributed computing can be confusing. Sometimes, it refers to a tightly coupled cluster of computers working together to look like one larger computer. More often, however, it refers to a bunch of loosely related applications chattering together without a lot of system-level support.

This lack of support in distributed computing environments makes it difficult to write applications that work together. Messages sent between systems do not have crisp guarantees for delivery. They can get lost, and so, after a timeout, they are retried. The application on the other side of the communication may see multiple messages arrive where one was intended. These messages may be reordered and interleaved with different messages. Ensuring that the application behaves as intended can be very hard to design and implement. It is even harder to test.

In a world full of retried messages, idempotence is an essential property for reliable systems. Idempotence is a mathematical term meaning that performing an operation multiple times will have the same effect as performing it exactly one time. The challenges occur when messages are related to each other and may have ordering constraints. How are messages associated? What can go wrong? How can an application developer build a correctly functioning app without losing his or her mojo?

A very good discussion of idempotence in the context of distributed (message passing) systems. You may recall the TMRM defining merging operators to be idempotent. (Section 8 )

Pat’s examples on idempotence include:

  1. Sweeping the floor is idempotent. If you sweep it multiple times, you still get a clean floor.
  2. Baking a cake is not idempotent.
  3. Baking a cake starting from a shopping list (if you don’t care about money) is idempotent.

As an aside, #2 is not idempotent because “a cake” means a particular cake. It can only be baked once, at least if you want to have an edible result. In #3, the act of baking from a shopping list (I prefer a recipe) and not the cake, is idempotent.

The post is quite good, particularly if you are interested in a reliable messaging based system.

I first saw this in Stuff The Internet Says On Scalability For January 3rd, 2014, which had the following note:

Pat Helland with a classically great article on Idempotence. Fortunately the article is not idempotent. Everytime you read it your brain updates with something new.

The Distributed Complexity of Large-scale Graph Processing

Tuesday, November 26th, 2013

The Distributed Complexity of Large-scale Graph Processing by Hartmut Klauck, Danupon Nanongkai, Gopal Pandurangan, Peter Robinson.

Abstract:

Motivated by the increasing need for fast distributed processing of large-scale graphs such as the Web graph and various social networks, we study a message-passing distributed computing model for graph processing and present lower bounds and algorithms for several graph problems. This work is inspired by recent large-scale graph processing systems (e.g., Pregel and Giraph) which are designed based on the message-passing model of distributed computing.

Our model consists of a point-to-point communication network of k machines interconnected by bandwidth-restricted links. Communicating data between the machines is the costly operation (as opposed to local computation). The network is used to process an arbitrary n-node input graph (typically k machines (a common implementation in many real world systems). Our goal is to study fundamental complexity bounds for solving graph problems in this model.

We present techniques for obtaining lower bounds on the distributed time complexity. Our lower bounds develop and use new bounds in random-partition communication complexity. We first show a lower bound of \Omega(n/k) rounds for computing a spanning tree (ST) of the input graph. This result also implies the same bound for other fundamental problems such as computing a minimum spanning tree (MST). We also show an \Omega(n/k^2) lower bound for connectivity, ST verification and other related problems.

We give algorithms for various fundamental graph problems in our model. We show that problems such as PageRank, MST, connectivity, and graph covering can be solved in \tilde{O}(n/k) time, whereas for shortest paths, we present algorithms that run in \tilde{O}(n/\sqrt{k}) time (for (1+\epsilon)-factor approx.) and in \tilde{O}(n/k) time (for O(\log n)-factor approx.) respectively.

The author’s state their main goal is:

…is to investigate the distributed time complexity, i.e., the number of distributed “rounds”, for solving various fundamental graph problems. The time complexity not only captures the (potential) speed up possible for a problem, but it also implicitly captures the communication cost of the algorithm as well, since links can transmit only a limited amount of bits per round; equivalently, we can view our model where instead of links, machines can send/receive only a limited amount of bits per round (cf. Section 1.1).

How would you investigate the number of “rounds,” to perform merging in a message passing topic map system?

With no one order of merging to reach a particular state, would you measure it statistically for some merging criteria N?

I first saw this in a tweet by Stefano Bertolo.

Abusing Cloud-Based Browsers for Fun and Profit [Passing Messages, Not Data]

Thursday, November 29th, 2012

Abusing Cloud-Based Browsers for Fun and Profit by Vasant Tendulkar, Joe Pletcher, Ashwin Shashidharan, Ryan Snyder, Kevin Butler and William Enck.

Abstract:

Cloud services have become a cheap and popular means of computing. They allow users to synchronize data between devices and relieve low-powered devices from heavy computations. In response to the surge of smartphones and mobile devices, several cloud-based Web browsers have become commercially available. These “cloud browsers” assemble and render Web pages within the cloud, executing JavaScript code for the mobile client. This paper explores how the computational abilities of cloud browsers may be exploited through a Browser MapReduce (BMR) architecture for executing large, parallel tasks. We explore the computation and memory limits of four cloud browsers, and demonstrate the viability of BMR by implementing a client based on a reverse engineering of the Puffin cloud browser. We implement and test three canonical MapReduce applications (word count, distributed grep, and distributed sort). While we perform experiments on relatively small amounts of data (100 MB) for ethical considerations, our results strongly suggest that current cloud browsers are a viable source of arbitrary free computing at large scale.

Excellent work on extending the use of cloud-based browsers. Whether you intend to use them for good or ill.

The use of messaging as opposed to passage of data is particularly interesting.

Shouldn’t that work for the process of merging as well?

Comments/suggestions?

Discovering message flows in actor systems with the Spider Pattern

Sunday, September 2nd, 2012

Discovering message flows in actor systems with the Spider Pattern by Raymond Rostenberg.

From the post:

In this post I’m going to show a pattern that can be used to discover facts about an actor system while it is running. It can be used to understand how messages flow through the actors in the system. The main reason why I built this pattern is to understand what is going on in a running actor system that is distributed across many machines. If I can’t picture it, I can’t understand it (and I’m in good company with that quote 🙂

Building actor systems is fun but debugging them can be difficult, you mostly end up browsing through many log files on several machines to find out what’s going on. I’m sure you have browsed through logs and thought, “Hey, where did that message go?”, “Why did this message cause that effect” or “Why did this actor never get a message?”

This is where the Spider pattern comes in.

I would think the better quote would be: “If I can’t see it, I can’t understand it.” But each to their own.

Message passing systems remind me of Newcomb’s requirement for having audit trails for merging behavior.

Not necessary for every use case but when it is necessary, it is nice to know robust auditing is possible.

Or perhaps the better way to put it is that auditing is adjustable.

We can go from tracking every operation at one extreme to a middle ground of some tracking but protecting political appointees or career servants or even to a wide open system (sort of like Twitter or Facebook).

Forwarding Without Repeating: Efficient Rumor Spreading in Bounded-Degree Graphs

Friday, August 17th, 2012

Forwarding Without Repeating: Efficient Rumor Spreading in Bounded-Degree Graphs by Vincent Gripon, Vitaly Skachek, and Michael Rabbat.

Abstract:

We study a gossip protocol called forwarding without repeating (FWR). The objective is to spread multiple rumors over a graph as efficiently as possible. FWR accomplishes this by having nodes record which messages they have forwarded to each neighbor, so that each message is forwarded at most once to each neighbor. We prove that FWR spreads a rumor over a strongly connected digraph, with high probability, in time which is within a constant factor of optimal for digraphs with bounded out-degree. Moreover, on digraphs with bounded out-degree and bounded number of rumors, the number of transmissions required by FWR is arbitrarily better than that of existing approaches. Specifically, FWR requires O(n) messages on bounded-degree graphs with n nodes, whereas classical forwarding and an approach based on network coding both require {\omega}(n) messages. Our results are obtained using combinatorial and probabilistic arguments. Notably, they do not depend on expansion properties of the underlying graph, and consequently the message complexity of FWR is arbitrarily better than classical forwarding even on constant-degree expander graphs, as n \rightarrow \infty. In resource-constrained applications, where each transmission consumes battery power and bandwidth, our results suggest that using a small amount of memory at each node leads to a significant savings.

Interesting work that may lead to topic maps in resource-constrained environments.

Vertical Scaling made easy through high-performance actors

Tuesday, July 31st, 2012

Vertical Scaling made easy through high-performance actors

From the webpage:

Vertical scaling is today a major issue when writing server code. Threads and locks are the traditional approach to making full utilization of fat (multi-core) computers, but result is code that is difficult to maintain and which to often does not run much faster than single-threaded code.

Actors make good use of fat computers but tend to be slow as messages are passed between threads. Attempts to optimize actor-based programs results in actors with multiple concerns (loss of modularity) and lots of spaghetti code.

The approach used by JActor is to minimize the messages passed between threads by executing the messages sent to idle actors on the same thread used by the actor which sent the message. Message buffering is used when messages must be sent between threads, and two-way messaging is used for implicit flow control. The result is an approach that is easy to maintain and which, with a bit of care to the architecture, provides extremely high rates of throughput.

On an intel i7, 250 million messages can be passed between actors in the same JVM per second–several orders of magnitude faster than comparable actor frameworks.

Hmmm, 250 million messages a second? On the topic map (TM) scale, that’s what?, about 1/4 TM? 😉

Seriously, if you are writing topic map server software, you need to take a look at JActor.

12 Ways to Increase Throughput by 32X and Reduce Latency by 20X

Wednesday, May 2nd, 2012

12 Ways to Increase Throughput by 32X and Reduce Latency by 20X

From the post:

Martin Thompson, a high-performance technology geek, has written an awesome post, Fun with my-Channels Nirvana and Azul Zing. In it Martin shows the process and techniques he used to take an existing messaging product, written in Java, and increase throughput by 32X and reduce latency by 20X. The article is very well written with lots of interesting details that make it well worth reading.

You might want to start with the High Scalability summary before tackling the “real thing.”

Of interest to subject-centric applications that rely on messaging. And anyone interested in performance for the sheer pleasure of it.