Archive for the ‘Computation’ Category

In-Memory Computing

Wednesday, April 17th, 2013

Why In-Memory Computing Is Cheaper And Changes Everything by Timo Elliott.

From the post:

What is the difference? Database engines today do I/O. So if they want to get a record, they read. If they want to write a record, they write, update, delete, etc. The application, which in this case is a DBMS, thinks that it’s always writing to disk. If that record that they’re reading and writing happens to be in flash, it will certainly be faster, but it’s still reading and writing. Even if I’ve cached it in DRAM, it’s the same thing: I’m still reading and writing.

What we’re talking about here is the actual database is physically in in-memory. I’m doing a fetch to get data and not a read. So the logic of the database changes. That’s what in-memory is about as opposed to the traditional types of computing.

Why is it time for in-memory computing?

Why now? The most important thing is this: DRAM costs are dropping about 32% every 12 months. Things are getting bigger, and costs are getting lower. If you looked at the price of a Dell server with a terabyte of memory three years ago, it was almost $100,000 on their internet site. Today, a server with more cores — sixteen instead of twelve — and a terabyte of DRAM, costs less than $40,000.

In-memory results in lower total cost of ownership

So the costs of this stuff is not outrageous. For those of you who don’t understand storage, I always get into this argument: the total cost of acquisition of an in-memory system is likely higher than a storage system. There’s no question. But the total cost of TCO is lower – because you don’t need storage people to manage memory. There are no LUNs [logical unit numbers]: all the things your storage technicians do goes away.

People cost more than hardware and software – a lot more. So the TCO is lower. And also, by the way, power: one study IBM did showed that memory is 99% less power than spinning disks. So unless you happen to be an electric company, that’s going to mean a lot to you. Cooling is lower, everything is lower.

Timo makes a good case for in-memory computing but I have a slightly different question.

If both data and program are stored in memory, where is the distinction between program and data?

Or in topic map terms, can’t we then speak about subject identities in the program and even in data at particular points in the program?

That could be a very powerful tool for controlling program behavior and re-purposing data at different stages of processing.

…no Hitchhiker’s Guide…

Saturday, February 9th, 2013

Why there is no Hitchhiker’s Guide to Mathematics for Programmers by Jeremy Kun.

From the post:

Do you really want to get better at mathematics?

Remember when you first learned how to program? I do. I spent two years experimenting with Java programs on my own in high school. Those two years collectively contain the worst and most embarrassing code I have ever written. My programs absolutely reeked of programming no-nos. Hundred-line functions and even thousand-line classes, magic numbers, unreachable blocks of code, ridiculous code comments, a complete disregard for sensible object orientation, negligence of nearly all logic, and type-coercion that would make your skin crawl. I committed every naive mistake in the book, and for all my obvious shortcomings I considered myself a hot-shot programmer! At leaa st I was learning a lot, and I was a hot-shot programmer in a crowd of high-school students interested in game programming.

Even after my first exposure and my commitment to get a programming degree in college, it was another year before I knew what a stack frame or a register was, two more before I was anywhere near competent with a terminal, three more before I fully appreciated functional programming, and to this day I still have an irrational fear of networking and systems programming (the first time I manually edited the call stack I couldn’t stop shivering with apprehension and disgust at what I was doing).

A must read post if you want to be on the cutting edge of programming.

The tyranny of algorithms

Monday, September 24th, 2012

The tyranny of algorithms by Kaiser Fung.

This WSJ book review on “The Tyranny of Algorithms” (link) is well worth reading for those interested in how computer algorithms are used in business and government. I agree with most of what this author has to say.

You need to read Kaiser’s comment on the review before proceeding….

Back?

I am not altogether sure that algorithms are the problem the book and/or review make they out to be.

No doubt the concerns and problems described are real, but they don’t exist because of algorithms.

Rather they exist because we are basically lazy and accept the result of algorithms, just like we accept the judgements of others, advertising, etc.

Were we to question algorithms, judgements of others, advertising, we might be living in a very different world.

But we don’t, so we’re not.

So the question is, how to live with algorithms knowing we are too lazy to question them. Yes?

Are these shadows/echoes of Thinking, Fast and Slow?

On the value of being inexact

Wednesday, June 13th, 2012

Algorithmic methodologies for ultra-efficient inexact architectures for sustaining technology scaling by Avinash Lingamneni, Kirthi Krishna Muntimadugu, Richard M. Karp, Krishna V. Palem, and Christian Piguet.

The following non-technical blurb caught my eye:

Researchers have unveiled an “inexact” computer chip that challenges the industry’s dogmatic 50-year pursuit of accuracy. The design improves power and resource efficiency by allowing for occasional errors. Prototypes unveiled this week at the ACM International Conference on Computing Frontiers in Cagliari, Italy, are at least 15 times more efficient than today’s technology.

[ads deleted]

The research, which earned best-paper honors at the conference, was conducted by experts from Rice University in Houston, Singapore’s Nanyang Technological University (NTU), Switzerland’s Center for Electronics and Microtechnology (CSEM) and the University of California, Berkeley.

“It is exciting to see this technology in a working chip that we can measure and validate for the first time,” said project leader Krishna Palem, who also serves as director of the Rice-NTU Institute for Sustainable and Applied Infodynamics (ISAID). “Our work since 2003 showed that significant gains were possible, and I am delighted that these working chips have met and even exceeded our expectations.” [From: Computing experts unveil superefficient ‘inexact’ chip which I saw in a list of links by Greg Linden.

Think about it. We are inexact and so are our semantics.

But we attempt to model our inexact semantics with increasingly exact computing platforms.

Does that sound like a modeling mis-match to you?

BTW, if you are interested in the details, see: Algorithmic methodologies for ultra-efficient inexact architectures for sustaining technology scaling

Abstract:

Owing to a growing desire to reduce energy consumption and widely anticipated hurdles to the continued technology scaling promised by Moore’s law, techniques and technologies such as inexact circuits and probabilistic CMOS (PCMOS) have gained prominence. These radical approaches trade accuracy at the hardware level for significant gains in energy consumption, area, and speed. While holding great promise, their ability to influence the broader milieu of computing is limited due to two shortcomings. First, they were mostly based on ad-hoc hand designs and did not consider algorithmically well-characterized automated design methodologies. Also, existing design approaches were limited to particular layers of abstraction such as physical, architectural and algorithmic or more broadly software. However, it is well-known that significant gains can be achieved by optimizing across the layers. To respond to this need, in this paper, we present an algorithmically well-founded cross-layer co-design framework (CCF) for automatically designing inexact hardware in the form of datapath elements. Specifically adders and multipliers, and show that significant associated gains can be achieved in terms of energy, area, and delay or speed. Our algorithms can achieve these gains with adding any additional hardware overhead. The proposed CCF framework embodies a symbiotic relationship between architecture and logic-layer design through the technique of probabilistic pruning combined with the novel confined voltage scaling technique introduced in this paper, applied at the physical layer. A second drawback of the state of the art with inexact design is the lack of physical evidence established through measuring fabricated ICs that the gains and other benefits that can be achieved are valid. Again, in this paper, we have addressed this shortcoming by using CCF to fabricate a prototype chip implementing inexact data-path elements; a range of 64-bit integer adders whose outputs can be erroneous. Through physical measurements of our prototype chip wherein the inexact adders admit expected relative error magnitudes of 10% or less, we have found that cumulative gains over comparable and fully accurate chips, quantified through the area-delay-energy product, can be a multiplicative factor of 15 or more. As evidence of the utility of these results, we demonstrate that despite admitting error while achieving gains, images processed using the FFT algorithm implemented using our inexact adders are visually discernible.

Why the link to the ACM Digital library or to the “unoffiical version” were not reported in any of the press stories I cannot say.

A Computable Universe,
Understanding Computation and
Exploring Nature As Computation

Friday, June 1st, 2012

Foreword: A Computable Universe, Understanding Computation and Exploring Nature As Computation by Roger Penrose.

Abstract:

I am most honoured to have the privilege to present the Foreword to this fascinating and wonderfully varied collection of contributions, concerning the nature of computation and of its deep connection with the operation of those basic laws, known or yet unknown, governing the universe in which we live. Fundamentally deep questions are indeed being grappled with here, and the fact that we find so many different viewpoints is something to be expected, since, in truth, we know little about the foundational nature and origins of these basic laws, despite the immense precision that we so often find revealed in them. Accordingly, it is not surprising that within the viewpoints expressed here is some unabashed speculation, occasionally bordering on just partially justified guesswork, while elsewhere we find a good deal of precise reasoning, some in the form of rigorous mathematical theorems. Both of these are as should be, for without some inspired guesswork we cannot have new ideas as to where look in order to make genuinely new progress, and without precise mathematical reasoning, no less than in precise observation, we cannot know when we are right — or, more usually, when we are wrong.

An unlikely volume to search for data mining or semantic modeling algorithms or patterns.

But one that should be read for the mental exercise/discipline of its reading.

The asking price of $138 (US) promises a limited readership.

Plus a greatly diminished impact.

When asked to participate in collections, scholars/authors should ask themselves:

How many books have I read from publisher X?*

*Read, not cited, is the appropriate test. Make your decision appropriately.


If republished as an accessible paperback, may I suggest: “Exploring the Nature of Computation”?

The committee title makes the collage nature of the volume a bit too obvious.

Complexity and Computation

Thursday, January 12th, 2012

Complexity and Computation by Allen B. Downey.

Another free (you can order hard copy) book from Allen B. Downey. See my post: Think Stats: Probability and Statistics for Programmers or jump to Green Tea Press to see these and other titles for free download.

Description:

This book is about complexity science, data structures and algorithms, intermediate programming in Python, and the philosophy of science:

  • Data structures and algorithms: A data structure is a collection that contains data elements organized in a way that supports particular operations. For example, a dictionary organizes key-value pairs in a way that provides fast mapping from keys to values, but mapping from values to keys is generally slower.

    An algorithm is a mechanical process for performing a computation. Designing efficient programs often involves the co-evolution of data structures and the algorithms that use them. For example, the first few chapters are about graphs, a data structure that is a good implementation of a graph—nested dictionaries—and several graph algorithms that use this data structure.

  • Python programming: This book picks up where Think Python leaves off. I assume that you have read that book or have equivalent knowledge of Python. As always, I will try to emphasize fundmental ideas that apply to programming in many languages, but along the way you will learn some useful features that are specific to Python.
  • Computational modeling: A model is a simplified description of a system that is useful for simulation or analysis. Computational models are designed to take advantage of cheap, fast computation.
  • Philosophy of science: The models and results in this book raise a number of questions relevant to the philosophy of science, including the nature of scientific laws, theory choice, realism and instrumentalism, holism and reductionism, and Bayesian epistemology.

This book focuses on discrete models, which include graphs, cellular automata, and agent-based models. They are often characterized by structure, rules and transitions rather than by equations. They tend to be more abstract than continuous models; in some cases there is no direct correspondence between the model and a physical system.

Complexity science is an interdiscipinary field—at the intersection of mathematics, computer science and physics—that focuses on these kinds of models. That’s what this book is about.

Turtles all the way down

Wednesday, August 31st, 2011

Turtles all the way down

From the website:

Decisive breakthrough from IBM researchers in Haifa introduces efficient nested virtualization for x86 hypervisors

What is nested virtualization and who needs it? Classical virtualization takes a physical computer and turns it into multiple logical, or virtual, computers. Each virtual machine can then interact independently, run its own operating environment, and basically behave like a separate physical resource. Hypervisor software is the secret sauce that makes virtualization possible by sitting in between the hardware and the operating system. It manages how the operating system and applications access the hardware.

IBM researchers found an efficient way to take one x86 hypervisor and run other hypervisors on top of it. For virtualization, this means that a virtual machine can be ‘turned into’ many machines, each with the potential to have its own unique environment, configuration, operating system, or security measures—which can in turn each be divided into more logical computers, and so on. With this breakthrough, x86 processors can now run multiple ‘hypervisors’ stacked, in parallel, and of different types.

This nested virtualization using one hypervisor on top of another is reminiscent of a tale popularized by Stephen Hawking. A little old lady argued with a lecturing scientist and insisted that the world is really a flat plate supported on the back of a giant tortoise. When the scientist asked what the tortoise is standing on, the woman answered sharply “But it’s turtles all the way down!” Inspired by this vision, the researchers named their solution the Turtles Project: Design and Implementation of Nested Virtualization

This awesome advance has been incorporated into the latest Linux release.

This is what I like about IBM, fundamental advances in computer science that can be turned into services for users.

One obvious use of this advance would be to segregate merging models in separate virtual machines. I am sure there are others.

Parallel prototyping leads to better design results, more divergence, and increased self-efficacy

Wednesday, January 5th, 2011

Parallel prototyping leads to better design results, more divergence, and increased self-efficacy Authors: Steven P. Dow, Alana Glassco, Jonathan Kass, Melissa Schwarz, Daniel L. Schwartz, and Scott R. Klemmer

Abstract:

Iteration can help people improve ideas. It can also give rise to fixation, continuously refining one option without considering others. Does creating and receiving feedback on multiple prototypes in parallel, as opposed to serially, affect learning, self-efficacy, and design exploration? An experiment manipulated whether independent novice designers created graphic Web advertisements in parallel or in series. Serial participants received descriptive critique directly after each prototype. Parallel participants created multiple prototypes before receiving feedback. As measured by click-through data and expert ratings, ads created in the Parallel condition significantly outperformed those from the Serial condition. Moreover, independent raters found Parallel prototypes to be more diverse. Parallel participants also reported a larger increase in task-specific self-confidence. This article outlines a theoretical foundation for why parallel prototyping produces better design results and discusses the implications for design education.

Interesting that I should stumble on this article after posting about parallel processing.

A useful heuristic for the development of subject identifications in an organization.

Is Parallel Programming Hard, And, If So, What Can You Do About It?

Wednesday, January 5th, 2011

Is Parallel Programming Hard, And, If So, What Can You Do About It? Editor Paul E. McKenney

Kirk Lowery forwarded this link to my attention.

Just skimming the first couple of chapters, I have to say it has some of the most amusing graphics I have seen in any CS book.

Performance, productivity and generality are all goals of parallel programming, as cited by this book.

I have to wonder though if subject recognition tasks, analogous to computer vision, that are inherently parallel.

Doing them in parallel does not make them easier but not doing them in parallel certainly makes them harder.

For example, consider the last time you failed to recognize someone who wasn’t in the location or context where you normally see them.

Do you recognize the context in addition or in parallel to your recognition of the person’s face?

Questions:

  1. What benefits/drawbacks do you see in parallel processing of TMDM instances? (3-5 pages, citations)
  2. How would you design subject identifications for processing in a parallel environment? (3-5 pages, citations)
  3. How would you evaluate the need for parallel processing of subject identifications? (3-5 pages, citations)

EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs

Friday, October 15th, 2010

EigenSpokes: Surprising Patterns and Scalable Community Chipping in Large Graphs. Authors: B. Aditya Prakash, Ashwin Sridharan, Mukund Seshadri, Sridhar Machiraju, and Christos Faloutsos Keywords: EigenSpokes – Communities – Graphs

Abstract:

We report a surprising, persistent pattern in large sparse social graphs, which we term EigenSpokes. We focus on large Mobile Call graphs, spanning about 186K nodes and millions of calls, and find that the singular vectors of these graphs exhibit a striking EigenSpokes pattern wherein, when plotted against each other, they have clear, separate lines that often neatly align along specific axes (hence the term “spokes”). Furthermore, analysis of several other real-world datasets e.g., Patent Citations, Internet, etc. reveals similar phenomena indicating this to be a more fundamental attribute of large sparse graphs that is related to their community structure.

This is the first contribution of this paper. Additional ones include (a) study of the conditions that lead to such EigenSpokes, and (b) a fast algorithm for spotting and extracting tightly-knit communities, called SpokEn, that exploits our findings about the EigenSpokes pattern.

The notion of “chipping” off communities for further study from a large graph is quite intriguing.

In part because those communities (need I say subjects?) are found as the result of a process of exploration rather than declaration.

To be sure, those subjects can be “declared” in a topic map but the finding, identifying, deciding on subject identity properties for subjects is a lot more fun.

An Approach for Fast Hierarchical Agglomerative Clustering Using Graphics Processors with CUDA

Sunday, October 10th, 2010

An Approach for Fast Hierarchical Agglomerative Clustering Using Graphics Processors with CUDA Authors: S.A. Arul Shalom, Manoranjan Dash, Minh Tue Keywords: CUDA. Hierarchical clustering, High performance Computing, Computations using Graphics hardware, complete linkage

Abstract:

Graphics Processing Units in today’s desktops can well be thought of as a high performance parallel processor. Each single processor within the GPU is able to execute different tasks independently but concurrently. Such computational capabilities of the GPU are being exploited in the domain of Data mining. Two types of Hierarchical clustering algorithms are realized on GPU using CUDA. Speed gains from 15 times up to about 90 times have been realized. The challenges involved in invoking Graphical hardware for such Data mining algorithms and effects of CUDA blocks are discussed. It is interesting to note that block size of 8 is optimal for GPU with 128 internal processors.

GPUs offer a great deal of processing power and programming them may provoke deeper insights into subject identification and mapping.

Topic mappers may be able to claim NVIDIA based software/hardware and/or Sony Playstation 3 and 4 units (Cell Broadband Engine) as a business expense (check with your tax advisor).

A GPU based paper for TMRA 2011 anyone?

Computation, Information, Cognition: The Nexus and the Liminal – Book

Saturday, July 3rd, 2010

Computation, Information, Cognition: The Nexus and the Liminal by Gordana Dodig-Crnkovic and Susan Stuart, is a deeply delightful collection of essays from European Computing and Philosophy Conference (E-CAP), 2005.

I originally ordered it because of Graeme Hirst’s essay, “Views of Text Meaning in Computational Linguistics: Past, Present, and Future.” More on that in a future post but suffice it to say that he sees computational linguistics returning to a realization that “meaning” isn’t nearly as flat as some people would like to believe.

I could not help perusing some of the other essays and ran across Werner Ceusters and Barry Smith, in “Ontology as the Core Discipline of Biomedical Informatics – Legacies of the Past and Recommendations for the Future Directions of Research,” bashing the work of ISO/IEC TC 37, and its founder, Eugen Wüster, as International Standard Bad Philosophy. Not that I care for “realist ontologies” all that much but it is a very amusing essay.

Not to mention Patrick Allo’s “Formalizing Semantic Information: Lessons from Logical Pluralism.” If I say “informational pluralism” does anyone need more of a hint as to why I would like this essay?

I feel bad that I can’t mention in a reasonable sized posts all the other essays in this volume, or do more to give the flavor of those I mention above. This isn’t a scripting source book but the ideas you will find in it are going to shape the future of computation and our little corner of it for some time to come.