## Archive for the ‘Supercomputing’ Category

### GPU + Russian Algorithm Bests Supercomputer

Thursday, June 30th, 2016

No need for supercomputers

From the post:

Senior researchers Vladimir Pomerantcev and Olga Rubtsova, working under the guidance of Professor Vladimir Kukulin (SINP MSU), were able to use on an ordinary desktop PC with GPU to solve complicated integral equations of quantum mechanics — previously solved only with the powerful, expensive supercomputers. According to Vladimir Kukulin, the personal computer does the job much faster: in 15 minutes it is doing the work requiring normally 2-3 days of the supercomputer time.

The main problem in solving the scattering equations of multiple quantum particles was the calculation of the integral kernel — a huge two-dimensional table, consisting of tens or hundreds of thousands of rows and columns, with each element of such a huge matrix being the result of extremely complex calculations. But this table appeared to look like a monitor screen with tens of billions of pixels, and with a good GPU it was quite possible to calculate all of these. Using the software developed in Nvidia and having written their own programs, the researchers split their calculations on the many thousands of streams and were able to solve the problem brilliantly.

“We reached the speed we couldn’t even dream of,” Vladimir Kukulin said. “The program computes 260 million of complex double integrals on a desktop computer within three seconds only. No comparison with supercomputers! My colleague from the University of Bochum in Germany (recently deceased, mournfully), whose lab did the same, carried out the calculations by one of the largest supercomputers in Germany with the famous blue gene architecture that is actually very expensive. And what his group is seeking for two or three days, we do in 15 minutes without spending a dime.”

The most amazing thing is that the desired quality of graphics processors and a huge amount of software to them exist for ten years already, but no one used them for such calculations, preferring supercomputers. Anyway, our physicists surprised their Western counterparts pretty much.

One of the principal beneficiaries of the US restricting the export of the latest generation of computer technology to the former USSR, was of course Russia.

Deprived of the latest hardware, Russian mathematicians and computer scientists were forced to be more efficient with equipment that was one or two generations off the latest mark for computing.

Parity between the USSR and the USA in nuclear weapons is testimony to their success and the failure of US export restriction policies.

For the technical details: V.N. Pomerantsev, V.I. Kukulin, O.A. Rubtsova, S.K. Sakhiev. Fast GPU-based calculations in few-body quantum scattering. Computer Physics Communications, 2016; 204: 121 DOI: 10.1016/j.cpc.2016.03.018.

### Raspberry Pi Zero — The $5 Tiny Computer is Here [Paper Thoughts?] Saturday, November 28th, 2015 Raspberry Pi Zero — The$5 Tiny Computer is Here by Swati Khandelwal.

From the post:

Get ready for a ThanksGiving celebration from the Raspberry Pi Foundation.

Raspberry Pi, the charitable foundation behind the United Kingdom’s best-selling computer, has just unveiled its latest wonder – the Raspberry Pi Zero.

From the post:

Makers, academics and generally anyone who likes to play with computers: get ready for some awesomesauce. Raspberry Pis, the tiny Linux computers that currently sell for $35 are getting a makeover that will give a tremendous boost to their compute power and double their memory while still keeping their price the same. The Pi 2 boards will be available today, and Pi creator and CEO of Raspberry Pi (Trading) Ltd. Eben Upton says the organization has already built 100,000 units, so buyers shouldn’t have to wait like they did at the original Pi launch. The Pi 2 will have the following specs: • SoC : Broadcom BCM2836 (CPU, GPU, DSP, SDRAM, and single USB port) • CPU: 900 MHz quad-core ARM Cortex A7 (ARMv7 instruction set) • GPU: Broadcom VideoCore IV @ 250 MHz, OpenGL ES 2.0 (24 GFLOPS), 1080p30 MPEG-2 and VC-1 decoder (with license), 1080p30 h.264/MPEG-4 AVC high-profile decoder and encoder • Memory: 1 GB (shared with GPU) • Total backwards compatibility (in terms of multimedia, form-factor and interfacing) with Pi 1 Why order a new Raspberry Pi? Well, Kevin Trainor had Ontopia running on the first version: Ontopia Runs on Raspberry Pi [This Rocks!] Hadoop on a Raspberry Pi And bear in mind my post: 5,000 operations per second – Computations for Hydrogen Bomb. What are you going to design with your new Raspberry Pi? If 5,000 operations per second could design a Hydrogen Bomb, what can you do with a 24 GFLOPS video chip? Faster Pac-Man, more detailed WarCraft, or Call of Duty, Future Warfare 4? Money makers no doubt but at the end of the day, still substitutes for changing the world. ### Supercomputing frontiers and innovations Saturday, August 9th, 2014 Supercomputing frontiers and innovations (New Journal) From the homepage: Parallel scientific computing has entered a new era. Multicore processors on desktop computers make parallel computing a fundamental skill required by all computer scientists. High-end systems have surpassed the Petaflop barrier, and significant efforts are devoted to the development of the next generation of hardware and software technologies towards Exascale systems. This is an exciting time for computing as we begin the journey on the road to exascale computing. ‘Going to the exascale’ will mean radical changes in computing architecture, software, and algorithms – basically, vastly increasing the levels of parallelism to the point of billions of threads working in tandem – which will force radical changes in how hardware is designed and how we go about solving problems. There are many computational and technical challenges ahead that must be overcome. The challenges are great, different than the current set of challenges, and exciting research problems await us. This journal, Supercomputing Frontiers and Innovations, gives an introduction to the area of innovative supercomputing technologies, prospective architectures, scalable and highly parallel algorithms, languages, data analytics, issues related to computational co-design, and cross-cutting HPC issues as well as papers on supercomputing education and massively parallel computing applications in science and industry. This journal provides immediate open access to its content on the principle that making research freely available to the public supports a greater global exchange of knowledge. We hope you find this journal timely, interesting, and informative. We welcome your contributions, suggestions, and improvements to this new journal. Please join us in making this exciting new venture a success. We hope you will find Supercomputing Frontiers and Innovations an ideal venue for the publication of your team’s next exciting results. Becoming “massively parallel” isn’t going to free “computing applications in science and industry” from semantics. If anything, the more complex applications become, the easier it will be to mislay semantics, to the user’s peril. Semantic efforts that did not scale for applications in the last decade face even dimmer prospects in the face of “big data” and massively parallel applications. I suggest we move the declaration of semantics closer to or at the authors of content/data. At least as a starting point for discussion/research. ### Supercomputing on the cheap with Parallella Tuesday, December 10th, 2013 Supercomputing on the cheap with Parallella by Federico Lucifredi. From the post: Packing impressive supercomputing power inside a small credit card-sized board running Ubuntu, Adapteva‘s$99 ARM-based Parallella system includes the unique Ephiphany numerical accelerator that promises to unleash industrial strength parallel processing on the desktop at a rock-bottom price. The Massachusetts-based startup recently ran a successfully funded Kickstarter campaign and gained widespread attention only to run into a few roadblocks along the way. Now, with their setbacks behind them, Adapteva is slated to deliver its first units mid-December 2013, with volume shipping in the following months.

What makes the Parallella board so exciting is that it breaks new ground: imagine an Open Source Hardware board, powered by just a few Watts of juice, delivering 90 GFLOPS of number crunching. Combine this with the possibility of clustering multiple boards, and suddenly the picture of an exceedingly affordable desktop supercomputer emerges.

This review looks in-depth at a pre-release prototype board (so-called Generation Zero, a development run of 50 units), giving you a pretty complete overview of what the finished board will look like.

Whether you participate in this aspect of the computing revolution or not, you will be impacted by it.

The more successful Parallela and similar efforts become in bringing desktop supercomputing, the more pressure there will be on cloud computing providers to match those capabilities at lower prices.

Another point of impact will be non-production experimentation with parallel processing. Which may, like Thomas Edison, discover (or re-discover) 10,000 ways that don’t work but discover 1 that far exceeds anyone’s expectations.

That is to say that supercomputing will become cheap enough to tolerate frequent failure while experimenting with it.

What would you like to invent for supercomputing?

### Programming model for supercomputers of the future

Wednesday, June 26th, 2013

Programming model for supercomputers of the future

From the post:

The demand for even faster, more effective, and also energy-saving computer clusters is growing in every sector. The new asynchronous programming model GPI might become a key building block towards realizing the next generation of supercomputers.

The demand for even faster, more effective, and also energy-saving computer clusters is growing in every sector. The new asynchronous programming model GPI from Fraunhofer ITWM might become a key building block towards realizing the next generation of supercomputers.

High-performance computing is one of the key technologies for numerous applications that we have come to take for granted – everything from Google searches to weather forecasting and climate simulation to bioinformatics requires an ever increasing amount of computing ressources. Big data analysis additionally is driving the demand for even faster, more effective, and also energy-saving computer clusters. The number of processors per system has now reached the millions and looks set to grow even faster in the future. Yet something has remained largely unchanged over the past 20 years and that is the programming model for these supercomputers. The Message Passing Interface (MPI) ensures that the microprocessors in the distributed systems can communicate. For some time now, however, it has been reaching the limits of its capability.

“I was trying to solve a calculation and simulation problem related to seismic data,” says Dr. Carsten Lojewski from the Fraunhofer Institute for Industrial Mathematics ITWM. “But existing methods weren’t working. The problems were a lack of scalability, the restriction to bulk-synchronous, two-sided communication, and the lack of fault tolerance. So out of my own curiosity I began to develop a new programming model.” This development work ultimately resulted in the Global Address Space Programming Interface – or GPI – which uses the parallel architecture of high-performance computers with maximum efficiency.

GPI is based on a completely new approach: an asynchronous communication model, which is based on remote completion. With this approach, each processor can directly access all data – regardless of which memory it is on and without affecting other parallel processes. Together with Rui Machado, also from Fraunhofer ITWM, and Dr. Christian Simmendinger from T-Systems Solutions for Research, Dr. Carsten Lojewski is receiving a Joseph von Fraunhofer prize this year.

The post concludes with the observation that “…GPI is a tool for specialists….”

Rather surprising since it hasn’t been that many years ago that Hadoop was a tool for specialists. Or that “data mining” was a tool for specialists.

In the last year both Hadoop and “data mining” have come within reach of nearly average users.

GPI if successful for a broad range of problems, a few years will find it under the hood of any nearby cluster.

Perhaps sooner if you take an interest in it.

### Raspberry Pi: Up and Running

Saturday, January 5th, 2013

Raspberry Pi: Up and Running by Matt Richardson.

From the post:

For those of you who haven’t yet played around with Raspberry Pi, this one’s for you. In this how-to video, I walk you through how to get a Raspberry Pi up and running. It’s the first in a series of Raspberry Pi videos that I’m making to accompany Getting Started with Raspberry Pi, a book I wrote with Shawn Wallace. The book covers Raspberry Pi and Linux basics and then works up to using Scratch, Python, GPIO (to control LED’s and switches), and web development on the board.

For the range of applications using the Raspberry Pi, consider: Water Droplet Photography:

We knew when we were designing it that the Pi would make a great bit of digital/real-world meccano. We hoped we’d see a lot of projects we hadn’t considered ourselves being made with it. We’re never so surprised by what people do with it as we are by some of the photography projects we see.

Using a €15 solenoid valve, some Python and a Raspberry Pi to trigger the valve and the camera shutter at the same time, Dave has built a rig for taking water droplet photographs.

The build your own computer kits started us on the path to today.

This is a build your own parallel/supercomputer kit.

Where do you want to go tomorrow?

### Educational manual for Raspberry Pi released [computer science set]

Thursday, January 3rd, 2013

Educational manual for Raspberry Pi released

From the post:

Created by a team of teachers from Computing at School, the newly published Raspberry Pi Education Manual⁠ sets out to provide support for teachers and educators who want to use the Raspberry Pi in a teaching environment. As education has been part of the original Raspberry Pi Foundation’s mission, the foundation has supported the development of the manual.

The manual has chapters on the basics of Scratch, experiments with Python, connecting programs with Twitter and other web services, connecting up the GPIO pins to control devices, and using the Linux command line. Two chapters, one on Greenfoot and GeoGebra, are not currently included in the manual as both applications require a Java virtual machine which is currently being optimised for the Pi platform.

The Scratch section, for example, explains how to work with the graphical programming environment and use sprites, first to animate a cat, then make a man walk, and then animate a bee pollinating flowers. It then changes gear to show how to use Scratch for solving maths problems using variables, creating an “artificial intelligence”, driving a robot, making a car follow a line, and animating a level crossing, and wraps up with a section on creating games.

Reminded me of Kevin Trainor’s efforts: Ontopia Runs on Raspberry Pi [This Rocks!].

The description in the manual of the Raspberry PI as a “computer science set” seems particularly appropriate.

What are you going to discover?

### Parallel Computing – Prof. Alan Edelman

Saturday, December 29th, 2012

Parallel Computing – Prof. Alan Edelman MIT Course Number 18.337J / 6.338J.

From the webpage:

This is an advanced interdisciplinary introduction to applied parallel computing on modern supercomputers. It has a hands-on emphasis on understanding the realities and myths of what is possible on the world’s fastest machines. We will make prominent use of the Julia Language software project.

A “modern supercomputer” may be in your near term future. Would not hurt to start preparing now.

Similar courses that you would recommend?

### Ontopia Runs on Raspberry Pi [This Rocks!]

Tuesday, December 4th, 2012

Ontopia Runs on Raspberry Pi by Kevin Trainor.

From the post:

I am pleased to report that I have had the Ontopia Topic Maps software running on my Raspberry Pi for the past week. Ontopia is a suite of open source tools for building, maintaining and deploying Topic Maps-based applications. The Raspberry Pi is an ultra-affordable ARM GNU/Linux box based upon the work of the Raspberry Pi Foundation. My experience in running the out-of-the-box Ontopia apps (Ontopoly topic map editor, Omnigator topic map browser, and Vizigator topic map vizualizer) has been terrific. Using the Raspberry Pi to run the Apache Tomcat server that hosts the Ontopia software, response time is as good or better than I have experienced when hosting the Ontopia software on a cloud-based Linux server at my ISP. Topic maps open quickly in all three applications and navigation from topic to topic within each application is downright snappy.

As you will see in my discussion of testing below, I have experienced good results with up to two simultaneous users. So, my future test plans include testing with more simultaneous users and testing with the Ontopia RDBMS Backend installed. Based upon the performance that I have experienced so far, I have high hopes. Stay tuned for further reports.

What a great way to introduce topic maps to experimenters!

Thanks Kevin!

Awaiting future results! (And for a Raspberry PI to arrive!)

### 2013 International Supercomputing Conference

Monday, December 3rd, 2012

2013 International Supercomputing Conference

Important Dates

Abstract Submission Deadline Sunday, January 27, 2013
23:59 pm, AoE
Full Paper Submission Deadline Sunday, February 10, 2013
23:59 pm, AoE
Author Notification Sunday, March 10, 2013
Rebuttal Phase Starts Sunday, March 10, 2013
Rebuttal Phase Ends Sunday, March 17, 2013
Notification of Acceptance Friday, March 22, 2013
Camera-Ready Submission Sunday, April 7, 2013

From the call for papers:

• Architectures (multicore/manycore systems, heterogeneous systems, network technology and programming models)
• Algorithms and Analysis (scalability on future architectures, performance evaluation and tuning)
• Large-Scale Simulations (workflow management, data analysis and visualization, coupled simulations and industrial simulations)
• Future Trends (Exascale HPC, HPC in the Cloud)
• Storage and Data (file systems and tape libraries, data intensive applications and databases)
• Software Engineering in HPC (application of methods, surveys)
• Supercomputing Facility (batch job management, job mix and system utilization and monitoring and administration tools)
• Scalable Applications: 50k+ (ISC Research thrust). The Research Paper committee encourages scientists to submit parallelization approaches that lead to scalable applications on more than 50,000 (CPU or GPU) cores
• Submissions on other innovative aspects of high-performance computing are also welcome.

Did I mention it will be in Leipzig, Germany? 😉

### SC12 Salt Lake City, Utah (Proceedings)

Thursday, November 22nd, 2012

SC12 Salt Lake City, Utah

Proceeding from SC12 are online!

ACM Digital Library: SC12 Conference Proceedings

IEEE Xplore: SC12 Conference Proceedings

Everything from graphs to search and lots in between.

Enjoy!

### The Titan Informatics Toolkit

Wednesday, October 17th, 2012

The Titan Informatics Toolkit

From the webpage:

A collaborative effort between Sandia National Laboratories and Kitware Inc., the Titan™ Informatics Toolkit is a collection of scalable algorithms for data ingestion and analysis that share a common set of data structures and a flexible, component-based pipeline architecture. The algorithms in Titan span a broad range of structured and unstructured analysis techniques, and are particularly suited to parallel computation on distributed memory supercomputers.

Titan components may be used by application developers using their native C++ API on all popular platforms, or using a broad set of language bindings that include Python, Java, TCL, and more. Developers will combine Titan components with their own application-specific business logic and user interface code to address problems in a specific domain. Titan is used in applications varying from command-line utilities and straightforward graphical user interface tools to sophisticated client-server applications and web services, on platforms ranging from individual workstations to some of the most powerful supercomputers in the world.

I stumbled across this while searching for the Titan (as in graph database) project.

The Parallel Latent Semantic Analysis component is available now. I did not see release dates on other modules, such as Advanced Graph Algorithms.

Source (C++) for the Titan Informatics Toolkit is available.

### Parallella: A Supercomputer For Everyone

Friday, October 5th, 2012

Parallella: A Supercomputer For Everyone

For a $99 pledge you help make the Parallella computer a reality (and get one when produced). • Dual-core ARM A9 CPU • Epiphany Multicore Accelerator (16 or 64 cores) • 1GB RAM • MicroSD Card • USB 2.0 (two) • Two general purpose expansion connectors • Ethernet 10/100/1000 • HDMI connection • Ships with Ubuntu OS • Ships with free open source Epiphany development tools that include C compiler, multicore debugger, Eclipse IDE, OpenCL SDK/compiler, and run time libraries. • Dimensions are 3.4” x 2.1” Once completed, the Parallella computer should deliver up to 45 GHz of equivalent CPU performance on a board the size of a credit card while consuming only 5 Watts under typical work loads. Counting GHz, this is more horsepower than a high end server costing thousands of dollars and consuming 400W.$99 to take a flyer on changing the fabric of supercomputing?

I’ll take that chance. How about you?

PS: Higher pledge amounts carry extra benefits, such as projected delivery of a beta version by December of 2012. ($5,000) Got a hard core geek on your holiday shopping list? PPS: I first saw this at: Adapteva Launches Crowd-Source Funding for Its Floating Point Accelerator by Michael Feldman (HPC). ### A Raspberry Pi Supercomputer Wednesday, September 12th, 2012 A Raspberry Pi Supercomputer If you need a supercomputer for processing your topic maps, an affordable one is at hand. Some assembly required. With Legos no less. From the ScienceDigest post: Computational Engineers at the University of Southampton have built a supercomputer from 64 Raspberry Pi computers and Lego. The team, led by Professor Simon Cox, consisted of Richard Boardman, Andy Everett, Steven Johnston, Gereon Kaiping, Neil O’Brien, Mark Scott and Oz Parchment, along with Professor Cox’s son James Cox (aged 6) who provided specialist support on Lego and system testing. Professor Cox comments: “As soon as we were able to source sufficient Raspberry Pi computers we wanted to see if it was possible to link them together into a supercomputer. We installed and built all of the necessary software on the Pi starting from a standard Debian Wheezy system image and we have published a guide so you can build your own supercomputer.” The racking was built using Lego with a design developed by Simon and James, who has also been testing the Raspberry Pi by programming it using free computer programming software Python and Scratch over the summer. The machine, named “Iridis-Pi” after the University’s Iridis supercomputer, runs off a single 13 Amp mains socket and uses MPI (Message Passing Interface) to communicate between nodes using Ethernet. The whole system cost under £2,500 (excluding switches) and has a total of 64 processors and 1Tb of memory (16Gb SD cards for each Raspberry Pi). Professor Cox uses the free plug-in ‘Python Tools for Visual Studio’ to develop code for the Raspberry Pi. You may also want to visit the Rasberry PI Foundation. Which has the slogan: “An ARM GNU/Linux box for$25. Take a byte!”

In an age with ready access to cloud computing resources, to say nothing of weapon quality toys (Playstation 3’s), for design simulations, there is still a place for inexpensive experimentation.

What hardware configurations will you test out on your Raspberry Pi Supercomputer?

Are there specialized configurations that work better for some subject identity tests than others?

How do hardware constraints influence our approaches to computational problems?

Are we missing solutions because they don’t fit current architectures and therefore aren’t considered? (Not rejected, just don’t come up at all.)

### An Open Source Platform for Virtual Supercomputing

Wednesday, September 7th, 2011

An Open Source Platform for Virtual Supercomputing, Michael Feldman reports:

Erlang Solutions and Massive Solutions will soon launch a new cloud platform for high performance computing. Last month they announced their intent to bring a virtual supercomputer (VSC) product to market, the idea being to enable customers to share their HPC resources either externally or internally, in a cloud-like manner, all under the banner of open source software.

The platform will be based on Clustrx and Xpandrx, two HPC software operating systems that were the result of several years of work done by Erlang Solutions, based in the UK, and Massive Solutions, based in Gibraltar. Massive Solutions has been the driving force behind the development of these two OS’s, using Erlang language technology developed by its partner.

In a nutshell, Clustrx is an HPC operating system, or more accurately, middleware, which sits atop Linux, providing the management and monitoring functions for supercomputer clusters. It is run on its own small server farm of one or more nodes, which are connected to the compute servers that make up the HPC cluster. The separation between management and compute enables it to support all the major Linux distros as well as Windows HPC Server. There is a distinct Clustrx-based version of Linux for the compute side as well, called Compute Based Linux.

A couple of things to note from within the article:

The only limitation to this model is its dependency on the underlying capabilities of Linux. For example, although Xpandrx is GPU-aware, since GPU virtualization is not yet supported in any Linux distros, the VSC platform can’t support virtualization of those resources. More exotic HPC hardware technology would, likewise, be out of the virtual loop.

The common denominator for VSC is Erlang, not just the company, but the language http://www.erlang.org/, which is designed for programming massively scalable systems. The Erlang runtime has built-in to support for things like concurrency, distribution and fault tolerance. As such, it is particularly suitable for HPC system software and large-scale interprocess communication, which is why both Clustrx and Xpandrx are implemented in the language.

As computing power and access to computing power increases, have you seen an increase in robust (in your view) topic map applications?