Is it time to get rid of the Linux OS model in the cloud?
From the post:
You program in a dynamic language, that runs on a JVM, that runs on a OS designed 40 years ago for a completely different purpose, that runs on virtualized hardware. Does this make sense? We’ve talked about this idea before in Machine VM + Cloud API – Rewriting The Cloud From Scratch, where the vision is to treat cloud virtual hardware as a compiler target, and converting high-level language source code directly into kernels that run on it.
As new technologies evolve the friction created by our old tool chains and architecture models becomes ever more obvious. Take, for example, what a team at USCD is releasing: a phase-change memory prototype – a solid state storage device that provides performance thousands of times faster than a conventional hard drive and up to seven times faster than current state-of-the-art solid-state drives (SSDs). However, PCM has access latencies several times slower than DRAM.
This technology has obvious mind blowing implications, but an interesting not so obvious implication is what it says about our current standard datacenter stack. Gary Athens has written an excellent article, Revamping storage performance, spelling it all out in more detail:
Computer scientists at UCSD argue that new technologies such as PCM will hardly be worth developing for storage systems unless the hidden bottlenecks and faulty optimizations inherent in storage systems are eliminated.
Moneta, bypasses a number of functions in the operating system (OS) that typically slow the flow of data to and from storage. These functions were developed years ago to organize data on disk and manage input and output (I/O). The overhead introduced by them was so overshadowed by the inherent latency in a rotating disk that they seemed not to matter much. But with new technologies such as PCM, which are expected to approach dynamic random-access memory (DRAM) in speed, the delays stand in the way of the technologies’ reaching their full potential. Linux, for example, takes 20,000 instructions to perform a simple I/O request.
By redesigning the Linux I/O stack and by optimizing the hardware/software interface, researchers were able to reduce storage latency by 60% and increase bandwidth as much as 18 times.
The I/O scheduler in Linux performs various functions, such as assuring fair access to resources. Moneta bypasses the scheduler entirely, reducing overhead. Further gains come from removing all locks from the low-level driver, which block parallelism, by substituting more efficient mechanisms that do not.
Moneta performs I/O benchmarks 9.5 times faster than a RAID array of conventional disks, 2.8 times faster than a RAID array of flash-based solid-state drives (SSDs), and 2.2 times faster than fusion-io’s high-end, flash-based SSD.
Read the rest of the post and then ask yourself what architecture do you envision for a topic map application?
What if rather that moving data from one data structure to another, that the data structure addressed is identified by the data? If you wish to “see” the data as a table, it reports is location by table/column/row. If you wish to “see” the data as a matrix, it reports its matrix position. If you wish to “see” the data as a linked list, it can report its value, plus those ahead and behind.
It isn’t that difficult to imagine that data reports its location on a graph as the result of an operation. Perhaps storing its graph location for every graphing operation that is “run” using that data point.
True enough we need to create topic maps that run on conventional hardware/software but that isn’t an excuse to ignore possible futures.
Reminds me of a “grook” that I read years ago: “You will conquer the present suspiciously fast – if you smell of the future and stink of the past.” (Piet Hein but I don’t remember which book.)