An Asymmetric Data Conversion Scheme based on Binary Tags by Zhu Wang; Chonglei Mei; Hai Jiang; Wilkin, G.A..
Abstract:
In distributed systems with homogeneous or heterogeneous computers, data generated on one machine might not always be used by another machine directly. For a particular data type, its endianness, size and padding situation cause incompatibility issue. Data conversion procedure is indispensable, especially in open systems. So far, there is no widely accepted data format standard in high performance computing community. Most time, programmers have to handle data formats manually. In order to achieve high programmability and efficiency in both homogeneous and heterogeneous open systems, a novel asymmetric binary-tag-based data conversion scheme (BinTag) is proposed to share data smoothly. Each data item carries one binary tag generated by BinTag’s parser without much programmer’s involvement. Data conversion only happens when it is absolutely necessary. Experimental results have demonstrated its effectiveness and performance gains in terms of productivity and data conversion speed. BinTag can be used in both memory and secondary storage systems.
Homogeneous and heterogeneous in the sense of padding, size, endianness? Serious issue for high performance computing.
Are there lessons to be taught or learned here for other notions of homogeneous/heterogeneous data?
Do we need binary tags to track semantics at a higher level?
Or can we view data as though it had particular semantics? At higher and lower levels?