Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

March 21, 2014

ROOT Files

Filed under: CERN,Dictionary,Files — Patrick Durusau @ 7:17 pm

ROOT Files

From the webpage:

Today, a huge amount of data is stored into files present on our PC and on the Internet. To achieve the maximum compression, binary formats are used, hence they cannot simply be opened with a text editor to fetch their content. Rather, one needs to use a program to decode the binary files. Quite often, the very same program is used both to save and to fetch the data from those files, but it is also possible (and advisable) that other programs are able to do the same. This happens when the binary format is public and well documented, but may happen also with proprietary formats that became a standard de facto. One of the most important problems of the information era is that programs evolve very rapidly, and may also disappear, so that it is not always trivial to correctly decode a binary file. This is often the case for old files written in binary formats that are not publicly documented, and is a really serious risk for the formats implemented in custom applications.

As a solution to these issues ROOT provides a file format that is a machine-independent compressed binary format, including both the data and its description, and provides an open-source automated tool to generate the data description (or “dictionary“) when saving data, and to generate C++ classes corresponding to this description when reading back the data. The dictionary is used to build and load the C++ code to load the binary objects saved in the ROOT file and to store them into instances of the automatically generated C++ classes.

ROOT files can be structured into “directories“, exactly in the same way as your operative system organizes the files into folders. ROOT directories may contain other directories, so that a ROOT file is more similar to a file system than to an ordinary file.

Amit Kapadia mentions ROOT files in his presentation at CERN on citizen science.

I have only just begun to read the documentation but wanted to pass this starting place along to you.

I don’t find the “machine-independent compressed binary format” argument all that convincing but apparently it has in fact worked for quite some time.

Of particular interest will be the data dictionary aspects of ROOT.

Other data and description capturing file formats?

1 Comment

  1. […] I mentioned ROOT files yesterday, I am curious what you make of the use of Thrift metadata definitions to read Parquet […]

    Pingback by Use Parquet with Impala, Hive, Pig, and MapReduce « Another Word For It — March 22, 2014 @ 8:05 pm

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress