Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 20, 2011

bigdata®

Filed under: bigdata®,NoSQL — Patrick Durusau @ 8:23 pm

bigdata®

Bryan Thompson, one of the creators of bigdata(R), was a member of the effort that resulted in the XTM syntax for topic maps.

If Bryan says it scales, it scales.

What I did not see was the ability to document mappings between data as representing the same subjects. Or the ability to query such mappings. Still, on further digging I may uncover something that works that way.

From the webpage:

This is a major version release of bigdata(R). Bigdata is a horizontally-scaled, open-source architecture for indexed data with an emphasis on RDF capable of loading 1B triples in under one hour on a 15 node cluster. Bigdata operates in both a single machine mode (Journal) and a cluster mode (Federation). The Journal provides fast scalable ACID indexed storage for very large data sets, up to 50 billion triples / quads. The federation provides fast scalable shard-wise parallel indexed storage using dynamic sharding and shard-wise ACID updates and incremental cluster size growth. Both platforms support fully concurrent readers with snapshot isolation.

Distributed processing offers greater throughput but does not reduce query or update latency. Choose the Journal when the anticipated scale and throughput requirements permit. Choose the Federation when the administrative and machine overhead associated with operating a cluster is an acceptable tradeoff to have essentially unlimited data scaling and throughput.

See [1,2,8] for instructions on installing bigdata(R), [4] for the javadoc, and [3,5,6] for news, questions, and the latest developments. For more information about SYSTAP, LLC and bigdata, see [7].

Starting with the 1.0.0 release, we offer a WAR artifact [8] for easy installation of the single machine RDF database. For custom development and cluster installations we recommend checking out the code from SVN using the tag for this release. The code will build automatically under eclipse. You can also build the code using the ant script. The cluster installer requires the use of the ant script.

You can download the WAR from:

http://sourceforge.net/projects/bigdata/

You can checkout this release from:

https://bigdata.svn.sourceforge.net/svnroot/bigdata/tags/BIGDATA_RELEASE_1_1_0

New features:

  • Fast, scalable native support for SPARQL 1.1 analytic queries;
  • %100 Java memory manager leverages the JVM native heap (no GC);
  • New extensible hash tree index structure.

Feature summary:

– Single machine data storage to ~50B triples/quads (RWStore);

  • Clustered data storage is essentially unlimited;
  • Simple embedded and/or webapp deployment (NanoSparqlServer);
  • Triples, quads, or triples with provenance (SIDs);
  • Fast 100% native SPARQL 1.0 evaluation;
  • Integrated “analytic” query package;
  • Fast RDFS+ inference and truth maintenance;
  • Fast statement level provenance mode (SIDs).

Road map [3]:

  • Simplified deployment, configuration, and administration for clusters; and
  • High availability for the journal and the cluster.

(footnotes omitted)

PS: Jack Park forwarded this to my attention. Will have to download and play with it over the holidays.

2 Comments

  1. Now this was interesting. I’ve queued both their whitepapers for reading.

    Comment by larsga@garshol.priv.no — December 21, 2011 @ 3:37 am

  2. Agreed, but I thought you would have liked the bitmapped vector trie under the Scala article as well. Nice performance characteristics for some purposes.

    Comment by Patrick Durusau — December 21, 2011 @ 7:33 am

RSS feed for comments on this post.

Sorry, the comment form is closed at this time.

Powered by WordPress