NoSQL « Another Word For It

January 26, 2012

New Opportunities for Connected Data (logic, contagion relationships and merging)

Filed under: Logic,Neo4j,NoSQL — Patrick Durusau @ 6:46 pm

New Opportunities for Connected Data by Ian Robinson, Neo Technologies, Inc.

An in depth discussion of relational, NoSQL and graph database views of the world.

I must admit to being surprised when James Frazer’s Golden Bough came up in the presentation. It was used quite effectively as an illustration but I have learned to not expect humanities references or examples in CS presentations. This was a happy exception.

I agree with Ian that the relational world view remains extremely useful but also that it limits the data that can be represented and queried.

Complex relationships between entities simply don’t come up with relational databases because they aren’t easy (if possible) to represent.

I would take Ian’s point a step further and point out that logic, as in RDF and the Semantic Web, is a similar constraint.

Logic can be very useful in any number of areas, just like relational databases, but it only represents a very small slice of the world. A slice of the world that can be represented quite artificially without contradictions, omissions, inconsistencies, or any of the other issues that make logic systems fall over clutching their livers.

BTW, topic mappers need to take a look at timemark 34.26. The representation of the companies who employ workers and the “contagion” relationships. (You will have to watch the video to find out why I say “contagion.” It is worth the time.) Does that suggest to you that I could point topics to a common node based on their possession of some property, say a subject identifier? Such that when I traverse any of those topics I can go to the common node and produce a “merged” result if desired?

I say that because any topic could point to more than one common node, depending upon the world view of an author. That could be very interesting in terms of comparing how authors would merge topics.

Comments Off

Google: MoreSQL is Real

Filed under: NoSQL — Patrick Durusau @ 6:41 pm

Google: MoreSQL is Real by Williams Edwards.

One comment on the post summarized it:

Super rant that really crystallized my discomfort with the whole NoSQL business .. At the end of the day, it’s a ‘war’ between various APIs to access B/B+ trees !

Well, but it is an enjoyable rant, so read it and see for yourself.

I do think one of the advantages of all the hype has been an increase in at least considering different options and data structures. Some of them will be less useful than the ones that are common now, but it only take one substantial improvement to make it all worthwhile.

Comments Off

January 25, 2012

3^rd Globals Challenge

Filed under: Contest,Globalsdb,NoSQL — Patrick Durusau @ 3:25 pm

3^rd Globals Challenge

Contest starts: 10 Feb 12 18:00 EST
Contest ends: 17 Feb 12 18:00 EST

Topic mappers take note:

All applications must be built using Globals. However, you are also allowed to use additional technologies to supplement Globals (emphasis added, additional technologies, unlike some linked data competitions)

The email I got reports:

A cash prize of USD $3,500 for the winning entry

A press release announcing the winning participant and solution

A chance to win a free registration for the InterSystems Global Summit

You might want to drop by Globals to grab a copy of the software and read up on the documentation.

You can also see the prior challenges. These are non-trivial events but that also means you will learn a lot in the process.

Comments Off

January 21, 2012

Sensei – Major Update

Filed under: NoSQL,Sensei — Patrick Durusau @ 10:08 pm

Sensei

My first post on Sensei was December 10, 2010 – Sensei – which if you follow the link given there, redirects to the new page.

The present homepage reads in part:

SenseiDB

Open-source, distributed, realtime, semi-structured database

Powering LinkedIn homepage and LinkedIn Signal.

Some Features:

Full-text search

Fast realtime updates

Structured and faceted search

Fast key-value lookup

High performing under concurrent heavy update and query volumes

Hadoop integration

Quite different and not idle claims about numbers. I have heard of LinkedIn, as I am sure you have as well. 😉

I appreciate the effort to stay as close to SQL as possible but lacking a copy of the current SQL standard (I need to fix that), I don’t know how much Sensei has diverged from SQL or why?

Not to nit-pick too much but entries like:

Note that wildcards % and _, not Lucene’s * and ? are used in BQL. This is mainly to make BQL more compatible with SQL. However, if * or ? is used, it is also accepted.

that I saw just scanning the documentation says to me that a close editing pass would be a useful thing.

I haven’t run the examples (yet) but marks for the cars data example and capturing a Twitter stream.

Comments Off

January 6, 2012

Semi-structured data and P2P graph databases

Filed under: Graphs,NoSQL,Plasma — Patrick Durusau @ 11:34 am

Semi-structured data and P2P graph databases by Jeff Rose.

From the post:

In a previous post I introduced the Plasma graph query engine that I’ve been working on as part of my thesis project. With Plasma you can declaratively define queries and evaluate them against a graph database. The heart of the system is a library of dataflow query operators, and on top of them sits a fairly simplistic query “language”. (I put it in quotes because in a lisp based language like Clojure the line between a mini-language and an API gets blurry.) In this post I’ll write a bit about why I think graph databases could be an interesting foundation for next generation P2P networks, and then I’ll give some examples of performing distributed graph queries using Plasma. First I think it is important to motivate the use of a graph database though. While most of the marketing speak on the web regarding graph databases is all about representing social network data, this is just one of many potential applications.

I am not convinced the categories of “structured,” “semi-structured,” and “unstructured” data are all that helpful.

For example, when did the New Testament become a structured text? Division into chapters? (13th century) Division into verses? (mid-16th century) or is it still “unstructured?” Or the same question for the Tanakh, except there relying on a much richer system of divisions.

If you mean by “structured” a particular form of internal representation and reference, such as are represented to users as relational tables, why not say so? That is a particular form of structuring data, not the only one.

And as Wikipedia observes (Table (Database):

An equally valid representations of a relation is as an n-dimensional chart, where n is the number of attributes (a table’s columns). For example, a relation with two attributes and three values can be represented as a table with two columns and three rows, or as a two-dimensional graph with three points. The table and graph representations are only equivalent if the ordering of rows is not significant, and the table has no duplicate rows.

I take that to mean that I can treat a graph as a data structure with more “structure” as it were.

I am equally unconvinced that P2P networks are the key to avoiding the control and censorship issues of architectures like the Internet. If you think the telcos rolled over quick when asked information for “national security,” just think about your CIO or even your local network administrator. And being P2P means arbitrary peers can pick up the data stream. Want to see the folks in dark shades and cheap suits?

P2P maybe a better technological choice to lessen the chances of censorship, but social institutions that oppose censorship or make it more difficult are equally important, if not more so.

Comments Off

January 4, 2012

Riak NoSQL Database: Use Cases and Best Practices

Filed under: NoSQL,Riak — Patrick Durusau @ 7:49 am

Riak NoSQL Database: Use Cases and Best Practices

From the post:

Riak is a key-value based NoSQL database that can be used to store user session related data. Andy Gross from Basho Technologies recently spoke at QCon SF 2011 Conference about Riak use cases. InfoQ spoke with Andy and Mark Phillips (Community Manager) about Riak database features and best practices when using Riak.

Not a lot of technical detail but enough to get a feel for whether you want/need to learn more about Riak.

Comments Off

January 1, 2012

Cassandra NYC 2011 Presentation Slides and Videos

Filed under: Cassandra,NoSQL — Patrick Durusau @ 5:55 pm

Cassandra NYC 2011 Presentation Slides and Videos

Almost the first half:

Chris Burroughs (Clearspring) – Apache Cassandra Clearspring (HD Video)

David Weinstein (Adobe) – Cassandra at Adobe (HD Video)

Drew Robb (SocialFlow) – Cassandra at Social Flow (HD Video)

Ed Capriolo (m6d) – Cassandra in Online Advertising (Slides and HD Video)

Eric Evans (Acunu) – CQL: SQL for Cassandra (Slides and HD Video)

Ilya Maykov (Ooyala) – Scaling Video Analytics with Apache Cassandra (Slides)

Joe Stein (Medialets) – Cassandra as the Central Nervous System of Your Distributed Systems (Slides and HD Video)

I count nine (9) more at the Datastax site.

Just in case you want to get started on your New Year’s resolution to learn one (or another?) NoSQL database cold.

I would amend that resolution to learn one of: DB2, Oracle, MySQL, PostgreSQL, SQL Server as well. That will enable you to make an intelligent assessment of the requirements of your projects and the capabilities of a range of storage solutions.

Comments Off

December 26, 2011

NoSQL Conference CGN

Filed under: NoSQL — Patrick Durusau @ 8:19 pm

NoSQL Conference CGN

Important dates:

1 February 2012 – Deadline for proposals
1 March 2012 – Accepted speakers announced
29 May 2012 – 30 May 2012 conference, Cologne, Germany

From the website:

NoSQL is taking the IT-world by storm: originally devised to tackle growing amounts of data, NoSQL now forms the base for a large variety of business solutions.

NoSQL – currently the most efficient way to manage large data repositories – is taking a leading role in the next generation of database technologies. The upcoming conference NoSQL matters will present the innovations at the forefront of the area – and right in the center of Europe.

…

Over two days international experts will present topics associated with NoSQL technologies, outline challenges and solutions for the administration of large data respositories. The conference aims to contribute creatively to the understanding of NoSQL in terms of development and practical use. NoSQL matters takes place in Cologne, where 2000 years of history are combined with modern technology.

I understand the tourist facilities at Ur are in disrepair so I guess having the conference at a more recent location is ok. 😉

Comments Off

December 24, 2011

IndexTank is now open source!

Filed under: Database,IndexTank,NoSQL — Patrick Durusau @ 4:43 pm

IndexTank is now open source! by Diego Basch, Director of Engineering, LinkedIn.

From the post:

We are proud to announce that the technology behind IndexTank has just been released as open-source software under the Apache 2.0 License! We promised to do this when LinkedIn acquired IndexTank, so here we go:

indextank-engine: Indexing engine

indextank-service: API, BackOffice, Storefront, and Nebulizer

We know that many of our users and other interested parties have been patiently waiting for this release. We want to thank you for your patience, for your kind emails, and for your continued support. We are looking forward to seeing IndexTank thrive as an open-source project. Of course we’ll do our part; our team is hard at work building search infrastructure at LinkedIn. We are part of a larger team that has built and released search technologies such as Zoie, Bobo, and just this past Monday, Cleo. We are excited to add IndexTank to this array of powerful open source tools.

From the indextank.com homepage:

PROVEN FULL-TEXT SEARCH API

Truly real-time: instant updates without reindexing

Geo & Social aware: use location, votes, ratings or comments

Works with Ruby, Rails, Python, Java, PHP, .NET & more!

CUSTOM SEARCH THAT YOU CONTROL

You control how to sort and score results

“Fuzzy”, Autocomplete, Facets for how users really search

Highlights & Snippets quickly shows search results relevance

EASY, FAST & HOSTED

Scalable from a personal blog to hundreds of millions of documents! (try Reddit)

Free up 100K documents

Easier than SQL, SOLR/Lucene & Sphinx.

If you are looking for documentation, rather than github, you best look here.

So far, I haven’t seen anything out of the ordinary for a search engine. I mention it in case some people prefer it over others.

Do you see anything out of the ordinary?

Comments Off

December 21, 2011

Lily 1.1 is out!

Filed under: Lily,NoSQL,Solr — Patrick Durusau @ 7:24 pm

Lily 1.1 is out

There is a lot to see here but I wanted to call your attention to:

Lily adds a high-level data model on top of HBase. Originally, the model was a simple list of fields stored within records, but we added some field types making that model a whole lot more interesting. A first addition is the RECORD value type. You can now store records inside records, which is useful to store structured data in fields. For indexing purposes, you can address sub-record data as if it are linked records, using dereferencing.

Is it just me or does it seem like a lot of software is being released just before the holidays? 😉

From the post:

Complex Field Types

Lily adds a high-level data model on top of HBase. Originally, the model was a simple list of fields stored within records, but we added some field types making that model a whole lot more interesting. A first addition is the RECORD value type. You can now store records inside records, which is useful to store structured data in fields. For indexing purposes, you can address sub-record data as if it are linked records, using dereferencing.

Two other cool new value types are LIST and PATH, which allow for far more flexible modeling than the previous multi-value and hierarchy field properties. At the schema level, we adopted a generics style of defining value types, for instance LIST<LIST<STRING>> defines a field that will contain a list of lists of strings. Finally, we also added a BYTEARRAY value type for raw data storage.

Conditional updates

If you’re familiar with multi-user environments you sure know about the problem of concurrent updates. For these situations, Lily now provides a lock-free, optimistic concurrency control feature we call conditional updates. The update and delete methods allow one to add a list of mutation conditions that need to be satisfied before the the update or delete will be applied.

For concurrency control, you can require that the value of a field needs to be the same as when the record was read before the update.

Test framework

Lily 1.1 ships with a toolchest for Java developers that want to run unit tests against an HBase/Lily application stack. The stack can be launched embedded or externally, with simple scripts straight out of the Lily distribution. You can also request a ‘state reset’, clearing a single node instance of Lily for subsequent test runs. Yes, you can now run Lily, HBase, Zookeeper, HDFS, Map/Reduce and Solr in a single VM, with a single command.

Server-side plugins

For the fearless Lily repository hacker, we offer two hooks to expand functionality of the Lily server process. There’s decorators which can intercept any CRUD operation for pre- or post-execution of side-effect operations (like modifying a field value before actually committing it).

Rowlog sharding

The global rowlog queue is now distributed across a pre-split table, with inserts and deletes going to several region servers. This will lead to superior performance on write-or update-heavy multi-node cluster setups.

API improvements

Our first customers (*waves to our French friends*) found our API to be a tad too verbose and suggested a Builder pattern approach. We listened and unveil a totally new (but optional) method-chaining Builder API for the Java API users.

Whirr-based cluster installer

For Lily Enterprise customers, we rewrote our cluster installer using Apache Whirr, being one of the first serious adopters of this exciting Cloud- and cluster management tool. Using this, installing Lily on many nodes becomes a breeze. Here’s a short movie showing off the new installer.

Performance

Thanks to better parallelization, Lily has become considerably faster. You can now comfortably throw more clients at one Lily cluster and see combined throughput scale fast.

All in all, Lily 1.1 was a great release to prepare. We hope you have as much fun using Lily 1.1 as we had building it. Check it out here: www.lilyproject.org.

Comments Off

December 20, 2011

bigdata®

Filed under: bigdata®,NoSQL — Patrick Durusau @ 8:23 pm

bigdata®

Bryan Thompson, one of the creators of bigdata(R), was a member of the effort that resulted in the XTM syntax for topic maps.

If Bryan says it scales, it scales.

What I did not see was the ability to document mappings between data as representing the same subjects. Or the ability to query such mappings. Still, on further digging I may uncover something that works that way.

From the webpage:

This is a major version release of bigdata(R). Bigdata is a horizontally-scaled, open-source architecture for indexed data with an emphasis on RDF capable of loading 1B triples in under one hour on a 15 node cluster. Bigdata operates in both a single machine mode (Journal) and a cluster mode (Federation). The Journal provides fast scalable ACID indexed storage for very large data sets, up to 50 billion triples / quads. The federation provides fast scalable shard-wise parallel indexed storage using dynamic sharding and shard-wise ACID updates and incremental cluster size growth. Both platforms support fully concurrent readers with snapshot isolation.

Distributed processing offers greater throughput but does not reduce query or update latency. Choose the Journal when the anticipated scale and throughput requirements permit. Choose the Federation when the administrative and machine overhead associated with operating a cluster is an acceptable tradeoff to have essentially unlimited data scaling and throughput.

See [1,2,8] for instructions on installing bigdata(R), [4] for the javadoc, and [3,5,6] for news, questions, and the latest developments. For more information about SYSTAP, LLC and bigdata, see [7].

Starting with the 1.0.0 release, we offer a WAR artifact [8] for easy installation of the single machine RDF database. For custom development and cluster installations we recommend checking out the code from SVN using the tag for this release. The code will build automatically under eclipse. You can also build the code using the ant script. The cluster installer requires the use of the ant script.

You can download the WAR from:

http://sourceforge.net/projects/bigdata/

You can checkout this release from:

https://bigdata.svn.sourceforge.net/svnroot/bigdata/tags/BIGDATA_RELEASE_1_1_0

New features:

Fast, scalable native support for SPARQL 1.1 analytic queries;

%100 Java memory manager leverages the JVM native heap (no GC);

New extensible hash tree index structure.

Feature summary:

– Single machine data storage to ~50B triples/quads (RWStore);

Clustered data storage is essentially unlimited;

Simple embedded and/or webapp deployment (NanoSparqlServer);

Triples, quads, or triples with provenance (SIDs);

Fast 100% native SPARQL 1.0 evaluation;

Integrated “analytic” query package;

Fast RDFS+ inference and truth maintenance;

Fast statement level provenance mode (SIDs).

Road map [3]:

Simplified deployment, configuration, and administration for clusters; and

High availability for the journal and the cluster.

(footnotes omitted)

PS: Jack Park forwarded this to my attention. Will have to download and play with it over the holidays.

Comments (2)

December 19, 2011

NoSQL Screencast: Building a StackOverflow Clone With RavenDB

Filed under: NoSQL,RavenDB — Patrick Durusau @ 8:11 pm

NoSQL Screencast: Building a StackOverflow Clone With RavenDB

Ayenda and Justin cover:

Map/Reduce indexes
Modelling tags
Facets
Performance
RavenDB profiler

Entire project is on Github, just in case you want to review the code.

Comments Off

NoSQL Screencast: HBase Schema Design

Filed under: HBase,NoSQL — Patrick Durusau @ 8:11 pm

NoSQL Screencast: HBase Schema Design

From Alex Popescu’s post:

In this O’Reilly webcast, long time HBase developer and Cloudera HBase/Hadoop architect Lars George discusses the underlying concepts of the storage layer in HBase and how to do model data in HBase for best possible performance.

You may know George from HBase: The Definitive Guide.

Comments Off

December 18, 2011

1st International ICST conference on No SQL Databases and Social Applications

Filed under: Conferences,NoSQL,Topic Maps — Patrick Durusau @ 8:40 pm

1st International ICST conference on No SQL Databases and Social Applications June 6-8, 2012, Berlin, Germany.

Important Dates:

Submission deadline 15 February 2012
Notification and Registration opens 31 March 2012
Camera-ready deadline 30 April 2012
Start of Conference 6 June 2012

From the call for papers:

The INOSA conference in Berlin / Germany focuses on breakthroughs, new concepts and applications for developing, operating and optimizing social software systems, as well analysing and exploiting the data created by these systems.

Computer Science is intertwined with users from its early beginnings. There is hardly a software just for itself. Each software is created for user needs. This process has now reached the private life. The Internet, Web and wireless communication has fundamentally changed our daily communication and how information flows in commercial and private networks. New issues came up with this trend: Non-SQL databases offer better feature for a number of social applications. Semantic systems helps to bridge the gap between words and their meaning. Data mining and inference helps to extract implicit facts. P2P systems are proper answers to an increasing amount of data that shall be exchanged. In 2010, more smart phones than computer are sold. Mobile devices and context aware systems (e.g. locations based systems) play a major role for social applications. New threats accompany these trends as well, though. Social hacking, lost of privacy, vague or complex copy rights are just one of them.

Lutz Maicher is on the organizing committee so it would be nice to see some topic map papers at the conference.

Comments Off

December 17, 2011

SQL to MongoDB: An Updated Mapping

Filed under: Aggregation,MongoDB,NoSQL — Patrick Durusau @ 7:52 pm

SQL to MongoDB: An Updated Mapping from Kristina Chodorow.

From the post:

The aggregation pipeline code has finally been merged into the main development branch and is scheduled for release in 2.2. It lets you combine simple operations (like finding the max or min, projecting out fields, taking counts or averages) into a pipeline of operations, making a lot of things that were only possible by using MapReduce doable with a “normal” query.

In celebration of this, I thought I’d re-do the very popular MySQL to MongoDB mapping using the aggregation pipeline, instead of MapReduce.

If you are interested in MongoDB-based solutions, this will be very interesting.

Comments Off

December 15, 2011

Raven DB – Stable Build – 573

Filed under: NoSQL,RavenDB — Patrick Durusau @ 7:45 pm

Raven DB – Stable Build – 573

From the email:

We now have a new stable build (finally).

It got delayed because of the new UI, and we *still have *new UI features that you’ll probably like that are going to show up on the unstable build, because I decided that enough is enough. We had almost two months without a real stable build, and we have had major work to improve things.

All our production stuff is now running 573. Here are all the new stuff:

Major:

The new UI is in the stable build

Optimized indexing – will not index documents that can be pre-filtered

Optimizing deletes

Reduced memory usage

New features:

Logs are available over the http API and using the new UI

Optimized handling of the server for big documents by streaming documents, rather than copying them

Updated to json.net 4.0.5

adding a way to control the capitalization of document keys

Added “More Like This” bundle

Licensing status is now reported in the UI.

Provide an event to notify about changes in failover status

Adding support for incremental backups

Allow nested queries to be optimized by the query optmizier

Use less memory on 32 bits systems

Raven.Backup executable

Much better interactive mode

Supporting projecting of complex paths

Support Count, Length on json paths

Allow to configure multi tenant idle times

Adding command line option to get all configuration documentation

Properly handle the scenario where we are unloading the domain / exiting without shutting down the database.

Will now push unstable versions to nuget as well

Something nice for the Windows side of the house.

Comments Off

December 9, 2011

Redis in Practice: Who’s Online?

Filed under: NoSQL,Redis — Patrick Durusau @ 8:17 pm

Redis in Practice: Who’s Online?

From the post:

Redis is one of the most interesting of the NOSQL solutions. It goes beyond a simple key-value store in that keys’ values can be simple strings, but can also be data structures. Redis currently supports lists, sets and sorted sets. This post provides an example of using Redis’ Set data type in a recent feature I implemented for Weplay, our social youth sports site.

See, having complex key values isn’t all that weird.

Comments Off

December 5, 2011

Released OrientDB v1.0rc7

Filed under: Graphs,NoSQL,OrientDB — Patrick Durusau @ 7:51 pm

Released OrientDB v1.0rc7: Improved transactions and first Multi-Master replication (alpha)

From the post:

Hi all, after about 2 months a new release is available for all: OrientDB 1.0rc7.

OrientDB embedded and server: http://code.google.com/p/orient/downloads/detail?name=orientdb-1.0rc7.zip
OrientDB Graph(ed): http://code.google.com/p/orient/downloads/detail?name=orientdb-graphed-1.0rc7.zip

According to the community answer this release should contains the new management of links using the OMVRB-Tree, but it’s in alpha stage yet and will be available in this week as 1.0rc8-SNAPSHOT. I preferred to release something really stable with all the 34 issues fixed (more below) till now. Furthermore tomorrow the TinkerPop team will release the new version of its amazing technology stack (Blueprints, Gremlin, etc.) and we couldn’t miss the chance to include latest release of OrientDB with it, don’t you?

Thanks to all the contributors every weeks more!

Changes

Transactions: Improved speed, up to 500x! (issue 538)

New Multi-Master replication (issue 589). Will be final in the next v1.0

SQL insert supports MAP syntax (issue 582), new date() function

HTTP interface: JSONP support (issue 587), new create database (issue 566), new import/export database (issue 567, 568)

Many bugs fixed, 34 issues in total

Full list: http://code.google.com/p/orient/issues/list?can=1&q=label%3Av1.0rc7

Thanks Luca!

Comments Off

December 2, 2011

Infinitegraph 2.0

Filed under: Graphs,InfiniteGraph,NoSQL — Patrick Durusau @ 4:56 pm

Infinitegraph 2.0

From the product page:

InfiniteGraph helps organizations find the valuable relationships within their data. Our product is unique in its ability to leverage distributed data and processes, which yields reduced time and costs while maximizing overall performance on big data.

No other graph database technology available today can match InfiniteGraph’s combined strengths of persisting and traversing complex relationships requiring multiple hops, across vast and distributed data stores.

But here is more important information (Objectivity, Inc. is the owner of Infinitegraph 2.0):

Objectivity, Inc., the leader in distributed, scalable data management solutions, today announced that Government Security News (GSN) has named its flagship database, Objectivity/DB, as winner of its annual Homeland Security Awards program in the “Best Intelligence Data Fusion and Collaborative Analysis Solution” category. The annual GSN Homeland Security Awards program celebrates the ongoing public-private partnership between all branches of Federal, state and local government in the United States and the private sector vendors of IT security, whose combined efforts successfully defend and protect the nation’s people, property and way of life. Click here for a list of awards categories and finalists, as well as for more information on GSN’s Homeland Security Awards.

“GSN is an authoritative source of news and information on all aspects of homeland security, and we are honored to be recognized by their esteemed panel of judges,” said Jay Jarrell, president and CEO of Objectivity, Inc. “This award is a testament to our leadership in the government sector, and underscores how agencies like the U.S. Air Force’s Network Centric Collaborative Targeting System (NCCT), Analyst Support Architecture (ASA) and the U.S. Navy’s Broad Area Maritime Surveillance (BAMS) Unmanned Aircraft System (UAS) program are leveraging Objectivity/DB to power distributed mission critical intelligence data fusion and collaborative analysis.”

Note that I corrected the first link in the first paragraph to point to the news of the award dinner. BTW, Netwitness and Overwatch Textron Systems were also winners in the “Best Intelligence Data Fusion and Collaborative Analysis Solution” category. Both worth your attention as well.

In terms of seeking an audience to discuss homeland security solutions, I think basing your approach on award winning software would be a good idea.

Comments Off

Acunu Data Platform v1.2 released!

Filed under: Acunu,Cassandra,NoSQL — Patrick Durusau @ 4:55 pm

Acunu Data Platform v1.2 released!

From the announcement:

We’re excited to announce the release of version 1.2 of the Acunu Data Platform, incorporating Apache Cassandra — the fastest and lowest-risk route to building a production-grade Cassandra cluster.

The Acunu Data Platform (ADP) is an all-in-one distributed database solution, delivered as a software appliance for your own data center or an Amazon Machine Image (AMI) for cloud deployments. It includes:

A hardened version of Apache Cassandra that is 100% compatible with existing Cassandra applications

The Acunu Core, a file system and embedded database designed from the ground-up for Big Data workloads

A web-based management console that simplifies deployment, monitoring and scaling of your cluster.

Your standard Linux Centos

Comments Off

Useful Mongo Resources for NoSQL Newbs

Filed under: MongoDB,NoSQL — Patrick Durusau @ 4:54 pm

Useful Mongo Resources for NoSQL Newbs

Michael Robinson has a small but useful collection of resources to introduce users to NoSQL and in particular MongoDB.

If you know of other resources Michael should be listing, give him a shout!

Comments Off

Cassandra 1.0.5

Filed under: Cassandra,NoSQL — Patrick Durusau @ 4:50 pm

Cassandra 1.0.5

A reversion release of Cassandra. Details: Cassandra changes.

Looks like the holidays are going to be filled with upgrades, new releases!

Comments Off

November 26, 2011

Neo4j 1.6 – Milestone 01!

Filed under: Graphs,Neo4j,NoSQL — Patrick Durusau @ 7:57 pm

Neo4j 1.6 – Milestone 01!

From the post:

The theme of 1.6 is mainly about improving infrastructure and QA. These improvements include faster builds, moving from TC to Jenkins, and extending our tests to cover more client platforms, both browser and operating system wise. The reason for these changes is that, while we’ve delivered many great features very rapidly over the last few months, we’re always looking to do better. Improving our internal build infrastructure helps us deliver quality features faster, and helps us better turn around responses to the community’s requests for features.

Infrastructure isn’t our only focus for 1.6, however. We are also working on Neo4j so that we can store graph metadata, e.g. configuration settings. This will help us to better evolve the internal infrastructure.

As always, there are a number of bugs that have been fixed, both internally and for the community issues. See: https://github.com/neo4j/community/issues?sort=created&direction=desc&state=closed&page=1

Something to keep you busy over the holidays!

Comments Off

November 22, 2011

NoSQL Zone (at DZone)

Filed under: NoSQL — Patrick Durusau @ 7:01 pm

NoSQL Zone (at DZone)

A collection of high quality content on NoSQL.

Enjoy!

Comments Off

November 15, 2011

VC funding for Hadoop and NoSQL tops $350m

Filed under: Funding,Hadoop,NoSQL — Patrick Durusau @ 7:58 pm

VC funding for Hadoop and NoSQL tops $350m

From the post:

451 Research has today published a report looking at the funding being invested in Apache Hadoop- and NoSQL database-related vendors. The full report is available to clients, but below is a snapshot of the report, along with a graphic representation of the recent up-tick in funding.

According to our figures, between the beginning of 2008 and the end of 2010 $95.8m had been invested in the various Apache Hadoop- and NoSQL-related vendors. That figure now stands at more than $350.8m, up 266%.

That statistic does not really do justice to the sudden uptick of interest, however. The figures indicate that funding for Apache Hadoop- and NoSQL-related firms has more than doubled since the end of August, at which point the total stood at $157.5m.

Takes more work than winning the lottery but on the other hand it is encouraging to see that kind of money being spread around.

But, past funding is just that, past funding. Encouraging but the real task is creating solutions that attract future funding.

Suggestions/comments?

Comments Off

November 14, 2011

Enable SPARQL query in your MVC3 application

Filed under: BrightstarDB,NoSQL — Patrick Durusau @ 7:15 pm

Enable SPARQL query in your MVC3 application

From the post:

BrightstarDB uses SPARQL as its primary query language. Because of this and because all the entities you create with the BrightstarDB entity framework are RDF resources, it is possible to turn your application into a part of the Linked Data web with just a few lines of code. The easiest way to achieve this is to add a controller for running SPARQL queries.

BrightstarDB is a recent .Net NoSQL offering from Networked Planet. Or, as better known to use in the topic maps community as Graham Moore and Kal Ahmed. 😉 I don’t run a Windows server but with Graham and Kal, you can count on this being performance oriented software.

Comments Off

November 11, 2011

Are You a Cassandra Jedi?

Filed under: Cassandra,Conferences,NoSQL — Patrick Durusau @ 7:38 pm

Are You a Cassandra Jedi?

Cassandra Conference, December 6, 2011, New York City

From the call for speakers:

BURLINGAME, Calif. – November 9, 2011 –DataStax, the commercial leader in Apache Cassandra™, along with the NYC Cassandra User Group, NoSQL NYC, and Big Data NYC are joining together to present the first Cassandra New York City conference on December 6. This all day, two-track event will focus on enterprise use cases as well as the latest developments in Cassandra. Early bird registration is now open here.

Coming on the heels of a sold-out DataStax Cassandra SF earlier this year, the event will feature some of the most interesting Cassandra use-cases from up and down the Eastern Seaboard. Cassandra NYC will be keynoted by Jonathan Ellis, chairman of the Apache Cassandra project, who will highlight what’s new in Cassandra 1.0, and what’s in store for the future. Additional confirmed speakers include Nathan Marz, lead engineer for the Storm project at Twitter and Jim Ancona, systems architect at Constant Contact.

“With the recent 1.0 release, we are seeing users doing amazing new things with Cassandra that are going beyond even our expectations and imagination,” said Ellis. “We look forward to sharing these stories with the broader community, to further hasten the adoption and usage of Cassandra to meet their real-time, big data challenges.”

Call for Speakers and Press Registration

The call for speakers is now also open for the event. Submissions can be made to lynnbender@datastax.com.

Press interested in attending the event may contact Zenobia@intersectcom.com for a complimentary press pass.

The event will be held at the Lighthouse International Conference Center on 59th St.

I am not sure about “early bird” registration for an event less than a month away but this sounds quite interesting. I hope the presentations will be recorded and posted for asynchronous access.

Comments Off

DataStax Enterprise and DataStax Community Edition

Filed under: Cassandra,DataStax,NoSQL — Patrick Durusau @ 7:38 pm

DataStax Enterprise and DataStax Community Edition

From the announcement:

BURLINGAME, Calif. – Nov.1, 2011 –DataStax, the commercial leader in Apache Cassandra™, today announced that DataStax Enterprise, the industry’s first distributed, scalable, and highly available database platform powered by Apache Cassandra™ 1.0, is now available.

“The ability to manage both real-time and analytic data in a simple, massively scalable, integrated solution is at the heart of challenges faced by most businesses with legacy databases,” said Billy Bosworth, CEO, DataStax. “Our goal is to ensure businesses can conquer these challenges with a modern application solution that provides operational simplicity, optimal performance and incredible cost savings.”

“Apache Cassandra is the scalable, high-impact, comprehensive data platform that is well-suited to the rapidly-growing real-time data needs of our social media platform,” said Christian Carollo, Senior Manager, Mobile for GameFly. “We leveraged the expertise of DataStax to deploy our new social media platform, and were able to complete the project without worrying about scale or distribution – we simply built a great application and Apache Cassandra took care of the rest.”

BTW, DataStax just added its 100th customer. You might recognize some of them, Netflix, Cisco, etc.

Comments Off

November 9, 2011

Redis: Zero to Master in 30 minutes – Part 1

Filed under: NoSQL,Redis — Patrick Durusau @ 7:41 pm

Redis: Zero to Master in 30 minutes – Part 1

From the post:

More than once, I’ve said that learning Redis is the most efficient way a programmer can spend 30 minutes. This is a testament to both how useful Redis is and how easy it is to learn. But, is it true, can you really learn, and even master, Redis in 30 minutes?

Let’s try it. In this part we’ll go over what Redis is. In the next, we’ll look at a simple example. Whatever time we have left will be for you to set up and play with Redis.

This is a nice post. Introduces enough of Redis for you to get some idea of its power without being overwhelming with details. Continues with Part 2 by the way.

Comments Off

November 8, 2011

Someone Is Being Honest on the Internet?

Filed under: MongoDB,NoSQL,Riak — Patrick Durusau @ 7:44 pm

After seeing the raft of Twitter traffic on MongoDB and Riak, In Context (and an apology), I just had to look. The thought of someone being honest on the Internet being even more novel than someone being wrong on the Internet.

At least I would not have to stay up late correcting them. 😉

Sean Cribbs writes:

There has been quite a bit of furor and excitement on the Internet this week regarding some very public criticisms (and defenses) of MongoDB and its creators, 10gen. Unfortunately, a ghost from my recent past also resurfaced as a result. Let me begin by apologizing to 10gen and its engineers for what I said at JSConf, and then I will reframe my comments in a more constructive form.

Mea culpa. It’s way too easy in our industry to set up and knock down strawmen, as I did, than to convey messages of objective and constructive criticism. It’s also too easy, when you are passionate about what you believe in, to ignore the feelings and efforts of others, which I did. I have great respect for the engineers I have met from 10gen, Mathias Stern and Kyle Banker. They are friendly, approachable, helpful and fun to socialize with at conferences. Thanks for being stand-up guys.

Also, whether we like it or not, these kinds of public embarrassments have rippling effects across the whole NoSQL ecosystem. While Basho has tried to distance itself from other players in the NoSQL field, we cannot deny our origins, and the ecosystem as a “thing” is only about 3 years old. Are developers, technical managers and CTOs more wary of new database technologies as a result of these embarrassments? Probably. Should we continue to work hard to develop and promote alternative data-storage solutions? Absolutely.

Sean’s following comments are useful but even more useful was his suggestion that both MongoDB and Riak push to improve their respective capabilities. There is always room for improvement.

Oh, I did notice on thing that needs correcting in Sean’s blog entry. 😉 See: Munnecke, Heath Records and VistA (NoSQL 35 years old?) NoSQL is at least 35 years old, probably longer but I don’t have the citation at hand.

Comments Off

« Newer Posts — Older Posts »

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

January 26, 2012

January 25, 2012

January 21, 2012

January 6, 2012

January 4, 2012

January 1, 2012

December 26, 2011

December 24, 2011

December 21, 2011

December 20, 2011

December 19, 2011

December 18, 2011

December 17, 2011

December 15, 2011

December 9, 2011

December 5, 2011

December 2, 2011

November 26, 2011

November 22, 2011

November 15, 2011

November 14, 2011

November 11, 2011

November 9, 2011

November 8, 2011