Archive for the ‘Databus’ Category

How To Legally Dick With The NSA – PostgreSQL 10 Beta 1

Thursday, May 18th, 2017

The release of PostgreSQL 10 Beta 1 gives everyone an opportunity to legally dick with the NSA.

In Stop Blaming NSA For The Ransomware Attack, Patrick Tucker repeats claims by the NSA that about 80% of vulnerabilities are revealed and 20% are concealed by the NSA.

Which means if there are 10 security vulnerabilities in PostgreSQL 10 Beta 1, the NSA will keep two for themselves.

Let’s disappoint them on that score. With widespread community testing, fuzzing, etc., the NSA score on PostgreSQL 10 Beta 1 could be zero.

That won’t help vendors with 70 million lines of closed source databases (look for Mary Ann Davidson). Such databases may have true accidental vulnerabilities or ones introduced by NSA ringers.

If NSA ringers working for closed source companies sounds like tin-hat conspiracy theory, recall the NSA is barred from spying on American citizens at all. In fact, they have vehemently denied it. At least until they admitted they were lying and in fact spying on all American citizens.

Also bear in mind that the NSA was a participant in many of the covert/overt attempts by the United States to influence elections in other countries. (Dov H. Levin, as of May 18, 2017, the datasets are forthcoming. See also: Database Tracks History Of U.S. Meddling In Foreign Elections, an NPR interview that counts 80 US-backed efforts to interfere in elections.)

On the technical front, the NSA is known to have intentionally damaged a U.S. cryptography standard. NSA Efforts to Evade Encryption Technology Damaged U.S. Cryptography Standard. That report isn’t from a blog that is a continuation of a photocopied version of a mimeographed conspiracy report found in low-end coffee shops.

No, the damage to U.S. cryptography report appears in Scientific American.

I can’t honestly name one illegal, immoral, unethical, act that the NSA is not capable of.


Beyond “sticking to the NSA,” database researchers and users have these PostgreSQL 10 Beta 1 features to enjoy:

The PostgreSQL Global Development Group announces today that the first beta release of PostgreSQL 10 is available for download. This release contains previews of all of the features which will be available in the final release of version 10, although some details will change before then. Users are encouraged to begin testing their applications against this latest release.

Major Features of 10

The new version contains multiple features that will allow users to both scale out and scale up their PostgreSQL infrastructure:

  • Logical Replication: built-in option for replicating specific tables or using replication to upgrade
  • Native Table Partitioning: range and list partitioning as native database objects
  • Additional Query Parallelism: including index scans, bitmap scans, and merge joins
  • Quorum Commit for Synchronous Replication: ensure against loss of multiple nodes

We have also made three improvements to PostgreSQL connections, which we are calling on driver authors to support, and users to test:

  • SCRAM Authentication, for more secure password-based access
  • Multi-host “failover”, connecting to the first available in a list of hosts
  • target_session_attrs parameter, so a client can request a read/write host

Additional Features

Many other new features and improvements have been added to PostgreSQL 10, some of which may be as important, or more important, to specific users than the above. Certainly all of them require testing. Among them are:

  • Crash-safe and replicable Hash Indexes
  • Multi-column Correlation Statistics
  • New “monitoring” roles for permission grants
  • Latch Wait times in pg_stat_activity
  • XMLTABLE query expression
  • Restrictive Policies for Row Level Security
  • Full Text Search support for JSON and JSONB
  • Compression support for pg_receivewal
  • ICU collation support
  • Push Down Aggregates to foreign servers
  • Transition Tables in trigger execution

Further, developers have contributed performance improvements in the SUM() function, character encoding conversion, expression evaluation, grouping sets, and joins against unique columns. Analytics queries against large numbers of rows should be up to 40% faster. Please test if these are faster for you and report back.

See the Release Notes for a complete list of new and changed features.

Make the lives of PostgreSQL users everywhere better and the lives of government intelligence services around the world worse!

I call that a win-win situation.

Wheel Re-invention: Change Data Capture systems

Tuesday, March 20th, 2012

LinkedIn: Creating a Low Latency Change Data Capture System with Databus

Siddharth Anand, a senior member of LinkedIn’s Distributed Data Systems team writes:

Having observed two high-traffic web companies solve similar problems, I cannot help but notice a set of wheel-reinventions. Some of these problems are difficult and it is truly unfortunate for each company to solve its problems separately. At the same time, each company has had to solve these problems due to an absence of a reliable open-source alternative. This clearly has implications for an industry dominated by fast-moving start-ups that cannot build 50-person infrastructure development teams or dedicate months away from building features.

Siddharth goes on to address a particular re-invention of the wheel: change data capture systems.

And he has a solution to this wheel re-invention problem: Databus. (Not good for all situations but worth your time to read carefully, along with following the other resources.)

From the post:

Databus is an innovative solution in this space.

It offers the following features:

  • Pub-sub semantics
  • In-commit-order delivery guarantees
  • Commits at the source are grouped by transaction
    • ACID semantics are preserved through the entire pipeline
  • Supports partitioning of streams
    • Ordering guarantees are then per partition
  • Like other messaging systems, offers very low latency consumption for recently-published messages
  • Unlike other messaging systems, offers arbitrarily-long look-back with no impact to the source
  • High Availability and Reliability