Archive for the ‘PostgreSQL’ Category
Wednesday, May 15th, 2013
PostgreSQL 9.3 Beta 1 Released
From the post:
The first beta release of PostgreSQL 9.3, the latest version of the world’s best open source database, is now available. This beta contains previews of all of the features which will be available in version 9.3, and is ready for testing by the worldwide PostgreSQL community. Please download, test, and report what you find.
Major Features
The major features available for testing in this beta include:
- Writeable Foreign Tables, enabling pushing data to other databases
- pgsql_fdw driver for federation of PostgreSQL databases
- Automatically updatable VIEWs
- MATERIALIZED VIEW declaration
- LATERAL JOINs
- Additional JSON constructor and extractor functions
- Indexed regular expression search
- Disk page checksums to detect filesystem failures
In 9.3, PostgreSQL has greatly reduced its requirement for SysV shared memory, changing to mmap(). This allows easier installation and configuration of PostgreSQL, but means that we need our users to rigorously test and ensure that no memory management issues have been introduced by the change. We also request that users spend extra time testing the improvements to Foreign Key locks.
If that isn’t enough features for you to test, see the full announcement!
Posted in PostgreSQL, SQL | No Comments »
Monday, December 3rd, 2012
AXLE: Advanced Analytics for Extremely Large European Databases
From the webpage:
The objectives of the AXLE project are to greatly improve the speed and quality of decision making on real-world data sets. AXLE aims to make these improvements generally available through high quality open source implementations via the PostgreSQL and Orange products.
The project started in early November 2012. Will be checking back to see what is proposed for PostgreSQL and/or Orange.
Posted in Analytics, BigData, Orange, PostgreSQL | No Comments »
Tuesday, October 23rd, 2012
Up to Date on Open Source Analytics by Steve Miller.
Steve updates his Wintel laptop with the latest releases of open source analytics tools.
Steve’s list:
What’s on your list?
I first saw this mentioned at KDNuggets.
Posted in Analytics, MySQL, PostgreSQL, R, Software | No Comments »
Thursday, October 4th, 2012
PostgreSQL Database Modeler
From the readme file at github:
PostgreSQL Database Modeler, or simply, pgModeler is an open source tool for modeling databases that merges the classical concepts of entity-relationship diagrams with specific features that only PostgreSQL implements. The pgModeler translates the models created by the user to SQL code and apply them onto database clusters from version 8.0 to 9.1.
Other modeling tools you have or are likely to encounter writing topic maps?
When the output of diverse modeling tools or diverse output from the same modeling tool needs semantic reconciliation, I would turn to topic maps.
I first saw this at DZone.
Posted in Database, Modeling, PostgreSQL | No Comments »
Friday, September 14th, 2012
JMyETL, an easy to use ETL tool that supports 10 different RDBMS by Esen Sagynov.
From the post:
JMyETL is a very useful and simple Java based application for Windows OS which allows users to import and export data from/to various database systems. For example:
- CUBRID –> Sybase ASE, Sybase ASA, MySQL, Oracle, PostgreSQL, SQL Server, DB2, Access, SQLite
- MySQL –> Sybase ASE/ASA, Oracle, Access, PostgreSQL, SQL Server, DB2, SQLite, CUBRID
- Sybase ASE –> Sybase ASA, MySQL, Oracle, Access, PostgreSQL, SQL Server, DB2, SQLite, CUBRID
- Sybase ASA –> Sybase ASE, MySQL, Oracle, Access, PostgreSQL, SQL Server, DB2, SQLite, CUBRID
- Oracle –> Sybase ASA, Sybase ASE, MySQL, Access, PostgreSQL, SQL Server, DB2, SQLite, CUBRID
- Access –> Sybase ASE, Sybase ASA, MySQL, Oracle, PostgreSQL, SQL Server, DB2, SQLite, CUBRID
- PostgreSQL –> Sybase ASE, Sybase ASA, MySQL, Oracle, Access, SQL Server, DB2, SQLite, CUBRID
- SQL Server –> Sybase ASE, Sybase ASA, MySQL, Oracle, PostgreSQL, Access, DB2, SQLite, CUBRID
- DB2 –> Sybase ASE, Sybase ASA, MySQL, Oracle, PostgreSQL, SQL Server, Access, SQLite, CUBRID
- SQLite –> Sybase ASE, Sybase ASA, MySQL, Oracle, PostgreSQL, SQL Server, DB2, Access, CUBRID
Just in case you need a database to database ETL utility.
I first saw this at DZone.
Posted in CUBRID, ETL, MySQL, Oracle, PostgreSQL, SQL Server | No Comments »
Wednesday, September 12th, 2012
PostgreSQL 9.2 released
From the announcement:
The PostgreSQL Global Development Group announces PostgreSQL 9.2, the latest release of the leader in open source databases. Since the beta release was announced in May, developers and vendors have praised it as a leap forward in performance, scalability and flexibility. Users are expected to switch to this version in record numbers.
“PostgreSQL 9.2 will ship with native JSON support, covering indexes, replication and performance improvements, and many more features. We are eagerly awaiting this release and will make it available in Early Access as soon as it’s released by the PostgreSQL community,” said Ines Sombra, Lead Data Engineer, Engine Yard.
Links
Downloads, including packages and installers
Release Notes
Documentation
What’s New in 9.2
Press Kit
New features like range types:
Range types are used to store a range of data of a given type. There are a few pre-defined types. They are integer (int4range), bigint (int8range), numeric (numrange), timestamp without timezone (tsrange), timestamp with timezone (tstzrange), and date (daterange).
Ranges can be made of continuous (numeric, timestamp…) or discrete (integer, date…) data types. They can be open (the bound isn’t part of the range) or closed (the bound is part of the range). A bound can also be infinite.
Without these datatypes, most people solve the range problems by using two columns in a table. These range types are much more powerful, as you can use many operators on them.
have captured my attention.
Now to look at other new features: Index-only scans, Replication improvements and JSON datatype.
Posted in Database, PostgreSQL | No Comments »
Friday, July 27th, 2012
PostgreSQL’s place in the New World Order by Matthew Soldo.
Description:
Mainstream software development is undergoing a radical shift. Driven by the agile development needs of web, social, and mobile apps, developers are increasingly deploying to platforms-as-a-service (PaaS). A key enabling technology of PaaS is cloud-services: software, often open-source, that is consumed as a service and operated by a third-party vendor. This shift has profound implications for the open-source world. It enables new business models, increases emphasis on user-experience, and creates new opportunities.
PostgreSQL is an excellent case study in this shift. The PostgreSQL project has long offered one of the most reliable open source databases, but has received less attention than competing technologies. But in the PaaS and cloud-services world, reliability and open-ness become increasingly important. As such, we are seeing the beginning of a shift in adoption towards PostgreSQL.
The datastore landscape is particularly interesting because of the recent attention given to the so-called NoSQL technologies. Data is suddenly sexy again. This attention is largely governed by the same forces driving developers to PaaS, namely the need for agility and scalability in building modern apps. Far from being a threat to PostgreSQL, these technologies present an amazing opportunity for showing the way towards making PostgreSQL more powerful and more widely adopted.
The presentation sounds great, but alas, the slidedeck is just a slidedeck.
I do recommend it for the next to last slide graphic. Very cool!
(And it may be time to take a another look at PostgreSQL as well.)
Posted in Cloud Computing, Database, Heroku, PostgreSQL | No Comments »
Monday, March 5th, 2012
Trees in the Database: Advanced Data Structures
Lorenzo Alberton writes:
Despite the NoSQL movement trying to flag traditional databases as a dying breed, the RDBMS keeps evolving and adding new powerful weapons to its arsenal. In this talk we’ll explore Common Table Expressions (SQL-99) and how SQL handles recursion, breaking the bi-dimensional barriers and paving the way to more complex data structures like trees and graphs, and how we can replicate features from social networks and recommendation systems. We’ll also have a look at window functions (SQL:2003) and the advanced reporting features they make finally possible. The first part of this talk will cover several different techniques to model a tree data structure into a relational database: parent-child (adjacency list) model, materialized path, nested sets, nested intervals, hybrid models, Common Table Expressions. Then we’ll move one step forward and see how we can model a more complex data structure, i.e. a graph, with concrete examples from today’s websites. Starting from real-world examples of social networks’ and recommendation systems’ features, and with the help of some graph theory, this talk will explain how to represent and traverse a graph in the database. Finally, we will take a look at Window Functions and how they can be useful for data analytics and simple inline aggregations, among other things. All the examples have been tested on PostgreSQL >= 8.4.
Very impressive presentation!
Definitely makes me want to dust off my SQL installations and manuals for a closer look!
Posted in Data Structures, Database, PostgreSQL, RDBMS, SQL, Trees | No Comments »
Monday, February 20th, 2012
Migrating from Oracle to PostgreSQL by Kevin Kempter
From the post:
This video presents Ora2Pg, a free tool that you can use to migrate an Oracle database to a PostgreSQL compatible schema. Ora2Pg connects your Oracle database, scan it automatically and extracts its structure or data, it then generates SQL scripts that you can load into your PostgreSQL database.
Ora2Pg can be used from reverse engineering Oracle database for database migration or to replicate Oracle data into a PostgreSQL database. The video shows where to download it and talks about the prerequisites. It explains how to install Ora2Pg and configure it. At the end, it presents some examples of ora2pg being used.
Like the man says, useful for migration or replication.
What I wonder about is the day in the not too distant future when “migration” isn’t a meaningful term. Either because the data is too large or dynamic for “migration” to be meaningful. Not to mention the inevitable dangers of corruption during “migration.”
And if you think about it, isn’t the database engine, Oracle or PostgreSQL simply a way to access data already stored? If I want to use a different engine to access the same data, what is the difficulty?
I would much rather design a topic map that queries “Oracle” data in place, either using an Oracle interface or even directly than to “migrate” the data with all the hazards and dangers that brings.
Will be interesting if the “cloud” results in data storage separate from application interfaces. Much like we all use TCP/IP for network traffic, although the packets are put to different purposes by different applications.
Posted in Oracle, PostgreSQL | No Comments »
Tuesday, January 31st, 2012
PGStrom
From the webpage:
PG-Strom is a module of FDW (foreign data wrapper) of PostgreSQL database. It was designed to utilize GPU devices to accelarate sequential scan on massive amount of records with complex qualifiers. Its basic concept is CPU and GPU should focus on the workload with their advantage, and perform concurrently. CPU has much more flexibility, thus, it has advantage on complex stuff such as Disk-I/O, on the other hand, GPU has much more parallelism of numerical calculation, thus, it has advantage on massive but simple stuff such as check of qualifiers for each rows.
The below figure is a basic concept of PG-Strom. Now, on sequential scan workload, vanilla PostgreSQL does iteration of fetch a tuple and checks of qualifiers for each tuples. If we could consign GPU the workload of green portion, it enables to reduce workloads of CPU, thus, it shall be able to load more tuples in advance. Eventually, it should allow to provide shorter response-time on complex queries towards large amount of data.
Requires setting up the table for the GPU ahead of time but performance increase is reported to be 10x – 20x.
It occurs to me that GPUs should be well suited for graph processing. Yes? Will have to look into that and report back.
Posted in CUDA, GPU, PGStrom, PostgreSQL | 1 Comment »
Monday, December 26th, 2011
Beyond Relational
I originally arrived at this site because of a blog hosted there with lessons on Oracle 10g. Exploring a bit I decided to post about it.
Seems to have fairly broad coverage, from Oracle and PostgreSQL to TSQL and XQuery.
Likely to be a good site for learning cross-overs between systems that you can map for later use.
Suggestions of similar sites?
Posted in Database, MySQL, Oracle, PostgreSQL, SQL, SQL Server | No Comments »
Thursday, December 15th, 2011
SQL Database-as-a-Service
Documentation
Just starting the documentation but two quick thoughts:
First, most conventionally, this could be the back-end to a topic map server. Despite having started off many years ago in server administration (or perhaps because of it), server configuration/management isn’t an “additional” duty for any mission critical development/support staff. Too easy to hire server management services, who are capable of providing maintenance/support that no small firm could afford locally.
Second, a bit more unconventionally, this could be an illustration for a Topic-Map-As-Service. Think about it. If instead of the mish-mash that is Wikipedia, you had a topic maps of facts that were supplemented (read mapped) to records from various public reporting agencies, that could be interesting to “press” a local topic map against to acquire more recent data.
True, there are the public record services but they only give you person by person records and not relationships between them. Not to mention that if you are inventive, you could create some very interesting topic maps (intersections of records).
Imagine the stir that a topic map of license plates of cars with local plates at motels would cause. Rather than offering free access, since most people would only be interested in one license plate in particular, suggest that you show one plate at some random time each hour and where it was seen. (not the date) Sell advertising for the page where you offer the “free” sneak peak. Suspect you better have some load expansion capacity.
Posted in Marketing, PostgreSQL, Topic Maps | No Comments »
Thursday, December 1st, 2011
Seven Databases in Seven Weeks now in Beta
From the webpage:
Redis, Neo4J, Couch, Mongo, HBase, Riak, and Postgres: with each database, you’ll tackle a real-world data problem that highlights the concepts and features that make it shine. You’ll explore the five data models employed by these databases: relational, key/value, columnar, document, and graph. See which kinds of problems are best suited to each, and when to use them.
You’ll learn how MongoDB and CouchDB, both JavaScript powered, document oriented datastores, are strikingly different. Learn about the Dynamo heritage at the heart of Riak and Cassandra. Understand MapReduce and how to use it to solve Big Data problems.
Build clusters of servers using scalable services like Amazon’s Elastic Compute Cloud (EC2). Discover the CAP theorem and its implications for your distributed data. Understand the tradeoffs between consistency and availability, and when you can use them to your advantage. Use multiple databases in concert to create a platform that’s more than the sum of its parts, or find one that meets all your needs at once.
Seven Databases in Seven Weeks will give you a broad understanding of the databases, their strengths and weaknesses, and how to choose the ones that fit your needs.
Now in beta, in non-DRM PDF, epub, and mobi from pragprog.com/book/rwdata.
If you know the Seven Languages in Seven Weeks by Bruce Tate, no further recommendation is necessary for the approach.
I haven’t read the book, yet, but will be getting the electronic beta tonight. More to follow.
Posted in CouchDB, HBase, MongoDB, Neo4j, PostgreSQL, Redis, Riak | No Comments »
Friday, November 11th, 2011
Postgres Plus Connector for Hadoop
From the webpage:
The Postgres Plus Connector for Hadoop provides developers easy access to massive amounts of SQL data for integration with or analysis in Hadoop processing clusters. Now large amounts of data managed by PostgreSQL or Postgres Plus Advanced Server can be accessed by Hadoop for analysis and manipulation using Map-Reduce constructs.
EnterpriseDB recognized early on that Hadoop, a framework allowing distributed processing of large data sets across computer clusters using a simple programming model, was a valuable and complimentary data processing model to traditional SQL systems. Map-Reduce processing serves important needs for basic processing of extremely large amounts of data and SQL based systems will continue to fulfill their mission critical needs for complex processing of data well into the future. What was missing was an easy way for developers to access and move data between the two environments.
EnterpriseDB has created the Postgres Plus Connector for Hadoop by extending the Pig platform (an engine for executing data flows in parallel on Hadoop) and using an EnterpriseDB JDBC driver to allow users the ability to load the results of a SQL query into Hadoop where developers can operate on that data using familiar Map-Reduce programming. In addition, data from Hadoop can also be moved back into PostgreSQL or Postgres Plus Advanced Server tables.
A private beta is in progress, see the webpage for details and to register.
Plus, there is a webinar, Tuesday, November 29, 2011 11:00 am Eastern Standard Time (New York, GMT-05:00), Extending SQL Analysis with the Postgres Plus Connector for Hadoop. Registration at the webpage as well.
A step towards seamless data environments. Much like word processing now without the “.” commands. Same commands for the most part but unseen. Data is going in that direction. You will specify desired results and environments will take care of access, processor(s), operations and the like. Tables will appear as tables because you have chosen to view them as tables, etc.
Posted in Hadoop, MapReduce, Pig, PostgreSQL, SQL | No Comments »
Thursday, September 29th, 2011
Indexed Nearest Neighbour Search in PostGIS
From the post:
An always popular question on the PostGIS users mailing list has been “how do I find the N nearest things to this point?”.
To date, the answer has generally been quite convoluted, since PostGIS supports bounding box index searches, and in order to get the N nearest things you need a box large enough to capture at least N things. Which means you need to know how big to make your search box, which is not possible in general.
PostgreSQL has the ability to return ordered information where an index exists, but the ability has been restricted to B-Tree indexes until recently. Thanks to one of our clients, we were able to directly fund PostgreSQL developers Oleg Bartunov and Teodor Sigaev in adding the ability to return sorted results from a GiST index. And since PostGIS indexes use GiST, that means that now we can also return sorted results from our indexes.
This feature (the PostGIS side of it) was funded by Vizzuality, and hopefully it comes in useful in their CartoDB work.
You will need PostgreSQL 9.1 and the PostGIS source code from the repository, but this is what a nearest neighbour search looks like:
PostgreSQL? Isn’t that SQL?
Indexed nearest neighbour search is a question of results, not ideology.
Better targeting through technology.
Posted in Geographic Data, Geographic Information Retrieval, PostgreSQL | No Comments »
Wednesday, February 16th, 2011
Tsearch2 – full text extension for PostgreSQL
Following up on the TSearch Primer post from yesterday.
This is the current documentation for Tsearch2.
Posted in PostgreSQL, TSearch | No Comments »
Tuesday, February 15th, 2011
TSearch Primer
From the website:
TSearch is a Full-Text Search engine that is packaged with PostgreSQL. The key developers of TSearch are Oleg Bartunov and Teodor Sigaev who have also done extensive work with GiST and GIN indexes used by PostGIS, PgSphere and other projects. For more about how TSearch and OpenFTS got started check out A Brief History of FTS in PostgreSQL. Check out the TSearch Official Site if you are interested in related TSearch tips or interested in donating to this very worthy project.
Tsearch is different from regular string searching in PostgreSQL in a couple of key ways.
- It is well-suited for searching large blobs of text since each word is indexed using a Generalized Inverted Index (GIN) or Generalized Search Tree (GiST) and searched using text search vectors. GIN is generally used for indexing. Search vectors are at word and phrase boundaries.
- TSearch has a concept of Linguistic significance using various language dictionaries, ISpell, thesaurus, stop words, etc. therefore it can ignore common words and equate like meaning terms and phrases.
- TSearch is for the most part case insensitive.
- While various dictionaries and configs are available out of the box with TSearch, one can create new ones and customize existing further to cater to specific niches within industries – e.g. medicine, pharmaceuticals, physics, chemistry, biology, legal matters.
Short introduction to TSearch, which is part of PostgreSQL.
Should be of interest to topic mappers using PostgreSQL.
Posted in PostgreSQL, SQL, TSearch | 1 Comment »