Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

May 2, 2012

TokuDB v6.0: Download Available

Filed under: MySQL,TokuDB — Patrick Durusau @ 3:08 pm

TokuDB v6.0: Download Available by Martin Farach-Colton.

From the post:

TokuDB v6.0 is full of great improvements, like getting rid of slave lag, better compression, improved checkpointing, and support for XA.

I’m happy to announce that TokuDB v6.0 is now generally available and can be downloaded here.

Are you familiar with any independent benchmark testing on TokuDB?

Not that I doubt the TokuDB numbers.

Thinking that contributing standard numbers to a more centralized resource would help with evaluations.

April 13, 2012

Percona Toolkit 2.1 with New Online Schema Change Tool

Filed under: MySQL,Percona Server,Schema — Patrick Durusau @ 4:47 pm

Percona Toolkit 2.1 with New Online Schema Change Tool by Baron Schwartz.

From the post:

I’m proud to announce the GA release of version 2.1 of Percona Toolkit. Percona Toolkit is the essential suite of administrative tools for MySQL.

With this release we introduce a new version of pt-online-schema-change, a tool that enables you to ALTER large tables with no blocking or downtime. As you know, MySQL locks tables for most ALTER operations, but pt-online-schema-change performs the ALTER without any locking. Client applications can continue reading and writing the table with no interruption.

With this new version of the tool, one of the most painful things anyone experiences with MySQL is significantly alleviated. If you’ve ever delayed a project’s schedule because the release involved an ALTER, which had to be scheduled in the dead of the night on Sunday, and required overtime and time off, you know what I mean. A schema migration is an instant blocker in the critical path of your project plan. No more!

Certainly a useful feature for MySQL users.

Not to mention being another step towards data models being a matter of how you choose to view the data for some particular purpose. Not quite there, yet, but that day is coming.

In a very real sense, the “normalization” of data and the data models we have built into SQL systems were compensation for the short-comings of our computing platforms. That we have continued to do so in the face of increases in computing resources that make it unnecessary, is evidence of short-comings on our part.

April 12, 2012

Drizzle: An Open Source Microkernel DBMS for High Performance Scale-Out Applications

Filed under: Database,Drizzle,MySQL — Patrick Durusau @ 7:07 pm

Drizzle: An Open Source Microkernel DBMS for High Performance Scale-Out Applications

From the webpage:

The Global Drizzle Development Team is pleased to announce the immediate availability of Drizzle 7.1.33-stable. The first stable release of Drizzle 7.1 and the result of 12 months of hard work from contributors around the world.

Improvements in Drizzle 7.1 compared to 7.0

  • Xtrabackup is included (in-tree) by Stewart Smith
  • Multi-source replication by David Shrewsbury
  • Improved execute parser by Brian Aker and Vijay Samuel
  • Servers are identified with UUID in replication by Joe Daly
  • HTTP JSON API (experimental) by Stewart Smith
  • Percona Innodb patches merged by Laurynas Biveinis
  • JS plugin: execute JavaScript code as a Drizzle function by Henrik Ingo
  • IPV6 data type by Muhammad Umair
  • Improvements to libdrizzle client library by Andrew Hutchings and Brian Aker
  • Query log plugin and auth_schema by Daniel Nichter
  • ZeroMQ plugin by Markus Eriksson
  • Ability to publish transactions to zeromq and rabbitmq by Marcus Eriksson
  • Replication Dictionary by Brian Aker
  • Log output to syslog is enabled by default by Brian Aker
  • Improvements to logging stats plugin
  • Removal of drizzleadmin utility (you can now do all administration from drizzle client itself) by Andrew Hutchings
  • Improved Regex Plugin by Clint Byrum
  • Improvements to pandora build by Monty Taylor
  • New version numbering system and support for it in pandora-build by Henrik Ingo
  • Updated DEB and RPM packages, by Henrik Ingo
  • Revamped testing system Kewpie all-inclusive with suites of randgen, sysbench, sql-bench, and crashme tests by Patrick Crews
  • Removal of HailDB engine by Stewart Smith
  • Removal of PBMS engine
  • Continued code refactoring by Olaf van der Spek, Brian Aker and others
  • many bug fixes
  • Brian Aker ,Mark Atwood- Continuous Integration
  • Vijay Samuel – Release Manager

From the documentation page:

Drizzle is a transactional, relational, community-driven open-source database that is forked from the popular MySQL database.

The Drizzle team has removed non-essential code, has re-factored the remaining code, and has converted the code to modern C++ and modern libraries.

Charter

  • A database optimized for Cloud infrastructure and Web applications
  • Design for massive concurrency on modern multi-CPU architectures
  • Optimize memory use for increased performance and parallelism
  • Open source, open community, open design

Scope

  • Re-designed modular architecture providing plugins with defined APIs
  • Simple design for ease of use and administration
  • Reliable, ACID transactional

If you like databases and data structure research, now is a wonderful time to be active.

March 21, 2012

A graphical overview of your MySQL database

Filed under: Data,Database,MySQL — Patrick Durusau @ 3:30 pm

A graphical overview of your MySQL database by Christophe Ladroue.

From the post:

If you use MySQL, there’s a default schema called ‘information_schema‘ which contains lots of information about your schemas and tables among other things. Recently I wanted to know whether a table I use for storing the results of a large number experiments was any way near maxing out. To cut a brief story even shorter, the answer was “not even close” and could be found in ‘information_schema.TABLES‘. Not being one to avoid any opportunity to procrastinate, I went on to write a short script to produce a global overview of the entire database.

infomation_schema.TABLES contains the following fields: TABLE_SCHEMA, TABLE_NAME, TABLE_ROWS, AVG_ROW_LENGTH and MAX_DATA_LENGTH (and a few others). We can first have a look at the relative sizes of the schemas with the MySQL query “SELECT TABLE_SCHEMA,SUM(DATA_LENGTH) SCHEMA_LENGTH FROM information_schema.TABLES WHERE TABLE_SCHEMA!='information_schema' GROUP BY TABLE_SCHEMA“.

Christophe includes R code to generate graphics that you will find useful in managing (or just learning about) MySQL databases.

While the parts of the schema Christophe is displaying graphically are obviously subjects, the graphical display pushed me in another direction.

If we can visualize the schema of a MySQL database, then shouldn’t we be able to visualize the database structures a bit closer to the metal?

And if we can visualize those database structures, shouldn’t we be able to represent them and the relationships between them as a graph?

Or perhaps better, can we “view” those structures and relationships “on demand” as a graph?

That is in fact what is happening when we display a table at the command prompt for MySQL. It is a “display” of information, it is not a report of information.

I don’t know enough about the internal structures of MySQL or PostgreSQL to start such a mapping. But ignorance is curable, at least that is what they say. 😉

I have another post today that suggests a different take on conversion methodology.

February 17, 2012

Oracle Announces General Availability of MySQL Cluster 7.2

Filed under: MySQL,Oracle — Patrick Durusau @ 5:08 pm

Oracle Announces General Availability of MySQL Cluster 7.2

Another demonstration that high quality open source projects are not inconsistent with commercial products.

From the post:

Delivers up to 70x More Performance for Complex Queries; Adds New NoSQL Memcached Interface

News Facts

  • Continuing to drive MySQL innovation, Oracle today announced the general availability of MySQL Cluster 7.2.
  • For highly demanding Web-based and communications products and services, MySQL Cluster is designed to cost-effectively deliver 99.999% availability, high write scalability and very low latency.
  • With SQL and NoSQL access through a new Memcached API, MySQL Cluster represents a “best of both worlds” solution allowing key value operations and complex SQL queries within the same database.
  • With MySQL Cluster 7.2, users can also gain up to a 70x increase in performance on complex queries, and enhanced multi-data center scalability.
  • MySQL Cluster 7.2 is also certified with Oracle VM. The combination of its elastic, on-demand scalability and self-healing features, together with Oracle VM support, makes MySQL Cluster an ideal choice for deployments in the cloud.
  • Also generally available today is the latest release of the MySQL Cluster Manager, version 1.1.4, further improving the ease of use and administration of MySQL Cluster.

February 14, 2012

Open Source OData Tools for MySQL and PHP Developers

Filed under: MySQL,Odata,PHP — Patrick Durusau @ 5:09 pm

Open Source OData Tools for MySQL and PHP Developers by Doug Mahugh.

To enable more interoperability scenarios, Microsoft has released today two open source tools that provide support for the Open Data Protocol (OData) for PHP and MySQL developers working on any platform.

The growing popularity of OData is creating new opportunities for developers working with a wide variety of platforms and languages. An ever increasing number of data sources are being exposed as OData producers, and a variety of OData consumers can be used to query these data sources via OData’s simple REST API.

In this post, we’ll take a look at the latest releases of two open source tools that help PHP developers implement OData producer support quickly and easily on Windows and Linux platforms:

  • The OData Producer Library for PHP, an open source server library that helps PHP developers expose data sources for querying via OData. (This is essentially a PHP port of certain aspects of the OData functionality found in System.Data.Services.)
  • The OData Connector for MySQL, an open source command-line tool that generates an implementation of the OData Producer Library for PHP from a specified MySQL database.

These tools are written in platform-agnostic PHP, with no dependencies on .NET.

This is way cool!

Seriously consider Doug’s request for what other tools you would like to see for OData?

January 23, 2012

Percona Live DC 2012 Slides (MySQL)

Filed under: Database,MySQL — Patrick Durusau @ 7:44 pm

Percona Live DC 2012 Slides

I put the (MySQL) header for the benefit of hard core TM fans who can’t be bothered with MySQL posts. 😉

I won’t say what database system I originally learned databases on but I must admit that I became enchanted with MySQL years later.

For a large number of applications, including TM backends, MySQL is entirely appropriate.

Sure, when your company goes interplanetary you are going to need a bigger solution.

But in the mean time, get a solution that isn’t larger than the problem you are trying to solve.

BTW, MySQL installations have the same mapping for BI issues I noted in an earlier post today.

Thoughts on how you would fashion a generic solution that does not require conversion of data?

January 11, 2012

Fractal Tree Indexes and Mead – MySQL Meetup

Filed under: Fractal Trees,MySQL — Patrick Durusau @ 8:04 pm

Fractal Tree Indexes and Mead – MySQL Meetup

From the post:

As a brief overview – most databases employ B-trees to achieve a good tradeoff between the ability to update data quickly and to search it quickly. It turns out that B-trees are far from the optimum in this tradeoff space. This led to the development at MIT, Rutgers and Stony Brook of Fractal Tree indexes. Fractal Tree indexes improve MySQL® scalability and query performance by allowing greater insertion rates, supporting rich indexing and offering efficient compression. They can also eliminate operational headaches such as dump/reloads, inflexible schemas and partitions.

The presentation provides an overview on how Fractal Tree indexes work, and then gets into some specific product features, benchmarks, and customer use cases that show where people have deployed Fractal Tree indexes via the TokuDB® storage engine.

Whether you are just browsing or seriously looking for better performance, I think you will like this presentation.

Performance of data stores is an issue for topic maps whether you store a final “merged” result or simply present “merged” results to users.

January 9, 2012

Triggers in MySQL

Filed under: MySQL,SQL,Triggers — Patrick Durusau @ 1:44 pm

Triggers in MySQL

From the post:

Almost all developers are heard about Triggers and all knows that mysql support triggers and triggers are adding an advantages to mysql.Triggers are the SQL statements are stored in database.

Triggers are the SQL statements which add functionality to your tables so that they perform a certain series of actions when a some queries are executed. We can say in easy language is Triggers are some conditions performed when INSERT, UPDATE or DELETE events are made in the table without using two separate queries.

Sometimes developers are prefer to use store procedures rather than triggers but triggers are one kind of store procedures which contain procedural code into body.The difference between a trigger and a stored procedure is that a trigger is called when an event occurs in a table whereas a stored procedure must be called explicitly.

Short overview of triggers in MySQL.

Possibly useful if you are using a relational backend for your topic map engine.

Should topic map engines support the equivalent of triggers?

As declared by a topic map?

Now that would be clever, to have a topic map carry its triggers around with it.

Admittedly, interactive data structures aren’t the norm, yet, but they are certainly worth thinking about.

December 26, 2011

Beyond Relational

Filed under: Database,MySQL,Oracle,PostgreSQL,SQL,SQL Server — Patrick Durusau @ 8:19 pm

Beyond Relational

I originally arrived at this site because of a blog hosted there with lessons on Oracle 10g. Exploring a bit I decided to post about it.

Seems to have fairly broad coverage, from Oracle and PostgreSQL to TSQL and XQuery.

Likely to be a good site for learning cross-overs between systems that you can map for later use.

Suggestions of similar sites?

December 22, 2011

Percona Toolkit

Filed under: MySQL — Patrick Durusau @ 7:40 pm

Percona Toolkit

From the webpage:

Percona Toolkit is a collection of advanced command-line tools used by Percona support staff to perform a variety of MySQL and system tasks that are too difficult or complex to perform manually, including:

  • Verify master and replica data consistency
  • Efficiently archive rows
  • Find duplicate indexes
  • Summarize MySQL servers
  • Analyze queries from logs and tcpdump
  • Collect vital system information when problems occur

Tools are a vital part of any MySQL deployment, so it’s important to use ones that are reliable and well-designed. Over 2,000 tests and several years of deployment, including some of the Internet’s best-known sites, have proven the reliability of the tools in Percona Toolkit. And the combined experience and expertise of Percona ensures that each tool is well thought-out and designed.

With tools and documentation like this, I am sorely tempted to throw a MySQL installation on my box just for fun. (Or more serious purposes.)

December 20, 2011

How Twitter Stores 250 Million Tweets a Day Using MySQL

Filed under: Design,MySQL — Patrick Durusau @ 8:27 pm

How Twitter Stores 250 Million Tweets a Day Using MySQL

From the post:

Jeremy Cole, a DBA Team Lead/Database Architect at Twitter, gave a really good talk at the O’Reilly MySQL conference: Big and Small Data at @Twitter, where the topic was thinking of Twitter from the data perspective.

One of the interesting stories he told was of the transition from Twitter’s old way of storing tweets using temporal sharding, to a more distributed approach using a new tweet store called T-bird, which is built on top of Gizzard, which is built using MySQL.

OK, so your Christmas wish wasn’t for a topic map with quite that level of input everyday. 😉 You can still learn something about design of a robust architecture from this presentation.

November 20, 2011

Scaling MySQL with TokuDB Webinar

Filed under: MySQL,TokuDB — Patrick Durusau @ 4:23 pm

Scaling MySQL with TokuDB Webinar – Video and Slides Now Available

From the post:

Thanks to everyone who signed up and attended the webinar I gave this week with Tim Callaghan on Scaling MySQL. For those who missed it and are interested, the video and slides are now posted here.

[snip]

MySQL implementations are often kept relatively small, often just a few hundred GB or less. Anything beyond this quickly leads to painful operational problems such as poor insertion rates, slow queries, hours to days offline for schema changes, prolonged downtime for dump/reload, etc. The promise of scalable MySQL has remained largely unfulfilled, until TokuDB.

TokuDB v5.0 delivers

  • Exceptional Agility — Hot Schema Changes allow read/write operations during index creation or column/field addition
  • Unmatched Speed — Fractal Tree indexes perform 20x to 80x better on write intensive workloads
  • Maximum Scalability — Fractal Tree index performance scales even as the primary index exceeds available RAM

This webinar covers TokuDB v5.0 features, latest performance results, and typical use cases.

I haven’t run TukoDB but it is advertised as a drop-in replacement for MySQL. High performance replacement.

Comments/suggestions? (I need to pre-order High-Performance MySQL, 2nd ed., 2012. Ignore the scams with the 1st edition copies still in stock at some sellers.)

November 19, 2011

Percona Server 5.5.17-22.1 released

Filed under: MySQL,Percona Server,SQL — Patrick Durusau @ 10:23 pm

Percona Server 5.5.17-22.1 released

From the webpage:

Percona is glad to announce the release of Percona Server 5.5.17-22.1 on November 19th, 2011 (Downloads are available here and from the Percona Software Repositories).

Based on MySQL 5.5.17, including all the bug fixes in it, Percona Server 5.5.17-22.1 is now the current stable release in the 5.5 series. All of Percona ‘s software is open-source and free, all the details of the release can be found in the 5.5.17-22.1 milestone at Launchpad or in the Release Notes for 5.5.17-22.1 in the documentation.

I haven’t installed or run a Percona Server, but the reported performance numbers are good enough to merit a closer look.

If you have run a Percona server, please comment.

November 8, 2011

Toad Virtual Expo – 11.11.11 – 24-hour Toad Event

Filed under: Conferences,Hadoop,HBase,Hive,MySQL,Oracle,Toad — Patrick Durusau @ 7:46 pm

Toad Virtual Expo – 11.11.11 – 24-hour Toad Event

From the website:

24 hours of Toad is here! Join us on 11.11.11, and take an around the world journey with Toad and database experts who will share database development and administration best practices. This is your chance to see new products and new features in action, virtually collaborate with other users – and Quest’s own experts, and get a first-hand look at what’s coming in the world of Toad.

If you are not going to see the Immortals on 11.11.11 or looking for something to do after the movie, drop in on the Toad Virtual Expo! 😉 (It doesn’t look like a “chick” movie anyway.)

Times:

Register today for Quest Software’s 24-hour Toad Virtual Expo and learn why the best just got better.

  1. Tokyo Friday, November 11, 2011 6:00 a.m. JST – Saturday, November 12, 2011 6:00 a.m. JST
  2. Sydney Friday, November 11, 2011 8:00 a.m. EDT – Saturday, November 12, 2011 8:00 a.m. EDT

  3. Tel Aviv Thursday, November 10, 2011 11:00 p.m. IST – Friday, November 11, 2011 11:00 p.m. IST
  4. Central Europe Thursday, November 10, 2011 10:00 p.m. CET – Friday, November 11, 2011 10:00 p.m. CET
  5. London Thursday, November 10, 2011 9:00 p.m. GMT – Friday, November 11, 2011 9:00 p.m. GMT
  6. New York Thursday, November 10, 2011 4:00 p.m. EST – Friday, November 11, 2011 4:00 p.m. EST
  7. Los Angeles Thursday, November 10, 2011 1:00 p.m. PST – Friday, November 11, 2011 1:00 p.m. PST

The site wasn’t long on specifics but this could be fun!

Toad for Cloud Databases (Quest Software)

Filed under: BigData,Cloud Computing,Hadoop,HBase,Hive,MySQL,Oracle,SQL Server — Patrick Durusau @ 7:45 pm

Toad for Cloud Databases (Quest Software)

From the news release:

The data management industry is experiencing more disruption than at any other time in more than 20 years. Technologies around cloud, Hadoop and NoSQL are changing the way people manage and analyze data, but the general lack of skill sets required to manage these new technologies continues to be a significant barrier to mainstream adoption. IT departments are left without a clear understanding of whether development and DBA teams, whose expertise lies with traditional technology platforms, can effectively support these new systems. Toad® for Cloud Databases addresses the skill-set shortage head-on, empowering database professionals to directly apply their existing skills to emerging Big Data systems through an easy-to-use and familiar SQL-based interface for managing non-relational data. 

News Facts:

  • Toad for Cloud Databases is now available as a fully functional, commercial-grade product, for free, at www.quest.com/toad-for-cloud-databases.  Toad for Cloud Databases enables users to generate queries, migrate, browse, and edit data, as well as create reports and tables in a familiar SQL view. By simplifying these tasks, Toad for Cloud Databases opens the door to a wider audience of developers, allowing more IT teams to experience the productivity gains and cost benefits of NoSQL and Big Data.
  • Quest first released Toad for Cloud Databases into beta in June 2010, making the company one of the first to provide a SQL-based database management tool to support emerging, non-relational platforms. Over the past 18 months, Quest has continued to drive innovation for the product, growing its list of supported platforms and integrating a UI for its bi-directional data connector between Oracle and Hadoop.
  • Quest’s connector between Oracle and Hadoop, available within Toad for Cloud Databases, delivers a fast and scalable method for data transfer between Oracle and Hadoop in both directions. The bidirectional characteristic of the utility enables organizations to take advantage of Hadoop’s lower cost of storage and analytical capabilities. Quest also contributed the connector to the Apache Hadoop project as an extension to the existing SQOOP framework, and is also available as part of Cloudera’s Distribution Including Apache Hadoop. 
  • Toad for Cloud Databases today supports:
    • Apache Hive
    • Apache HBase
    • Apache Cassandra
    • MongoDB
    • Amazon SimpleDB
    • Microsoft Azure Table Services
    • Microsoft SQL Azure, and
    • All Open Database Connectivity (ODBC)-enabled relational databases (Oracle, SQL Server, MySQL, DB2, etc)

 

Anything that eases the transition to cloud computing is going to be welcome. Toad being free will increase the ranks of DBAs who will at least experiment on their own.

October 2, 2011

Oracle rigs MySQL for NoSQL-like access

Filed under: MySQL,NoSQL — Patrick Durusau @ 6:36 pm

Oracle rigs MySQL for NoSQL-like access by Joab Jackson at CIO.

Joab writes:

In an interview in May with the IDG News Service, Tomas Ulin, Oracle vice president of MySQL engineering, described a project to bring the NoSQL-like speed of access to SQL-based MySQL.

“We feel very strongly we can combine SQL and NoSQL,” he said. “If you have really high-scalability performance requirements for certain parts of your application, you can share the dataset” across both NoSQL and SQL interfaces.

The key to Oracle’s effort is the use of Memecached, which Internet-based service providers, Facebook being the largest, have long used to quickly serve MySQL data to their users. Memcached creates a hash table of commonly accessed database items that is stored in a server’s working memory for quick access, by way of an API (application programming interface).

Memcached would provide a natural non-SQL interface for MySQL, Ulin said. Memcached “is heavily used in the Web world. It is something [webmasters] already have installed on their systems, and they know how to use [it]. So we felt that would be a good way to provide NoSQL access,” Ulin said.

Oracle’s thinking is that the Memecached interface can serve as an alternative access point for MySQL itself. Much of the putative sluggishness of SQL-based systems actually stems from the overhead of supporting a fully ACID-based query infrastructure needed to execute complex queries, industry observers said. By providing a NoSQL alternative access method, Oracle could offer customers the best of both worlds–a database that is fully ACID-compliant and has the speed of a NoSQL database.

With Memcached you are not accessing the data through SQL, but by a simple key-value lookup. “You can do a simple key-value-type lookup and get very optimal performance,” Ulin said.

The technology would not require any changes to MySQL itself. “We can just plug it in,” Ulin said. He added that Oracle was considering including this technology in the next version of MySQL, version 5.6.

While you are thinking about what that will mean for using MySQL engines, remember Stratified B-Tree and Versioned Dictionaries.

Suddenly, being able map the structures of data stores as subjects (ne topics) and to merge them, reliably, with structures of other data stores doesn’t seem all that far fetched does it? The thing to remember is that all that “big data” was stored in some “big structure,” a structure that topic maps can view as subjects to be represented by topics.

Not to mention knowing when you are accessing content (addressing) or authoring information about the content (identification).

August 20, 2011

May the Index be with you!

Filed under: MySQL,Query Language,SQL — Patrick Durusau @ 8:06 pm

May the Index be with you! by Lawrence Schwartz.

From the post:

The summer’s end is rapidly approaching — in the next two weeks or so, most people will be settling back into work. Time to change your mindset, re-evaluate your skills and see if you are ready to go back from the picnic table to the database table.

With this in mind, let’s see how much folks can remember from the recent indexing talks my colleague Zardosht Kasheff gave (O’Reilly Conference, Boston, and SF MySQL Meetups). Markus Winand’s site “Use the Index, Luke!” (not to be confused with my favorite Star Wars parody, “Use the Schwartz, Lone Starr!”), has a nice, quick 5 question indexing quiz that can help with this.

Interesting enough to request an account so I could download ToKuDB v.5.0. Uses fractal trees for indexing speed. Could be interesting. More on that later.

August 17, 2011

What’s New in MySQL 5.6 – Part 1: Overview – Webinar 18 August 2011

Filed under: MySQL,NoSQL,SQL — Patrick Durusau @ 6:54 pm

What’s New in MySQL 5.6 – Part 1: Overview

From the webpage:

MySQL 5.6 builds on Oracle’s investment in MySQL by adding improvements to Performance, InnoDB, Replication, Instrumentation and flexibility with NoSQL (Not Only SQL) access. In the first session of this 5-part Webinar series, we’ll cover the highlights of those enhancements to help you begin the development and testing efforts around the new features and improvements that are now available in the latest MySQL 5.6 Development Milestone and MySQL Labs releases.

OK, I’ll ‘fess up, I haven’t kept up with MySQL like I did when I was a sysadmin and running it everyday in a production environment. So, maybe its time to do some catching up.

Besides, when you read:

We will also explore how you can now use MySQL 5.6 as a “Not Only SQL” data source for high performance key-value operations by leveraging the new Memcached Plug-in to InnoDB, running simultaneously with SQL for more complex queries, all across the same data set.

“…SQL for more complex queries,…” you almost have to look. 😉

So, get up early tomorrow and throw a recent copy of MySQL on a box.

July 26, 2011

Using MySQL as a NoSQL…

Filed under: MySQL — Patrick Durusau @ 6:21 pm

Using MySQL as a NoSQL – A story for exceeding 750,000 qps on a commodity server by Yoshinori Matsunobu.

From the post:

Most of high scale web applications use MySQL + memcached. Many of them use also NoSQL like TokyoCabinet/Tyrant. In some cases people have dropped MySQL and have shifted to NoSQL. One of the biggest reasons for such a movement is that it is said that NoSQL performs better than MySQL for simple access patterns such as primary key lookups. Most of queries from web applications are simple so this seems like a reasonable decision.

Like many other high scale web sites, we at DeNA(*) had similar issues for years. But we reached a different conclusion. We are using “only MySQL”. We still use memcached for front-end caching (i.e. preprocessed HTML, count/summary info), but we do not use memcached for caching rows. We do not use NoSQL, either. Why? Because we could get much better performance from MySQL than from other NoSQL products. In our benchmarks, we could get 750,000+ qps on a commodity MySQL/InnoDB 5.1 server from remote web clients. We also have got excellent performance on production environments.

Maybe you can’t believe the numbers, but this is a real story. In this long blog post, I’d like to share our experiences.

Perhaps MySQL will be part of your next topic map system!

July 18, 2011

…Neo4j is 377 times faster than MySQL

Filed under: Graphs,MySQL,Neo4j — Patrick Durusau @ 6:46 pm

Time lines and news streams: Neo4j is 377 times faster than MySQL by René Pickhardt.

From the post:

Over the last weeks I did some more work on neo4j. And I am ready to present some more results on the speed (In my use case neo4j outperformed MySQL by a factor of 377 ! That is more than two magnitudes). As known one part of my PhD thesis is to create a social newsstream application around my social networking site metalcon.de. It is very obvious that a graph structure for social newsstreams are very natural: You go to a user. Travers to all his friends or objects of interest and then traverse one step deeper to the newly created content items. A problem with this kind of application is the sorting by Time or relvance of the content items. But before I discuss those problems I just want to present another comparission between MySQL and neo4j.

It is as exciting as the title implies.

June 22, 2011

Biodiversity Indexing: Migration from MySQL to Hadoop

Filed under: Hadoop,Hibernate,MySQL — Patrick Durusau @ 6:36 pm

Biodiversity Indexing: Migration from MySQL to Hadoop

From the post:

The Global Biodiversity Information Facility is an international organization, whose mission is to promote and enable free and open access to biodiversity data worldwide. Part of this includes operating a search, discovery and access system, known as the Data Portal; a sophisticated index to the content shared through GBIF. This content includes both complex taxonomies and occurrence data such as the recording of specimen collection events or species observations. While the taxonomic content requires careful data modeling and has its own challenges, it is the growing volume of occurrence data that attracts us to the Hadoop stack.

The Data Portal was launched in 2007. It consists of crawling components and a web application, implemented in a typical Java solution consisting of Spring, Hibernate and SpringMVC, operating against a MySQL database. In the early days the MySQL database had a very normalized structure, but as content and throughput grew, we adopted the typical pattern of denormalisation and scaling up with more powerful hardware. By the time we reached 100 million records, the occurrence content was modeled as a single fixed-width table. Allowing for complex searches containing combinations of species identifications, higher-level groupings, locality, bounding box and temporal filters required carefully selected indexes on the table. As content grew it became clear that real time indexing was no longer an option, and the Portal became a snapshot index, refreshed on a monthly basis, using complex batch procedures against the MySQL database. During this growth pattern we found we were moving more and more operations off the database to avoid locking, and instead partitioned data into delimited files, iterating over those and even performing joins using text files by synthesizing keys, sorting and managing multiple file cursors. Clearly we needed a better solution, so we began researching Hadoop. Today we are preparing to put our first Hadoop process into production.

Awesome project!

Where would you suggest the use of topic maps and subject identity to improve the project?

March 17, 2011

MySQL 5.5 Released

Filed under: MySQL,SQL — Patrick Durusau @ 6:49 pm

MySQL 5.5 Released

Performance gains for MySQL 5.5, from the release:

In recent benchmarks, the MySQL 5.5 release candidate delivered significant performance improvements compared to MySQL 5.1. Results included:

  • On Windows: Up to 1,500 percent performance gains for Read/Write operations and up to 500 percent gain for Read Only.(1)
  • On Linux: Up to 360 percent performance gain in Read/Write operations and up to 200 percent improvement in Read Only.(2)

If you are using MySQL as a backend for your topic map application, these and other improvements will be welcome news.

March 8, 2011

Summify’s Technology Examined

Filed under: Data Analysis,Data Mining,MongoDB,MySQL,Redis — Patrick Durusau @ 9:54 am

Summify’s Technology Examined

Phil Whelan writes an interesting review of the underlying technology for Summify.

Many those same components are relevant to the construction of topic map based services.

Interesting that Summify uses MySQL, Redis and MongoDB.

I rather like the idea of using the best tool for a particular job.

Worth a close read.

« Newer Posts

Powered by WordPress