From Big Data to NoSQL: The ReadWriteWeb Guide to Data Terminology (Part 1)
An article that perpetuates the data vs. schema/metadata mistake.
Let me illustrate:
Data Data are plain facts. When data are processed, organized, structured or presented in a given context so as to make them useful, they are called Information.
Schema According to Wikipedia, a database system’s schema is “its structure described in a formal language supported by the database management system (DBMS) and refers to the organization of data to create a blueprint of how a database will be constructed (divided into database tables).”
Err, pardon me, isn’t a schema composed of plain facts?
Sometimes I may only want to talk about the data in a single database.
What happens when I want to take data from several databases? All of which have different schemas?
If I can’t talk about the schemas as plain facts, how am I going to be able to use plain facts from different sources?
I could build an over-arching unified schema, but that would require constant updating and consensus across all the databases that are integrated.
And very likely someone else would prefer another over-arching unified schema.
While it may be convenient, and backwards compatible, to speak of “data,” “schema,” “metadata,” etc., recognize that it’s all data, from a certain point of view.