Taming Big Data Is Not a Technology Issue by Bill Franks.
From the post:
One thing that has struck me recently is that most of the focus when discussing big data is upon the technologies involved. The consensus seems to be that the biggest challenge with big data is a technological one, yet I don’t believe this to be the case. Sure, there are challenges today for organizations using big data, but, I would like to submit to you that technology is not the biggest problem. In fact, technology may be one of the easiest problems to solve when it comes time to tame big data.
The fact is that there are tools and technologies out there that can handle virtually all of the big data needs of the vast majority of organizations. As of today, you can find products and solutions that do whatever you need to do with big data. Technology itself is not the problem.
Then, what are the issues? The real problems are with resource availability, skills, process change, politics, and culture. While the technologies to solve your problems may be out there just waiting for you to implement them, it isn’t quite that easy, is it? You have to get budget, you have to do an implementation, you have to get your people up to speed on how to use the tools, you have to get buy in from various stakeholders, and you have to push against a culture averse to change.
The technology is right there, but you are unable to effectively put it to work. It FEELS like a technology issue since technology is front and center. However, it is really the cultural, people, and political issues surrounding the technology that are the problem. Let me illustrate with an example.
A refreshing view at the drive to build technology to “solve” the big data problem.
Once terabytes of data are accessible as soon as entering the data stream, for real time, reactive analysis, with n-dimensional graphic representations as a matter of course, the “big data” problem will still be the “big data” problem.
The often cited “volume, velocity, variety” characterization of “big data” are surface issues that in one manner or another, can be addressed using technology. Now.
A deeper, more persistent problem is that users expect their data, big or small, to have semantics. Whether express or implied. That problem, along with the others cited by Franks, has no technological solution.
Because semantics originate with us and not with our machines.
By all means, we need to solve the technology issues around “big data,” but that only gives us a start towards working on the more difficult problems, problems that original with us.
A much harder “programming” exercise. I suspect on Knuth’s scale of exercises, an 80 or 90.