Semantics: The Next Big Issue in Big Data

Semantics: The Next Big Issue in Big Data by Glen Fest.

From the post:

The use of semantics often is a way to evade the issue at hand (i.e., Bill Clinton’s parsed definition of “is”). But in David Saul’s world of bank compliance and regulation, it’s something that can help get right to the heart of the matter.

Saul, the chief scientist at State Street Corp. in Boston, views the technology of semantics—in which data is structured in ways that it can be shared easily between bank divisions, institutions and regulators—as an ends to better understand and manage big-bank risk profiles.

“By bringing all of this data together with the semantic models, we’re going to be able to ask the questions you need to ask to prepare regulatory reporting,” as well as internal risk calculations, Saul promised at a recent forum held at the New York offices of SWIFT, the Society for Worldwide Interbank Financial Telecommunication. Saul’s championing of semantics technology was part of a wider-ranging panel discussion on the role of technology in helping banks meet the current and forthcoming compliance demands of global regulators. “That’s really what we’re doing: trying to pull risk information from a variety of different systems and platforms, written at different times by different people,” Saul says.

To bridge the underlying data, the Financial Industry Business Ontology (FIBO), a group that Saul participates in, is creating the common terms and data definitions that will put banks and regulators on the same semantic page.

What’s ironic is in the same post you find:

Semantics technology already is a proven concept as an underlying tool of the Web that requires common data formats for sites to link to one another, says Saul. At large global banks, common data infrastructure is still in most cases a work in progress, if it’s underway at all. Legacy departmental divisions have allowed different (and incompatible) data sets and systems to evolve internally, leaving banks with the heavy chore of accumulating and repurposing data for both compliance reporting and internal risk analysis.

The inability to automate or reuse data across silos is at the heart of banks’ big-data dilemma—or as Saul likes to call it, a “smart data” predicament.

I’m not real sure what having a “common data forma” has to do with linking between data sets. Most websites use something close to HTML but that doesn’t mean they can be usefully linked together.

Not to mention the “legacy departmental divisions.” What is going to happen to them and their data?

How “semantics technology” is going to deal with different and incompatible data sets isn’t clear. Change all the recorded data retroactively? How far back?

If you have any contacts in the banking industry, tell them the FIBO proposal sounds like a bad plan.

Comments are closed.