Archive for the ‘BI’ Category

Why BI Projects Fail

Thursday, May 15th, 2014

Top reasons your Business Intelligence (BI) project will fail by Andrew Bourne.

Reasons 1) Data models are complex, 2) Dirty data, and 5) Decision making errors from misinterpretation of information, all have topic map like elements in them.

Andrew outlines the issues here and promises to take up each one separately and cover “…what to do about them:”

OK, I’m game.

There does seem to be a trend towards explanations for why “big data” projects are failing. As we saw in The Shrinking Big Data MarketPlace, a survey by VoltDB found that a full 72% of the respondents could not access or utilize the majority of their data.

I don’t view such reports as being “skeptical” about big data but more as being realistic that all the things necessary for a successful project of any kind, clear goals, hard work, good management are necessary for BI projects.

I will be following Andrew’s post and report back on where he comes down on issues relevant to topic maps.

I first saw this in a tweet by Gregory Piatetsky.

Business Information Key Resources

Friday, February 21st, 2014

Business Information Key Resources by Karen Blakeman.

From the post:

On one of my recent workshops I was asked if I used Google as my default search tool, especially when conducting business research. The short answer is “It depends”. The long answer is that it depends on the topic and type of information I am looking for. Yes, I do use Google a lot but if I need to make sure that I have covered as many sources as possible I also use Google alternatives such as Bing, Millionshort, Blekko etc. On the other hand and depending on the type of information I require I may ignore Google and its ilk altogether and go straight to one or more of the specialist websites and databases.

Here are just a few of the free and pay-per-view resources that I use.

Starting points for research are a matter of subject, cost, personal preference, recommendations from others, etc.

What are your favorite starting points for business information?


Wednesday, January 8th, 2014

BIIIG : Enabling Business Intelligence with Integrated Instance Graphs by André Petermann, Martin Junghanns, Robert Müller, Erhard Rahm.


We propose a new graph-based framework for business intelligence called BIIIG supporting the flexible evaluation of relationships between data instances. It builds on the broad availability of interconnected objects in existing business information systems. Our approach extracts such interconnected data from multiple sources and integrates them into an integrated instance graph. To support specific analytic goals, we extract subgraphs from this integrated instance graph representing executed business activities with all their data traces and involved master data. We provide an overview of the BIIIG approach and describe its main steps. We also present initial results from an evaluation with real ERP data.

Very interesting paper because on one hand it talks about merging data from heterogeneous data sets and at the same time claims to be using Neo4j.

In case you didn’t know, Neo4j enforces normalization and doesn’t have a concept of merging nodes. (True, Cypher has a “merge” operator but it doesn’t “merge” nodes in any meaningful sense of the word. Either a node is matched or a new node is created. Not how I interpret “merge.”)

It took more than one read but in puzzling over:

For integrated objects we can merge the properties from the sources. For the example in Fig. 2, we can combine employees objects with CIT.employees.erp_empl_number = ERP.EmplyeeTable.number and merge their properties from both sources (name, degree, dob, address, phone).

I realized the authors were producing a series of graphs where only the final version of the graph has the “merged” nodes. If you notice, the nodes are created first and then populated with associations, which resolves the question of using different pointers from the original sources.

The authors also point out that Neo4j cannot manage sets of graphs. I had overlooked that point. That is a fairly severe limitation.

Do spend some time at the Database Group Leipzig. There are several other recent papers that look very interesting.

Office 2013, Office 365 Editions and BI Features

Saturday, February 2nd, 2013

Office 2013, Office 365 Editions and BI Features by Chris Webb.

From the post:

By now you’re probably aware that Office 2013 is in the process of being officially released, and that Office 365 is a very hot topic. You’ve probably also read lots of blog posts by me and other writers talking about the cool new BI functionality in Office 2013 and Office 365. But which editions of Office 2013 and Office 365 include the BI functionality, and how does Office 365 match up to plain old non-subscription Office 2013 for BI? It’s surprisingly hard to find out the answers…

For regular, non-subscription, Office 2013 on the desktop you need Office Professional Plus to use the PowerPivot addin or to use Power View in Excel. However there’s an important distinction to make: the xVelocity engine is now natively integrated into Excel 2013, and this functionality is called the Excel Data Model and is available in all desktop editions of Excel. You only need the PowerPivot addin, and therefore Professional Plus, if you want to use the PowerPivot Window to modify and extend your model (for example by adding calculated columns or KPIs). So even if you’re not using Professional Plus you can still do some quite impressive BI stuff with PivotTables etc. On the server, the only edition of Sharepoint 2013 that has any BI functionality is Enterprise Edition; there’s no BI functionality in Foundation or Standard Editions.

No matter what OS you are running, you are likely to be using some version of MS Office and if you are reading this blog, probably for BI purposes.

Chris does a great job at pointing to resources and generating resources to guide you through the feature/license thicket that surrounds MS Office in its various incarnations.

Complex licensing/feature matrices contribute to the size of department budgets that create such complexity. They don’t contribute to the bottom line at Microsoft. There is a deep and profound difference.

BI’s Dirty Secrets – The Unfortunate Domination of Manually-Coded Extracts

Friday, March 9th, 2012

BI’s Dirty Secrets – The Unfortunate Domination of Manually-Coded Extracts by Rick Sherman.

From the post:

Manually-coded extracts are another dirty secret of the BI world. I’ve been seeing them for years, in both large and small companies. They grow haphazardly and are never documented, which practically guarantees that they will become an IT nightmare.

How have manually-coded extracts become so prevalent? It’s not as if there aren’t enough data integration tools around, including ETL tools. Even large enterprises that use the correct tools to load their enterprise data warehouses will often resort to manually-coded extracts to load their downstream BI data sources such as data marts, OLAP cubes, reporting databases and spreadsheets.

I thought the following passage was particularly good:

….Tools are easy; concepts are harder. Anyone can start coding; it’s a lot harder to actually architect and design. Tool vendors don’t help this situation when they promote tools that “solve world hunger” and limit training to the tool, not any concepts.

I don’t see manual coding as a problem, so long as it is documented. There should be one and only one penalty for lack of documentation. Termination.

Lack of documentation can put critical IT systems at risk and it doesn’t take complex systems to produce. Even (gasp) MS Word documents that are maintained with a table of contents and indexes can be adequate documentation.

Not the same as a bug database with bug reports, patches, pointers to code, email discussions, meeting minutes, etc., but interactive production of graphs and charts isn’t a requirement for successful documentation.

Undocumented manually-coded extracts are a sign that the requirements of BI users are not being meet. Getting those documented and incorporated into BI tools looks like a good first start to solving this secret.

One Mashboard to Rule Them All

Wednesday, April 13th, 2011

One Mashboard to Rule Them All

From the announcement:

Webinar Overview: We’ll be showcasing real-world examples of Jaspersoft dashboards,adding to your already extensive technical knowledge. Dashboards, with their instant answers for executives and business users, and mashboards, ideal for integrating multiple data sources for improved organizational decision-making are among the most frequently requested BI deliverables. Join us for everything you wanted to know about Jaspersoft Platforms.

April 20, 2011 1:00 pm, Eastern Daylight Time (New York, GMT-04:00)
April 20, 2011 10:00 am, Pacific Daylight Time (San Francisco, GMT-07:00)
April 20, 2011 6:00 pm, Western European Summer Time (London, GMT+01:00)

There is an open source side to Jaspersoft,

Stats from the site:

206224 members
163 today
1707 last 7 days
6643 last 30 days
255 public projects
182 private projects
85193 forum entries

A community where I would like to pose the question: “How do you re-use a mashup created by someone else?”

And given that it has an open source side, a place to pose topic maps as an answer.

Pentaho BI Suite Enterprise Edition (TM/SW Are You Listening?)

Thursday, March 10th, 2011

Pentaho BI Suite Enterprise Edition

From the website:

Pentaho is the open source business intelligence leader. Thousands of organizations globally depend on Pentaho to make faster and better business decisions that positively impact their bottom lines. Download the Pentaho BI Suite today if you want to speed your BI development, deploy on-premise or in the cloud or cut BI licensing costs by up to 90%.

There are several open source offerings like this, Talend is another one that comes to mind.

I haven’t looked at its data integration in detail but suspect I know the answer to the question:

Say I have an integration of some BI assets using Pentaho and other BI assets integrated using Talend, how do I integrate those together while maintaining the separately integrated BI assets?

Or for that matter, how do I integrate BI that has been gathered and integrated by others, say Lexis/Nexis?

Interesting too to note that this is the sort of user slickness and ease that topic maps and (cough) linked data (see, I knew I could say it), faces in the marketplace.

Does it offer all the bells and whistles of more sophisticated subject identity or reasoning approaches?

No, but if it offers all that users are interested in using, what is your complaint?

Both topic maps and semantic web/linked data approaches need to listen more closely to what users want.

As opposed to deciding what users need.

And delivering the latter instead of the former.