Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

September 7, 2012

HTML5: Render urban population growth on a 3D world globe with Three.js and canvas

Filed under: HTML5,Maps,Marketing,Three.js — Patrick Durusau @ 2:47 pm

HTML5: Render urban population growth on a 3D world globe with Three.js and canvas By jos.dirksen.

From the post:

In this article I’ll once again look at data / geo visualization with Three.js. This time I’ll show you how you can plot the urban population growth over the years 1950 to 2050 on a 3D globe using Three.js. The resulting visualization animates the growth of the world’s largest cities on a rotating 3D world. The result we’re aiming for looks like this (for a working example look here.):

Possible contender for the topic map graphic? A 3D globe?

If you think of topic maps as representing a users world view?

Perhaps, perhaps, but then you will need a flat earth version for some users as well. 😉

Images for your next Hadoop and Big Data presentation [Topic Map Images?]

Filed under: BigData,Hadoop,Marketing — Patrick Durusau @ 2:17 pm

Images for your next Hadoop and Big Data presentation

Images that will help with your next Hadoop/Big Data presentation.

Question: What images will you use for your next topic map presentation?

Possibles:

Raptor

A bit too tame for my tastes. And its doesn’t say: “map” to me. You?

Adam - Sistene Chapel

Hmmm, presumptuous don’t you think? Plus lacking that “map” quality as well.

Barta

It claims to be a map, of sorts. But scarring potential customers isn’t good strategy.

Dante - Inferno

Will be familiar soon enough. Not sure anyone wants a reminder.

Suggestions?

September 3, 2012

Sell-an-Elephant-to-your-Boss-HOWTO

Filed under: Design,Marketing — Patrick Durusau @ 7:12 pm

Sell-an-Elephant-to-your-Boss-HOWTO by Aurimas Mikalauskas.

From the post:

Spoiler alert: If your boss does not need an elephant, he is definitely NOT going to buy one from you. If he will, he will regret it and eventually you will too.

I must appologize to the reader who was expecting to find an advice on selling useless goods to his boss. While I do use a similar technique to get a quarterly raise (no, I don’t), this article is actually about convincing your team, your manager or anyone else who has influence over project’s priorities, that pending system performance optimizations are a priority (assuming, they indeed are). However this headline was not very catchy and way too long, so I decided to go with the elephant instead.

System performance optimization is what I do day to day here at Percona. Looking back at the duration of an optimization project, I find that with bigger companies (bigger here means it’s not a one-man show) it’s not the identification of performance problems that takes most of the time. Nor it is looking for the right solution. Biggest bottleneck in the optimization project is where solution gets approved and prioritized appropriately inside the company that came for performance optimization in the first place. Sometimes I would follow-up with the customer after a few weeks or a month just to find that nothing was done to implement suggested changes. When I would ask why, most of the time the answer is someting along those lines: my manager didn’t schedule/approve it yet.

I don’t want to say that all performance improvements are a priority and should be done right away, not at all. I want to suggest that you can check if optimizations at hand should be prioritized and if so – how to make it happen if you’re not the one who sets priorities.

Steps to follow:

  1. Estimate harm being done
  2. Estimate the cost of the solution
  3. Make it a short and clear statement
  4. Show the method
  5. The main problem
  6. The solution
  7. Overcome any obsticles
  8. Kick it off

I like number one (1) in particular.

If your client doesn’t feel a need, no amount of selling is going to make a project happen.

All steps to follow in any IT/semantic project.

August 28, 2012

Misconceptions holding back use of data integration tools [Selling tools or data integration?]

Filed under: Data Integration,Marketing — Patrick Durusau @ 1:23 pm

Misconceptions holding back use of data integration tools by Rick Sherman.

From the post:

There’s no question that data integration technology is a good thing. So why aren’t businesses using it as much as they should be?

Data integration software has evolved significantly from the days when it primarily consisted of extract, transform and load (ETL) tools. The technologies available now can automate the process of integrating data from source systems around the world in real time if that’s what companies want. Data integration tools can also increase IT productivity and make it easier to incorporate new data sources into data warehouses and business intelligence (BI) systems for users to analyze.

But despite tremendous gains in the capabilities and performance of data integration tools, as well as expanded offerings in the marketplace, much of the data integration projects in corporate enterprises are still being done through manual coding methods that are inefficient and often not documented. As a result, most companies haven’t gained the productivity and code-reuse benefits that automated data integration processes offer. Instead, they’re deluged with an ever-expanding backlog of data integration work, including the need to continually update and fix older, manually coded integration programs.

Rick’s first sentence captures the problem with promoting data integration:

“There’s no question that data integration technology is a good thing.”

Hypothetical survey of Fortune 1,000 CEO’s:

Question Agree Disagree
Data integration may be a good thing 100% 0%
Data integration technology is a good thing 0.001% 99.999%

Data integration may be a good thing. Depends on what goal or mission is furthered by data integration.

Data integration, by hand, manual coding or data mining, isn’t an end unto itself. Only a means to an end.

Specific data integration, tied to a mission or goal of an organization, has a value to be evaluated against the cost of the tool or service.

Otherwise, we are selling tools of no particular value for some unknown purpose.

Sounds like a misconception of the sales process to me.

August 26, 2012

Designing a Better Sales Pipeline Dashboard

Filed under: Interface Research/Design,Marketing — Patrick Durusau @ 10:50 am

Designing a Better Sales Pipeline Dashboard by Zach Gemignani

From the post:

What would your perfect sales pipeline dashboard look like?

The tools that so effectively capture sales information (Salesforce, PipelineDeals, Highrise) tend to do a pretty lousy job of providing visibility into that very same data. The reporting or analytics is often just a table with lots of filtering features. That doesn’t begin to answer important questions like:

  • What is the value of the pipeline?
  • Where is it performing efficiently? Where is it failing?
  • How are things likely to change in the next month?

I’ve been annoyed by this deficiency in sales dashboards for a while. Ken and I put together some thoughts about what a better sales pipeline information interface would look like and how it would function. Here’s what we came up with:

A sales dashboard that at least two people like better than most offerings.

What would you add to this dashboard that topic maps would be able to supply?

Yes, I am divorcing the notion of “interface” from “topic map.”

Interface being how a user accomplishes a task or accesses information.

Completely orthogonal to the underlying technology.

Exposing the underlying technology demonstrates how clever we are.

Is not succeeding in the marketplace clever?*


*Ask yourself how many MS Office users can even stumble through a “big block” diagram of how MS Word works?

Compare that number to the number of MS Word users. Express as:

“MS Word users/MS Word users who understand the technology.”

That’s my target ratio for:

“topic map users/topic map users who understand the technology.”

August 25, 2012

Avoiding Public Confessions of Ignorance

Filed under: Government,Marketing,Topic Maps — Patrick Durusau @ 2:15 pm

I saw White House Follows No 10 To Github-first open source development, with text that reads in part:

Yesterday the White House got some justifiable praise for open sourcing its online petitioning platform, We The People, using a Github repository. In a blog post Macon Philips, Director of Digital Strategy, said:

“Now anybody, from other countries to the smallest organizations to civic hackers can take this code and put to their own use.

One of the most exciting prospects of open sourcing We the People is getting feedback, ideas and code contributions from the public. There is so much that can be done to improve this system, and we only benefit by being able to more easily collaborate with designers and engineers around the country – and the world.”

If you don’t know the details of the U.S. government and open source, see: Open Source in the U.S. Government.

History is “out there,” and not all that hard to find.

Can topic maps help government officials avoid public confessions of ignorance?

August 22, 2012

A bit “lite” today

Filed under: Marketing — Patrick Durusau @ 6:48 pm

Apologies but postings are a bit “lite” today.

Probably late tomorrow but stay tuned for one of the reasons today has been a “lite” day.

It’s not a bad reason and I think you will like the outcome, or at least find it useful.

August 20, 2012

Topic Map Based Publishing

Filed under: Marketing,Publishing,Topic Map Software,Topic Maps — Patrick Durusau @ 10:21 am

After asking for ideas on publishing cheat sheets this morning, I have one to offer as well.

One problem with traditional cheat sheets is what any particular user wants in a cheat sheet?

Another problem is how expand the content of a cheat sheet?

And what if you want to sell the content? How does that work?

I don’t have a working version (yet) but here is my thinking on how topic maps could power a “cheat sheet” that meets all those requirements.

Solving the problem of what content to include seems critical to me. It is the make or break point in terms of attracting paying customers for a cheat sheet.

Content of no interest is as deadly as poor quality content. Either way, paying customers will vote with their feet.

The first step is to allow customers to “build” their own cheat sheet from some list of content. In topic map terminology, they specify an association between themselves and a set of topics to appear in “their” cheat sheet.

Most of the cheat sheets that I have seen (and printed out more than a few) are static artifacts. WYSIWYG artifacts. What there is and there ain’t no more.

Works for some things but what if what you need to know lies just beyond the edge of the cheat sheet? That’s that bad thing about static artifacts, they have edges.

In addition to building their own cheat sheet, the only limits to a topic map based cheat sheet are those imposed by lack of payment or interest. 😉

You may not need troff syntax examples on a daily basis but there are times when they could come in quite handy. (Don’t laugh. Liam Quin got hired on the basis of the troff typesetting of his resume.)

The second step is to have a cheat sheet that can expand or contract based on the immediate needs of the user. Sometimes more or less content, depending on their need. Think of an expandable “nutshell” reference.

A WYWIWYG (What You Want Is What You Get) approach as opposed to WWWTSYIWYG (What We Want To Sell You Is What You Get) (any publishers come to mind?).

What’s more important? Your needs or the needs of your publisher?

Finally, how to “sell” the content? The value-add?

Here’s one model: The user buys a version of the cheat sheet, which has embedded links to addition content. Links that when the user authenticates to a server, are treated as subject identifiers. Subject identifiers that cause merging to occur with topics on the server and deliver additional content. Each user subject identifier can be auto-generated on purchase and so are uniquely tied to a particular login.

The user can freely distribute the version of the cheat sheet they purchased, free advertising for you. But the additional content requires a separate purchase by the new user.

What blind alleys, pot holes and other hazards/dangers am I failing to account for in this scenario?

Erlang Cheat Sheet [And Cheat Sheets in General]

Filed under: Erlang,Marketing — Patrick Durusau @ 8:07 am

Erlang Cheat Sheet

Fairly short (read limited) cheat sheet on Erlang. Found at: http://www.cheatography.com/

Has a number of cheat sheets and is in the process of creating a cheat sheet template.

Questions that come to mind:

  • Using a topic map to support a cheat sheet, what more would you expect to see? Links to fuller examples? Links to manuals? Links to sub-cheat sheets?
  • Have you seen any ontology cheat sheets? For coding consistency, that sounds like something that could be quite handy.
  • For existing ontologies, any research on frequency of use to support the creation of cheat sheets? (Would not waste space on “thing” for example. Too unlikely to bear mentioning.)

August 7, 2012

Catering to the long tail? (business opportunity)

Filed under: Game Theory,Games,Marketing — Patrick Durusau @ 12:16 pm

I was struck by a line in Lattice games and the Economics of aggregators, by P. Jordan, U. Nadav, K. Punera, A. Skrzypacz, and G. Varghese, that reads:

A vendor that can provide good tools for to reduce the cost of doing business F is likely to open the floodgates for new small aggregators to cater to the long tail of user interests — and reap a rich reward in doing so.

You see? Struggling through all the game theory parts of the paper were worth your time!

A topic map application that enables small aggregators select/re-purpose/re-brand content for their “long tail of user interests” could be such an application.

Each aggregator could have their “view/terminology/etc.” both as a filter for content delivered as well as how it appears to their users.

Not long tails but think of the recent shooting incident in Aurora.

A topic map application could deliver content to gun control aggregators, with facts about the story that support new gun control laws, petitions and other activities.

At the same time, the same topic map application could delivery to NRA aggregators, the closest gun stores and hours for people who take such incidents as a reason to more fully arm themselves.

Same content, just repurposed on demand for different aggregators.

True, any relatively sophisticated user can setup their own search/aggregation service, but that’s the trick isn’t it? Any “relatively sophisticated user.”

Thinking not so much as a “saved search” or “alert”, dumpster diving is only to productive and it is tiring, but curated and complex searches that users can select for inclusion. So they are getting the “best” searches composed by experts.

I am sure there are other options and possibilities for delivery of both services and content. Topic maps should score high for either one.

PS: Slides from Stanford RAIN Seminar

August 1, 2012

Semantic Silver Bullets?

Filed under: Information Sharing,Marketing,Semantics — Patrick Durusau @ 1:46 pm

The danger of believing in silver bullets

Nick Wakeman writes in the Washington Technology Business Beat:

Whether it is losing weight, getting rich or managing government IT, it seems we can’t resist the lure of a silver bullet. The magic pill. The easy answer.

Ten or 12 years ago, I remember a lot of talk about leasing and reverse auctions, and how they were going to transform everything.

Since then, outsourcing and insourcing have risen and fallen from favor. Performance-based contracting was going to be the solution to everything. And what about the huge systems integration projects like Deepwater?

They start with a bang and end with a whimper, or in some cases, a moan and a whine. And of course, along the way, millions and even billions of dollars get wasted.

I think we are in the midst of another silver bullet phenomenon with all the talk around cloud computing and everything as a service.

I wish I could say that topic maps are a semantic silver bullet. Or better yet, a semantic hand grenade. One that blows other semantic approaches away.

Truthfully, topic maps are neither one.

Topic maps rely upon users, assisted by various technologies, to declare and identify subjects they want to talk about and, just as importantly, relationships between those subjects. Not to mention where information about those subjects can be found.

If you need evidence of the difficulty of those tasks, consider the near idiotic results you get from search engines. Considering the task they do pretty good but pretty good still takes time and effort to sort out every time you search.

Topic maps aren’t easy, no silver bullet, but you can capture subjects of interest to you, define their relationships to other subjects and specify where more information can be found.

Once captured, that information can be shared, used and/or merged with information gathered by others.

Bottom line is that better semantic results, for sharing, for discovery, for navigation, all require hard work.

Are you ready?

July 23, 2012

Aurora – Illegal Weapons [Big Data to Small Data]

Filed under: BigData,Marketing,Security — Patrick Durusau @ 1:53 pm

Tuan C. Nguyen writes in Inside the secret online marketplace for illegal weapons that:

With just a few clicks, anyone with an internet connection can obtain some of the deadliest weapons known to man, an investigation by tech blog Gizmodo has revealed.

These include AK-47s, Bushmaster military rifles and even grenades — all of which can be sold, bought, sent and delivered on the Armory, a hidden website that functions as an online black market for illegal firearms. It’s there that Gizmodo writer Sam Biddle, who went undercover as an anonymous buyer, discovered a transaction process that uses an elaborate scheme that involves identity-concealing data encryption, an alternative electronic currency and a delivery method that allows both buyers and sellers to bypass the authorities without raising even the hint of suspicion.

Concerns over the ease of obtaining guns and other lethal weapons has gripped the nation in the aftermath of one of the deadliest massacre’s in recent memory when a heavily-armed lone gunman killed 12 people and injured 58 during a midnight movie screening just outside Denver. Shortly after, a paper trail revealed that the suspect built his arsenal through purchases made via a host of unregulated web sites, the Associated press reports. The existence of such portals is alarming in that not only can they arm a single deranged individual with enough ballistics to carry out a massacre, but also supply a group of terrorist rebels with enough artillery to lay siege to embassies and government offices, according to the report.

The post goes on to make much of the use of TOR (The Onion Router), which was developed by the U.S. Navy.

The TOR site relates in its overview:

Using Tor protects you against a common form of Internet surveillance known as “traffic analysis.” Traffic analysis can be used to infer who is talking to whom over a public network. Knowing the source and destination of your Internet traffic allows others to track your behavior and interests. This can impact your checkbook if, for example, an e-commerce site uses price discrimination based on your country or institution of origin. It can even threaten your job and physical safety by revealing who and where you are. For example, if you’re travelling abroad and you connect to your employer’s computers to check or send mail, you can inadvertently reveal your national origin and professional affiliation to anyone observing the network, even if the connection is encrypted.

I recommend that you take a look at the TOR site and its documentation. Quite a clever piece of work.

Taun see this in part as a “big data” problem. Sure, given all the network traffic that is being exchanged at one time, TOR can easily defeat any “traffic analysis” process. (Or at least let’s take that as a given for purposes of this discussion. Users are assuming there are no “backdoors” built into the encryption but that’s another story.)

What if we look at this as a “big data” being reduced to “small data” problem?

Assume local law enforcement has access to the local Internet “connection.” (It is more complicated than this but I am trying to illustrate something, not write a manual for it.)

My first step is to filter encrypted traffic from non-encrypted traffic, passing my current location. Since locations are fed by routers, I can just walk the chain of routers, filtering non-encrypted traffic as I go. I don’t have to worry about the content or even tracking the IP addresses of the sender. Eventually I have tracked the senders of encrypted messages down to the nearest router to the origin of the traffic.

My second step is to start using a topic map to combine other information known to the local police about an area and its residents. A person or group ordering heavy weapons, explosives, etc., is going to have other “tells” besides encrypted Internet traffic.

A topic map can help combine all those “tells” into a map of probable locations and actors, using a variety of information sources, TOR or other technologies not withstanding.

Rather than a “big data,” you now have a “small data” problem and one that can be addressed by the local police.

July 20, 2012

[It] knows if you’ve been bad or good so be good for [your own sake]

Filed under: Marketing,Microsoft — Patrick Durusau @ 1:49 pm

I had to re-write a line from “Stanta Claus is coming to town” just a bit to fit the story about SkyDrive I read today: Watch what you store on SkyDrive–you may lose your Microsoft life.

I don’t find the terms of service surprising. Everybody has to say that sort of thing to avoid liability in case you store, transfer, etc., something illegal using their service.

The rules used to require notice and refusal to remove content before you have any liability.

Has that changed?

Curious for a number of reasons, not the least of which is providing topic map data products and topic map appliances online.

Data Jujitsu: The art of turning data into product

Filed under: Data,Marketing,Topic Maps — Patrick Durusau @ 11:00 am

Data Jujitsu: The art of turning data into product: Smart data scientists can make big problems small by DJ Patil.

From the post:

Having worked in academia, government and industry, I’ve had a unique opportunity to build products in each sector. Much of this product development has been around building data products. Just as methods for general product development have steadily improved, so have the ideas for developing data products. Thanks to large investments in the general area of data science, many major innovations (e.g., Hadoop, Voldemort, Cassandra, HBase, Pig, Hive, etc.) have made data products easier to build. Nonetheless, data products are unique in that they are often extremely difficult, and seemingly intractable for small teams with limited funds. Yet, they get solved every day.

How? Are the people who solve them superhuman data scientists who can come up with better ideas in five minutes than most people can in a lifetime? Are they magicians of applied math who can cobble together millions of lines of code for high-performance machine learning in a few hours? No. Many of them are incredibly smart, but meeting big problems head-on usually isn’t the winning approach. There’s a method to solving data problems that avoids the big, heavyweight solution, and instead, concentrates building something quickly and iterating. Smart data scientists don’t just solve big, hard problems; they also have an instinct for making big problems small.

We call this Data Jujitsu: the art of using multiple data elements in clever ways to solve iterative problems that, when combined, solve a data problem that might otherwise be intractable. It’s related to Wikipedia’s definition of the ancient martial art of jujitsu: “the art or technique of manipulating the opponent’s force against himself rather than confronting it with one’s own force.”

How do we apply this idea to data? What is a data problem’s “weight,” and how do we use that weight against itself? These are the questions that we’ll work through in the subsequent sections.

To start, for me, a good definition of a data product is a product that facilitates an end goal through the use of data. It’s tempting to think of a data product purely as a data problem. After all, there’s nothing more fun than throwing a lot of technical expertise and fancy algorithmic work at a difficult problem. That’s what we’ve been trained to do; it’s why we got into this game in the first place. But in my experience, meeting the problem head-on is a recipe for disaster. Building a great data product is extremely challenging, and the problem will always become more complex, perhaps intractable, as you try to solve it.

Before investing in a big effort, you need to answer one simple question: Does anyone want or need your product? If no one wants the product, all the analytical work you throw at it will be wasted. So, start with something simple that lets you determine whether there are any customers. To do that, you’ll have to take some clever shortcuts to get your product off the ground. Sometimes, these shortcuts will survive into the finished version because they represent some fundamentally good ideas that you might not have seen otherwise; sometimes, they’ll be replaced by more complex analytic techniques. In any case, the fundamental idea is that you shouldn’t solve the whole problem at once. Solve a simple piece that shows you whether there’s an interest. It doesn’t have to be a great solution; it just has to be good enough to let you know whether it’s worth going further (e.g., a minimum viable product).

Here’s the question to ask for an open source topic map project:

Does anyone want or need your product?

Ouch!

A few of us, not enough to make a small market, like to have topic maps as interesting computational artifacts.

For a more viable (read larger) market, we need to sell data products topic maps can deliver.

How we create or deliver that product, hypergraphs, elves chained to desks, quantum computers or even magic, doesn’t matter to any sane end user.

What matters is the utility of the data product for some particular need or task.

No, I don’t know what data product to suggest. If I did, it would have been the first thing I would have said.

Suggestions?

PS: Read DJ’s post in full. Every other day or so until you have a successful, topic map based, data product.

July 12, 2012

LSU Researchers Create Topic Map of Oil Spill Disaster

Filed under: Marketing,Topic Map Software,Topic Maps — Patrick Durusau @ 6:53 pm

LSU Researchers Create Topic Map of Oil Spill Disaster

From the post:

The Gulf of Mexico Deepwater Horizon Oil Spill incident has impacted many aspects of the coastal environment and inhabitants of surrounding states. However, government officials, Gulf-based researchers, journalists and members of the general public who want a big picture of the impact on local ecosystems and communities are currently limited by discipline-specific and fractured information on the various aspects of the incident and its impacts.

To solve this problem, Assistant Professor in the School of Library and Information Science Yejun Wu is leading the way in information convergence on oil spill events. Wu’s lab has created a first edition of an online topic map, available at http://topicmap.lsu.edu/, that brings together information from a wide range of research fields including biological science, chemistry, coastal and environmental science, engineering, political science, mass communication studies and many other disciplines in order to promote collaboration and big picture understanding of technological disasters.

“Researchers, journalists, politicians and even school teachers wanted to know the impacts of the Deepwater Horizon oil spill incident,” Wu said. “I felt this was an opportunity to develop a tool for supporting learning and knowledge discovery. Our topic map tool can help people learn from historical events to better prepare for the future.”

Wu started the project with a firm belief in the need for an oil spill information hub.

“There is a whole list of historical oil spill events that we probably neglected – we did not learn enough from history,” Wu said.

He first looked to domain experts from various disciplines to share their own views of the impacts of the Deepwater Horizon oil spill. From there, Wu and his research associate and graduate students manually collected more than 7,000 concepts and 4,000 concept associations related to oil spill incidents worldwide from peer-reviewed journal articles and authoritative government websites, loading the information into an organizational topic map software program. Prior to these efforts by Wu’s lab, no comprehensive oil spill topic map or taxonomy existed.

“Domain experts typically focus on oil spill research in their own area, such as chemistry or political communication, but an oil spill is a comprehensive problem, and studies should be interdisciplinary,” Wu said. “Experts in different fields that usually don’t talk to each can benefit from a tool that brings together and organizes information concepts across many disciplines.”

Wikipedia calls it: Deepwater Horizon oil spill. I think BP Oil Spill is a better name.

Just thinking of environmental disasters, which ones would you suggest for topic maps?

July 9, 2012

UDL Guidelines – Version 2.0: Principle I. Provide Multiple Means of Representation

Filed under: Marketing,Topic Maps — Patrick Durusau @ 2:22 pm

UDL Guidelines – Version 2.0: Principle I. Provide Multiple Means of Representation

From the webpage:

Learners differ in the ways that they perceive and comprehend information that is presented to them. For example, those with sensory disabilities (e.g., blindness or deafness); learning disabilities (e.g., dyslexia); language or cultural differences, and so forth may all require different ways of approaching content. Others may simply grasp information quicker or more efficiently through visual or auditory means rather than printed text. Also learning, and transfer of learning, occurs when multiple representations are used, because it allows students to make connections within, as well as between, concepts. In short, there is not one means of representation that will be optimal for all learners; providing options for representation is essential.

From the Universal Design for Learning (UDL) Center.

Have you ever noticed how people keep running across topic map issues? Different domains, different ways of talking about the problems but bottom line it comes down to different ways to identify the same subjects.

When they create solutions, they don’t always remember that containers in their solutions are subjects too. That may be identified differently by others. We create information silos, useful in their own domains, but unless treated as subjects, are hard to share across domains.

Hard to share because without a map between identifications, can’t tell which container goes with what other container, or subject with subject.

Need to agree we each keep our identifications and use maps from one container/subject to the other.

So we benefit from each other instead of ignoring the riches gathered by others.

The UDL makes multiple modes of access (what we call subject mapping in topic maps) it’s Principle 1!

Makes sense. You want educational content to be re-used by many learners.

Now to explore how they realize Principle 1 in action. Hoping to start a conversation where topic maps will come up.

July 7, 2012

Subverting Ossified Departments [Moving beyond name calling]

Filed under: Analytics,Business Intelligence,Marketing,Topic Maps — Patrick Durusau @ 10:21 am

Brian Sommer has written on why analytics will not lead to new revenue streams, improved customer service, better stock options or other signs of salvation:

The Ossified Organization Won’t ‘Get’ Analytics (part 1 of 3)

How Tough Will Analytics Be in Ossified Firms? (Part 2 of 3)

Analytics and the Nimble Organization (part 3 of 3)

Why most firms won’t profit from analytics:

… Every day, companies already get thousands of ideas for new products, process innovations, customer interaction improvements, etc. and they fail to act on them. The rationale for this lack of movement can be:

– That’s not the way we do things here

– It’s a good idea but it’s just not us

– It’s too big of an idea

– It will be too disruptive

– We’d have to change so many things

– I don’t know who would be responsible for such a change

And, of course,

– It’s not my job

So if companies don’t act on the numerous, free suggestions from current customers and suppliers, why are they so deluded into thinking that IT-generated, analytic insights will actually fare better? They’re kidding themselves.

[part 1]

What Brian describes in amusing and great detail are all failures that no amount of IT, analytics or otherwise, can address. Not a technology problem. Not even an organization (as in form) issue.

It is a personnel issue. You can either retrain (I find unlikely to succeed) or you can get new personnel. it really is that simple. And with a glutted IT market, now would be the time to recruit an IT department not wedded to current practices. But you would need to do the same in accounting, marketing, management, etc.

But calling a department “ossified” is just name calling. You have to move beyond name calling to establish a bottom line reason for change.

Assuming you have access, topic maps can help you integrate data across department that don’t usually interchange data. So you can make the case for particular changes in terms of bottom line expenses.

Here is a true story with the names omitted and the context changed a bit:

Assume you are a publisher of journals, with both institutional and personal subscriptions. One of the things that all periodical publishers have to address are claims for “missing” issues. It happens, mail room mistakes, postal system errors, simply lost in transit, etc. Subscribers send in claims for those missing issues.

Some publishers maintain records of all subscriptions, including any correspondence and records, which are consulted by some full time staffer who answers all “claim” requests. One argument being there is a moral obligation to make sure non-subscribers don’t get an issue to which they are not entitled. Seriously, I have heard that argument made.

Analytics and topic maps could combine the subscription records with claim records and expenses for running the claims operation to show the expense of detailed claim service. Versus the cost of having the mail room toss another copy back to the requester. (Our printing cost was $3.00/copy so the math wasn’t the hard part.)

Topic maps help integrate the data you “obtain” from other departments. Just enough to make your point. Don’t have to integrate all the data, just enough to win the argument. Until the next argument comes along and you take a bit bigger bite of the apple.

Agile organizations are run by people agile enough to take control of them.

You can wait for permission from an ossified organization or you can use topic maps to take the first “bite.”

Your move.

PS: If you have investments in journal publishing you might want to check on claims handling.

July 2, 2012

Bisociative Knowledge Discovery

Filed under: Bisociative,Graphs,Knowledge Discovery,Marketing,Networks,Topic Maps — Patrick Durusau @ 8:43 am

Bisociative Knowledge Discovery: An Introduction to Concept, Algorithms, Tools, and Applications by Michael R. Berthold. (Lecture Notes in Computer Science, Volume 7250, 2012, DOI: 10.1007/978-3-642-31830-6)

The volume where Berthold’s Towards Bisociative Knowledge Discovery appears.

Follow the links for article abstracts and additional information. “PDFs” are available under Springer Open Access.

If you are familiar with Steve Newcomb’s universes of discourse, this will sound hauntingly familiar.

How will diverse methodologies of bisociative knowledge discovery, being in different universes of discourse, interchange information?

Topic maps anyone?

July 1, 2012

Lessons from Anime and Big Data (Ghost in the Shell)

Filed under: Information Exchange,Information Workers,Marketing,Topic Maps — Patrick Durusau @ 4:45 pm

Lessons from Anime and Big Data (Ghost in the Shell) by James Locus.

From the post:

What lessons might the anime (Japanese animation) “Ghost in the Shell” teach us about the future of big data? The show, originally a graphic novel from creator Masamune Shirow, explores the consequences of a “hyper”-connected society so advanced one is able to download one’s consciousness temporarily into human-like android shells (hence the work’s title). If this sounds familiar, it’s because Ghost in the Shell was a major point of inspiration for the Wachowski brothers, the creators of the Matrix Trilogy.

The ability to handle, process, and manipulate big data is a major theme of the show and focuses on the challenges of a high tech police unit in thwarting potential cyber crimes. The graphic novel was originally created in 1991, long before the concept of big data had grown to prominence (and for-all-intents-and-purposes even before what we now think of as the internet…)

Visions of a “Big Data” Future

While such visions of an interconnected techno-future are common in anime, what makes Ghost in the Shell special is its treatment of the power of big data. Technology is not used simply for its exploitative value, but as a means to create a greater, more capable society. Data becomes the engine that drives an entire civilization towards achieving taller buildings, faster cars, and yes – even androids.

Big data puts many of Ghost in the Shell’s “technological advances” just within reach. The show features almost instantaneous transfers of petabyte hard drives and facial recognition searches about as fast as a Google search. (emphasis added)

A big +1! to technological advances being on the cusp of something transformative, but I am less certain about what that transformation will lead to. While cutting edge research is underway to help amputees, I fully expect the first commercially viable application to be safe, virtual sex (if they are not there already).

We are talking about us. We have a long history of using technology for its exploitative value. In fact, I can’t think of a single example of where technology has not been used for its exploitative value? Can you?

Although Snow Crash is a novel, there is a kernel (sorry) of truth to the proposition that the results of analysis will become items for exchange. That is true now but we lack the exchange mechanisms to make it currency.

People write books, articles, posts, but for the most part, all of those are at too large a level to be reused. We need information libraries that operate like software libraries, that are called for a particular operation. The creator of an information library gets a “credit” for your use of the information.

Not a reality today but overcoming semantic barriers to re-use is a start in that direction. Can settle the question of use of technology for its exploitative value or not, with its use in fact. I know where my money is riding. Yours?

June 30, 2012

R-Uni (A List of Free R Tutorials and Resources in Universities webpages)

Filed under: Marketing,R,Topic Maps — Patrick Durusau @ 6:51 pm

R-Uni (A List of Free R Tutorials and Resources in Universities webpages) by Pairach Piboonrungroj.

A list of eighty-seven (87) university-based resources on R.

I suspect there is a fair amount of duplication just in terms of resources cited at each of those resources.

Duplication/repetition isn’t necessarily bad, but imagine having a unique list of resources on R.

Or tagging in articles on R that link back into resources on R, in case you need a quick reminder on a function.

Time saver?

June 20, 2012

NoLogic (Not only Logic) – #5,000

Filed under: Logic,Marketing — Patrick Durusau @ 8:09 pm

I took the precaution to say “Not only Logic” so I would not have to reach back and invent a soothing explanation for saying “NoLogic.”

The marketing reasons for parroting “NoSQL” are obvious and I won’t belabor them here.

There are some less obvious reasons for saying “NoLogic.”

Logic, as in formal logic (description logic for example), is rarely used by human user. Examples mainly exist in textbooks and journal articles. And of late, in semantic web proposals.

Ask anyone in your office to report the number of times they used formal logic to make a decision in the last week. We both know the most likely answer, by a very large margin.

But we rely upon searches everyday that are based upon the use of digital logic.

Searches that are quite useful in assisting non-logical users but we limit ourselves in refining those search results. By more logic. Which we don’t use ourselves.

Isn’t that odd?

Or take the “curse of dimensionality.” Viewed from the perspective of data mining, Baeza-Yates & Ribeiro-Neto point out that “…a large feature space might render document classifiers impractical.” p.320

Those are features that can be identified with the document.

What of the dimensions of a user who is a former lawyer, theology student, markup editor, Ancient Near Easter amateur, etc., all of which have an impact on how they view any particular document and its relevance to a search result? Or to make connections to another document?

Some of those dimensions would be shared by other users, some would not.

But in either case, human users are untroubled by the “curse of dimensionality.” In part I would suggest because “NoLogic” comes easy for the human user. We may not be able to articulate all the dimensions, but we are likely to pick results similar users will find useful.

We should not forgo logic, either as digital logic or formal reasoning systems, when those assist us.

We should be mindful that logic does not represent all views of the world.

In other words, not only logic (NoLogic).

June 13, 2012

Azure Changes Dress Code, Allows Tuxedos

Filed under: Cloud Computing,Linux OS,Marketing — Patrick Durusau @ 4:12 am

Azure Changes Dress Code, Allows Tuxedos by Robert Gelber.

Had it on my list to mention that Azure is now supporting Linux. Robert summarizes as follows:

Microsoft has released previews of upcoming services on their Azure cloud platform. The company seems focused on simplifying the transition of in-house resources to hybrid or external cloud deployments. Most notable is the ability for end users to create virtual machines with Linux images. The announcement will be live streamed later today at 1 p.m. PST.

Azure’s infrastructure will support CentOS 6.2, OpenSUSE 12.1, SUSE Linux Enterprise Server SP2 and Ubuntu 12.04 VM images. Microsoft has already updated their Azure site to reflect the compatibility. Other VM features include:

  • Virtual Hard Disks – Allowing end users to migrate data between on-site and cloud permises.
  • Workload Migration – Moving SQL Server, Sharepoint, Windows Server or Linux images to cloud services.
  • Common Virtualization Format – Microsoft has made the VHD file format freely available under an open specification promise.

Cloud offerings are changing, perhaps evolving would be a better word, at a rapid pace.

Although standardization may be premature, it is certainly a good time to start gathering information on services, vendors, in a way that cuts across the verbal jungle that is cloud computing PR.

Topic maps anyone?

June 9, 2012

Puppet

Filed under: Marketing,Systems Administration,Systems Research — Patrick Durusau @ 7:15 pm

Puppet

From “What is Puppet?”:

Puppet is IT automation software that helps system administrators manage infrastructure throughout its lifecycle, from provisioning and configuration to patch management and compliance. Using Puppet, you can easily automate repetitive tasks, quickly deploy critical applications, and proactively manage change, scaling from 10s of servers to 1000s, on-premise or in the cloud.

Puppet is available as both open source and commercial software. You can see the differences here and decide which is right for your organization.

How Puppet Works

Puppet uses a declarative, model-based approach to IT automation.

  1. Define the desired state of the infrastructure’s configuration using Puppet’s declarative configuration language.
  2. Simulate configuration changes before enforcing them.
  3. Enforce the deployed desired state automatically, correcting any configuration drift.
  4. Report on the differences between actual and desired states and any changes made enforcing the desired state.

Topic maps seem like a natural for systems administration.

They can capture the experience and judgement of sysadmins that aren’t ever part of printed documentation.

Make sysadmins your allies when introducing topic maps. Part of that will be understanding their problems and concerns.

Being able to intelligently discuss software like Puppet will be a step in the right direction. (Not to mention giving you ideas about topic map applications for systems administration.)

June 4, 2012

Stop Labeling Everything as an Impedance Mismatch!

Filed under: Communication,Marketing — Patrick Durusau @ 4:30 pm

Stop Labeling Everything as an Impedance Mismatch! by Jos Dirksen (DZone Java Lobby).

Jos writes:

I recently ran across an article that was talking (again) about the Object-Relational mismatch. And just like in many articles this mismatch is called the Object-Relational Impedance mismatch. This “impedance mismatch” label isn’t just added when talking about object and relational databases, but pretty much in any situation where we have two concepts that don’t match nicely:

As someone who has abused “semantic impedance” in the past (and probably will in the future), this caught my eye.

Particularly because Jos goes on to say:

…In the way we use it impedance mismatch sounds like a bad thing. In electrical engineering it is just a property of an electronic circuit. In some circuits you might need to have impedance matching, in others you don’t.

Saying we have an object relation impedance mismatch doesn’t mean anything. Yes we have a problem between the OO world and the relation world, no discussion about that. Same goes for the other examples I gave in the beginning of this article. But labelling it with the “impedance mismatch” doesn’t tell us anything about the kind of problem we have. We have a “concept mismatch”, a “model mismatch”, or a “technology mismatch”.

That impedance, a property of every circuit, doesn’t tell us anything, is the important point.

Just as “semantic impedance” doesn’t tell us anything about the nature of the “impedance.”

Or possible ways to reduce it.

Suggestion: Let’s take “semantic impedance” as a universal given.

Next question: What can we do to lessen it in specific situations? With enough details, that’s a question we may be able to answer, in part.

June 3, 2012

Semi-Supervised Named Entity Recognition:… [Marketing?]

Filed under: Entities,Entity Extraction,Entity Resolution,Marketing — Patrick Durusau @ 3:40 pm

Semi-Supervised Named Entity Recognition: Learning to Recognize 100 Entity Types with Little Supervision by David Nadeau (PhD Thesis, University of Ottawa, 2007).

Abstract:

Named Entity Recognition (NER) aims to extract and to classify rigid designators in text such as proper names, biological species, and temporal expressions. There has been growing interest in this field of research since the early 1990s. In this thesis, we document a trend moving away from handcrafted rules, and towards machine learning approaches. Still, recent machine learning approaches have a problem with annotated data availability, which is a serious shortcoming in building and maintaining large-scale NER systems. In this thesis, we present an NER system built with very little supervision. Human supervision is indeed limited to listing a few examples of each named entity (NE) type. First, we introduce a proof-of-concept semi-supervised system that can recognize four NE types. Then, we expand its capacities by improving key technologies, and we apply the system to an entire hierarchy comprised of 100 NE types. Our work makes the following contributions: the creation of a proof-of-concept semi-supervised NER system; the demonstration of an innovative noise filtering technique for generating NE lists; the validation of a strategy for learning disambiguation rules using automatically identified, unambiguous NEs; and finally, the development of an acronym detection algorithm, thus solving a rare but very difficult problem in alias resolution. We believe semi-supervised learning techniques are about to break new ground in the machine learning community. In this thesis, we show that limited supervision can build complete NER systems. On standard evaluation corpora, we report performances that compare to baseline supervised systems in the task of annotating NEs in texts.

Nadeau demonstrates the successful construction of a Named Entity Recognition (NER) system using a few supplied examples for each entity.

But what explains the lack of annotation where the entities are well known? The King James Bible? Search for “Joseph.” We know not all of the occurrences of “Joseph” represent the same entity.

Looking at the client list for Infoglutton, is there a lack of interest in named entity recognition?

Have we focused on techniques and issues that interest us, and then, as an afterthought, tried to market the results to consumers?

May 23, 2012

Merging Market News – 23 May 2012

Filed under: Marketing,Topic Maps — Patrick Durusau @ 6:20 pm

On the merging market front, the need for merging between different IT systems, I read happy news at:

451 Research delivers market sizing estimates for NoSQL, NewSQL and MySQL ecosystem by Matthew Aslett.

From the post:

NoSQL and NewSQL database technologies pose a long-term competitive threat to MySQL’s position as the default database for Web applications, according to a new report published by 451 Research.

The report, MySQL vs. NoSQL and NewSQL: 2011-2015, examines the competitive dynamic between MySQL and the emerging NoSQL non-relational, and NewSQL relational database technologies.

It concludes that while the current impact of NoSQL and NewSQL database technologies on MySQL is minimal, they pose a long-term competitive threat due to their adoption for new development projects. The report includes market sizing and growth estimates, with the key findings as follows:

You can get a copy of the report if you like but the important theme is that different IT vocabularies and approaches are going to be in play.

Which means translation costs between systems are going to sky rocket and be repeated with every IT spasm or change.

Unless you are hired to address integration/migration problems with topic maps of course.

On the database front, I would say things look pretty bright for topic maps!

PS: Any thoughts on how the collapse of Greece or its becoming a failed state is going to impact the merging market?

May 17, 2012

“…Things, Not Strings”

Filed under: Google Knowledge Graph,Marketing,RDF,RDFa,Semantic Web,Topic Maps — Patrick Durusau @ 6:30 pm

The brilliance at Google spreads beyond technical chops and into their marketing department.

Effective marketing can be what you do but what you don’t do as well.

What did Google not do with the Google Knowledge Graph?

Google Knowledge Graph does not require users to:

  • learn RDF/RDFa
  • learn OWL
  • learn various syntaxes
  • build/choose ontologies
  • use SW software
  • wait for authoritative instructions from Mount W3C

What does Google Knowledge Graph do?

It gives users information about things, things that are of interest to users. Using their web browsers.

Let’s see, we can require users to do what we want, or, we can give users what they want.

Which one do you think is the most likely to succeed? (No peeking!)

May 16, 2012

Google Advertises Topic Maps – Breaking News – Please ReTweet

Filed under: Google Knowledge Graph,Marketing,Topic Maps — Patrick Durusau @ 3:50 pm

Actually the post is titled: Introducing the Knowledge Graph: things, not strings.

It reads in part:

Search is a lot about discovery—the basic human need to learn and broaden your horizons. But searching still requires a lot of hard work by you, the user. So today I’m really excited to launch the Knowledge Graph, which will help you discover new information quickly and easily.

Take a query like [taj mahal]. For more than four decades, search has essentially been about matching keywords to queries. To a search engine the words [taj mahal] have been just that—two words.

But we all know that [taj mahal] has a much richer meaning. You might think of one of the world’s most beautiful monuments, or a Grammy Award-winning musician, or possibly even a casino in Atlantic City, NJ. Or, depending on when you last ate, the nearest Indian restaurant. It’s why we’ve been working on an intelligent model—in geek-speak, a “graph”—that understands real-world entities and their relationships to one another: things, not strings.

The Knowledge Graph enables you to search for things, people or places that Google knows about—landmarks, celebrities, cities, sports teams, buildings, geographical features, movies, celestial objects, works of art and more—and instantly get information that’s relevant to your query. This is a critical first step towards building the next generation of search, which taps into the collective intelligence of the web and understands the world a bit more like people do.

Google’s Knowledge Graph isn’t just rooted in public sources such as Freebase, Wikipedia and the CIA World Factbook. It’s also augmented at a much larger scale—because we’re focused on comprehensive breadth and depth. It currently contains more than 500 million objects, as well as more than 3.5 billion facts about and relationships between these different objects. And it’s tuned based on what people search for, and what we find out on the web.

Google just set the bar for search/information appliances, including topic maps.

What is the value add of your appliance when compared to Google?

When people ask me to explain topic maps now I can say:

You know Google’s Knowledge Graph? It’s like that but customized to your interests and data.

(I would just leave it at that. Let them start imagining what they want to do beyond the reach of Google. In their “dark data.”)

Who knew? Google advertising for topic maps. Without any click-through. Amazing.

Mobilizing Knowledge Networks for Development

Filed under: Conferences,Marketing — Patrick Durusau @ 3:35 pm

Mobilizing Knowledge Networks for Development

June 19—20, 2012
The World Bank Group
1818 H Street NW, Washington DC 20433

From the webpage:

The goal of the workshop is to explore ways to become better providers and connectors of knowledge in a world where the sources of knowledge are increasingly diverse and disbursed. At the World Bank, for example, we are seeking ways to connect with new centers of research, emerging communities of practice, and tap the practical experience of development organizations and the policy makers in rapidly developing economies. Our goal is to find better ways to connect those that have the development knowledge with those that need it, when they need it.

We are also seeking to engage research communities and civil society organizations through an Open Development initiative that makes data and publications freely available. We understand that many other organizations are exploring similar initiatives. The Conference and Knowledge fair will provide an opportunity for knowledge organizations working in development to learn from one another about their knowledge services, practices, and successes and challenges in providing these services.

You can register to attend in person or over the Internet.

As always, networking opportunities are what you make of them. This will be a good opportunity to spread the good news about topic maps.

May 10, 2012

CIA/NSA Diff Utility?

Filed under: Intelligence,Marketing,Topic Maps — Patrick Durusau @ 2:40 pm

How much of the data sold to the CIA/NSA is from public resources?

Of the sort you find at Knoema?

Albeit some of it isn’t easy to find but it is public data.

A topic map of public data resources would be a good CIA/NSA Diff Utility so they could avoid paying for data that is freely available on the WWW.

I suppose the fall back position of suppliers would be their “value add.”

With public data sets, the CIA/NSA could put that “value add” to the test. Along the lines of the Netflix competition.

Even if the results weren’t the goal, it would be a good way to discover new techniques and/or analysts.

How would you “diff” public data from that being supplied by a contractor?

« Newer PostsOlder Posts »

Powered by WordPress