Archive for the ‘Information Sharing’ Category

…a quantified-self, semantic-analysis tool to track web browsing

Sunday, January 26th, 2014

The New York Times’ R&D Lab is building a quantified-self, semantic-analysis tool to track web browsing

From the post:

Let’s say you work in a modern digital newsroom. Your colleagues are looking at interesting stuff online all day long — reading stimulating news stories, searching down rabbit holes you’ve never thought of. There are probably connections between what the reporter five desks down from you is looking for and what you already know — or vice versa. Wouldn’t it be useful if you could somehow gather up that all that knowledge-questing and turn it into a kind of intraoffice intel?

A version of that vision is what Noah Feehan and others in The New York Times’ R&D Lab is working on with a new system called Curriculum. It started as an in-house browser extension he and Jer Thorp built last year called Semex, which monitored your browsing and, by semantically analyzing the web pages you visit, rendered it as a series of themes.

…if Semex was most useful to me as a way to record my cognitive context, the state in which I left a problem, maybe I could share that state with other people who might need to know it. Sharing topics from my browsing history with a close group of colleagues can afford us insight into one another’s processes, yet is abstracted enough (and constrained to a trusted group) to not feel too invasive…

Each user in a group has a Chrome extension that submits pageviews to a server to perform semantic analysis and publish a private, authenticated feed. (I should note here that the extension ignores any pages using HTTPS, to avoid analyzing emails, bank statements, and other secure pages.) Curriculum is carefully designed to be anonymous; that is, no topic in the feed can be traced back to any one particular user. The anonymity isn’t perfect, of course: because there are only five people using it, and because we five are in very close communication with each other, it is usually not too difficult to figure out who might be researching a particular topic.

Curriculum is kind of like a Fitbit for context, an effortless way to record what’s on our minds throughout the day and make it available to the people who need it most: the people we work with. The function Curriculum performs, that of semantic listening, is fantastically useful when people need to share their contexts (what they were working on, what approaches they were investigating, what problems they’re facing) with each other.

The Curriculum feed is truly a new channel of input for us, a stream of information of a different character than we’ve encountered before. Having access to the residue of our collective web travels has led to many questions, conversations, and jokes that wouldn’t have happened without it. (emphasis added)

Are you ready for real information sharing?

I was rather surprised that anyone in a newsroom would be that sensitive about their browsing history. I would stream mine to the Net if I thought anyone were interested. You might be offended by what you find, but that’s not my problem. 😉

I do know of rumored intelligence service projects that never got off the ground because of information sharing concerns. As well as one state legislature that decided it liked to talk about transparency more than it enjoyed practicing it.

While we call for tearing down data silos (those of others) are we anxious to keep our own personal data silos in place?

Boy Scout Explusions – Oil Drop Semantics

Monday, October 22nd, 2012

Data on decades of Boy Scout expulsions released by Nathan Yau.

Nathan points to an interactive map, searchable list and downloadable data from the Los Angeles Times of data from the Boy Scouts of America on people expelled from the Boy Scouts for suspicions of sexual abuse.

The LA Times has done a great job with this data set (and the story) but it also illustrates a limitation in current data practices.

All of these cases occurred in jurisdictions with laws against sexual abuse of children.

If a local sheriff or district attorney reads about this database, how do they tie it into their databases?

Not at simple as saying “topic map,” if that’s what you were anticipating.

Among the issues that would need addressing:

  • Confidentiality – Law enforcement and courts have their own rules about sharing data.
  • Incompatible System Semantics – The typical problem that is encountered in business enterprises, writ large. Every jurisdiction is likely to have its own rules, semantics and files.
  • Incompatible Data Semantics – Assuming systems talk to each other, the content and its semantics will vary from one jurisdiction to another.
  • Subjects Evading Identification – The subjects (sorry!) in question are trying to avoid identification.

You could get funding for a conference of police administrators to discuss how to organize additional meetings to discuss potential avenues for data sharing and get the DHS to fund a large screen digital TV (not for the meeting, just to have one). Consultants could wax and whine about possible solutions if someday you decided on one.

I have a different suggestion: Grab your records guru and meet up with an overlapping or neighboring jurisdiction’s data guru and one of their guys. For lunch.

Bring note pads and sample records. Talk about how you share information between officers (that you and your counter-part). Let the data gurus talk about how they can share data.

Practical questions of how to share data and what does your data mean now? Make no global decisions, no award medals for attending, etc.

Do that once or twice a month for six months. Write down what worked, what didn’t work (just as important). Each of you picks an additional partner. Share what you have learned.

The documenting and practice at information sharing will be the foundation for more formal information sharing systems. Systems based on documented sharing practices, not how administrators imagine sharing works.

Think of it as “oil drop semantics.”

Start small and increase only as more drops are added.

The goal isn’t a uniform semantic across law enforcement but understanding what is being said. That understanding can be mapped into a topic map or other information sharing strategy. But understanding comes first, mapping second.

The Curse Of Knowledge

Wednesday, August 29th, 2012

The Curse Of Knowledge by Mark Needham.

From the post:

My colleague Anand Vishwanath recently recommended the book ‘Made To Stick‘ and one thing that has really stood out for me while reading it is the idea of the ‘The Curse Of Knowledge’ which is described like so:

Once we know something, we find it hard to imagine what it was like not to know it. Our knowledge has “cursed” us. And it becomes difficult for us to share out knowledge with others, because can’t readily re-create our listeners’ state of mind.

This is certainly something I imagine that most people have experienced, perhaps for the first time at school when we realised that the best teacher of a subject isn’t necessarily the person who is best at the subject.

I’m currently working on an infrastructure team and each week every team does a mini showcase where they show the other teams some of the things they’ve been working on.

It’s a very mixed audience – some very technical people and some not as technical people – so we’ve found it quite difficult to work out how exactly we can explain what we’re doing in a way that people will be able to understand.

A lot of what we’re doing is quite abstract/not very visible and the first time we presented we assumed that some things were ‘obvious’ and didn’t need an explanation.
….

Sounds like a problem that teachers/educators have been wrestling with for a long time.

Read the rest of Mark’s post, then find a copy of Made to Stick.

And/or, find a really good teacher and simply observe them teaching.

Semantic Silver Bullets?

Wednesday, August 1st, 2012

The danger of believing in silver bullets

Nick Wakeman writes in the Washington Technology Business Beat:

Whether it is losing weight, getting rich or managing government IT, it seems we can’t resist the lure of a silver bullet. The magic pill. The easy answer.

Ten or 12 years ago, I remember a lot of talk about leasing and reverse auctions, and how they were going to transform everything.

Since then, outsourcing and insourcing have risen and fallen from favor. Performance-based contracting was going to be the solution to everything. And what about the huge systems integration projects like Deepwater?

They start with a bang and end with a whimper, or in some cases, a moan and a whine. And of course, along the way, millions and even billions of dollars get wasted.

I think we are in the midst of another silver bullet phenomenon with all the talk around cloud computing and everything as a service.

I wish I could say that topic maps are a semantic silver bullet. Or better yet, a semantic hand grenade. One that blows other semantic approaches away.

Truthfully, topic maps are neither one.

Topic maps rely upon users, assisted by various technologies, to declare and identify subjects they want to talk about and, just as importantly, relationships between those subjects. Not to mention where information about those subjects can be found.

If you need evidence of the difficulty of those tasks, consider the near idiotic results you get from search engines. Considering the task they do pretty good but pretty good still takes time and effort to sort out every time you search.

Topic maps aren’t easy, no silver bullet, but you can capture subjects of interest to you, define their relationships to other subjects and specify where more information can be found.

Once captured, that information can be shared, used and/or merged with information gathered by others.

Bottom line is that better semantic results, for sharing, for discovery, for navigation, all require hard work.

Are you ready?

International Conference on Knowledge Management and Information Sharing

Wednesday, February 15th, 2012

International Conference on Knowledge Management and Information Sharing

Regular Paper Submission: April 17, 2012
Authors Notification (regular papers): June 12, 2012
Final Regular Paper Submission and Registration: July 4, 2012

From the call for papers:

Knowledge Management (KM) is a discipline concerned with the analysis and technical support of practices used in an organization to identify, create, represent, distribute and enable the adoption and leveraging of good practices embedded in collaborative settings and, in particular, in organizational processes. Effective knowledge management is an increasingly important source of competitive advantage, and a key to the success of contemporary organizations, bolstering the collective expertise of its employees and partners.

Information Sharing (IS) is a term used for a long time in the information technology (IT) lexicon, related to data exchange, communication protocols and technological infrastructures. Although standardization is indeed an essential element for sharing information, IS effectiveness requires going beyond the syntactic nature of IT and delve into the human functions involved in the semantic, pragmatic and social levels of organizational semiotics.

The two areas are intertwined as information sharing is the foundation for knowledge management.

Part of IC3K 2012 – International Joint Conference on Knowledge Discovery, Knowledge Engineering and Knowledge Management.

Although all three conferences at IC3K 2012 will be of interest to topic mappers, the line:

Although standardization is indeed an essential element for sharing information, IS effectiveness requires going beyond the syntactic nature of IT and delve into the human functions involved in the semantic, pragmatic and social levels of organizational semiotics.

did catch my attention.

I am not sure that I would treat syntactic standardization as a prerequisite for sharing information. If anything, syntactic diversity increases more quickly than semantic diversity, as every project to address the latter starts by claiming a need to address the former.

Let’s start with extant syntaxes, whether COBOL, relational tables, topic maps, RDF, etc., and specify semantics that we wish to map between them. To see if there is any ROI. If not, stop there and select other data sets. If yes, then specify only so much in the way of syntax/semantics as results in ROI.

Don’t have to plan on integrating all the data from all federal agencies. Just don’t do anything inconsistent with that as a long term goal. Like failing to document why you arrived at particular mappings. (You will forget by tomorrow or the next day.)