Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

July 19, 2012

World Leaders Comment on Attack in Bulgaria

Filed under: Data Mining,Intelligence,Social Media — Patrick Durusau @ 4:53 am

World Leaders Comment on Attack in Bulgaria

From the post:

Following the terror attack in Bulgaria killing a number of Israeli tourists on an airport bus, we can see the statements from world leaders around the globe including Israel Prime Minister Benjamin Netanyahu openly pinning the blame on Iran and threatening retaliation

If you haven’t seen one of the visualizations by Recorded Future you will be impressed by this one. Mousing over people and locations invokes what we would call scoping in a topic map context and limits the number of connections you see. And each node can lead to additional information.

While this works like a topic map, I can’t say it is a topic map application because how it works isn’t disclosed. You can read How Recorded Future Works, but you won’t be any better informed than before you read it.

Impressive work but it isn’t clear how I would integrate their matching of sources to say an internal mapping of sources? Or how I would augment their mapping with additional mappings by internal subject experts?

Or how I would map this incident to prior incidents which lead to disproportionate responses?

Or map “terrorist” attacks by the world leaders now decrying other “terrorist” attacks?

That last mapping could be an interesting one for the application of the term “terrorist.” My anecdotal experience is that it depends on the sponsor.

Would be interesting to know if systematic analysis supports that observation.

Perhaps the news media could then evenly identify the probable sponsors of “terrorists” attacks.

June 2, 2012

Social Meets Search with the Latest Version of Bing…

Filed under: Search Engines,Searching,Social Media — Patrick Durusau @ 10:29 am

Social Meets Search with the Latest Version of Bing…

Two things are obvious:

  • I am running a day behind.
  • Bing isn’t my default search engine. (Or I would have noticed this yesterday.)

From the post:

A few weeks ago, we introduced you to the most significant update to Bing since our launch three years ago, combining the best of search with relevant people from your social networks, including Facebook and Twitter. After the positive response to the preview, the new version of Bing is available today in the US at www.bing.com. You can now access Bing’s new three column design , including the snapshot feature and social features.

According to a recent internal survey, nearly 75 % of people spend more time than they would like searching for information online. With Bing’s new design, you can access information from the Web including friends you do know and relevant experts that you may not know letting you spend less time searching and more time doing.

(screenshot omitted)

Today, we’re also unveiling a new advertising campaign to support the introduction of search plus social and announcing the Bing Summer of Doing, in celebration of the new features and designed to inspire people to do amazing things this summer.

BTW, correcting the HTML code in the post for Bing, www.bing.com.

When I arrived, the “top” searches were:

  • Nazi parents
  • Hosni Mubarak

“Popular” searches ranging from the inane to the irrelevant.

I need something a bit more focused on subjects of interest to me.

Perhaps automated queries that are filtered, then processed into a topic map?

Something to think about over the summer. More posts to follow on that theme.

May 31, 2012

Knowledge Extraction and Consolidation from Social Media

Filed under: Conferences,Knowledge Capture,Social Media — Patrick Durusau @ 7:08 am

Knowledge Extraction and Consolidation from Social Media KECSM2012 – November 11 – 12, Boston, USA.

Important dates

  • Jul 31, 2012: submission deadline full & short papers
  • Aug 21, 2012: notifications for research papers
  • Sep 10, 2012: camera-ready papers due
  • Oct 05, 2012: submission deadline poster & demo abstracts
  • Oct 10, 2012: notifications posters & demos

From the website:

The workshop aims to become a highly interactive research forum for exploring innovative approaches for extracting and correlating knowledge from degraded social media by exploiting the Web of Data. While the workshop’s general focus is on the creation of well-formed and well-interlinked structured data from highly unstructured Web content, its interdisciplinary scope will bring together researchers and practitioners from areas such as the semantic and social Web, text mining and NLP, multimedia analysis, data extraction and integration, and ontology and data mapping. The workshop will also look into innovative applications that exploit extracted knowledge in order to produce solutions to domain-specific needs.

We will welcome high-quality papers about current trends in the areas listed in the following, non-exhaustive list of topics. We will seek application-oriented, as well as more theoretical papers and position papers.

Knowledge detection and extraction (content perspective)

  • Knowledge extraction from text (NLP, text mining)
  • Dealing with scalability and performance issues with regard to large amounts of heterogeneous content
  • Multilinguality issues
  • Knowledge extraction from multimedia (image and video analysis)
  • Sentiment detection and opinion mining from text and audiovisual content
  • Detection and consideration of temporal and dynamics aspects
  • Dealing with degraded Web content

Knowledge enrichment, aggregation and correlation (data perspective)

  • Modelling of events and entities such as locations, organisations, topics, opinions
  • Representation of temporal and dynamics-related aspects
  • Data clustering and consolidation
  • Data enrichment based on linked data/semantic web
  • Using reference datasets to structure, cluster and correlate extracted knowledge
  • Evaluation of automatically extracted data

Exploitation of automatically extracted knowledge/data (application perspective)

  • Innovative applications which make use of automatically extracted data (e.g. for recommendation or personalisation of Web content)
  • Semantic search in annotated Web content
  • Entity-driven navigation of user-generated content
  • Novel navigation and visualisation of extracted knowledge/graphs and associated Web resources

I like the sound of “consolidation.” An unspoken or tacit goal of any knowledge gathering. Not much use in scattered pieces on the shop floor.

Collocated with the 11th International Semantic Web Conference (ISWC2012)

May 10, 2012

EveryBlock

Filed under: Social Media,Social Networks — Patrick Durusau @ 5:43 pm

EveryBlock

I remember my childhood neighborhood just before the advent of air conditioning and the omnipresence of TV. A walk down the block gave you a good idea of what your neighbors were up to. Or not. 😉

Comparing then to now, the neighborhood where I now live, is strangely silent. Walk down my block and you hear no TVs, conversations, radios, loud discussions or the like.

We have become increasingly isolated from others by our means of transportation, entertainment and climate control.

EveryBlock offers the promise of restoring some of the random contact with our neighbors to our lives.

EveryBlock says it solves two problems:

First, there’s no good place to keep track of everything happening in your neighborhood, from news coverage to events to photography. We try to collect all of the news and civic goings-on that have happened recently in your city, and make it simple for you to keep track of news in particular areas.

Second, there’s no good way to post messages to your neighbors online. Facebook lets you post messages to your friends, Twitter lets you post messages to your followers, but no well-used service lets you post a message to people in a given neighborhood.

EveryBlock addresses the problem of geographic blocks, but how do you get information on your professional block?

Do you hear anything unexpected or different? Or do you hear the customary and expected?

Maybe your professional block has gotten too silent.

Suggestions for how to change that?

March 31, 2012

HotSocial 2012

Filed under: Conferences,Data Mining,Social Media — Patrick Durusau @ 4:09 pm

HotSocial 2012: First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research August 12, 2012, Beijing, China (in conjunction with ACM KDD 2012, August 12-16, 2012) http://user.informatik.uni-goettingen.de/~fu/hotsocial/

Important Dates:

Deadline for submissions: May 9, 2012 (11:59 PM, EST)
Notification of acceptance: June 1, 2012
Camera-ready version: June 12, 2012
HotSocial Workshop Day: Aug 12, 2012

From the post:

Among the fundamental open questions are:

  • How to access social networks data? Different communities have different means, each with pros and cons. Experience exchanges from different communities will be beneficial.
  • How to protect these data? Privacy and data protection techniques considering social and legal aspects are required.
  • How the complex systems and graph theory algorithms can be used for understanding social networks? Interdisciplinary collaboration are necessary.
  • Can social network features be exploited for a better computing and social network system design?
  • How do online social networks play a role in real-life (offline) community forming and evolution?
  • How does the human mobility and human interaction influence human behaviors and thus public health? How can we develop methodologies to investigate the public health and their correlates in the context of the social networks?

Topics of Interest:

Main topics of this workshop include (but are not limited to) the following:

  • methods for accessing social networks (e.g., sensor nets, mobile apps, crawlers) and bias correction for use in different communities (e.g., sociology, behavior studies, epidemiology)
  • privacy and ethic issues of data collection and management of large social graphs, leveraging social network properties as well as legal and social constraints
  • application of data mining and machine learning in the context of specific social networks
  • information spread models and campaign detection
  • trust and reputation and community evolution in the online and offline interacted social networks, including the presence and evolution of social identities and social capital in OSNs
  • understanding complex systems and scale-free networks from an interdisciplinary angle
  • interdisciplinary experiences and intermediate results on social network research

Sounds relevant to the “big data” stuff of interest to the White House.

PS: Have you noticed how some blogging software really sucks when you do “view source” on pages? Markup and data should be present. It makes content reuse easier. WordPress does it. How about your blogging software?

February 20, 2012

Social Media Application (FBI RFI)

Filed under: Data Mining,RFI-RFP,Social Media — Patrick Durusau @ 8:35 pm

Social Media Application (FBI RFI)

Current Due Date: 11:00 AM, March 13, 2012

You have to read the Social Media Application.pdf document to prepare a response.

Be aware that as of 20 February 2012, that document has a blank page every other page. I suspect it is the complete document but have written to confirm and to request a corrected document be posted.

Out-Hoover Hoover: FBI wants massive data-mining capability for social media does mention:

Nowhere in this detailed RFI, however, does the FBI ask industry to comment on the privacy implications of such massive data collection and storage of social media sites. Nor does the FBI say how it would define the “bad actors” who would be subjected this type of scrutiny.

I take that to mean that the FBI is not seeking your comments on privacy implications or possible definitions of “bad actors.”

I won’t be able to prepare an official response because I don’t meet the contractor suitability requirements, which include a cost estimate for an offsite server as a solution to the requirements.

I will be going over the requirements and publishing my response here as though I meet the contractor suitability requirements. Could be an interesting exercise.

February 5, 2012

Social Media Monitoring with CEP, pt. 2: Context As Important As Sentiment

Filed under: Context,Sentiment Analysis,Social Media — Patrick Durusau @ 8:04 pm

Social Media Monitoring with CEP, pt. 2: Context As Important As Sentiment by Chris Carlson.

From the post:

When I last wrote about social media monitoring, I made a case for using a technology like Complex Event Processing (“CEP”) to detect rapidly growing and geospatially-oriented social media mentions that can provide early warning detection for the public good (Social Media Monitoring for Early Warning of Public Safety Issues, Oct. 27, 2011).

A recent article by Chris Matyszczyk of CNET highlights the often conflicting and confusing nature of monitoring social media. A 26-year old British citizen, Leigh Van Bryan, gearing up for a holiday of partying in Los Angeles, California (USA), tweeted in British slang his intention to have a good time: “Free this week, for quick gossip/prep before I go and destroy America.” Since I’m not too far removed the culture of youth, I did take this to mean partying, cutting loose, having a good time (and other not-so-current definitions.)

This story does not end happily, as Van Bryan and his friend Emily Bunting were arrested and then sent back to Blighty.

This post will not increase American confidence in the TSAbut does illustrate how context can influence the identification of a subject (or “person of interest”) or to exclude the same.

Context is captured in topic maps using associations. In this particular case, a view of the information on the young man in question would reveal a lack of associations with any known terror suspects, people on the no-fly list, suspicious travel patterns, etc.

Not to imply that having good information leads to good decisions, technology can’t correct that particular disconnect.

January 22, 2012

The Role of Social Networks in Information Diffusion

Filed under: Networks,Social Graphs,Social Media,Social Networks — Patrick Durusau @ 7:35 pm

The Role of Social Networks in Information Diffusion by Eytan Bakshy, Itamar Rosenn, Cameron Marlow and Lada Adamic.

Abstract:

Online social networking technologies enable individuals to simultaneously share information with any number of peers. Quantifying the causal effect of these technologies on the dissemination of information requires not only identification of who influences whom, but also of whether individuals would still propagate information in the absence of social signals about that information. We examine the role of social networks in online information diffusion with a large-scale field experiment that randomizes exposure to signals about friends’ information sharing among 253 million subjects in situ. Those who are exposed are significantly more likely to spread information, and do so sooner than those who are not exposed. We further examine the relative role of strong and weak ties in information propagation. We show that, although stronger ties are individually more influential, it is the more abundant weak ties who are responsible for the propagation of novel information. This suggests that weak ties may play a more dominant role in the dissemination of information online than currently believed.

Sample size: 253 million Facebook users.

Pay attention to the line:

We show that, although stronger ties are individually more influential, it is the more abundant weak ties who are responsible for the propagation of novel information.

If you have an “Web scale” (whatever that means) information delivery issue, you need to not only target CNN and Drudge with press releases but should consider targeting actors with abundant weak ties.

Thinking this could be important in topic map driven applications that “push” novel information into the social network of a large, distributed company. You know how few of us actually read the tiresome broadcast stuff from HR, etc., so what if the important parts were “reported” piecemeal by others?

It is great to have a large functioning topic map but it doesn’t become useful until people make the information it delivers their own and take action based upon it.

January 4, 2012

Data Structure for Social News Streams on Graph Databases

Filed under: Graphs,News,Social Media — Patrick Durusau @ 8:34 am

Data Structure for Social News Streams on Graph Databases

René Pickhardt writes (in part):

I also looked into the case of saving the news stream as a flat file for every user in which the events from his friends are saved for every user. For some reason I thought I had picked up somewhere that facebook is running on such a system. But right now I can’t find the resource anymore. If you can, please tell me! Anyway while studying these different approaches I realized that the flat file approach even though it seems to be primitive makes perfect sense. It scales to infinity and is very fast for reading! Even though I can’t find the resource anymore I will still call this approach the Facebook approach.

I was now wondering how you would store a social news stream in a graph data base like neo4j in a way that you get some nice properties. More specifically I wanted to combine the advantages of both the facebook and the twitter approach and try to get rid of the downfalls. And guess what! To me this seems actually possible on graph data bases. The key Idea is to store the social network and content items created by the users not only in a star topology but also in a list topology ordered by time of occuring events. The crucial part is to maintain this topology which is actually possible in O(1) while Updates occure to the graph. (emphasis in original)

See the post for links to his poster, paper and other interesting material.

Big Brother’s Name is…

Filed under: Marketing,Networks,Social Media,Social Networks — Patrick Durusau @ 7:09 am

not the FBI, CIA, Interpol, Mossad, NSA or any other government agency.

Walmart all but claims that name at: Social Genome.

From the webpage:

In a sense, the social world — all the millions and billions of tweets, Facebook messages, blog postings, YouTube videos, and more – is a living organism itself, constantly pulsating and evolving. The Social Genome is the genome of this organism, distilling it to the most essential aspects.

At the labs, we have spent the past few years building and maintaining the Social Genome itself. We do this using public data on the Web, proprietary data, and a lot of social media. From such data we identify interesting entities and relationships, extract them, augment them with as much information as we can find, then add them to the Social Genome.

For example, when Susan Boyle was first mentioned on the Web, we quickly detected that she was becoming an interesting person in the world of social media. So we added her to the Social Genome, then monitored social media to collect more information about her. Her appearances became events, and the bigger events were added to the Social Genome as well. As another example, when a new coffee maker was mentioned on the Web, we detected and added it to the Social Genome. We strive to keep the Social Genome up to date. For example, we typically detect and add information from a tweet into the Social Genome within two seconds, from the moment the tweet arrives in our labs.

As a result of our effort, the Social Genome is a vast, constantly changing, up-to-date knowledge base, with hundreds of millions of entities and relationships. We then use the Social Genome to perform semantic analysis of social media, and to power a broad array of e-commerce applications. For example, if a user never uses the word “coffee”, but has mentioned many gourmet coffee brands (such as “Kopi Luwak”) in his tweets, we can use the Social Genome to detect the brands, and infer that he is interested in gourmet coffee. As another example, using the Social Genome, we may find that a user frequently mentions movies in her tweets. As a result, when she tweeted “I love salt!”, we can infer that she is probably talking about the movie “salt”, not the condiment (both of which appear as entities in the Social Genome).

Two seconds after you hit “send” on your tweet, it has been stripped, analyzed and added to the Social Genome at WalMart. For every tweet. Plus other data.

How should we respond to this news?

One response is to trust that WalMart and whoever it sells this data trove to, will use the information to enhance your shopping experience and achieve greater fulfilment by balancing shopping against your credit limit.

Another response is to ask for legislation to attempt regulation of a multi-national corporation that is larger than many governments.

Another response is to hold sit-ins and social consciousness raising events at WalMart locations.

My suggestion? One good turn deserves another.

WalMart is owned by someone. Walmart has a board of directors. Walmart has corporate officers. Walmart has managers, sales representatives, attorneys and advertising executives. All of who have information footprints. Perhaps not as public as ours, but they exist. Wny not gather up information on who is running Walmart? Fighting fire with fire as they say. Publish that information so that regulators, stock brokers, divorce lawyers and others can have access to it.

Let’s welcome WalMart as “Little Big Brothers.”

December 21, 2011

Semantic Web Technologies and Social Searching for Librarians – No Buy

Filed under: Searching,Semantic Web,Social Media — Patrick Durusau @ 7:26 pm

Semantic Web Technologies and Social Searching for Librarians By Robin Fay and Michael Sauers.

I don’t remember recommending a no buy on any book on this blog, particularly one I haven’t read, but there is a first time for everything.

Yes, I haven’t read the book because it isn’t available yet.

How do I know to recommend no buy on Robin Fay and Michael Sauers’ “Semantic Web Technologies and Social Searching for Librarians”?

Let’s look at the evidence, starting with the overview:

There are trillions of bytes of information within the web, all of it driven by behind-the-scenes data. Vast quantities of information make it hard to find what’s really important. Here’s a practical guide to the future of web-based technology, especially search. It provides the knowledge and skills necessary to implement semantic web technology. You’ll learn how to start and track trends using social media, find hidden content online, and search for reusable online content, crucial skills for those looking to be better searchers. The authors explain how to explore data and statistics through WolframAlpha, create searchable metadata in Flickr, and give meaning to data and information on the web with Google’s Rich Snippets. Let Robin Fay and Michael Sauers show you how to use tools that will awe your users with your new searching skills.

So, having read this book, you will know:

  • the future of web-based technology, especially search
  • [the] knowledge and skills necessary to implement semantic web technology
  • [how to] start and track trends using social media
  • [how to] find hidden content online
  • [how to] search for reusable online content
  • [how to] explore data and statistics through WolframAlpha
  • [how to] create searchable metadata in Flickr
  • [how to] give meaning to data and information on the web with Google’s Rich Snippets

The other facts you need to consider?

6 x 9 | 125 pp. | $59.95

So, in 125 pages, call it 105, allowing for title page, table of contents and some sort of index, you are going to learn all those skills?

For about the same amount of money, you can get a copy of Modern information retrieval : the concepts and technology behind search by Ricardo Baeza-Yates; Berthier Ribeiro-Neto, which covers only search in 944 pages.

I read a lot of discussion about teaching students to critically evaluate information that they read on the WWW.

Any institution that buys this book needs to implement critical evaluation of information training for its staff/faculty.

December 11, 2011

Klout Search Powered by ElasticSearch, Scala, Play Framework and Akka

Filed under: Social Media,Social Networks — Patrick Durusau @ 9:24 pm

Klout Search Powered by ElasticSearch, Scala, Play Framework and Akka

From the post:

At Klout, we love data and as Dave Mariani, Klout’s VP of Engineering, stated in his latest blog post, we’ve got lots of it! Klout currently uses Hadoop to crunch large volumes of data but what do we do with that data? You already know about the Klout score, but I want to talk about a new feature I’m extremely excited about — search!

Problem at Hand

I just want to start off by saying, search is hard! Yet, the requirements were pretty simple: we needed to create a robust solution that would allow us to search across all scored Klout users. Did I mention it had to be fast? Everyone likes to go fast! The problem is that 100 Million People have Klout (and that was this past September—an eternity in Social Media time) which means our search solution had to scale, scale horizontally.

Well, more of a “testimonial” as the Wizard of Oz would say but the numbers are serious enough to merit further investigation.

Although I must admit that social networking sites are spreading faster than, well, spreading faster that some social contagions.

Unless someone is joining multiple times for each one, for spamming purposes, I suspect some consolidation is in the not too distant future. What happens to all the links, etc., at the services that go away?

Just curious.

November 24, 2011

An R function to analyze your Google Scholar Citations page

Filed under: Citation Indexing,Social Media — Patrick Durusau @ 3:51 pm

An R function to analyze your Google Scholar Citations page

From the post:

Google scholar has now made Google Scholar Citations profiles available to anyone. You can read about these profiles and set one up for yourself here.

I asked John Muschelli and Andrew Jaffe to write me a function that would download my Google Scholar Citations data so I could play with it. Then they got all crazy on it and wrote a couple of really neat functions. All cool/interesting components of these functions are their ideas and any bugs were introduced by me when I was trying to fiddle with the code at the end.

Features include:

The function will download all of Rafa’s citation data and put it in the matrix out. It will also make wordclouds of (a) the co-authors on his papers and (b) the titles of his papers and save them in the pdf file specified (There is an option to turn off plotting if you want).

It can also calculate citation indices.

Scholars are fairly peripatetic these days and so have webpages, projects, courses, not to mention social media postings using various university identities. A topic map would be a nice complement to this function to gather up the “grey” literature that underlies final publications.

November 20, 2011

Graphity: An efficient Graph Model for Retrieving the Top-k News Feeds for users in social networks

Filed under: Graphity,Graphs,Neo4j,Networks,Social Media,Social Networks — Patrick Durusau @ 4:11 pm

Graphity: An efficient Graph Model for Retrieving the Top-k News Feeds for users in social networks by Rene Pickhardt.

From the post:

I already said that my first research results have been submitted to SIGMOD conference to the social networks and graph databases track. Time to sum up the results and blog about them.

I created a data model to make retrieval of social news feeds in social networks very efficient. It is able to dynamically retrieve more than 10’000 temporal ordered news feeds per second in social networks with millions of users like Facebook and Twitter by using graph data bases (like neo4j)

10,000 temporally ordered news feeds per second? I can imagine any number of use cases that fit comfortably within those performance numbers!

How about you?

Looking forward to the paper (and source code)!

November 6, 2011

TimesOpen: Social Media on Nov 14

Filed under: Conferences,Social Media — Patrick Durusau @ 5:45 pm

TimesOpen: Social Media on Nov 14

I won’t be in New York for this event but if you are around, well worth the time to attend! Social media content (and its semantics) are going to figure prominently in some future topic map applications. Get a glimpse of the future!

We’re excited to announce the next TimesOpen event, a discussion of what’s next in social media technology, interfaces and business models–Monday, November 14, starting at 6:30 p.m. in the Times Building conference facility on the 15th floor. Registration is open. There is no cost to attend. Seats are limited.

BTW, if you do attend, I would appreciate a pointer to your posts about the event. Thanks!

« Newer Posts

Powered by WordPress