Archive for the ‘Social Media’ Category

Everything You Need To Know About Social Media Search

Sunday, December 14th, 2014

Everything You Need To Know About Social Media Search by Olsy Sorokina.

From the post:

For the past decade, social networks have been the most universally consistent way for us to document our lives. We travel, build relationships, accomplish new goals, discuss current events and welcome new lives—and all of these events can be traced on social media. We have created hashtags like #ThrowbackThursday and apps like Timehop to reminisce on all the past moments forever etched in the social web in form of status updates, photos, and 140-character phrases.

Major networks demonstrate their awareness of the role they play in their users’ lives by creating year-end summaries such as Facebook’s Year in Review, and Twitter’s #YearOnTwitter. However, much of the emphasis on social media has been traditionally placed on real-time interactions, which often made it difficult to browse for past posts without scrolling down for hours on end.

The bias towards real-time messaging has changed in a matter of a few days. Over the past month, three major social networks announced changes to their search functions, which made finding old posts as easy as a Google search. If you missed out on the news or need a refresher, here’s everything you need to know.

I suppose Olsy means in addition to search in general sucking.

Interested tidbit on Facebook:


This isn’t Facebook’s first attempt at building a search engine. The earlier version of Graph Search gave users search results in response to longer-form queries, such as “my friends who like Game of Thrones.” However, the semantic search never made it to the mobile platforms; many supposed that using complex phrases as search queries was too confusing for an average user.

Does anyone have any user research on the ability of users to use complex phrases as search queries?

I ask because if users have difficulty authoring “complex” semantics and difficulty querying with “complex” semantics, it stands to reason they may have difficulty interpreting “complex” semantic results. Yes?

If all three of those are the case, then how do we impart the value-add of “complex” semantics without tripping over one of those limitations?

Osly also covers Instagram and Twitter. Twitter’s advanced search looks like the standard include/exclude, etc. type of “advanced” search. “Advanced” maybe forty years ago in the early OPACs but not really “advanced” now.

Catch up on these new search features. They will provide at least a minimum of grist for your topic map mill.

The 2014 Social Media Glossary: 154 Essential Definitions

Saturday, October 25th, 2014

The 2014 Social Media Glossary: 154 Essential Definitions by Matt Foulger.

From the post:

Welcome to the 2014 edition of the Hootsuite Social Media Glossary. This is a living document that will continue to grow as we add more terms and expand our definitions. If there’s a term you would like to see added, let us know in the comments!

I searched but did not find an earlier version of this glossary on the Hootsuite blog. I have posted a comment asking for pointers to the earlier version(s).

In the meantime, you may want to compare: The Ultimate Glossary: 120 Social Media Marketing Terms Explained by Kipp Bodnar. From 2011 but if you don’t know the terms, even a 2011 posting may be helpful.

We all accept the notion that language evolves but within domains that evolution is gradual and as thinking in that domain shifts, making it harder for domain members to see it.

Tracking a rapidly changing vocabulary, such as the one used in social media, might be more apparent.

Web Apps in the Cloud: Even Astronomers Can Write Them!

Wednesday, October 22nd, 2014

Web Apps in the Cloud: Even Astronomers Can Write Them!

From the post:

Philip Cowperthwaite and Peter K. G. Williams work in time-domain astronomy at Harvard. Philip is a graduate student working on the detection of electromagnetic counterparts to gravitational wave events, and Peter studies magnetic activity in low-mass stars, brown dwarfs, and planets.

Astronomers that study GRBs are well-known for racing to follow up bursts immediately after they occur — thanks to services like the Gamma-ray Coordinates Network (GCN), you can receive an email with an event position less than 30 seconds after it hits a satellite like Swift. It’s pretty cool that we professionals can get real-time notification of stars exploding across the universe, but it also seems like a great opportunity to convey some of the excitement of cutting-edge science to the broader public. To that end, we decided to try to expand the reach of GCN alerts by bringing them on to social media. Join us for a surprisingly short and painless tale about the development of YOITSAGRB, a tiny piece of Python code on the Google App Engine that distributes GCN alerts through the social media app Yo.

If you’re not familiar with Yo, there’s not much to know. Yo was conceived as a minimalist social media experience: users can register a unique username and send each other a message consisting of “Yo,” and only “Yo.” You can think of it as being like Twitter, but instead of 140 characters, you have zero. (They’ve since added more features such as including links with your “Yo,” but we’re Yo purists so we’ll just be using the base functionality.) A nice consequence of this design is that the Yo API is incredibly straightforward, which is convenient for a “my first web app” kind of project.

While “Yo” has been expanded to include more content, the origin remains an illustration of the many meanings that can be signaled by the same term. In this case, the detection of a gamma-ray burst in the known universe.

Or “Yo” could mean it is time to start some other activity when received from a particular sender. Or even be a message composed entirely of “Yo’s” where different senders had some significance. Or “Yo’s” sent at particular times to compose a message. Or “Yo’s” sent to leave the impression that messages were being sent. 😉

So, does a “Yo” have any semantics separate and apart from that read into it by a “Yo” recipient?

Twitter and the Arab Spring

Sunday, September 7th, 2014

You may remember that “effective use of social media” was claimed as a hallmark of the Arab Spring. (The Arab Spring and the impact of social media and Opening Closed Regimes: What Was the Role of Social Media During the Arab Spring?)

When evaluating such claims remember that your experience with social media may or may not represent the experience with social media elsewhere.

For example, Citizen Engagement and Public Services in the Arab World: The Potential of Social Media from Mohammed Bin Rashid School of Government (2014) reports:

Figure 23: Egypt 22.4% Facebook User Penetration

Figure 34: Egypt 1.26% Twitter user penetration rate.

Those figures are as of 2014. Figures for prior years are smaller.

That doesn’t sound like a level of social media necessary for create and then drive a social movement like the Arab Spring.

You can find additional datasets and additional information at: http://www.arabsocialmediareport.com. Registration is free.

And check out: Mohammed Bin Rashid School of Government

I first saw this in a tweet by Peter W. Singer.

Conference on Weblogs and Social Media (Proceedings)

Saturday, May 31st, 2014

Proceedings of the Eighth International Conference on Weblogs and Social Media

A great collection of fifty-eight papers and thirty-one posters on weblogs and social media.

Not directly applicable to topic maps but social media messages are as confused, ambiguous, etc., as any area could be. Perhaps more so but there isn’t a reliable measure for semantic confusion that I am aware of to compare different media.

These papers may give you some insight into social media and useful ways for processing its messages.

I first saw this in a tweet by Ben Hachey.

Social Media Mining: An Introduction

Saturday, April 26th, 2014

Social Media Mining: An Introduction by Reza Zafarani, Mohammad Ali Abbasi, and Huan Liu.

From the webpage:

The growth of social media over the last decade has revolutionized the way individuals interact and industries conduct business. Individuals produce data at an unprecedented rate by interacting, sharing, and consuming content through social media. Understanding and processing this new type of data to glean actionable patterns presents challenges and opportunities for interdisciplinary research, novel algorithms, and tool development. Social Media Mining integrates social media, social network analysis, and data mining to provide a convenient and coherent platform for students, practitioners, researchers, and project managers to understand the basics and potentials of social media mining. It introduces the unique problems arising from social media data and presents fundamental concepts, emerging issues, and effective algorithms for network analysis and data mining. Suitable for use in advanced undergraduate and beginning graduate courses as well as professional short courses, the text contains exercises of different degrees of difficulty that improve understanding and help apply concepts, principles, and methods in various scenarios of social media mining.

Another Cambridge University Press title that is available in pre-publication PDF format.

If you are contemplating writing a textbook, Cambridge University Press access policies should be one of your considerations in seeking a publisher.

You can download the entire books, chapters, and slides from Social Media Mining: An Introduction

Do remember that only 14% of the U.S. adult population uses Twitter. Whatever “trends” you extract from Twitter may or may not reflect “trends” in the larger population.

I first saw this in a tweet by Stat Fact.

Are You A Facebook Slacker? (Or, “Don’t “Like” Me, Support Me!”)

Sunday, November 10th, 2013

Their title reads: The Nature of Slacktivism: How the Social Observability of an Initial Act of Token Support Affects Subsequent Prosocial Action by Kirk Kristofferson, Katherine White, John Peloza. (Kirk Kristofferson, Katherine White, John Peloza. The Nature of Slacktivism: How the Social Observability of an Initial Act of Token Support Affects Subsequent Prosocial Action. Journal of Consumer Research, 2013; : 000 DOI: 10.1086/674137)

Abstract:

Prior research offers competing predictions regarding whether an initial token display of support for a cause (such as wearing a ribbon, signing a petition, or joining a Facebook group) subsequently leads to increased and otherwise more meaningful contributions to the cause. The present research proposes a conceptual framework elucidating two primary motivations that underlie subsequent helping behavior: a desire to present a positive image to others and a desire to be consistent with one’s own values. Importantly, the socially observable nature (public vs. private) of initial token support is identified as a key moderator that influences when and why token support does or does not lead to meaningful support for the cause. Consumers exhibit greater helping on a subsequent, more meaningful task after providing an initial private (vs. public) display of token support for a cause. Finally, the authors demonstrate how value alignment and connection to the cause moderate the observed effects.

From the introduction:

We define slacktivism as a willingness to perform a relatively costless, token display of support for a social cause, with an accompanying lack of willingness to devote significant effort to enact meaningful change (Davis 2011; Morozov 2009a).

From the section: The Moderating Role of Social Observability: The Public versus Private Nature of Support:

…we anticipate that consumers who make an initial act of token support in public will be no more likely to provide meaningful support than those who engaged in no initial act of support.

Four (4) detailed studies and an extensive review of the literature are offered to support the author’s conclusions.

The only source that I noticed missing was:

10 Two men went up into the temple to pray; the one a Pharisee, and the other a publican.

11 The Pharisee stood and prayed thus with himself, God, I thank thee, that I am not as other men are, extortioners, unjust, adulterers, or even as this publican.

12 I fast twice in the week, I give tithes of all that I possess.

13 And the publican, standing afar off, would not lift up so much as his eyes unto heaven, but smote upon his breast, saying, God be merciful to me a sinner.

14 I tell you, this man went down to his house justified rather than the other: for every one that exalteth himself shall be abased; and he that humbleth himself shall be exalted.

King James Version, Luke 18: 10-14.

The authors would reverse the roles of the Pharisee and the publican, to find the Pharisee contributes “meaningful support,” and the publican has not.

We contrast token support with meaningful support, which we define as consumer contributions that require a significant cost, effort, or behavior change in ways that make tangible contributions to the cause. Examples of meaningful support include donating money and volunteering time and skills.

If you are trying to attract “meaningful support” for your cause or organization, i.e., avoid slackers, there is much to learn here.

If you are trying to move beyond the “cheap grace” (Bonhoeffer)* of “meaningful support” and towards “meaningful change,” there is much to be learned here as well.

Governments, corporations, ad agencies and even your competitors are manipulating the public understanding of “meaningful support” and “meaningful change.” And acceptable means for both.

You can play on their terms and lose, or you can define your own terms and roll the dice.

Questions?


* I know the phrase “cheap grace” from Bonhoeffer but in running a reference to ground, I saw a statement in Wikipedia that Bonhoeffer learned that phrase from Adam Clayton Powell, Sr.. Homiletics have never been a strong interest of mine but I will try to run down some sources on sermons by Adam Clayton Powell, Sr.

Twitter Data Analytics

Wednesday, September 11th, 2013

Twitter Data Analytics by Shamanth Kumar, Fred Morstatter, and Huan Liu.

From the webpage:

Social media has become a major platform for information sharing. Due to its openness in sharing data, Twitter is a prime example of social media in which researchers can verify their hypotheses, and practitioners can mine interesting patterns and build realworld applications. This book takes a reader through the process of harnessing Twitter data to find answers to intriguing questions. We begin with an introduction to the process of collecting data through Twitter’s APIs and proceed to discuss strategies for curating large datasets. We then guide the reader through the process of visualizing Twitter data with realworld examples, present challenges and complexities of building visual analytic tools, and provide strategies to address these issues. We show by example how some powerful measures can be computed using various Twitter data sources. This book is designed to provide researchers, practitioners, project managers, and graduate students new to the field with an entry point to jump start their endeavors. It also serves as a convenient reference for readers seasoned in Twitter data analysis.

Preprint with data set on analyzing Twitter data.

Although running a scant seventy-nine (79) pages, including an index, Twitter Data Analytics (TDA) covers:

Each chapter end with suggestions for further reading and references.

In addition to learning more about Twitter and its APIs, the reader will be introduced to MondoDB, JUNG and D3.

No mean accomplishment for seventy-nine (79) pages!

Social Remains Isolated From ‘Business-Critical’ Data

Wednesday, August 14th, 2013

Social Remains Isolated From ‘Business-Critical’ Data by Aarti Shah.

From the post:

Social data — including posts, comments and reviews — are still largely isolated from business-critical enterprise data, according to a new report from the Altimeter Group.

The study considered 35 organizations — including Caesar’s Entertainment and Symantec — that use social data in context with enterprise data, defined as information collected from CRM, business intelligence, market research and email marketing, among other sources. It found that the average enterprise-class company owns 178 social accounts and 13 departments — including marketing, human resources, field sales and legal — are actively engaged on social platforms.

“Organizations have invested in social media and tools are consolidating but it’s all happening in a silo,” said Susan Etlinger, the report’s author. “Tools tend to be organized around departments because that’s where budgets live…and the silos continue because organizations are designed for departments to work fairly autonomously.”

Somewhat surprisingly, the report finds social data is often difficult to integrate because it is touched by so many organizational departments, all with varying perspectives on the information. The report also notes the numerous nuances within social data make it problematic to apply general metrics across the board and, in many organizations, social data doesn’t carry the same credibility as its enterprise counterpart. (emphasis added)

Isn’t the definition of a silo the organization of data from a certain perspective?

If so, why would it be surprising that different views on data make it difficult to integrate?

Viewing data from one perspective isn’t the same as viewing it from another perspective.

Not really a question of integration but of how easy/hard it is to view data from a variety of equally legitimate perspectives.

Rather than a quest for “the” view shouldn’t we be asking users: “What view serves you best?”

AAAI – Weblogs and Social Media

Tuesday, July 9th, 2013

Seventh International AAAI Conference on Weblogs and Social Media

Abstracts and papers from the Seventh International AAAI Conference on Weblogs and Social Media.

Much to consider:

Frontmatter: Six (6) entries.

Full Papers: Sixty-nine (69) entries.

Poster Papers: Eighteen (18) entries.

Demonstration Papers: Five (5) entries.

Computational Personality Recognition: Ten (10) entries.

Social Computing for Workforce 2.0: Seven (7) entries.

Social Media Visualization: Four (4) entries.

When the City Meets the Citizen: Nine (9) entries.

Be aware that the links for tutorials and workshops only give you the abstracts describing the tutorials and workshops.

There is the obligatory “blind men and the elephant” paper:

Blind Men and the Elephant: Detecting Evolving Groups in Social News

Abstract:

We propose an automated and unsupervised methodology for a novel summarization of group behavior based on content preference. We show that graph theoretical community evolution (based on similarity of user preference for content) is effective in indexing these dynamics. Combined with text analysis that targets automatically-identified representative content for each community, our method produces a novel multi-layered representation of evolving group behavior. We demonstrate this methodology in the context of political discourse on a social news site with data that spans more than four years and find coexisting political leanings over extended periods and a disruptive external event that lead to a significant reorganization of existing patterns. Finally, where there exists no ground truth, we propose a new evaluation approach by using entropy measures as evidence of coherence along the evolution path of these groups. This methodology is valuable to designers and managers of online forums in need of granular analytics of user activity, as well as to researchers in social and political sciences who wish to extend their inquiries to large-scale data available on the web.

It is a great paper but commits a common error when it notes:

Like the parable of Blind Men and the Elephant2, these techniques provide us with disjoint, specific pieces of information.

Yes, the parable is oft told to make a point about partial knowledge, but the careful observer will ask:

How are we different from the blind men trying to determine the nature of an elephant?

Aren’t we also blind men trying to determine the nature of blind men who are examining an elephant?

And so on?

Not that being blind men should keep us from having opinions, but it should may us wary of how deeply we are attached to them.

Not only are there elephants all the way down, there are blind men before, with (including ourselves) and around us.

Data Socializing

Tuesday, April 23rd, 2013

If you need more opportunities for data socializing, KDNuggets has complied: Top 30 LinkedIn Groups for Analytics, Big Data, Data Mining, and Data Science.

Here’s an interesting test:

Write down your LinkedIn groups and compare your list to this one.

Enjoy!

ViralSearch: How Viral Content Spreads over Twitter

Wednesday, March 6th, 2013

ViralSearch: How Viral Content Spreads over Twitter by Andrew Vande Moere.

From the post:

ViralSearch [microsoft.com], developed by Jake Hofman and others of Microsoft Research, visualizes how content spreads over social media, and Twitter in particular.

ViralSearch is based on hundred thousands of stories that are spread through billions of mentions of these stories, over many generations. In particular, it reveals the typical, hidden structures behind the sharing of viral videos, photos and posts as an hierarchical generation tree or as an animated bubble graph. The interface contains an interactive timeline of events, as well as a search field to explore specific phrases, stories, or Twitter users to provide an overview of how the independent actions of many individuals make content go viral.

As this tool seems only to be available within Microsoft, you can only enjoy it by watching the documentary video below.

See also NYTLabs Cascade: How Information Propagates through Social Media for a visualization of a very similar concept.

Impressive graphics!

Question: If and when you have an insight while viewing a social networking graphic, where do you capture that insight?

That is how do you link your insight into a particular point in the graphic?

The Swipp API: Creating the World’s Social Intelligence

Monday, February 4th, 2013

The Swipp API: Creating the World’s Social Intelligence by Greg Bates.

From the post:

The Swipp API allows developers to integrate Swipp’s “Social Intelligence” into their sites and applications. Public information is not available on the API; interested parties are asked to email info@swipp.com. Once available the APIs will “make it possible for people to interact around any topic imaginable.”

[graphic omitted]

Having operated in stealth mode for 2 years, Swipp founders Don Thorson and Charlie Costantini decided to go public after Facebook’s release of it’s somewhat different competitor, the social graph. The idea is to let users rate any topic they can comment on or anything they can photograph. Others can chime in, providing an average rating by users. One cool difference: you can dislike something as well as like it, giving a rating from -5 to +5. According to Darrell Etherington at Techcrunch, the company has a three-pronged strategy of a consumer app just described, a business component tailored around specific events like the Superbowl, that will help businesses target specific segments.

A fact that seems to be lost in most discussions of social media/sites is that social intelligence already exists.

Social media/sites may assist in the capturing/recording of social intelligence but that isn’t the same thing as creating social intelligence.

It is an important distinction because understanding the capture/recording role enables us to focus on what we want to capture and in what way?

What we decide to capture or record greatly influences the utility of the social intelligence we gather.

Such as capturing how users choose to identify particular subjects or relationships between subjects, for example.

PS: The goal of Swipp is to create a social network and ratings system (like Facebook) that is open for re-use elsewhere on the web. Adding semantic integration to that social networks and ratings system would be a plus I would imagine.

REVIEW: Crawling social media and depicting social networks with NodeXL [in 3 parts]

Friday, February 1st, 2013

REVIEW: Crawling social media and depicting social networks with NodeXL by Eruditio Loginquitas.appears in three parts: Part 1 of 3, Part 2 of 3 and Part 3 of 3.

From part 1:

Surprisingly, given the complexity of the subject matter and the various potential uses by researchers from a range of fields, “Analyzing…” is a very coherent and highly readable text. The ideas are well illustrated throughout with full-color screenshots.

In the introduction, the authors explain that this is a spatially organized book—in the form of an organic tree. The early chapters are the roots which lay the groundwork of social media and social network analysis. Then, there is a mid-section that deals with how to use the NodeXL add-on to Excel. Finally, there are chapters that address particular social media platforms and how data is extracted and analyzed from each type. These descriptors include email, thread networks, Twitter, Facebook, WWW hyperlink networks, Flickr, YouTube, and wiki networks. The work is surprisingly succinct, clear, and practical.

Further, it is written with such range that it can serve as an introductory text for newcomers to social network analysis (me included) as well as those who have been using this approach for a while (but may need to review the social media and data crawling aspects). Taken in total, this work is highly informative, with clear depictions of the social and technical sides of social media platforms.

From part 2:

One of the strengths of “Analyzing Social Media Networks with NodeXL” is that it introduces a powerful research method and a tool that helps tap electronic media and non-electronic social network information intelligently, in a way that does not over-state what is knowable. The authors, Derek Hansen, Ben Schneiderman, and Marc A. Smith, are no strangers to research or academic publishing, and theirs is a fairly conservative approach in terms of what may be asserted.

To frame what may be researched, the authors use a range of resources: some generalized research questions, examples from real-world research, and step-by-step techniques for data extraction, analysis, visualization, and then further analysis.

From part 3:

What is most memorable about “Analyzing Social Media Networks with NodeXL” is the depth of information about the various social network sites that may be crawled using NodeXL. With so many evolving social network platforms, and each capturing and storing information differently, it helps to know what an actual data extractions mean.

I haven’t seen the book personally, but from this review it sounds like a good model for technical writing for a lay audience.

For that matter, a good model for writing about topic maps for a lay audience. (Many of the issues being similar.)

1 Billion Videos = No Reruns

Monday, January 14th, 2013

Viki Video: 1 Billion Videos in 150 languages Means Never Having to Say Rerun by Greg Bates.

from the post:

Tried of American TV? Tired of TV in English? Escape to Viki, the leading global TV and movie network, which provides videos with crowd sourced translations in 150 languages. The Viki API allows your users to browse more than 1 billion videos by genre, country, and language, plus search across the entire database. The API uses OAuth2.0 authentication, REST, with responses in either JSON or XML.

The Viki Platform Google Group.

Now this looks like a promising data set!

A couple of use cases for topic maps come to mind:

  • Entry in OPAC points patron mapping from catalog to videos from this database.
  • Entry returned from database maps to book in local library collection (via WorldCat) (more likely to appeal to me).

What use cases do you see?

Windows into Relational Events: Data Structures for Contiguous Subsequences of Edges

Friday, September 28th, 2012

Windows into Relational Events: Data Structures for Contiguous Subsequences of Edges by Michael J. Bannister, Christopher DuBois, David Eppstein, Padhraic Smyth.

Abstract:

We consider the problem of analyzing social network data sets in which the edges of the network have timestamps, and we wish to analyze the subgraphs formed from edges in contiguous subintervals of these timestamps. We provide data structures for these problems that use near-linear preprocessing time, linear space, and sublogarithmic query time to handle queries that ask for the number of connected components, number of components that contain cycles, number of vertices whose degree equals or is at most some predetermined value, number of vertices that can be reached from a starting set of vertices by time-increasing paths, and related queries.

Among other interesting questions, raises the issue of what time span of connections constitutes a network of interest? More than being “dynamic.” A definitional issue for the social network in question.

If you are working with social networks, a must read.

PS: You probably need to read: Relational events vs graphs, a posting by David Eppstein.

David details several different terms for “relational event data,” and says there are probably others they did not find. (Topic maps anyone?)

The Art of Social Media Analysis with Twitter and Python

Friday, July 20th, 2012

The Art of Social Media Analysis with Twitter and Python by Krishna Sankar.

All that social media data in your topic map has to come from somewhere. 😉

Covers both the basics of the Twitter API and social graph analysis. With code of course.

I first saw this at KDNuggets.

World Leaders Comment on Attack in Bulgaria

Thursday, July 19th, 2012

World Leaders Comment on Attack in Bulgaria

From the post:

Following the terror attack in Bulgaria killing a number of Israeli tourists on an airport bus, we can see the statements from world leaders around the globe including Israel Prime Minister Benjamin Netanyahu openly pinning the blame on Iran and threatening retaliation

If you haven’t seen one of the visualizations by Recorded Future you will be impressed by this one. Mousing over people and locations invokes what we would call scoping in a topic map context and limits the number of connections you see. And each node can lead to additional information.

While this works like a topic map, I can’t say it is a topic map application because how it works isn’t disclosed. You can read How Recorded Future Works, but you won’t be any better informed than before you read it.

Impressive work but it isn’t clear how I would integrate their matching of sources to say an internal mapping of sources? Or how I would augment their mapping with additional mappings by internal subject experts?

Or how I would map this incident to prior incidents which lead to disproportionate responses?

Or map “terrorist” attacks by the world leaders now decrying other “terrorist” attacks?

That last mapping could be an interesting one for the application of the term “terrorist.” My anecdotal experience is that it depends on the sponsor.

Would be interesting to know if systematic analysis supports that observation.

Perhaps the news media could then evenly identify the probable sponsors of “terrorists” attacks.

Social Meets Search with the Latest Version of Bing…

Saturday, June 2nd, 2012

Social Meets Search with the Latest Version of Bing…

Two things are obvious:

  • I am running a day behind.
  • Bing isn’t my default search engine. (Or I would have noticed this yesterday.)

From the post:

A few weeks ago, we introduced you to the most significant update to Bing since our launch three years ago, combining the best of search with relevant people from your social networks, including Facebook and Twitter. After the positive response to the preview, the new version of Bing is available today in the US at www.bing.com. You can now access Bing’s new three column design , including the snapshot feature and social features.

According to a recent internal survey, nearly 75 % of people spend more time than they would like searching for information online. With Bing’s new design, you can access information from the Web including friends you do know and relevant experts that you may not know letting you spend less time searching and more time doing.

(screenshot omitted)

Today, we’re also unveiling a new advertising campaign to support the introduction of search plus social and announcing the Bing Summer of Doing, in celebration of the new features and designed to inspire people to do amazing things this summer.

BTW, correcting the HTML code in the post for Bing, www.bing.com.

When I arrived, the “top” searches were:

  • Nazi parents
  • Hosni Mubarak

“Popular” searches ranging from the inane to the irrelevant.

I need something a bit more focused on subjects of interest to me.

Perhaps automated queries that are filtered, then processed into a topic map?

Something to think about over the summer. More posts to follow on that theme.

Knowledge Extraction and Consolidation from Social Media

Thursday, May 31st, 2012

Knowledge Extraction and Consolidation from Social Media KECSM2012 – November 11 – 12, Boston, USA.

Important dates

  • Jul 31, 2012: submission deadline full & short papers
  • Aug 21, 2012: notifications for research papers
  • Sep 10, 2012: camera-ready papers due
  • Oct 05, 2012: submission deadline poster & demo abstracts
  • Oct 10, 2012: notifications posters & demos

From the website:

The workshop aims to become a highly interactive research forum for exploring innovative approaches for extracting and correlating knowledge from degraded social media by exploiting the Web of Data. While the workshop’s general focus is on the creation of well-formed and well-interlinked structured data from highly unstructured Web content, its interdisciplinary scope will bring together researchers and practitioners from areas such as the semantic and social Web, text mining and NLP, multimedia analysis, data extraction and integration, and ontology and data mapping. The workshop will also look into innovative applications that exploit extracted knowledge in order to produce solutions to domain-specific needs.

We will welcome high-quality papers about current trends in the areas listed in the following, non-exhaustive list of topics. We will seek application-oriented, as well as more theoretical papers and position papers.

Knowledge detection and extraction (content perspective)

  • Knowledge extraction from text (NLP, text mining)
  • Dealing with scalability and performance issues with regard to large amounts of heterogeneous content
  • Multilinguality issues
  • Knowledge extraction from multimedia (image and video analysis)
  • Sentiment detection and opinion mining from text and audiovisual content
  • Detection and consideration of temporal and dynamics aspects
  • Dealing with degraded Web content

Knowledge enrichment, aggregation and correlation (data perspective)

  • Modelling of events and entities such as locations, organisations, topics, opinions
  • Representation of temporal and dynamics-related aspects
  • Data clustering and consolidation
  • Data enrichment based on linked data/semantic web
  • Using reference datasets to structure, cluster and correlate extracted knowledge
  • Evaluation of automatically extracted data

Exploitation of automatically extracted knowledge/data (application perspective)

  • Innovative applications which make use of automatically extracted data (e.g. for recommendation or personalisation of Web content)
  • Semantic search in annotated Web content
  • Entity-driven navigation of user-generated content
  • Novel navigation and visualisation of extracted knowledge/graphs and associated Web resources

I like the sound of “consolidation.” An unspoken or tacit goal of any knowledge gathering. Not much use in scattered pieces on the shop floor.

Collocated with the 11th International Semantic Web Conference (ISWC2012)

EveryBlock

Thursday, May 10th, 2012

EveryBlock

I remember my childhood neighborhood just before the advent of air conditioning and the omnipresence of TV. A walk down the block gave you a good idea of what your neighbors were up to. Or not. 😉

Comparing then to now, the neighborhood where I now live, is strangely silent. Walk down my block and you hear no TVs, conversations, radios, loud discussions or the like.

We have become increasingly isolated from others by our means of transportation, entertainment and climate control.

EveryBlock offers the promise of restoring some of the random contact with our neighbors to our lives.

EveryBlock says it solves two problems:

First, there’s no good place to keep track of everything happening in your neighborhood, from news coverage to events to photography. We try to collect all of the news and civic goings-on that have happened recently in your city, and make it simple for you to keep track of news in particular areas.

Second, there’s no good way to post messages to your neighbors online. Facebook lets you post messages to your friends, Twitter lets you post messages to your followers, but no well-used service lets you post a message to people in a given neighborhood.

EveryBlock addresses the problem of geographic blocks, but how do you get information on your professional block?

Do you hear anything unexpected or different? Or do you hear the customary and expected?

Maybe your professional block has gotten too silent.

Suggestions for how to change that?

HotSocial 2012

Saturday, March 31st, 2012

HotSocial 2012: First ACM International Workshop on Hot Topics on Interdisciplinary Social Networks Research August 12, 2012, Beijing, China (in conjunction with ACM KDD 2012, August 12-16, 2012) http://user.informatik.uni-goettingen.de/~fu/hotsocial/

Important Dates:

Deadline for submissions: May 9, 2012 (11:59 PM, EST)
Notification of acceptance: June 1, 2012
Camera-ready version: June 12, 2012
HotSocial Workshop Day: Aug 12, 2012

From the post:

Among the fundamental open questions are:

  • How to access social networks data? Different communities have different means, each with pros and cons. Experience exchanges from different communities will be beneficial.
  • How to protect these data? Privacy and data protection techniques considering social and legal aspects are required.
  • How the complex systems and graph theory algorithms can be used for understanding social networks? Interdisciplinary collaboration are necessary.
  • Can social network features be exploited for a better computing and social network system design?
  • How do online social networks play a role in real-life (offline) community forming and evolution?
  • How does the human mobility and human interaction influence human behaviors and thus public health? How can we develop methodologies to investigate the public health and their correlates in the context of the social networks?

Topics of Interest:

Main topics of this workshop include (but are not limited to) the following:

  • methods for accessing social networks (e.g., sensor nets, mobile apps, crawlers) and bias correction for use in different communities (e.g., sociology, behavior studies, epidemiology)
  • privacy and ethic issues of data collection and management of large social graphs, leveraging social network properties as well as legal and social constraints
  • application of data mining and machine learning in the context of specific social networks
  • information spread models and campaign detection
  • trust and reputation and community evolution in the online and offline interacted social networks, including the presence and evolution of social identities and social capital in OSNs
  • understanding complex systems and scale-free networks from an interdisciplinary angle
  • interdisciplinary experiences and intermediate results on social network research

Sounds relevant to the “big data” stuff of interest to the White House.

PS: Have you noticed how some blogging software really sucks when you do “view source” on pages? Markup and data should be present. It makes content reuse easier. WordPress does it. How about your blogging software?

Social Media Application (FBI RFI)

Monday, February 20th, 2012

Social Media Application (FBI RFI)

Current Due Date: 11:00 AM, March 13, 2012

You have to read the Social Media Application.pdf document to prepare a response.

Be aware that as of 20 February 2012, that document has a blank page every other page. I suspect it is the complete document but have written to confirm and to request a corrected document be posted.

Out-Hoover Hoover: FBI wants massive data-mining capability for social media does mention:

Nowhere in this detailed RFI, however, does the FBI ask industry to comment on the privacy implications of such massive data collection and storage of social media sites. Nor does the FBI say how it would define the “bad actors” who would be subjected this type of scrutiny.

I take that to mean that the FBI is not seeking your comments on privacy implications or possible definitions of “bad actors.”

I won’t be able to prepare an official response because I don’t meet the contractor suitability requirements, which include a cost estimate for an offsite server as a solution to the requirements.

I will be going over the requirements and publishing my response here as though I meet the contractor suitability requirements. Could be an interesting exercise.

Social Media Monitoring with CEP, pt. 2: Context As Important As Sentiment

Sunday, February 5th, 2012

Social Media Monitoring with CEP, pt. 2: Context As Important As Sentiment by Chris Carlson.

From the post:

When I last wrote about social media monitoring, I made a case for using a technology like Complex Event Processing (“CEP”) to detect rapidly growing and geospatially-oriented social media mentions that can provide early warning detection for the public good (Social Media Monitoring for Early Warning of Public Safety Issues, Oct. 27, 2011).

A recent article by Chris Matyszczyk of CNET highlights the often conflicting and confusing nature of monitoring social media. A 26-year old British citizen, Leigh Van Bryan, gearing up for a holiday of partying in Los Angeles, California (USA), tweeted in British slang his intention to have a good time: “Free this week, for quick gossip/prep before I go and destroy America.” Since I’m not too far removed the culture of youth, I did take this to mean partying, cutting loose, having a good time (and other not-so-current definitions.)

This story does not end happily, as Van Bryan and his friend Emily Bunting were arrested and then sent back to Blighty.

This post will not increase American confidence in the TSAbut does illustrate how context can influence the identification of a subject (or “person of interest”) or to exclude the same.

Context is captured in topic maps using associations. In this particular case, a view of the information on the young man in question would reveal a lack of associations with any known terror suspects, people on the no-fly list, suspicious travel patterns, etc.

Not to imply that having good information leads to good decisions, technology can’t correct that particular disconnect.

The Role of Social Networks in Information Diffusion

Sunday, January 22nd, 2012

The Role of Social Networks in Information Diffusion by Eytan Bakshy, Itamar Rosenn, Cameron Marlow and Lada Adamic.

Abstract:

Online social networking technologies enable individuals to simultaneously share information with any number of peers. Quantifying the causal effect of these technologies on the dissemination of information requires not only identification of who influences whom, but also of whether individuals would still propagate information in the absence of social signals about that information. We examine the role of social networks in online information diffusion with a large-scale field experiment that randomizes exposure to signals about friends’ information sharing among 253 million subjects in situ. Those who are exposed are significantly more likely to spread information, and do so sooner than those who are not exposed. We further examine the relative role of strong and weak ties in information propagation. We show that, although stronger ties are individually more influential, it is the more abundant weak ties who are responsible for the propagation of novel information. This suggests that weak ties may play a more dominant role in the dissemination of information online than currently believed.

Sample size: 253 million Facebook users.

Pay attention to the line:

We show that, although stronger ties are individually more influential, it is the more abundant weak ties who are responsible for the propagation of novel information.

If you have an “Web scale” (whatever that means) information delivery issue, you need to not only target CNN and Drudge with press releases but should consider targeting actors with abundant weak ties.

Thinking this could be important in topic map driven applications that “push” novel information into the social network of a large, distributed company. You know how few of us actually read the tiresome broadcast stuff from HR, etc., so what if the important parts were “reported” piecemeal by others?

It is great to have a large functioning topic map but it doesn’t become useful until people make the information it delivers their own and take action based upon it.

Data Structure for Social News Streams on Graph Databases

Wednesday, January 4th, 2012

Data Structure for Social News Streams on Graph Databases

René Pickhardt writes (in part):

I also looked into the case of saving the news stream as a flat file for every user in which the events from his friends are saved for every user. For some reason I thought I had picked up somewhere that facebook is running on such a system. But right now I can’t find the resource anymore. If you can, please tell me! Anyway while studying these different approaches I realized that the flat file approach even though it seems to be primitive makes perfect sense. It scales to infinity and is very fast for reading! Even though I can’t find the resource anymore I will still call this approach the Facebook approach.

I was now wondering how you would store a social news stream in a graph data base like neo4j in a way that you get some nice properties. More specifically I wanted to combine the advantages of both the facebook and the twitter approach and try to get rid of the downfalls. And guess what! To me this seems actually possible on graph data bases. The key Idea is to store the social network and content items created by the users not only in a star topology but also in a list topology ordered by time of occuring events. The crucial part is to maintain this topology which is actually possible in O(1) while Updates occure to the graph. (emphasis in original)

See the post for links to his poster, paper and other interesting material.

Big Brother’s Name is…

Wednesday, January 4th, 2012

not the FBI, CIA, Interpol, Mossad, NSA or any other government agency.

Walmart all but claims that name at: Social Genome.

From the webpage:

In a sense, the social world — all the millions and billions of tweets, Facebook messages, blog postings, YouTube videos, and more – is a living organism itself, constantly pulsating and evolving. The Social Genome is the genome of this organism, distilling it to the most essential aspects.

At the labs, we have spent the past few years building and maintaining the Social Genome itself. We do this using public data on the Web, proprietary data, and a lot of social media. From such data we identify interesting entities and relationships, extract them, augment them with as much information as we can find, then add them to the Social Genome.

For example, when Susan Boyle was first mentioned on the Web, we quickly detected that she was becoming an interesting person in the world of social media. So we added her to the Social Genome, then monitored social media to collect more information about her. Her appearances became events, and the bigger events were added to the Social Genome as well. As another example, when a new coffee maker was mentioned on the Web, we detected and added it to the Social Genome. We strive to keep the Social Genome up to date. For example, we typically detect and add information from a tweet into the Social Genome within two seconds, from the moment the tweet arrives in our labs.

As a result of our effort, the Social Genome is a vast, constantly changing, up-to-date knowledge base, with hundreds of millions of entities and relationships. We then use the Social Genome to perform semantic analysis of social media, and to power a broad array of e-commerce applications. For example, if a user never uses the word “coffee”, but has mentioned many gourmet coffee brands (such as “Kopi Luwak”) in his tweets, we can use the Social Genome to detect the brands, and infer that he is interested in gourmet coffee. As another example, using the Social Genome, we may find that a user frequently mentions movies in her tweets. As a result, when she tweeted “I love salt!”, we can infer that she is probably talking about the movie “salt”, not the condiment (both of which appear as entities in the Social Genome).

Two seconds after you hit “send” on your tweet, it has been stripped, analyzed and added to the Social Genome at WalMart. For every tweet. Plus other data.

How should we respond to this news?

One response is to trust that WalMart and whoever it sells this data trove to, will use the information to enhance your shopping experience and achieve greater fulfilment by balancing shopping against your credit limit.

Another response is to ask for legislation to attempt regulation of a multi-national corporation that is larger than many governments.

Another response is to hold sit-ins and social consciousness raising events at WalMart locations.

My suggestion? One good turn deserves another.

WalMart is owned by someone. Walmart has a board of directors. Walmart has corporate officers. Walmart has managers, sales representatives, attorneys and advertising executives. All of who have information footprints. Perhaps not as public as ours, but they exist. Wny not gather up information on who is running Walmart? Fighting fire with fire as they say. Publish that information so that regulators, stock brokers, divorce lawyers and others can have access to it.

Let’s welcome WalMart as “Little Big Brothers.”

Semantic Web Technologies and Social Searching for Librarians – No Buy

Wednesday, December 21st, 2011

Semantic Web Technologies and Social Searching for Librarians By Robin Fay and Michael Sauers.

I don’t remember recommending a no buy on any book on this blog, particularly one I haven’t read, but there is a first time for everything.

Yes, I haven’t read the book because it isn’t available yet.

How do I know to recommend no buy on Robin Fay and Michael Sauers’ “Semantic Web Technologies and Social Searching for Librarians”?

Let’s look at the evidence, starting with the overview:

There are trillions of bytes of information within the web, all of it driven by behind-the-scenes data. Vast quantities of information make it hard to find what’s really important. Here’s a practical guide to the future of web-based technology, especially search. It provides the knowledge and skills necessary to implement semantic web technology. You’ll learn how to start and track trends using social media, find hidden content online, and search for reusable online content, crucial skills for those looking to be better searchers. The authors explain how to explore data and statistics through WolframAlpha, create searchable metadata in Flickr, and give meaning to data and information on the web with Google’s Rich Snippets. Let Robin Fay and Michael Sauers show you how to use tools that will awe your users with your new searching skills.

So, having read this book, you will know:

  • the future of web-based technology, especially search
  • [the] knowledge and skills necessary to implement semantic web technology
  • [how to] start and track trends using social media
  • [how to] find hidden content online
  • [how to] search for reusable online content
  • [how to] explore data and statistics through WolframAlpha
  • [how to] create searchable metadata in Flickr
  • [how to] give meaning to data and information on the web with Google’s Rich Snippets

The other facts you need to consider?

6 x 9 | 125 pp. | $59.95

So, in 125 pages, call it 105, allowing for title page, table of contents and some sort of index, you are going to learn all those skills?

For about the same amount of money, you can get a copy of Modern information retrieval : the concepts and technology behind search by Ricardo Baeza-Yates; Berthier Ribeiro-Neto, which covers only search in 944 pages.

I read a lot of discussion about teaching students to critically evaluate information that they read on the WWW.

Any institution that buys this book needs to implement critical evaluation of information training for its staff/faculty.

Klout Search Powered by ElasticSearch, Scala, Play Framework and Akka

Sunday, December 11th, 2011

Klout Search Powered by ElasticSearch, Scala, Play Framework and Akka

From the post:

At Klout, we love data and as Dave Mariani, Klout’s VP of Engineering, stated in his latest blog post, we’ve got lots of it! Klout currently uses Hadoop to crunch large volumes of data but what do we do with that data? You already know about the Klout score, but I want to talk about a new feature I’m extremely excited about — search!

Problem at Hand

I just want to start off by saying, search is hard! Yet, the requirements were pretty simple: we needed to create a robust solution that would allow us to search across all scored Klout users. Did I mention it had to be fast? Everyone likes to go fast! The problem is that 100 Million People have Klout (and that was this past September—an eternity in Social Media time) which means our search solution had to scale, scale horizontally.

Well, more of a “testimonial” as the Wizard of Oz would say but the numbers are serious enough to merit further investigation.

Although I must admit that social networking sites are spreading faster than, well, spreading faster that some social contagions.

Unless someone is joining multiple times for each one, for spamming purposes, I suspect some consolidation is in the not too distant future. What happens to all the links, etc., at the services that go away?

Just curious.

An R function to analyze your Google Scholar Citations page

Thursday, November 24th, 2011

An R function to analyze your Google Scholar Citations page

From the post:

Google scholar has now made Google Scholar Citations profiles available to anyone. You can read about these profiles and set one up for yourself here.

I asked John Muschelli and Andrew Jaffe to write me a function that would download my Google Scholar Citations data so I could play with it. Then they got all crazy on it and wrote a couple of really neat functions. All cool/interesting components of these functions are their ideas and any bugs were introduced by me when I was trying to fiddle with the code at the end.

Features include:

The function will download all of Rafa’s citation data and put it in the matrix out. It will also make wordclouds of (a) the co-authors on his papers and (b) the titles of his papers and save them in the pdf file specified (There is an option to turn off plotting if you want).

It can also calculate citation indices.

Scholars are fairly peripatetic these days and so have webpages, projects, courses, not to mention social media postings using various university identities. A topic map would be a nice complement to this function to gather up the “grey” literature that underlies final publications.