Archive for the ‘Search Interface’ Category

Designing Search: Displaying Results

Saturday, April 27th, 2013

Designing Search: Displaying Results by Tony Russell-Rose.

From the post:

Search is a conversation: a dialogue between user and system that can be every bit as rich as human conversation. Like human dialogue, it is bidirectional: on one side is the user with their information need, which they articulate as some form of query.

On the other is the system and its response, which it expresses a set of search results. Together, these two elements lie at the heart of the search experience, defining and shaping much of the information seeking dialogue. In this piece, we examine the most universal of elements within that response: the search result.

Basic Principles

Search results play a vital role in the search experience, communicating the richness and diversity of the overall result set, while at the same time conveying the detail of each individual item. This dual purpose creates the primary tension in the design: results that are too detailed risk wasting valuable screen space while those that are too succinct risk omitting vital information.

Suppose you’re looking for a new job, and you browse to the 40 or so open positions listed on UsabilityNews. The results are displayed in concise groups of ten, occupying minimal screen space. But can you tell which ones might be worth pursuing?

As always a great post by Tony but a little over the top with:

“…a dialogue between user and system that can be every bit as rich as human conversation.”

Not in my experience but that’s not everyone’s experience.

Has anyone tested the thesis that dialogue between a user and search engine is as rich as between user and reference librarian?

Leading People to Longer Queries

Thursday, March 14th, 2013

Leading People to Longer Queries by Elena Agapie, Gene Golovchinsky, Pernilla Qvarfordt.

Abstract:

Although longer queries can produce better results for information seeking tasks, people tend to type short queries. We created an interface designed to encourage people to type longer queries, and evaluated it in two Mechanical Turk experiments. Results suggest that our interface manipulation may be effective for eliciting longer queries.

The researchers encouraged longer queries by varying a halo around the search box.

Not conclusive but enough evidence to ask the questions:

What does your search interface encourage?

What other ways could you encourage query construction?

How would you encourage graph queries?

I first saw this in a tweet by Gene Golovchinsky.

typeahead.js [Autocompletion Library]

Friday, February 22nd, 2013

typeahead.js

From the webpage:

Inspired by twitter.com‘s autocomplete search functionality, typeahead.js is a fast and fully-featured autocomplete library.

Features

  • Displays suggestions to end-users as they type
  • Shows top suggestion as a hint (i.e. background text)
  • Works with hardcoded data as well as remote data
  • Rate-limits network requests to lighten the load
  • Allows for suggestions to be drawn from multiple datasets
  • Supports customized templates for suggestions
  • Plays nice with RTL languages and input method editors

Why not use X?

At the time Twitter was looking to implement a typeahead, there wasn’t a solution that allowed for prefetching data, searching that data on the client, and then falling back to the server. It’s optimized for quickly indexing and searching large datasets on the client. That allows for sites without datacenters on every continent to provide a consistent level of performance for all their users. It plays nicely with Right-To-Left (RTL) languages and Input Method Editors (IMEs). We also needed something instrumented for comprehensive analytics in order to optimize relevance through A/B testing. Although logging and analytics are not currently included, it’s something we may add in the future.

A bit on the practical side for me, ;-) , but I can think of several ways that autocompletion could be useful with a topic map interface.

Not just the traditional completion of a search term or phrase but offering possible roles for subjects already in a map and other uses.

If experience with XML and OpenOffice is any guide, the easier authoring becomes (assuming the authoring outcome is useful), the greater the adoption of topic maps.

It really is that simple.

I first saw this at: typeahead.js : Fully-featured jQuery Autocomplete Library.

‘What’s in the NIDDK CDR?’…

Saturday, February 9th, 2013

‘What’s in the NIDDK CDR?’—public query tools for the NIDDK central data repository by Nauqin Pan, et al., (Database (2013) 2013 : bas058 doi: 10.1093/database/bas058)

Abstract:

The National Institute of Diabetes and Digestive Disease (NIDDK) Central Data Repository (CDR) is a web-enabled resource available to researchers and the general public. The CDR warehouses clinical data and study documentation from NIDDK funded research, including such landmark studies as The Diabetes Control and Complications Trial (DCCT, 1983–93) and the Epidemiology of Diabetes Interventions and Complications (EDIC, 1994–present) follow-up study which has been ongoing for more than 20 years. The CDR also houses data from over 7 million biospecimens representing 2 million subjects. To help users explore the vast amount of data stored in the NIDDK CDR, we developed a suite of search mechanisms called the public query tools (PQTs). Five individual tools are available to search data from multiple perspectives: study search, basic search, ontology search, variable summary and sample by condition. PQT enables users to search for information across studies. Users can search for data such as number of subjects, types of biospecimens and disease outcome variables without prior knowledge of the individual studies. This suite of tools will increase the use and maximize the value of the NIDDK data and biospecimen repositories as important resources for the research community.

Database URL: https://www.niddkrepository.org/niddk/home.do

I would like to tell you more about this research, since “[t]he National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) is part of the National Institutes of Health (NIH) and the U.S. Department of Health and Human Services” (that’s a direct quote) and so doesn’t claim copyright on its publications.

Unfortunately, the NIDDK published this paper in the Oxford journal Database, which does believe in restricting access to publicly funded research.

Do visit the search interface to see what you think about it.

Not quite the same as curated content but an improvement over raw string matching.

DuckDuckGo Architecture…

Sunday, February 3rd, 2013

DuckDuckGo Architecture – 1 Million Deep Searches A Day And Growing Interview with Gabriel Weinberg.

From the post:

This is an interview with Gabriel Weinberg, founder of Duck Duck Go and general all around startup guru, on what DDG’s architecture looks like in 2012.

Innovative search engine upstart DuckDuckGo had 30 million searches in February 2012 and averages over 1 million searches a day. It’s being positioned by super investor Fred Wilson as a clean, private, impartial and fast search engine. After talking with Gabriel I like what Fred Wilson said earlier, it seems closer to the heart of the matter: We invested in DuckDuckGo for the Reddit, Hacker News anarchists.
                  
Choosing DuckDuckGo can be thought of as not just a technical choice, but a vote for revolution. In an age when knowing your essence is not about about love or friendship, but about more effectively selling you to advertisers, DDG is positioning themselves as the do not track alternative, keepers of the privacy flame. You will still be monetized of course, but in a more civilized and anonymous way. 

Pushing privacy is a good way to carve out a competitive niche against Google et al, as by definition they can never compete on privacy. I get that. But what I found most compelling is DDG’s strong vision of a crowdsourced network of plugins giving broader search coverage by tying an army of vertical data suppliers into their search framework. For example, there’s a specialized Lego plugin for searching against a complete Lego database. Use the name of a spice in your search query, for example, and DDG will recognize it and may trigger a deeper search against a highly tuned recipe database. Many different plugins can be triggered on each search and it’s all handled in real-time.

Can’t searching the Open Web provide all this data? No really. This is structured data with semantics. Not an HTML page. You need a search engine that’s capable of categorizing, mapping, merging, filtering, prioritizing, searching, formatting, and disambiguating richer data sets and you can’t do that with a keyword search. You need the kind of smarts DDG has built into their search engine. One problem of course is now that data has become valuable many grown ups don’t want to share anymore.

Being ad supported puts DDG in a tricky position. Targeted ads are more lucrative, but ironically DDG’s do not track policies means they can’t gather targeting data. Yet that’s also a selling point for those interested in privacy. But as search is famously intent driven, DDG’s technology of categorizing queries and matching them against data sources is already a form of high value targeting.

It will be fascinating to see how these forces play out. But for now let’s see how DuckDuckGo implements their search engine magic…

Some topic map centric points from the post:

Dream is to appeal to more niche audiences to better serve people who care about a particular topic. For example: lego parts. There’s a database of Lego parts, for example. Pictures of parts and part numbers can be automatically displayed from a search.

  • Some people just use different words for things. Goal is not to rewrite the query, but give suggestions on how to do things better.
  • “phone reviews” for example, will replace phone with telephone. This happens through an NLP component that tries to figure out what phone you meant and if there are any synonyms that should be used in the query.

Those are the ones that caught my eye, there are no doubt others.

Not to mention a long list of DuckDuckGo references at the end of the post.

What place(s) would you suggest to DuckDuckGo where topic maps would make a compelling difference?

izik Debuts as #1 Free Reference App on iTunes

Wednesday, January 9th, 2013

izik Debuts as #1 Free Reference App on iTunes

From the post:

We launched izik, our search app for tablets, last Friday and are amazed at the responses we’ve received! Thanks to our users, on day one izik was the #1 free reference app on iTunes and #49 free app overall. Yesterday we were mentioned twice in the New York Times, here and here (also in the B1 story in print). We are delighted that there is such a strong desire to see something fresh and new in search, and that our vision with izik is so well received.

The twitterverse has been especially active in spreading the word about izik. We’ve seen a lot of comments about the beautiful design and interface, the useful categories, and most importantly the high quality results that make izik a truly viable choice for searching on tablets.

Just last Monday I remarked: “From the canned video I get the sense that the interface is going to make search different.” (izik: Take Search for a Joy Ride on Your Tablet)

Users with tablets have supplied the input I asked for in that post and it is overwhelmingly in favor of izik.

To paraphrase Ray Charles in the Blues Brothers:

“E-excuse me, uh, I don’t think there’s anything wrong with the action on [search applications].”

There is plenty of “action” left in the search space.

izik is fresh evidence for that proposition.

izik: Take Search for a Joy Ride on Your Tablet

Monday, January 7th, 2013

izik: Take Search for a Joy Ride on Your Tablet

From the post:

We are giddy to announce the launch of izik, our new search app built specifically with the iPad and Android tablets in mind. With izik, every search on your tablet is transformed into a beautiful, glossy page that utilizes rich images, categories, and, of course, gesture controls. Check it: so much content, so many ways to explore.

Tablets are increasingly getting integrated into our lives, so we wracked our noggins to figure out how we could use our search technology to optimally serve tablet users. Not surprisingly, our research revealed that tablets take on a very different role in our lives than laptops and desktops. Laptops are for work; tablets are for fun. Laptops are task-oriented (“what’s the capital of Bulgaria?”); tablets are more exploratory (“what’s Jennifer Lopez doing these days?”).

So, our goal with izik was to move the task-oriented search product we all use on our computers (aka 10 blue links) and turn it into a more fun, tablet-appropriate product. That means an image-rich layout with an appearance and experience very different than what we’re used to seeing on a laptop.

I remain without a tablet so am dependent upon your opinions for how izik works for real users.

From the canned video I get the sense that the interface is going to make search different.

Is the scroll gesture more natural than using a mouse? Are some movements easier using gestures?

What other features of a tablet interface can change/improve search experiences?

Go3R [Searching for Alternatives to Animal Testing]

Monday, December 17th, 2012

Go3R

A semantic search engine for finding alternatives to animal testing.

I mention it as an example of a search interface that assists the user in searching.

The help documentation is a bit sparse if you are looking for an opportunity to contribute to such a project.

I did locate some additional information on the project, all usefully with the same title to make locating it “easy.” ;-)

[Introduction] Knowledge-based semantic search engine for alternative methods to animal experiments

[PubMed - entry] Go3R – semantic Internet search engine for alternative methods to animal testing by Sauer UG, Wächter T, Grune B, Doms A, Alvers MR, Spielmann H, Schroeder M. (ALTEX. 2009;26(1):17-31).

Abstract:

Consideration and incorporation of all available scientific information is an important part of the planning of any scientific project. As regards research with sentient animals, EU Directive 86/609/EEC for the protection of laboratory animals requires scientists to consider whether any planned animal experiment can be substituted by other scientifically satisfactory methods not entailing the use of animals or entailing less animals or less animal suffering, before performing the experiment. Thus, collection of relevant information is indispensable in order to meet this legal obligation. However, no standard procedures or services exist to provide convenient access to the information required to reliably determine whether it is possible to replace, reduce or refine a planned animal experiment in accordance with the 3Rs principle. The search engine Go3R, which is available free of charge under http://Go3R.org, runs up to become such a standard service. Go3R is the world-wide first search engine on alternative methods building on new semantic technologies that use an expert-knowledge based ontology to identify relevant documents. Due to Go3R’s concept and design, the search engine can be used without lengthy instructions. It enables all those involved in the planning, authorisation and performance of animal experiments to determine the availability of non-animal methodologies in a fast, comprehensive and transparent manner. Thereby, Go3R strives to significantly contribute to the avoidance and replacement of animal experiments.

[ALTEX entry - full text available] Go3R – Semantic Internet Search Engine for Alternative Methods to Animal Testing

Complexificaton: Is ElasticSearch Making a Case for a Google Search Solution?

Sunday, November 25th, 2012

Complexificaton: Is ElasticSearch Making a Case for a Google Search Solution? by Stephen Arnold.

From the post:

I don’t have any dealings with Google, the GOOG, or Googzilla (a word I coined in the years before the installation of the predator skeleton on the wizard zone campus). In the briefings I once endured about the GSA (Google speak for the Google Search Appliance), I recall three business principles imparted to me; to wit:

  1. Search is far too complicated. The Google business proposition was and is that the GSA and other Googley things are easy to install, maintain, use, and love.
  2. Information technology people in organizations can often be like a stuck brake on a sports car. The institutionalized approach to enterprise software drags down the performance of the organization information technology is supposed to serve.
  3. The enterprise search vendors are behind the curve.

Now the assertions from the 2004 salad days of Google are only partially correct today. As everyone with a colleague under 25 years of age knows, Google is the go to solution for information. A number of large companies have embraced Google’s all-knowing, paternalistic approach to digital information. However, others—many others, in fact—have not.

I won’t repeat Stephen’s barbs at ElasticSearch but his point applies to search interfaces and approaches in general.

Is your search application driving business towards simpler solutions? (If the simpler solution isn’t yours, isn’t that the wrong direction?)

eGIFT: Mining Gene Information from the Literature

Thursday, November 22nd, 2012

eGIFT: Mining Gene Information from the Literature by Catalina O Tudor, Carl J Schmidt and K Vijay-Shanker.

Abstract:

Background

With the biomedical literature continually expanding, searching PubMed for information about specific genes becomes increasingly difficult. Not only can thousands of results be returned, but gene name ambiguity leads to many irrelevant hits. As a result, it is difficult for life scientists and gene curators to rapidly get an overall picture about a specific gene from documents that mention its names and synonyms.

Results

In this paper, we present eGIFT (http://biotm.cis.udel.edu/eGIFT webcite), a web-based tool that associates informative terms, called iTerms, and sentences containing them, with genes. To associate iTerms with a gene, eGIFT ranks iTerms about the gene, based on a score which compares the frequency of occurrence of a term in the gene’s literature to its frequency of occurrence in documents about genes in general. To retrieve a gene’s documents (Medline abstracts), eGIFT considers all gene names, aliases, and synonyms. Since many of the gene names can be ambiguous, eGIFT applies a disambiguation step to remove matches that do not correspond to this gene. Another additional filtering process is applied to retain those abstracts that focus on the gene rather than mention it in passing. eGIFT’s information for a gene is pre-computed and users of eGIFT can search for genes by using a name or an EntrezGene identifier. iTerms are grouped into different categories to facilitate a quick inspection. eGIFT also links an iTerm to sentences mentioning the term to allow users to see the relation between the iTerm and the gene. We evaluated the precision and recall of eGIFT’s iTerms for 40 genes; between 88% and 94% of the iTerms were marked as salient by our evaluators, and 94% of the UniProtKB keywords for these genes were also identified by eGIFT as iTerms.

Conclusions

Our evaluations suggest that iTerms capture highly-relevant aspects of genes. Furthermore, by showing sentences containing these terms, eGIFT can provide a quick description of a specific gene. eGIFT helps not only life scientists survey results of high-throughput experiments, but also annotators to find articles describing gene aspects and functions.

Website: http://biotm.cis.udel.edu/eGIFT

Another lesson for topic map authoring interfaces: Offer domain specific search capabilities.

Using a ****** search appliance is little better than a poke with a sharp stick in most domains. The user is left to their own devices to sort out ambiguities, discover synonyms, again and again.

Your search interface may report > 900,000 “hits,” but anything beyond the first 20 or so are wasted.

(If you get sick, get something that comes up in the first 20 “hits” in PubMed. Where most researchers stop.)

HCIR 2012 papers published!

Thursday, November 8th, 2012

HCIR 2012 papers published! by Gene Golovchinsky.

Gene calls attention to four papers from the HCIR Symposium:

Great looking set of papers!

A Model of Consumer Search Behaviour

Tuesday, September 18th, 2012

A Model of Consumer Search Behaviour by Tony Russell-Rose.

From the post:

A couple of weeks ago I posted the slides to my talk at EuroHCIR on A Model of Consumer Search Behaviour. Finally, as promised, here is the associated paper, which is co-authored with Stephann Makri (and also available as a pdf in the proceedings). I hope it addresses the questions that the slide deck provoked, and provides further food for thought :)

ABSTRACT

In order to design better search experiences, we need to understand the complexities of human information-seeking behaviour. In previous work [13], we proposed a model of information behavior based on an analysis of the information needs of knowledge workers within an enterprise search context. In this paper, we extend this work to the site search context, examining the needs and behaviours of users of consumer-oriented websites and search applications.

We found that site search users presented significantly different information needs to those of enterprise search, implying some key differences in the information behaviours required to satisfy those needs. In particular, the site search users focused more on simple “lookup” activities, contrasting with the more complex, problem-solving behaviours associated with enterprise search. We also found repeating patterns or ‘chains’ of search behaviour in the site search context, but in contrast to the previous study these were shorter and less complex. These patterns can be used as a framework for understanding information seeking behaviour that can be adopted by other researchers who want to take a ‘needs first’ approach to understanding information behaviour.

Take the time to read the paper.

How would you test the results?

Placeholder: Probably beyond the bounds of the topic maps course but a guest lecture on designing UI tests could be very useful for library students. They will be selecting interfaces to be used by patrons and knowing how to test candidate interfaces could be valuable.

Blame Google? Different Strategy: Let’s Blame Users! (Not!)

Saturday, September 15th, 2012

Let me quote from A Simple Guide To Understanding The Searcher Experience by Shari Thurow to start this post:

Web searchers have a responsibility to communicate what they want to find. As a website usability professional, I have the opportunity to observe Web searchers in their natural environments. What I find quite interesting is the “Blame Google” mentality.

I remember a question posed to me during World IA Day this past year. An attendee said that Google constantly gets search results wrong. He used a celebrity’s name as an example.

“I wanted to go to this person’s official website,” he said, “but I never got it in the first page of search results. According to you, it was an informational query. I wanted information about this celebrity.”

I paused. “Well,” I said, “why are you blaming Google when it is clear that you did not communicate what you really wanted?”

“What do you mean?” he said, surprised.

“You just said that you wanted information about this celebrity,” I explained. “You can get that information from a variety of websites. But you also said that you wanted to go to X’s official website. Your intent was clearly navigational. Why didn’t you type in [celebrity name] official website? Then you might have seen your desired website at the top of search results.”

The stunned silence at my response was almost deafening. I broke that silence.

“Don’t blame Google or Yahoo or Bing for your insufficient query formulation,” I said to the audience. “Look in the mirror. Maybe the reason for the poor searcher experience is the person in the mirror…not the search engine.”

People need to learn how to search. Search experts need to teach people how to search. Enough said.

What a novel concept! If the search engine/software doesn’t work, must be the user’s fault!

I can save you a trip down the hall to the marketing department. They are going to tell you that is an insane sales strategy. Satisfying to the geeks in your life but otherwise untenable, from a business perspective.

Remember the stats on using Library of Congress subject headings I posted under Subject Headings and the Semantic Web:

Overall percentages of correct meanings for subject headings in the original order of subdivisions were as follows: children, 32%, adults, 40%, reference 53%, and technical services librarians, 56%.

?

That is with decades of teaching people to search both manual and automated systems using Library of Congress classification.

Test Question: I have a product to sell. 60% of my all buyers can’t find it with a search engine. Do I:

  • Teach all users everywhere better search techniques?
  • Develop better search engines/interfaces to compensate for potential buyers’ poor searching?

I suspect the “stunned silence” was an audience with greater marketing skills than the speaker.

Broccoli: Semantic Full-Text Search at your Fingertips

Friday, July 13th, 2012

Broccoli: Semantic Full-Text Search at your Fingertips by Hannah Bast, Florian Bäurle, Björn Buchhold, and Elmar Haussmann.

Abstract:

We present Broccoli, a fast and easy-to-use search engine for what we call semantic full-text search. Semantic full-text search combines the capabilities of standard full-text search and ontology search. The search operates on four kinds of objects: ordinary words (e.g. edible), classes (e.g. plants), instances (e.g. Broccoli), and relations (e.g. occurs-with or native-to). Queries are trees, where nodes are arbitrary bags of these objects, and arcs are relations. The user interface guides the user in incrementally constructing such trees by instant (search-as-you-type) suggestions of words, classes, instances, or relations that lead to good hits. Both standard full-text search and pure ontology search are included as special cases. In this paper, we describe the query language of Broccoli, a new kind of index that enables fast processing of queries from that language as well as fast query suggestion, the natural language processing required, and the user interface. We evaluated query times and result quality on the full version of the EnglishWikipedia (32 GB XML dump) combined with the YAGO ontology (26 million facts). We have implemented a fully-functional prototype based on our ideas, see this http URL

It’s good to see CS projects work so hard to find unambiguous names. That won’t be confused with far more common uses of the same names. ;-)

For all that, on quick review it does look like a clever, if annoyingly named, project.

Hmmm, doesn’t like the “-” (hyphen) character. “graph-theoretical tree” returns 0 results, “graph theoretical tree” returns 1 (the expected one).

Definitely worth a close read.

One puzzle though. There are a number of projects that use Wikipedia data dumps. The problem is most of the documents I am interested in searching aren’t in Wikipedia data dumps. Like the Enron emails.

Techniques that work well with clean data may work less well with documents composed of the vagaries of human communication. Or attempts at communication.

Designing Search (part 5): Results pages

Wednesday, July 4th, 2012

Designing Search (part 5): Results pages by Tony Russell-Rose.

From the post:

In the previous post, we looked at the ways in which a response to an information need can be articulated, focusing on the various forms that individual search results can take. Each separate result represents a match for our query, and as such, has the potential to fulfil our information needs. But as we saw earlier, information seeking is a dynamic, iterative activity, for which there is often no single right answer.

A more informed approach therefore is to consider search results not as competing alternatives, but as an aggregate response to an information need. In this context, the value lies not so much with the individual results but on the properties and possibilities that emerge when we consider them in their collective form. In this section we examine the most universal form of aggregation: the search results page.

As usual, Tony illustrates each of his principles with examples drawn from actual webpages. Makes a very nice checklist to use when constructing a results page. Concludes with references and links to all the prior posts in this series.

Unless you are a UI expert, defaulting to following Tony’s advice is not a bad plan. May not be anyway.

Become a Google Power Searcher

Wednesday, June 27th, 2012

Become a Google Power Searcher by Terry Ednacot.

From the post:

You may already be familiar with some shortcuts for Google Search, like using the search box as a calculator or finding local movie showtimes by typing [movies] and your zip code. But there are many more tips, tricks and tactics you can use to find exactly what you’re looking for, when you most need it.

Today, we’ve opened registration for Power Searching with Google, a free, online, community-based course showcasing these techniques and how you can use them to solve everyday problems. Our course is aimed at empowering you to find what you need faster, no matter how you currently use search. For example, did you know that you can search for and read pages written in languages you’ve never even studied? Identify the location of a picture your friend took during his vacation a few months ago? How about finally identifying that green-covered book about gardening that you’ve been trying to track down for years? You can learn all this and more over six 50-minute classes.

Lessons will be released daily starting on July 10, 2012, and you can take them according to your own schedule during a two-week window, alongside a worldwide community. The lessons include interactive activities to practice new skills, and many opportunities to connect with others using Google tools such as Google Groups, Moderator and Google+, including Hangouts on Air, where world-renowned search experts will answer your questions on how search works. Googlers will also be on hand during the course period to help and answer your questions in case you get stuck.

I know, I know, you are way beyond using Google but you may know some people who are not.

Try to suggest this course in a positive way, i.e., non-sneering sort of way.

Will be a new experience.

You may want to “audit” the course.

Would be unfortunate for someone to ask you a Google search question you can’t answer.

;-)

Google search parameters in 2012

Monday, June 25th, 2012

Google search parameters in 2012

From the post:

Knowing the parameters Google uses in its search is not only important for SEO geeks. It allow you to use shortcuts and play with the Google filters. The parameters also reveal more juicy things: Is it safe to share your Google search URLs or screenshots of your Google results? This post argues that it is important to be aware of the complicated nature of the Google URL. As we will see later posting your own Google URL can reveal personal information about you that you might not feel too comfortable sharing. So read on to learn more about the Google search parameters used in 2012.

Why do I say “in 2012″? Well, the Google URL changed over time and more parameters were added to keep pace with the increasing complexity of the search product, the Google interface and the integration of verticals. Before looking at the parameter table below, though, I encourage you to quickly perform the following 2 things:

  1. Go directly to Google and search for your name. Look at the URL.
  2. Go directly to DuckDuckGo and perform the same search. Look at the URL.

This little exercise serves well to demonstrate just how simple and how complicated URLs used by search engines can look like. These two cases are at the opposing ends: While DuckDuckGo has only one search parameter, your query, and is therefore quite readable, Google uses a cryptic construct that only IT professionals can try to decipher. What I find interesting is that on my Smartphone, though, the Google search URL is much simpler than on the desktop.

This blog post is primarily aimed at Google’s web search. I will not look at their other verticals such as scholar or images. But because image search is so useful, I encourage you to look at the image section of the Unofficial Google Advanced Search guide

The tables of search parameters are a nice resource.

Suggestions of similar information for other search engines?

What’s Your Default Search Engine?

Sunday, June 24th, 2012

Bing’s Evolving Local Search by Matthew Hurst.

From the post:

Recently, there have been a number of announcements regarding the redesign of Bing’s main search experience. The key difference is the use of three parallel zones in the SERP. Along with the traditional page results area, there are two new results columns: the task pane, which highlights factual data and the social pane which currently highlights social information from individuals (I distinguish social from ‘people’ as entities – for example a restaurant – can have a social presence even though they are only vaguely regarded as people).

I don’t get out much but I can appreciate the utility of the aggregate results for local views.

Matthew writes:

  1. When we provide flat structured data (as Bing did in the past), while we continued to strive for high quality data, there is no burning light focused on any aspect of the data. However, when we require to join the data to the web (local results are ‘hanging off’ the associated web sites), the quality of the URL associated with the entity record becomes a critical issue.
  2. The relationship between the web graph and the entity graph is subtle and complex. Our legacy system made do with the notion of a URL associated with an entity. As we dug deeper into the problem we discovered a very rich set of relationships between entities and web sites. Some entities are members of chains, and the relationships between their chain home page and the entity is quite different from the relationship between a singleton business and its home page. This also meant that we wanted to treat the results differently. See below for the results for {starbucks in new york}
  3. The structure of entities in the real world is subtle and complex. Chains, franchises, containment (shop in mall, restaurant in casino, hotel in airport), proximity – all these qualities of how the world works scream out for rich modeling if the user is to be best supported in navigating her surroundings.

Truth be told, the structure of entities in the “real world” and their representatives (somewhere other than the “real” world), not to mention their relationships to each other, are all subtle and complex.

That is part of what makes searching, discovery, mapping such exciting areas for exploration. There is always something new just around the next corner.

Social Annotations in Web Search

Wednesday, June 13th, 2012

Social Annotations in Web Search by Aditi Muralidharan,
Zoltan Gyongyi, and Ed H. Chi. (CHI 2012, May 5–10, 2012, Austin, Texas, USA)

Abstract:

We ask how to best present social annotations on search results, and attempt to find an answer through mixed-method eye-tracking and interview experiments. Current practice is anchored on the assumption that faces and names draw attention; the same presentation format is used independently of the social connection strength and the search query topic. The key findings of our experiments indicate room for improvement. First, only certain social contacts are useful sources of information, depending on the search topic. Second, faces lose their well-documented power to draw attention when rendered small as part of a social search result annotation. Third, and perhaps most surprisingly, social annotations go largely unnoticed by users in general due to selective, structured visual parsing behaviors specific to search result pages. We conclude by recommending improvements to the design and content of social annotations to make them more noticeable and useful.

The entire paper is worth your attention but the first paragraph of the conclusion gives much food for thought:

For content, three things are clear: not all friends are equal, not all topics benefit from the inclusion of social annotation, and users prefer different types of information from different people. For presentation, it seems that learned result-reading habits may cause blindness to social annotations. The obvious implication is that we need to adapt the content and presentation of social annotations to the specialized environment of web search.

The complexity and sublty of semantics on human side keeps bumping into the search/annotate with a hammer on the computer side.

Or as the authors say: “…users prefer different types of information from different people.”

Search engineers/designers who use their preferences/intuitions as the designs to push out to the larger user universe are always going to fall short.

Because all users have their own preferences and intuitions about searching and parsing search results. What is so surprising about that?

I have had discussions with programmers who would say: “But it will be better for users to do X (as opposed to Y) in the interface.”

Know what? Users are the only measure of the fitness of an interface or success of a search result.

A “pull” model (user preferences) based search engine will gut all existing (“push” model, engineer/programmer preference) search engines.


PS: You won’t discover the range of user preferences by study groups with 11 participants. Ask one of the national survey companies and have them select several thousand participants. Then refine which preferences get used the most. Won’t happen overnight but every precentage gain will be one the existing search engines won’t regain.

PPS: Speaking of interfaces, I would pay for a web browser that put webpages back under my control (the early WWW model).

Enabling me to defeat those awful “page is loading” ads from major IT vendors who should know better. As well as strip other crap out. It is a data stream that is being parsed. I should be able to clean it up before viewing. That could be a real “hit” and make page load times faster.

I first saw this article in a list of links from Greg Linden.

A Taxonomy of Site Search

Wednesday, June 6th, 2012

A Taxonomy of Site Search by Tony Russell-Rose.

From the post:

Here are the slides from the talk I gave at Enterprise Search Europe last week on A Taxonomy of Site Search. This talk extends and validates the taxonomy of information search strategies (aka ‘search modes’) presented at last year’s event, and reviews some of their implications for design. But this year we looked specifically at site search rather than enterprise search, and explored the key differences in user needs and behaviours between the two domains. [see Tony's post for the slides]

There is a lot to be learned (and put to use) from investigations of search behavior.

Designing Search (part 4): Displaying results

Thursday, May 17th, 2012

Designing Search (part 4): Displaying results

Tony Russell-Rose writes:

In an earlier post we reviewed the various ways in which an information need may be articulated, focusing on its expression via some form of query. In this post we consider ways in which the response can be articulated, focusing on its expression as a set of search results. Together, these two elements lie at the heart of the search experience, defining and shaping much of the information seeking dialogue. We begin therefore by examining the most universal of elements within that response: the search result.

As usual, Tony does a great job of illustrating your choices and trade-offs in presentation of search results. Highly recommended.

I am curious since Tony refers to it as an “information seeking dialogue,” has anyone mapped reference interview approaches to search interfaces? I suspect that is just my ignorance of the literature on that subject so would appreciate any pointers you can throw my way.

I would update Tony’s bibliography:

Marti Hearst (2009) Search User Interfaces. Cambridge University Press

Online as full text: http://searchuserinterfaces.com/

Designing User Experiences for Imperfect Data

Wednesday, March 28th, 2012

Designing User Experiences for Imperfect Data by Matthew Hurst.

Matthew writes:

Any system that uses some sort of inference to generate user value is at the mercy of the quality of the input data and the accuracy of the inference mechanism. As neither of these can be guaranteed to by perfect, users of the system will inevitably come across incorrect results.

In web search we see this all the time with irrelevant pages being surfaced. In the context of track // microsoft, I see this in the form of either articles that are incorrectly added to the wrong cluster, or articles that are incorrectly assigned to no cluster, becoming orphans.

It is important, therefore, to take these imperfections into account when building the interface. This is not necessarily a matter of pretending that they don’t exist, or tricking the user. Rather it is a problem of eliciting an appropriate reaction to error. The average user is not conversant in error margins and the like, and thus tends to over-weight errors leading to the perception of poorer quality in the good stuff.

I am not real sure how Matthew finds imperfect data but I guess I will just have to take his word for it. ;-)

Seriously, I think he is spot on in observing that expecting users to hunt-n-peck through search results is wearing a bit thin. That is going to be particularly so when better search systems make the hidden cost of hunt-n-peck visible.

Do take the time to visit his track // microsoft site.

Now imagine your own subject specific and dynamic website. Or even search engine. Could be that search engines for “everything” are the modern day dinosaurs. Big, clumsy, fairly crude.

Designing Search (part 3): Keeping on track

Tuesday, March 20th, 2012

Designing Search (part 3): Keeping on track by Tony Russell-Rose

From the post:

In the previous post we looked at techniques to help us create and articulate more effective queries. From auto-complete for lookup tasks to auto-suggest for exploratory search, these simple techniques can often make the difference between success and failure.

But occasionally things do go wrong. Sometimes our information journey is more complex than we’d anticipated, and we find ourselves straying off the ideal course. Worse still, in our determination to pursue our original goal, we may overlook other, more productive directions, leaving us endlessly finessing a flawed strategy. Sometimes we are in too deep to turn around and start again.

(graphic omitted)

Conversely, there are times when we may consciously decide to take a detour and explore the path less trodden. As we saw earlier, what we find along the way can change what we seek. Sometimes we find the most valuable discoveries in the most unlikely places.

However, there’s a fine line between these two outcomes: one person’s journey of serendipitous discovery can be another’s descent into confusion and disorientation. And there’s the challenge: how can we support the former, while unobtrusively repairing the latter? In this post, we’ll look at four techniques that help us keep to the right path on our information journey.

Whether you are writing a search interface or simply want to know more about what factors to consider in evaluating a search interface, this series by Tony Russell-Rose is well worth your time.

If you are writing a topic map, you already have as a goal the collection of information for some purpose. It would be sad if the information you collect isn’t findable due to poor interface design.

Bad vs Good Search Experience

Monday, March 5th, 2012

Bad vs Good Search Experience by Emir Dizdarevic.

From the post:

The Problem

This article will show how a bad search solution can be improved. We will demonstrate how to build an enterprise search solution relatively easy using Apache Lucene/SOLR.

We took a local ad site as an example of a bad search experience.

We crawled the ad site with Apache Nutch, using a couple of home grown plugins to fetch only the data we want and not the whole site. Stay tuned for a separate article on this topic.

‘BAD’ search is based on real search results from the ad site i.e. how the website search currently works. ‘GOOD ‘ search is based on same data but indexed with Apache Lucene/Solr (inverted index).

BAD Search: We assume that it’s based on exact match criteria or something similar to ‘%like%’ database statement. To simulate this behavior we used content field that it tokenized by whitespace, lowercased and used phrase queries every time. This is the closest we could get to existing ad site search solution, but even this bad it was performing better.

An excellent post in part because of the detailed example but also to show that improving search results is an iterative process.

Enjoy!

YapMap: Breck’s Fun New Project to Improve Search

Saturday, January 28th, 2012

YapMap: Breck’s Fun New Project to Improve Search

From the post:

What I like about the user interface is that threads can be browsed easily–I have spent hours on remote controlled airplane forums reading every post because it is quite difficult to find relevant information within a thread. The color coding and summary views are quite helpful in eliminating irrelevant posts.

My first job is to get query spell checking rolling. Next is search optimized for the challenges of thread based postings. The fact that relevance of a post to a query is a function of a thread is very interesting. I will hopefully get to do some discourse analysis as well.

I will continue to run Alias-i/LingPipe. The YapMap involvement is just too fun a project to pass up given that I get to build a fancy search and discovery tool.

What do you think about the thread browsing capabilities?

I am sympathetic to the “reading every post” problem but I am not sure threading helps, at least not completely.

Doesn’t help with posters like myself who may make “off-thread” comments that may be the one you are looking for.

Comments about the interface?

Designing Search (part 1): Entering the query

Thursday, January 19th, 2012

Designing Search (part 1): Entering the query by Tony Russell-Rose.

From the post:

In an earlier post we reviewed models of information seeking, from an early focus on documents and queries through to a more nuanced understanding of search as an information journey driven by dynamic information needs. While each model emphasizes different aspects of the search process, what they share is the principle that search begins with an information need which is articulated in some form of query. What follows below is the first in a mini-series of articles exploring the process of query formulation, starting with the most ubiquitous of design elements: the search box.

If you are designing or using search interfaces, you will benefit from reading this post.

Suggestion: Don’t jump to the summary and best practices. Tony’s analysis is just as informative as the conclusions he reaches.

Google; almost 50 functions & resources killed in 2011

Saturday, December 17th, 2011

Google; almost 50 functions & resources killed in 2011 by Phil Bradley.

Just in case you want to think of other potential projects over the holidays! ;-)

For my topic maps class:

  1. Pick one function or resource
  2. Outline how semantic integration could support or enhance such a function or resource. (3-5 pages, no cites)
  3. Bonus points: What resources would you want to integrate for such a function or resource? (1-2 pages)

Google removes more search functionality

Saturday, December 17th, 2011

Google removes more search functionality by Phil Bradley.

From the post:

In Google’s apparently lemming like attempt to throw as much search functionality away as they can, they have now revamped their advanced search page. Regular readers will recall that I wrote about Google making it harder to find, and now they’re reducing the available options. The screen is now following the usual grey/white/read design, but to refresh your memory, this is what it used to look like:

Just in case you are looking for search opportunities in the near future.

The smart money says to not try to be everything to everybody. Pick off a popular (read advertising supporting) subpart of all content and work up really well. Offer users for that area what seem like useful defaults for that area. The defaults for television/movie types are likely to be different from the Guns & Ammo crowd. As would the advertising you would sell.

Remind me to write about using topic maps to create pull-model advertising. So that viewers pre-qualify themselves and you can charge more for “hits” on ads.

A Task-based Model of Search

Wednesday, December 14th, 2011

A Task-based Model of Search by Tony Russell-Rose.

From the post:

A little while ago I posted an article called Findability is just So Last Year, in which I argued that the current focus (dare I say fixation) of the search community on findability was somewhat limiting, and that in my experience (of enterprise search, at least), there are a great many other types of information-seeking behaviour that aren’t adequately accommodated by the ‘search as findability’ model. I’m talking here about things like analysis, sensemaking, and other problem-solving oriented behaviours.

Now, I’m not the first person to have made this observation (and I doubt I’ll be the last), but it occurs to me that one of the reasons the debate exists in the first place is that the community lacks a shared vocabulary for defining these concepts, and when we each talk about “search tasks” we may actually be referring to quite different things. So to clarify how I see the landscape, I’ve put together the short piece below. More importantly, I’ve tried to connect the conceptual (aka academic) material to current design practice, so that we can see what difference it might make if we had a shared perspective on these things. As always, comments & feedback welcome.

High marks for a start on what complex and intertwined issues.

Not so much that we will reach a common vocabulary but so we can be clearer about where we get confused when moving from one paradigm to another.

Which search engine when?

Tuesday, December 13th, 2011

Which search engine when?

A listing of search engines in the following categories:

  • keyword search
  • index or directory based
  • multi or meta search engines
  • visual results
  • category
  • blended results

There are fifty-three (53) entries so plenty to choose from if you are bored with your current search “experience.”

Not to mention learning about different ways to present search results to users.

BTW, if you run across a blog mentioning that AllPlus was listed in two separate categories, like this one, realize that SearchLion was also listed in two separate categories.

Search engines are an important topic for topic mappers because it is one of the places where semantic impedance and the lack of useful organization of information is a major time sink for all users.

Getting 400,000 “hits” is just a curiosity, getting 402 “hits,” in a document archive like I did this morning, is a considerable amount of content but a manageable one.

No, it wasn’t a topic map that I was searching but the results may well find themselves into a topic map.