December « 2011 « Another Word For It

December 3, 2011

AO: Annotation Ontology

Filed under: Annotation,Ontology — Patrick Durusau @ 8:22 pm

From the description:

The Annotation Ontology is a vocabulary for performing several types of annotation – comment, entities annotation (or semantic tags), textual annotation (classic tags), notes, examples, erratum… – on any kind of electronic document (text, images, audio, tables…) and document parts. AO is not providing any domain ontology but it is fostering the reuse of the existing ones for not breaking the principle of scalability of the Semantic Web.

Anita de Waard mentioned this in her How to Execute the Research Paper.

Interesting work but you have to realize that all ontologies evolve (except for those that aren’t used) and that not everyone uses the same one.

Still, it is the sort of thing you will encounter in topic maps work so you need to be aware of it.

Comments Off

Heroku, Neo4j and Google Spreadsheet in 10min. Flat.

Filed under: Cypher,Heroku,Neo4j,Ruby — Patrick Durusau @ 8:21 pm

Heroku, Neo4j and Google Spreadsheet in 10min. Flat. by Peter Neubauer

From the description:

This screencast shows how to use Neo4j on Heroku. We will do:

Create and install a Heroku app

Add a Neo4j instance to it

create a custom Ruby app

execute Cypher queries

Connect to the app using Google Spreadsheet

Build a small bar chart from a Cypher query.

Great presentation, with one tiny flaw. That is that the screen was so tiny that one has to guess at the contents of commands. Sure I can come fairly close but a file with transcripts of the terminal sessions and code would be nicer.

Recommend that you download the video for viewing. Watch it once online and you will see what I mean. I ran it on a 22 inch Samsung as full screen and a copy of the command sequence would have been appreciated.

Comments (2)

How to Execute the Research Paper

Filed under: Annotation,Biomedical,Dynamic Updating,Linked Data,RDF — Patrick Durusau @ 8:21 pm

How to Execute the Research Paper by Anita de Waard.

I had to create the category, “dynamic updating,” to at least partially capture what Anita describes in this presentation. I would have loved to be present to see it in person!

The gist of the presentation is that we need to create mechanisms to support research papers being dynamically linked to the literature and other resources. One example that Anita uses is linking a patient’s medical records to reports in literature with professional tools for the diagnostician.

It isn’t clear how Linked Data (no matter how generously described by Jeni Tennison) could be the second technology for making research papers linked to other data. In part because as Jeni points out, URIs are simply more names for some subject. We don’t know if that name is for the resource or something the resource represents. Makes reliable linking rather difficult.

BTW, the web lost its ability to grow in a “gradual and sustainable way” when RDF/Linked Data introduced the notion that URIs cannot be allowed to fail. If you try to reason based on something that fails, the reasoner falls on its side. Not nearly as robust as allowing semantic 404’s.

Anita’s third step, an integrated workflow is certainly the goal to which we should be striving. I am less convinced about the mechanisms, such as generating linked data stores in addition to the documents we already have, are the way forward. For documents, for instance, why do we need to repeat data they already possess? Why can’t documents represent their contents themselves? Oh, because that isn’t how Linked Data/RDF stores work.

Still, I would highly recommend this slide deck and that you catch any presentation by Anita that you can.

Comments Off

50^th Annual Meeting of the Association for Computational Linguistics

Filed under: Computational Linguistics,Conferences — Patrick Durusau @ 8:21 pm

50^th Annual Meeting of the Association for Computational Linguistics

Important dates:

January 15, 2012 (11:59pm PST) : Long Submission Deadline
March 11, 2012 : Long Notification
April 30, 2012 : Long Camera Ready Deadline
March 18, 2012 (11:59pm PST) : Short Submission Deadline
April 23, 2012 : Short Notification
May 7, 2012 : Short Camera Ready Deadline
July 9, 2012 : Conference Starts – July 14, 2012 : Conference Ends

From the call for papers:

The 50th Annual Meeting of the Association for Computational Linguistics and the Human Language Technologies conference will be organized as a single event to be held at the International Convention Center Jeju, Jeju, Korea, on July 8-14, 2012. The conference will cover a broad spectrum of technical areas related to natural language and computation. ACL 2012 will include full papers, short papers, oral presentations, poster presentations, demonstrations, tutorials, and workshops. The conference is organized by the Association for Computational Linguistics.

The conference invites the submission of papers on original and unpublished research on all aspects of computational linguistics, including but not limited to:

1. Discourse, Dialogue, and Pragmatics
2. Information Extraction
3. Information Retrieval
4. Language Resources
5. Lexical Semantics
6. Lexicon and ontology development
7. Machine Translation
8. Multilinguality
9. Multimodal representations and processing
10. Social Media
11. Natural Language Processing Applications
12. Phonology/Morphology, Tagging and Chunking, Word Segmentation
13. Question Answering
14. Sentiment Analysis and Opinion Mining
15. Spoken Language Processing
16. Statistical and Machine Learning Methods
17. Summarization and Generation
18. Syntax and Parsing
19. Text Classification
20. Text Mining
21. User Studies and Evaluation Methods

Comments Off

Detecting Structure in Scholarly Discourse

Filed under: Computational Linguistics,Conferences,Discourse,Narrative — Patrick Durusau @ 8:20 pm

Detecting Structure in Scholarly Discourse (DSSD2012)

Important Dates:

March 11, 2012 Submission Deadline
April 15, 2012 Notification of acceptance
April 30, 2012 Camera-ready papers due
July 12 or 13, 2012 Workshop

From the Call for Papers:

The detection of discourse structure in scientific documents is important for a number of tasks, including biocuration efforts, text summarization, error correction, information extraction and the creation of enriched formats for scientific publishing. Currently, many parallel efforts exist to detect a range of discourse elements at different levels of granularity and for different purposes. Discourse elements detected include the statement of facts, claims and hypotheses, the identification of methods and protocols, and as the differentiation between new and existing work. In medical texts, efforts are underway to automatically identify prescription and treatment guidelines, patient characteristics, and to annotate research data. Ambitious long-term goals include the modeling of argumentation and rhetorical structure and more recently narrative structure, by recognizing ‘motifs’ inspired by folktale analysis.

A rich variety of feature classes is used to identify discourse elements, including verb tense/mood/voice, semantic verb class, speculative language or negation, various classes of stance markers, text-structural components, or the location of references. These features are motivated by linguistic inquiry into the detection of subjectivity, opinion, entailment, inference, but also author stance and author disagreement, motif and focus.

Several workshops have been focused on the detection of some of these features in scientific text, such as speculation and negation in the 2010 workshop on Negation and Speculation in Natural Language Processing and the BioNLP’09 Shared Task, and hedging in the CoNLL-2010 Shared Task Learning to detect hedges and their scope in natural language textM. Other efforts that have included a clear focus on scientific discourse annotation include STIL2011 and Force11, the Future of Research Communications and e-Science. There have been several efforts to produce large-scale corpora in this field, such as BioScope, where negation and speculation information were annotated, and the GENIA Event corpus.

The goal of the 2012 workshop Detecting Structure in Scholarly Discourse is to discuss and compare the techniques and principles applied in these various approaches, to consider ways in which they can complement each other, and to initiate collaborations to develop standards for annotating appropriate levels of discourse, with enhanced accuracy and usefulness.

This conference is being held in conjunction with ACL 2012.

Comments Off

Meet Sheldon, Our Custom Database Server

Filed under: Database,Graphs — Patrick Durusau @ 8:20 pm

Meet Sheldon, Our Custom Database Server

From the post:

Building a recommendations engine is a challenging thing, but one thing that makes a difference is saving your data to make the whole process more efficient. Recommendation engines are fed by user interactions, so the first thought might be to use a graph processing system as a model that lets you abstract the whole system in a natural way. Since Moviepilot Labs is working with a graph database system to store and query all our data we built Sheldon, our custom graph database server.

I suppose being a movie recommendation service that it is only appropriate that it start with a teaser about its graph database server.

The post ends:

This article is Part 1 of a two-part series on graph databases being used at Moviepilot.Read about how Moviepilot walks the graph in Part 2. Learn more about graph databases here.

The pointer is a good source of information but still no detail on “Sheldon.”

Comments (1)

tutorial draft on curse of dimensionality

Filed under: Dimensions — Patrick Durusau @ 8:19 pm

tutorial draft on curse of dimensionality

From the post:

Curse of dimensionality is a widely heard of, largely misunderstood concept in machine learning. There is one single explanation of it circulating, but there is more to it. I will explain what is the curse, and how it complicates everything.

I don’t follow hockey but the example would be easy enough to adapt by subject domain.

The author illustrates one problem with dimensionality and promises to discuss others.

I say “the author” because this is one of those blogs where identification of the author isn’t clear. In academic discussions that is more than a little annoying.

Good illustration of the problem and points for that.

Comments Off

December 2, 2011

Netwitness

Filed under: Networks,Security — Patrick Durusau @ 4:56 pm

Netwitness

I was researching the InfiniteGraph post but the splash on the homepage of Netwitness: “Know Everything Answer Anything: The Revolutionary Approach to Network Monitoring” caught my eye.

You really need to watch the intro videos for NetWitness Investigator. I thought it was impressive but then I used command line tools more than a decade ago for similar purposes. And the art has changed a lot since then.

Although I think the interface is “busy,” I did like the idea of blurring the separation of querying and navigation.

That is an idea we would do well to think about for topic maps. (Would depend on the domain.)

The NetWitness Investigator is a free download so I am going to play with a copy. (Well, on my Windows box b/c Linux is available for the commercial version.)

Comments Off

Infinitegraph 2.0

Filed under: Graphs,InfiniteGraph,NoSQL — Patrick Durusau @ 4:56 pm

Infinitegraph 2.0

From the product page:

InfiniteGraph helps organizations find the valuable relationships within their data. Our product is unique in its ability to leverage distributed data and processes, which yields reduced time and costs while maximizing overall performance on big data.

No other graph database technology available today can match InfiniteGraph’s combined strengths of persisting and traversing complex relationships requiring multiple hops, across vast and distributed data stores.

But here is more important information (Objectivity, Inc. is the owner of Infinitegraph 2.0):

Objectivity, Inc., the leader in distributed, scalable data management solutions, today announced that Government Security News (GSN) has named its flagship database, Objectivity/DB, as winner of its annual Homeland Security Awards program in the “Best Intelligence Data Fusion and Collaborative Analysis Solution” category. The annual GSN Homeland Security Awards program celebrates the ongoing public-private partnership between all branches of Federal, state and local government in the United States and the private sector vendors of IT security, whose combined efforts successfully defend and protect the nation’s people, property and way of life. Click here for a list of awards categories and finalists, as well as for more information on GSN’s Homeland Security Awards.

“GSN is an authoritative source of news and information on all aspects of homeland security, and we are honored to be recognized by their esteemed panel of judges,” said Jay Jarrell, president and CEO of Objectivity, Inc. “This award is a testament to our leadership in the government sector, and underscores how agencies like the U.S. Air Force’s Network Centric Collaborative Targeting System (NCCT), Analyst Support Architecture (ASA) and the U.S. Navy’s Broad Area Maritime Surveillance (BAMS) Unmanned Aircraft System (UAS) program are leveraging Objectivity/DB to power distributed mission critical intelligence data fusion and collaborative analysis.”

Note that I corrected the first link in the first paragraph to point to the news of the award dinner. BTW, Netwitness and Overwatch Textron Systems were also winners in the “Best Intelligence Data Fusion and Collaborative Analysis Solution” category. Both worth your attention as well.

In terms of seeking an audience to discuss homeland security solutions, I think basing your approach on award winning software would be a good idea.

Comments Off

Toolset for Genomic Analysis, Data Management

Filed under: Bioinformatics,Biomedical — Patrick Durusau @ 4:56 pm

Toolset for Genomic Analysis, Data Management

From the post:

The informatics group at the Genome Institute at Washington University School of Medicine has released an integrated analysis and information-management system called the Genome Modeling System.

The system borrows concepts from traditional laboratory information-management systems — such as tracking methods and data-access interfaces, — and applies them to genomic analysis. The result is a standardized system that integrates both analysis and management capabilities, David Dooling, the assistant director of informatics at Wash U and one of the developers of GMS, explained to BioInform.

Not exactly. The tools that will make up the “Genome Modeling System” have been released but melding them into the full package is something we will see near the end of this year. (later in the article)

I remember the WU-FTPD software before it fell into disrepute so I have great expectations for this software. I will keep watch and post a note when it appears for download.

Comments Off

Acunu Data Platform v1.2 released!

Filed under: Acunu,Cassandra,NoSQL — Patrick Durusau @ 4:55 pm

Acunu Data Platform v1.2 released!

From the announcement:

We’re excited to announce the release of version 1.2 of the Acunu Data Platform, incorporating Apache Cassandra — the fastest and lowest-risk route to building a production-grade Cassandra cluster.

The Acunu Data Platform (ADP) is an all-in-one distributed database solution, delivered as a software appliance for your own data center or an Amazon Machine Image (AMI) for cloud deployments. It includes:

A hardened version of Apache Cassandra that is 100% compatible with existing Cassandra applications

The Acunu Core, a file system and embedded database designed from the ground-up for Big Data workloads

A web-based management console that simplifies deployment, monitoring and scaling of your cluster.

Your standard Linux Centos

Comments Off

Useful Mongo Resources for NoSQL Newbs

Filed under: MongoDB,NoSQL — Patrick Durusau @ 4:54 pm

Useful Mongo Resources for NoSQL Newbs

Michael Robinson has a small but useful collection of resources to introduce users to NoSQL and in particular MongoDB.

If you know of other resources Michael should be listing, give him a shout!

Comments Off

rCUDA

Filed under: CUDA,GPU — Patrick Durusau @ 4:53 pm

rCUDA

From the post:

We are glad to announce the new version 3.1 of rCUDA. It has been developed in a joint collaboration with the Parallel Architectures Group from the Technical University of Valencia.

The rCUDA framework enables the concurrent usage of CUDA-compatible devices remotely.

rCUDA employs the socket API for the communication between clients and servers. Thus, it can be useful in three different environments:

Clusters. To reduce the number of GPUs installed in High Performance Clusters. This leads to increased GPU usage and therefore energy savings as well as other related savings like acquisition costs, maintenance, space, cooling, etc.

Academia. In commodity networks, to offer access to a few high performance GPUs concurrently to many students.

Virtual Machines. To enable the access to the CUDA facilities on the physical machine.

The current version of rCUDA (v3.1) implements most of the functions in the CUDA Runtime API version 4.0, excluding only those related with graphics interoperability. rCUDA 3.1 targets the Linux OS (for 32- and 64-bit architectures) on both client and server sides.

This was mentioned in the Letting GPUs run free post but I thought it merited a separate entry. This is very likely to be important.

Comments Off

DEX 4.3 Graph Database

Filed under: DEX,Graphs — Patrick Durusau @ 4:53 pm

DEX 4.3 Graph Database

Sparsity Technologies has released DEX 4.3!

From the products page:

DEX 4.3 is distributed in 3 different APIs:

JAVA: Java API for DEX. We have remodelled the Java API for a better structured future of DEX.
New feature – DEX Java 4.3 includes loaders and exporters for edges and nodes
Allows creating a graph and manipulate its schema using a script. Load nodes and edges direclty from a csv file
Take into account that currently this API is not offering graph algorithms, they will be up in the next release.

.NET: The .NET API makes possible to Microsoft .NET programmers to use the high performance of DEX graph database.
New feature – DEX.NET 4.3 includes loaders and exporters for edges and nodes
Allows creating a graph and manipulate its schema using a script. Load nodes and edges direclty from a csv file
Take into account that currently this API is not offering graph algorithms, they will be up in the next release.

JDEX: For a complete compatibility with applications from previous versions of DEX, we are keeping the Jdex API.

I checked with Sparsity and the statement about the graph algorithms should read:

The graph algorithms (DFS, BFS and Shortest Path) present in JDEX API since v3.0 are not yet included in the new APIs JAVA and .NET, released since v.4.2 and following. The v.4.5 release, planned for 1st quarter of 2012 will include these algorithms in the JAVA and .NET APIs.

Evaluation version only goes up to 1 million nodes so you will have to use something else for your season’s wish list. 😉

Comments (2)

openSNP

Filed under: Bioinformatics,Biomedical — Patrick Durusau @ 4:51 pm

openSNP

Don’t recognize the name? I didn’t either when I came across it on Genoweb under the title Battle Over.

Then I read the homepage blurb:

openSNP allows customers of direct-to-customer genetic tests to publish their test results, find others with similar genetic variations, learn more about their results, find the latest primary literature on their variations and help scientists to find new associations.

I think we will be hearing more about openSNP in the not too distant future.

Sounds like a useful place to talk about topic maps. But in terms of their semantic impedances and their identifiers for subjects.

Hard to sell a product if we are fixing a “problem” that no one sees as a “problem.”

Comments Off

Letting GPUs run free

Filed under: CUDA,GPU — Patrick Durusau @ 4:51 pm

Letting GPUs run free by Dan Olds.

From the post:

One of the most interesting things I saw at SC11 was a joint Mellanox and University of Valencia demonstration of rCUDA over Infiniband. With rCUDA, applications can access a GPU (or multiple GPUs) on any other node in the cluster. It makes GPUs a sharable resource and is a big step towards making them as virtualisable (I don’t think that’s a word, but going to go with it anyway) as any other compute resource.

There aren’t a lot of details out there yet, there’s this press release from Mellanox and Valencia and this explanation of the rCUDA project.

This is a big deal. To me, the future of computing will be much more heterogeneous and hybrid than homogeneous and, well, some other word that means ‘common’ and begins with ‘H’. We’re moving into a mindset where systems are designed to handle particular workloads, rather than workloads that are modified to run sort of well on whatever systems are cheapest per pound or flop.

Comments (1)

Cassandra Drivers

Filed under: Cassandra,JDBC,Python — Patrick Durusau @ 4:50 pm

Cassandra Drivers

I not more write about the new release of Cassandra than I see a post about Python DB and JDBC drivers for Cassandra!

Enjoy!

Comments Off

Cassandra 1.0.5

Filed under: Cassandra,NoSQL — Patrick Durusau @ 4:50 pm

Cassandra 1.0.5

A reversion release of Cassandra. Details: Cassandra changes.

Looks like the holidays are going to be filled with upgrades, new releases!

Comments Off

Knime 2.5

Filed under: Knime — Patrick Durusau @ 4:49 pm

Knime 2.5

From the announcement:

Also this year we are putting a new version of KNIME into your shoes, just in time for Nikolaus (St. Nicholas’ Day). KNIME 2.5 contains a number of new nodes (most notably powerful string manipulation nodes and an improved JFreeChart integration) and many usability enhancements. Note also that registration for our 2012 User Group Meeting is now open!

You can download KNIME here: http://www.knime.org/download and the complete list of changes here: http://tech.knime.org/changelog-v250

The 5th KNIME Users Group Meeting will take place in Zurich, Switzerland between January 30 and February 3, 2012. Similar to last year, the UGM is accompanied by a two-day KNIME user and reporting training program as well as special workshops from KNIME and the KNIME Partners on Friday. For more information and the registration form see: http://www.knime.org/ugm2012

If you can’t make it to Zurich, at least take the time to send the Knime team seasons greetings at: http://www.knime.org/contact.

Comments Off

2^nd Globals Challenge

Filed under: Contest,Globalsdb — Patrick Durusau @ 10:48 am

2^nd Globals Challenge

Just a few hours left until the start of the 2^nd Globals Challenge so I am sending this on its way.

Details being released at 18:00 EST on 2 December 2011!

Prizes to be awarded!

Maybe by this time next year we could organize something like this for topic maps. That would be way cool!

Comments Off

December 1, 2011

Seven Databases in Seven Weeks now in Beta

Filed under: CouchDB,HBase,MongoDB,Neo4j,PostgreSQL,Redis,Riak — Patrick Durusau @ 7:41 pm

Seven Databases in Seven Weeks now in Beta

From the webpage:

Redis, Neo4J, Couch, Mongo, HBase, Riak, and Postgres: with each database, you’ll tackle a real-world data problem that highlights the concepts and features that make it shine. You’ll explore the five data models employed by these databases: relational, key/value, columnar, document, and graph. See which kinds of problems are best suited to each, and when to use them.

You’ll learn how MongoDB and CouchDB, both JavaScript powered, document oriented datastores, are strikingly different. Learn about the Dynamo heritage at the heart of Riak and Cassandra. Understand MapReduce and how to use it to solve Big Data problems.

Build clusters of servers using scalable services like Amazon’s Elastic Compute Cloud (EC2). Discover the CAP theorem and its implications for your distributed data. Understand the tradeoffs between consistency and availability, and when you can use them to your advantage. Use multiple databases in concert to create a platform that’s more than the sum of its parts, or find one that meets all your needs at once.

Seven Databases in Seven Weeks will give you a broad understanding of the databases, their strengths and weaknesses, and how to choose the ones that fit your needs.

Now in beta, in non-DRM PDF, epub, and mobi from pragprog.com/book/rwdata.

If you know the Seven Languages in Seven Weeks by Bruce Tate, no further recommendation is necessary for the approach.

I haven’t read the book, yet, but will be getting the electronic beta tonight. More to follow.

Comments Off

Node.js Tutorial

Filed under: Javascript,node-js — Patrick Durusau @ 7:41 pm

Node.js Tutorial by Alex Young.

A tutorial on building a notepad web app called NodePad using node.js.

I happened upon lesson #23, which makes it hard to get your bearings but the Alex included the link you see above, which gives you an ordered listing of the lessons.

I haven’t gone back to the beginning but judging from the comments it would not be a bad thing to do!

Comments Off

elasticsearch version 0.18.5

Filed under: ElasticSearch,Search Engines — Patrick Durusau @ 7:40 pm

elasticsearch version 0.18.5

From the blog entry:

You can download it here. It includes an upgraded Lucene version (3.5), featuring bug fixes and memory improvements, as well as more bug fixes in elasticsearch itself. Changes can be found here.

Comments Off

Relevancy Driven Development with Solr

Filed under: LucidWorks,Solr — Patrick Durusau @ 7:40 pm

Relevancy Driven Development with Solr by Robin Bramley.

From the post:

The relevancy of search engine results is very subjective so therefore testing the relevancy of queries is also subjective. One technique that exists in the information retrieval field is the use of judgement lists; an alternative approach discussed here is to follow the Behaviour Driven Development methodology employing user story acceptance criteria – I’ve been calling this Relevancy Driven Development or RDD for short.

I’d like to thank Eric Pugh for a great discussion on search engine testing and for giving me a guest slot in his ‘Better Search Engine Testing‘ talk* at Lucene EuroCon Barcelona 2011 to mention RDD. The first iteration of Solr-RDD combines my passion for automated testing with my passion for Groovy by leveraging EasyB (a Groovy BDD testing framework).

The Solr-RDD GitHub site comes closer to the expectations of the project:

The aim of RDD is to allow the business users to gain confidence in the relevancy of the search query results.
…
The trick is that the business users can use a constrained data set, define a query and the results they expect in the order that they expect.

Well…, maybe. Two things of concern:

First, a user would have to “know” the data extremely well to formulate queries in that sort of detail, and

Second, it does not appear to leave any room for unexpected information that might also be useful to the user.

Perhaps this is a technique that works well with very well known data sets with few if any unexpected results.

Comments Off

DC-2012 Metadata for Meeting Global Challenges

Filed under: Conferences,Dublin Core,Metadata,RDF,Semantic Web — Patrick Durusau @ 7:39 pm

DC-2012 Metadata for Meeting Global Challenges 3-7 September 2012, Kuching, Sarawak, Malaysia

DEADLINES & IMPORTANT DATES:
Submission Deadline: 23 March 2012
Author Notification: 25 May 2012
Final Copy: 29 June 2012

From the call for papers:

DC-2012 will explore the global, national and regional roles of metadata in addressing global challenges such as food security, the digital divide, and sustainable development. Metadata plays a significant role globally in information systems shaping how we know, monitor and change social and governmental systems affecting everything from the environment, human rights and justice to education and peace. DC-2012 will bring together in Kuching the community of metadata scholars and practitioners to engage in the exchange of knowledge and best practices in developing languages of description to meet these global challenges. Beyond the conference theme, papers, reports, and poster submissions are welcome on a wide range of metadata topics, such as:

Metadata principles, guidelines, and best practices

Metadata quality (methods, tools, and practices)

Conceptual models and frameworks (e.g., RDF, DCAM, OAIS)

Application profiles

Metadata generation (methods, tools, and practices)

Metadata interoperability across domains, languages, time, structures, and scales.

Cross-domain metadata uses (e.g., recordkeeping, preservation, curation, institutional repositories, publishing)

Domain metadata (e.g., for corporations, cultural memory institutions, education, government, and scientific fields)

Bibliographic standards (e.g., RDA, FRBR, subject headings) as Semantic Web vocabularies

Accessibility metadata

Metadata for scientific data, e-Science and grid applications

Social tagging and user participation in building metadata

Usage data (paradata/attention metadata)

Knowledge Organization Systems (e.g., ontologies, taxonomies, authority files, folksonomies, and thesauri) and Simple Knowledge Organization Systems (SKOS)

Ontology design and development

Integration of metadata and ontologies

Search engines and metadata

Linked data and the Semantic Web (metadata and applications)

Vocabulary registries and registry services

Comments Off

A taxonomy of web search (2002)

Filed under: Searching — Patrick Durusau @ 7:39 pm

A taxonomy of web search (2002) by Andrei Broder is cited by Tony Russell-Rose in A Taxonomy of Enterprise Search and Discovery, which Tony proposes to extend to the enterprise.

This post is more of a reminder than anything else to update the taxonomy of web search of Broder and after updating it, compare it to the taxonomy as “extended” by Russell-Rose. My suspicion is that that basic aspects of web search haven’t changed, even if the vocabulary we use to describe them has, substantially.

BTW, it was 2002 when the American Dialect Society declared “google” to be the “most useful” new word of the year and 2006 when Merriam-Webster included it in the dictionary.

Anyone interested in updating Border’s taxonomy? (Which would include (topic) mapping new terminology to old.)

Comments Off

Cool GO annotation visualizations with Gephi + Bio4j

Filed under: Graphs,Visualization — Patrick Durusau @ 7:38 pm

Cool GO annotation visualizations with Gephi + Bio4j

Sometimes really awesome things just show up in my inbox! This is one of them.

From the post:

After a few months without finding the opportunity to play with Gephi, it was already time to dedicate a lab day to this.

I thought that a good feature would be having the equivalent .gexf file for the current graph representation available at the tab “GoAnnotation Graph Viz”; so that you could play around with it in Gephi adapting it to your specific needs.

Then I got down to work and this is what happened:

Here is the visualization that got my interest up:

http://bio4j.com/imgs/EHEC_MolecularFunction_SeaDragon/

Enjoy!

Comments Off

CPSC 533C: Information Visualization, Fall 2011-2012

Filed under: Graphics,Graphs,Visualization — Patrick Durusau @ 7:38 pm

CPSC 533C: Information Visualization, Fall 2011-2012 by Tamara Munzner.

I found this page while looking for the software described in: CiteWiz: A Tool for the Visualization of Scientific Citation Networks (2007).

Very good course outline with links to lots of outside reading. Not to mention there are pointers to other visualization courses, software, etc.

If you want a solid grounding in visualization, it is hard to think of a better place to start.

Any number of the articles or software packages merit separate blog posts to point up their relevance to and/or use for topic maps.

Comments Off

CiteWiz: A Tool for the Visualization of Scientific Citation Networks (2007)

Filed under: Bibliography,Visualization — Patrick Durusau @ 7:37 pm

CiteWiz: A Tool for the Visualization of Scientific Citation Networks (2007) by Niklas Elmqvist and Philippas Tsigas.

Abstract:

We present CiteWiz, an extensible framework for visualization of scientific citation networks. The system is based on a taxonomy of citation database usage for researchers, and provides a timeline visualization for overviews and an influence visualization for detailed views. The timeline displays the general chronology and importance of authors and articles in a citation database, whereas the influence visualization is implemented using the Growing Polygons technique, suitably modified to the context of browsing citation data. Using the latter technique, hierarchies of articles with potentially very long citation chains can be graphically represented. The visualization is augmented with mechanisms for parent-child visualization and suitable interaction techniques for interacting with the view hierarchy and the individual articles in the dataset. We also provide an interactive concept map for keywords and co-authorship using a basic force-directed graph layout scheme. A formal user study indicates that CiteWiz is significantly more efficient than traditional database interfaces for high-level analysis tasks relating to influence and overviews, and equally efficient for low-level tasks such as finding a paper and correlating bibliographical data.

The interactive concept map is particularly interesting although the entire article will be useful for anyone experimenting with network or topic map visualization.

Comments (1)

GPUs: Path into the future

Filed under: GPU — Patrick Durusau @ 7:37 pm

GPUs: Path into the future

From the introduction:

With the announcement of a new Blue Waters petascale system that includes a considerable amount of GPU capability, it is clear GPUs are the future of supercomputing. Access magazine’s Barbara Jewett recently sat down with Wen-mei Hwu, a professor of electrical and computer engineering at the University of Illinois, a co-principal investigator on the Blue Waters project, and an expert in computer architecture, especially GPUs.

Find out why you should start thinking about GPU systems, now.

For more information, see the Blue Waters project.

Comments Off

« Newer Posts

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 3, 2011

December 2, 2011

December 1, 2011