April « 2013 « Another Word For It

April 3, 2013

MapR and Ubuntu

Filed under: Hadoop,MapR,MapReduce — Patrick Durusau @ 5:06 am

MapR has posted all of its Hadoop ecosystem source code to Github: MapR Technologies.

MapR has also partnered with Canonical to release the entire Hadoop stack for 12.04 LTS and 12.10 releases of Ubuntu on www.ubuntu.com starting April 25, 2013.

For details see: MapR Teams with Canonical to Deliver Hadoop on Ubuntu.

I first saw this at: MapR Turns to Ubuntu in Bid to Increase Footprint by Isaac Lopez.

Comments Off

Biological Database of Images and Genomes

Filed under: Associations,Biology,Genome,Genomics — Patrick Durusau @ 4:48 am

Biological Database of Images and Genomes: tools for community annotations linking image and genomic information by Andrew T Oberlin, Dominika A Jurkovic, Mitchell F Balish and Iddo Friedberg. (Database (2013) 2013 : bat016 doi: 10.1093/database/bat016)

Abstract:

Genomic data and biomedical imaging data are undergoing exponential growth. However, our understanding of the phenotype–genotype connection linking the two types of data is lagging behind. While there are many types of software that enable the manipulation and analysis of image data and genomic data as separate entities, there is no framework established for linking the two. We present a generic set of software tools, BioDIG, that allows linking of image data to genomic data. BioDIG tools can be applied to a wide range of research problems that require linking images to genomes. BioDIG features the following: rapid construction of web-based workbenches, community-based annotation, user management and web services. By using BioDIG to create websites, researchers and curators can rapidly annotate a large number of images with genomic information. Here we present the BioDIG software tools that include an image module, a genome module and a user management module. We also introduce a BioDIG-based website, MyDIG, which is being used to annotate images of mycoplasmas.

Database URL: BioDIG website: http://biodig.org

BioDIG source code repository: http://github.com/FriedbergLab/BioDIG

The MyDIG database: http://mydig.biodig.org/

Linking image data to genomic data. Sounds like associations to me.

You?

Not to mention the heterogeneity of genomic data.

Imagine extending an image/genomic data association by additional genomic data under a different identification.

Comments Off

100 Savvy Sites on Statistics and Quantitative Analysis

Filed under: Mathematics,Quantitative Analysis,Statistics — Patrick Durusau @ 4:21 am

100 Savvy Sites on Statistics and Quantitative Analysis

From the post:

Nate Silver’s unprecedented accurate prediction of state-by-state election results in the most recent presidential race was a watershed moment for the public awareness of statistics. While data gathering and analysis has become a massive industry in the past decade, it hasn’t always been as well covered in the press or publicly accessible as it is now. With more and more of our daily interactions being mediated through computers and the internet, it is easier than ever to gather detailed quantitative data and do statistical analysis on that data derive valuable information and predictions from it.

Knowledge of statistics and quantitative analysis techniques is more valuable than ever. From biostatisticians to politicians and economists, people in every field are using statistics to further their careers and knowledge. These sites are some of the most useful, informative, and comprehensive on the web covering stats and quantitative analysis.

Covers everything from Comprehensive Statistics Sites and Big Data to Data Visualization and Sports Stats.

Fire up your alternative to Google Reader!

I first saw this at 100 Savvy Sites on Statistics and Quantitative Analysis by Vincent Granville.

Comments Off

Intrade Archive: Data for Posterity

Filed under: Data,Finance Services,Prediction — Patrick Durusau @ 4:07 am

Intrade Archive: Data for Posterity by Panos Ipeirotis.

From the post:

A few years back, I have done some work on prediction markets. For this line of research, we have been collecting data from Intrade, to perform our experimental analysis. Some of the data is available through the Intrade Archive, a web app that I wrote in order to familiarize myself with the Google App Engine.

In the last few weeks, through, after the effective shutdown of Intrade, I started receiving requests on getting access to the data stored in the Intrade Archive. So, after popular demand, I gathered all the data from the Intrade Archive, and also all the past data that I had about all the Intrade contracts going back to 2003, and I put them all on GitHub for everyone to access and download.

If you don’t know about Intrade, see: How Intrade Works.

Not sure why you would need the data but it is unusual enough to merit notice.

Comments Off

April 2, 2013

Topic Map Tool Chain

Filed under: Authoring Topic Maps,Topic Map Software,Topic Map Systems,Topic Maps — Patrick Durusau @ 6:15 pm

Belaboring the state of topic map tools won’t change this fact: It could use improvement.

Leaving the current state of topic map tools to one side, I have a suggestion about going forward.

What if we conceptualize topic map production as a tool chain?

A chain that can exist as separate components or with combinations of components.

Thinking like *nix tools, each one could be designed to do one task well.

The stages I see:

Authoring
Merging
Conversion
Query
Display

The only odd looking stage is “conversion.”

By that I mean conversion from being held in a topic map data store or format to some other format for integration, query or display.

TaxMap, the oldest topic map on the WWW, is a conversion to HTML for delivery.

Converting a topic map into graph format enables the use of graph display or query mechanisms.

End-to-end solutions are possible but a tool chain perspective enables smaller projects with quicker returns.

Comments/Suggestions?

Comments (4)

Topic Map Patterns/Use Cases

Filed under: Design,Design Patterns,Graphics,UML,Visualization — Patrick Durusau @ 3:18 pm

The sources for topic map patterns I mentioned yesterday use a variety of modeling languages:

Data Model Patterns: Conventions of Thought by David C. Hay. (Uses CASE*Method™ (Baker’s Notation))

Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans. (Uses UML (Unified Modeling Language))

Developing High Quality Data Models by Matthew West. (Uses EXPRESS (EXPRESS-G is for information models))

The TMDM and Kal’s Design Patterns both use UML notation.

Although constraints will be expressed in TMCL, visually it looks to me like UML should be the notation of choice.

Will require transposition from non-UML notation but seems worthwhile to have a uniform notation.

Any strong reasons to use another notation?

Comments Off

Force-Directed Graph

Filed under: D3,Graphics,Graphs,Visualization — Patrick Durusau @ 2:15 pm

Force-Directed Graph

From the post:

This simple force-directed graph shows character co-occurence in Les Misérables. A physical simulation of charged particles and springs places related characters in closer proximity, while unrelated characters are farther apart. Layout algorithm inspired by Tim Dwyer and Thomas Jakobsen. Data based on character coappearence in Victor Hugo’s Les Misérables, compiled by Donald Knuth.

Display of graphs (read topic maps) need not be limited to complex applications.

Sam Hunting suggested this link.

Comments Off

Construction of Controlled Vocabularies

Filed under: Identity,Subject Identity,Subject Recognition,Vocabularies — Patrick Durusau @ 2:01 pm

Construction of Controlled Vocabularies: A Primer by Marcia Lei Zeng.

From the “why” page:

Vocabulary control is used to improve the effectiveness of information storage and retrieval systems, Web navigation systems, and other environments that seek to both identify and locate desired content via some sort of description using language. The primary purpose of vocabulary control is to achieve consistency in the description of content objects and to facilitate retrieval.

1.1 Need for Vocabulary Control (1.1)

The need for vocabulary control arises from two basic features of natural language, namely:

• Two or more words or terms can be used to represent a single concept

Example:
salinity/saltiness

VHF/Very High Frequency

• Two or more words that have the same spelling can represent different concepts

Example:
Mercury (planet)

Mercury (metal)

Mercury (automobile)

Mercury (mythical being)

Great examples for vocabulary control but for topic maps as well!

The topic map question is:

What do you know about the subject(s) in either case, that would make you say the words mean the same subject or different subjects?

If we can capture the information you think makes them represent the same or different subjects, there is a basis for repeating that comparison.

Perhaps even automatically.

Mary Jane pointed out this resource in a recent comment.

Comments Off

SharePoint Taxonomy: How to Start

Filed under: SharePoint,Taxonomy — Patrick Durusau @ 1:20 pm

SharePoint Taxonomy: How to Start

From the post:

Are you wondering how to start with SharePoint Taxonomy?

Many people have heard about the value of managed metadata, term store, and tagging in SharePoint 2010 and SharePoint 2013 but don’t have a taxonomy and are wondering what a taxonomy looks like and how to get started.

Download a free SharePoint Taxonomy from WAND and begin to see how taxonomy, managed metadata, and the term store in SharePoint can improve searching and findability of your SharePoint content. This taxonomy is a starter set of terms covering Legal, IT, HR, Accounting and Finance, and Sales and Marketing

There are more taxonomies at: http://blog.wandinc.com/p/sharepoint-2010-2013-and-online.html.

I ran across this today while thinking about the question of design patterns.

The web is littered with taxonomies, ontologies, thesauri, etc., so rather than starting over from scratch, why not cut-n-paste/adapt/represent existing structures as topic maps?

Suggestions of other sources?

Particularly ones you are interested in seeing as topic maps!

Comments Off

Elm

Filed under: Elm,Interface Research/Design,Programming — Patrick Durusau @ 10:50 am

The Elm Programming Language

From the webpage:

Elm is a functional reactive programming (FRP) language that compiles to HTML, CSS, and JS. FRP is a concise and elegant way to create highly interactive applications and avoid callbacks.

The hyperlinks for “create,” “highly,” “interactive,” and “applictions,” all lead to examples using Elm.

I never was much of a Pong player. More of a Missile Command and Boulder Dash fan. Still, it is an interesting demonstration. (Wasn’t working when I tried it.)

Yes, another programming language. 😉

But, it does look lite enough to encourage experimentation with interfaces.

Whether it is lite enough to keep people from feeling “invested” in prior interface choices only time will tell.

Not for everyone but I can imagine a topic map interface that offers design patterns in UML notation which are extended/completed by a user.

Or interfaces that are drawing kits of nodes and edges. Some predefined, some you define.

Or interfaces with text boxes and reasonable names for their contents.

Or other variations I cannot imagine.

Could be “lite” or “feature rich,” although I lean towards the “lite” side.

Wherever you come down on that continuum, topic maps need interfaces as varied as its users.

Comments Off

LinkBench [Graph Benchmark]

Filed under: Benchmarks,Facebook,Graphs — Patrick Durusau @ 10:14 am

LinkBench

From the webpage:

LinkBench Overview

LinkBench is a database benchmark developed to evaluate database performance for workloads similar to those of Facebook’s production MySQL deployment. LinkBench is highly configurable and extensible. It can be reconfigured to simulate a variety of workloads and plugins can be written for benchmarking additional database systems.

LinkBench is released under the Apache License, Version 2.0.

Background

One way of modeling social network data is as a social graph, where entities or nodes such as people, posts, comments and pages are connected by links which model different relationships between the nodes. Different types of links can represent friendship between two users, a user liking another object, ownership of a post, or any relationship you like. These nodes and links carry metadata such as their type, timestamps and version numbers, along with arbitrary payload data.

Facebook represents much of its data in this way, with the data stored in MySQL databases. The goal of LinkBench is to emulate the social graph database workload and provide a realistic benchmark for database performance on social workloads. LinkBench’s data model is based on the social graph, and LinkBench has the ability to generate a large synthetic social graph with key properties similar to the real graph. The workload of database operations is based on Facebook’s production workload, and is also generated in such a way that key properties of the workload match the production workload.

A benchmark for testing your graph database performance!

Additional details at: LinkBench: A database benchmark for the social graph by Tim Armstrong.

I first saw this in a tweet by Stefano Bertolo.

Comments Off

An Applet for the Investigation of Simpson’s Paradox

Filed under: BigData,Mathematics,Statistics — Patrick Durusau @ 6:17 am

An Applet for the Investigation of Simpson’s Paradox by Kady Schneiter and Jürgen Symanzik. (Journal of Statistics Education, Volume 21, Number 1 (2013))

Simpson’s paradox is best illustrated by the University of California, Berkeley sex discrimination case. Taken in the aggregate, admissions to the graduate school appeared to greatly favor men. Taken by department, no department discriminated against women and most favored admission of women. Same data, different level of examination. That is Simpson’s paradox.

Abstract:

This article describes an applet that facilitates investigation of Simpson’s Paradox in the context of a number of real and hypothetical data sets. The applet builds on the Baker-Kramer graphical representation for Simpson’s Paradox. The implementation and use of the applet are explained. This is followed by a description of how the applet has been used in an introductory statistics class and a discussion of student responses to the applet.

From Wikipedia on Simpson’s Paradox:

In probability and statistics, Simpson’s paradox, or the Yule–Simpson effect, is a paradox in which a trend that appears in different groups of data disappears when these groups are combined, and the reverse trend appears for the aggregate data. This result is often encountered in social-science and medical-science statistics,^[1] and is particularly confounding when frequency data are unduly given causal interpretations.^[2] Simpson’s Paradox disappears when causal relations are brought into consideration.

A cautionary tale about the need to understand data sets and how combining them may impact outcomes of statistical analysis.

Comments Off

Journal of Statistics Education

Filed under: BigData,Mathematics,Statistics — Patrick Durusau @ 5:56 am

Journal of Statistics Education

From the mission statement:

The Journal of Statistics Education (JSE) disseminates knowledge for the improvement of statistics education at all levels, including elementary, secondary, post-secondary, post-graduate, continuing, and workplace education. It is distributed electronically and, in accord with its broad focus, publishes articles that enhance the exchange of a diversity of interesting and useful information among educators, practitioners, and researchers around the world. The intended audience includes anyone who teaches statistics, as well as those interested in research on statistical and probabilistic reasoning. All submissions are rigorously refereed using a double-blind peer review process.

Manuscripts submitted to the journal should be relevant to the mission of JSE. Possible topics for manuscripts include, but are not restricted to: curricular reform in statistics, the use of cooperative learning and projects, innovative methods of instruction, assessment, and research (including case studies) on students’ understanding of probability and statistics, research on the teaching of statistics, attitudes and beliefs about statistics, creative and tested ideas (including experiments and demonstrations) for teaching probability and statistics topics, the use of computers and other media in teaching, statistical literacy, and distance education. Articles that provide a scholarly overview of the literature on a particular topic are also of interest. Reviews of software, books, and other teaching materials will also be considered, provided these reviews describe actual experiences using the materials.

In addition JSE also features departments called “Teaching Bits: A Resource for Teachers of Statistics” and “Datasets and Stories.” “Teaching Bits” summarizes interesting current events and research that can be used as examples in the statistics classroom, as well as pertinent items from the education literature. The “Datasets and Stories” department not only identifies interesting datasets and describes their useful pedagogical features, but enables instructors to download the datasets for further analysis or dissemination to students.

Associated with the Journal of Statistics Education is the JSE Information Service. The JSE Information Service provides a source of information for teachers of statistics that includes the archives of EDSTAT-L (an electronic discussion list on statistics education), information about the International Association for Statistical Education, and links to many other statistics education sources.

If you are going to talk about big data, of necessity you are also going to talk about statistics.

A very good free online resource on statistics.

Comments Off

STScI’s Engineering and Technology Colloquia

Filed under: Astroinformatics,GPU,Image Processing,Knowledge Management,Visualization — Patrick Durusau @ 5:49 am

STScI’s Engineering and Technology Colloquia Series Webcasts by Bruce Berriman.

From the post:

Last week, I wrote a post about Michelle Borkin’s presentation on Astronomical Medicine and Beyond, part of the Space Telescope Science Institute’s (STScI) Engineering and Technology Colloquia Series. STScI archives and posts on-line all the presentations in this series. The talks go back to 2008 (with one earlier one dating to 2001), are generally given monthly or quarterly, and represent a rich source of information on many aspects of engineering and technology. The archive includes, where available, abstracts, Power Point Slides, videos for download, and for the more recent presentations, webcasts.

Definitely a astronomy/space flavor but also includes:

Scientific Data Visualization by Adam Bly (Visualizing.org, Seed Media Group).

Knowledge Retention & Transfer: What You Need to Know by Jay Liebowitz (UMUC).

Fast Parallel Processing Using GPUs for Accelerating Image Processing by Tom Reed (Nvidia Corporation).

Every field is struggling with the same data/knowledge issues, often using different terminologies or examples.

We can all struggle separately or we can learn from others.

Which approach do you use?

Comments Off

April 1, 2013

On the Eight Day

Filed under: Marketing,Topic Maps — Patrick Durusau @ 7:45 pm

On the eight day of creation [language and time units are for the convenience of the reader. The celestial court exists outside of their strictures].

I started this post off as an April Fools Day gag but the keyboard ran away from me.

See what you think.

L = Lord

O = Other member(s) of the celestial court

L: The Tower of Babel is another example of bad PR from my own followers.

O: How so? Didn’t you confuse their languages to prevent an assault on Heaven?

L: Look around you. It is likely I would be fearful of someone piling up bricks to assault Heaven?

O: Well, now that you mention it, no, it doesn’t seem likely. (In an uncertain tone of voice.)

L: Would it help if I explained why humans invented the story of the Tower of Babel?

O: Nodding quickly.

L: Arrogance.

O: Arrogance?

L: Think about it. There are two types of people. One type thinks they know what and how everyone else should be thinking. The other type knows who should be telling others what and how to think.

The Tower of Babel story blames me for the competition to force others to a single way of thinking.

What’s ironic is their arrogance multiplies the number of languages and approaches to languages. Every generation denigrates what went before, for a new bumper crop of shiny “truths.”

Need an example?

Take their “when in the beginning there was FORTRAN….”

Now look at any listing of major programming languages, never mind the smaller ones.

No Tower of Babel story there.

O: What about the Tower of Babel as an explanation for different languages?

L: Glad you asked.

I’ll give you one guess who thinks they are entitled to an explanation for everything.

More listening to others and less whining about not being in charge would be a start towards less confusion of languages.

Comments Off

New Book Explores the P-NP Problem [Explaining Topic Maps]

Filed under: Marketing,Mathematical Reasoning,Mathematics — Patrick Durusau @ 5:24 pm

New Book Explores the P-NP Problem by Shar Steed.

From the post:

The Golden Ticket: P, NP, and the Search for the Impossible, written by CCC Council and CRA board member, Lance Fortnow is now available. The inspiration for the book came in 2009 when Fortnow published an article on the P-NP problem for Communications of the ACM. With more than 200,000 downloads, the article is one of the website’s most popular, which signals that this is an issue that people are interested in exploring. The P-NP problem is the most important open problem in computer science because it attempts measure the limits of computation.

The book is written to appeal to readers outside of computer science and shed light on the fact that there are deep computational challenges that computer scientists face. To make it relatable, Fortnow developed the “Golden Ticket” analogy, comparing the P-NP problem to the search for the golden ticket in Charlie and the Chocolate Factory, a story many people can relate to. Fortnow avoids mathematical and technical terminology and even the formal definition of the P-NP problem, and instead uses examples to explain concepts

“My goal was to make the book relatable by telling stories. It is a broad based book that does not require a math or computer science background to understand it.”

Fortnow also credits CRA and CCC for giving him inspiration to write the book.

Fortnow has explained the P-NP problem without using “…mathematical and technical commentary and even the formal definition of the P-NP problem….”

Now, we were talking about how difficult it is to explain topic maps?

Suggest we all read this as a source of inspiration for better (more accessible) explanations and tutorials on topic maps.

(I just downloaded it to the Kindle reader on a VM running on my Ubuntu box. This promises to be a great read!)

Comments Off

USPTO – New Big Data App [Value-Add Opportunity]

Filed under: BigData,Government,Government Data,MarkLogic,Patents,Topic Maps — Patrick Durusau @ 4:15 pm

U.S. Patent and Trademark Office Launches New Big Data Application on MarkLogic®

From the post:

Real-Time, Granular, Online Access to Complex Manuals Improves Efficiency and Transparency While Reducing Costs

MarkLogic Corporation, the provider of the MarkLogic® Enterprise NoSQL database, today announced that the U.S. Patent and Trademark Office (USPTO) has launched the Reference Document Management Service (RDMS), which uses MarkLogic for real-time searching of detailed, specific, up-to-date content within patent and trademark manuals. RDMS enables real-time search of the Manual of Patent Examining Procedure (MPEP) and the Trademark Manual of Examination Procedures (TMEP). These manuals provide a vital window into the complexities of U.S. patent and trademark laws for inventors, examiners, businesses, and patent and government attorneys.

The thousands of examiners working for USPTO need to be able to quickly locate relevant instructions and procedures to assist in their examinations. The RDMS is enabling faster, easier searches for these internal users.

Having the most current materials online also means that the government can reduce reliance on printed manuals that quickly go out of date. USPTO can also now create and publish revisions to its manuals more quickly, allowing them to be far more responsive to changes in legislation.

Additionally, for the first time ever, the tool has also been made available to the public increasing the MPEP and TMEP accessibility globally, furthering the federal government’s efforts to promote transparency and accountability to U.S. citizens. Patent creators and their trusted advisors can now search and reference the same content as the USPTO examiners, in real time — instead of having to thumb through a printed reference guide.

The date on this report was March 26, 2013.

I don’t know if the USPTO is just playing games but searching their site for “Reference Document Management Service” produces zero “hits.”

Searching for “RDMS” produces four (4) “hits,” none of which were pointers to an interface.

Maybe it was too transparent?

The value-add proposition I was going to suggest was mapping the results of searching into some coherent presentation, like TaxMap.

And/or linking the results of searches into current literature in rapidly developing fields of technology.

Guess both of those opportunities will have to wait for basic searching to be available.

If you have a status update on this announced but missing project please ping me.

Comments Off

Internet Topology… [Finite by Nature vs. Design]

Filed under: Graphics,Graphs,Topology,Visualization — Patrick Durusau @ 3:50 pm

Internet Topology – Massive and Amazing Graphs by Vincent Granville.

From the post:

I selected a few from this Google search. Which one is best? Re-usable in other contexts? What about videos showing growth over time, or more sophisticated graphs where link thickness represents “Internet highway” bandwidth or speed. And what about a video representing a simulated reflected DNS attack, rendering 10% of the Internet virtually dead, and showing how the attack spreads across the network?

Source: http://javiergs.com/?p=983 (a must read)

Be prepared to pump up the image size to get any recognizable text.

Truly impressive but I mention it to illustrate one of the practical problems in authoring topic maps.

The AT&amp:T graph is “massive and amazing” but it is finite. By its very nature it is finite.

Topic maps are finite as well, but their finiteness is by design. An entirely different problem.

In a topic map, every topic has the potential to have one or more associations with other topics, but it also has potential associations with subjects not yet represented by topics in the topic map.

That is like an encyclopedia author, you have to draw an arbitrary line around your topic map and say:

No associations with subjects not already in the map!

and,

No more new subjects in the map!

Which is quite different from a network typology, which no matter how vast, ends with with nodes at the end of each connection.

As a matter of design and authorship, you have to choose the limits on your topic map.

Where the limits of your topic map should be set will depend upon the use cases, requirements and resources that govern the authoring of your topic map.

Comments Off

What is the difference between a Taxonomy and an Ontology?

Filed under: Ontology,Taxonomy — Patrick Durusau @ 3:04 pm

What is the difference between a Taxonomy and an Ontology?

From the post:

In the world of information management, two common terms that people use are “taxonomy” and “ontology” but people often wonder what the difference between the two terms are. In many of our webinars, this question comes up so I wanted to provide an answer on our blog.

When I first read this post, I thought it was an April Fool’s post. But check the date: March 15, 2013. Unless April Fool’s day came early this year.

After reading the post you will find that what the author calls a taxonomy is actually an ontology.

Don’t take my word for it, see the original post.

I think the difference between a taxonomy and an ontology is that an ontology costs more.

I don’t know of any other universal differences between the two.

I first saw this in Taxonomy or Ontology by April Holmes.

Comments (8)

Finding Subject Identifiers

Filed under: Identification,Identifiers,Subject Identifiers — Patrick Durusau @ 2:55 pm

A recent comment made it clear that tooling, or the lack thereof, is a real issue for topic maps.

Here is my first suggestion of a tool you can use while authoring a topic map:

Wikipedia.

Seriously, think about it. You want a URL that identifies subject X.

Granting that Wikipedia is a fairly limited set of subjects, it is at least a starting point.

Example: I want a subject identifier for “Donald Duck,” a cartoon character.

I can use the search box at Wikipedia or I can type in a browser:

http://en.wikipedia.org/wiki/Donald%20Duck

Go ahead, try it.

If I don’t know the full name:

http://en.wikipedia.org/wiki/Donald

What do you think?

Allows you to disambiguate Donalds, at least the ones that Wikipedia knows about.

Not to mention giving you access to other subjects and relationships that may be of interest for your topic map.

To include foreign language materials (outside of English only non-thinking zones in the U.S.), try a different language Wikipedia:

http://de.wikipedia.org/wiki/Donald%20Duck

Finding subject identifiers won’t write your topic map for you but can make the job easier.

There are other sources of subject identifiers so send in your suggestions and any syntax short-cuts for accessing them.

You have no doubt read that URIs used as identifiers are supposed to be semi-permanent, “cool,” etc.

But identifiers change over time. It’s one of the reasons for historical semantic diversity.

URIs as identifiers will change as well.

Good thing topic maps enable you to have multiple identifiers for any subject.

Means old references to old identifiers still work.

Glad we dodged having to redo and reproof all those old connections.

Aren’t you?

Comments Off

Design Pattern Sources?

Filed under: Design,Design Patterns,Modeling — Patrick Durusau @ 2:23 pm

To continue with the need for topic map design pattern thread, what sources would you suggest for design patterns?

Thinking that it would be more efficient to start from commonly known patterns and then when necessary, to branch out into new or unique ones.

Not to mention that starting with familiar patterns, as opposed to esoteric ones, will provide some comfort level for users.

Sources that I have found useful include:

Data Model Patterns: Conventions of Thought by David C. Hay.

Domain-Driven Design: Tackling Complexity in the Heart of Software by Eric Evans.

Developing High Quality Data Models by Matthew West. (Think Shell Oil. Serious enterprise context.)

Do you have any favorites you would suggest?

After a day or two of favorites, the next logical step would be to choose a design pattern and with an eye on Kal’s Design Pattern Examples , attempt to fashion a design template.

Just one, not bother to specify what comes next.

Working one bite at a time will make the task seem manageable.

Yes?

Comments Off

Topic Map Design Patterns For Information Architecture

Filed under: Design,Design Patterns,Modeling,TMCL — Patrick Durusau @ 1:21 pm

Topic Map Design Patterns For Information Architecture by Kal Ahmed.

Abstract:

Software design patterns give programmers a high level language for discussing the design of software applications. For topic maps to achieve widespread adoption and improved interoperability, a set of topic map design patterns are needed to codify existing practices and make them available to a wider audience. Combining structured descriptions of design patterns with Published Subject Identifiers would enable not only the reuse of design approaches but also encourage the use of common sets of PSIs. This paper presents the arguments for developing and publishing topic map design patterns and a proposed notation for diagramming design patterns based on UML. Finally, by way of examples, the paper presents some design patterns for representation of traditional classification schemes such as thesauri, hierarchical and faceted classification.

Kal used UML to model the design patterns and their constraints. (TMCL, the Topic Map Constraint Language, had yet to be written. (TMCL)

For visual modeling purposes, are there any constraints in TMCL that cannot be modeled in UML?

I ask because I have not compared TMCL to UML.

Using UML to express the generic constraints in TMCL would be a first step towards answering the need for topic maps design patterns.

Comments Off

Topic Map Design Patterns

Filed under: Design,Design Patterns,Modeling — Patrick Durusau @ 12:47 pm

A recent comment on topic map design patterns reads in part:

The second problem, and the one I’m working through now, is that information modeling with topic maps is a new paradigm for me (and most people I’m sure) and the information on topic map models is widely dispersed. Techquila had some design patterns that were very useful and later those were put put in a paper by A. Kal but, in general, it is a lot more difficult to figure out the information model with topic maps than it is with SQL or NoSQL or RDF because those other technologies have a lot more open discussions of designs to cover specific use cases. If those discussions existed for topic maps, it would make it easier for non-experts like me to connect the high-level this-is-how-topic-maps-work type information (that is plentiful) with the this-is-the-problem-and-this-is-the-model-that-solves-it type information (that is hard to find for topic maps).

Specifically, the problem I’m trying to solve and many other real world problems need a semi-structured information model, not just an amorphous blob of topics and associations. There are multiple dimensions of hierarchies and sequences that need to be modeled so that the end user can query the system with OLAP type queries where they drill up and down or pan forward and back through the information until they find what they need.

Do you know of any books of Topic Maps use cases and/or design patterns?

Unfortunately I had to say that I knew of no “Topic Maps use cases and/or design patterns” books.

There is XML topic maps : creating and using topic maps for the Web by Sam Hunting and Jack Park, but it isn’t what I would call a design pattern book.

While searching for the Hunting/Park book I did find: Topic Maps: Semantische Suche im Internet (Xpert.press) (German Edition) [Paperback] by Richard Widhalm (Author), Thomas Mück, with a 2012 publication date. Don’t be deceived. This is a reprint of the 2002 edition.

Any books that I have missed on topic maps modeling in particular?

The comment identifies a serious lack of resources on use cases and design patterns for topic maps.

My suggestion is that we all refresh our memories of Kal’s work on topic map design patterns (which I will cover in a separate post) and start to correct this deficiency.

What say you all?

Comments Off

Current Topic Map Software?

Filed under: Topic Map Software,Topic Maps — Patrick Durusau @ 12:31 pm

A recent comment about topic map tools reads in part:

First, it took me a long time to understand what tools are out there, what their capabilities are, and which ones are still maintained. (As an aside, you would think the topic map community would have a central topic map based repository/wiki to make it easy for new developers to get started. )

A valid criticism.

I could not name off hand all the currently maintained topic map projects.

Can you?

Moreover, shouldn’t there be more topic map tools?

Adoption of computer technologies in the absence of computer-based tools tends to be low.

Yes?

Comments (1)

« Newer Posts

Example:	Mercury (planet)
	Mercury (metal)
	Mercury (automobile)
	Mercury (mythical being)

Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

April 3, 2013

April 2, 2013

April 1, 2013