I encountered this while hunting down references on the insect identification contest.
How does your thinking about topic maps or other semantic solutions fare against:
Machine learning research has, to a great extent, ignored an important aspect of many real world applications: time. Existing concept learners predominantly operate on a static set of attributes; for example, classifying flowers described by leaf size, petal colour and petal count. The values of these attributes is assumed to be unchanging — the flower never grows or loses leaves.
However, many real datasets are not “static”; they cannot sensibly be represented as a fixed set of attributes. Rather, the examples are expressed as features that vary temporally, and it is the temporal variation itself that is used for classification. Consider a simple gesture recognition domain, in which the temporal features are the position of the hands, finger bends, and so on. Looking at the position of the hand at one point in time is not likely to lead to a successful classification; it is only by analysing changes in position that recognition is possible.
(Temporal Classication: Extending the Classication Paradigm to Multivariate Time Series by Mohammed Waleed Kadous (2002))
A decade old now but still a nice summary of the issue.
Can we substitute “identification” for “machine learning research?”
Are you relying “…on a static set of attributes” for identity purposes?