In “Put This On My List…” Michael Mitzenmacher writes:
Put this on my list of papers I wish I had written: Manipulating Google Scholar Citations and Google Scholar Metrics: simple, easy and tempting. I think the title is sufficiently descriptive of the content, but the idea was they created a fake researcher and posted fake papers on a real university web site to inflate citation counts for some papers. (Apparently, Google scholar is pretty “sticky”; even after the papers came down, the citation counts stayed up…)
The traditional way to boost citations is to re-arrange the order of the authors and the same paper, then re-publish it.
Gaming citation systems isn’t news, although the Google Scholar Citations paper demonstrates that it has become easier.
For me the “news” part was the “sticky” behavior of Google’s information system, retaining the citation counts even after the fake documents were removed.
Is your information system “sticky?” That is does it store information as “static” data that isn’t dependent on other data?
If it does, you and anyone who uses your data is running the risk of using stale or even incorrect data. The potential cost of that risk depends on your industry.
For legal, medical, banking and similar industries, the potential liability argues against assuming recorded data is current and valid data.
Representing critical data as a topic with constrained (TMCL) occurrences that must be present is one way to address this problem with a topic map.
If a constrained occurrences is absent, the topic in question fails the TMCL constraint and so can be reported as an error.
I suspect you could duplicate that behavior in a graph database.
When you query for a particular node (read “fact”), check to see if all the required links are present. Not as elegant as invalidation by constraint but should work.