## Archive for the ‘Publishing’ Category

### Journal of Data Mining & Digital Humanities

Monday, May 27th, 2013

Journal of Data Mining & Digital Humanities

From the webpage:

Data mining, an interdisciplinary subfield of computer science, involving the methods at the intersection of artificial intelligence, machine learning and database systems. The Journal of Data Mining & Digital Humanities concerned with the intersection of computing and the disciplines of the humanities, with tools provided by computing such as data visualisation, information retrieval, statistics, text mining by publishing scholarly work beyond the traditional humanities.

The journal includes a wide range of fields in its discipline to create a platform for the authors to make their contribution towards the journal and the editorial office promises a peer review process for the submitted manuscripts for the quality of publishing.

Journal of Data Mining & Digital Humanities is an Open Access journal and aims to publish most complete and reliable source of information on the discoveries and current developments in the mode of original articles, review articles, case reports, short communications, etc. in all areas of the field and making them freely available through online without any restrictions or any other subscriptions to researchers worldwide.

The journal is using Editorial Tracking System for quality in review process. Editorial Tracking is an online manuscript submission, review and tracking systems. Review processing is performed by the editorial board members of Journal of Data Mining & Digital Humanities or outside experts; at least two independent reviewers approval followed by editor approval is required for acceptance of any citable manuscript. Authors may submit manuscripts and track their progress through the system, hopefully to publication. Reviewers can download manuscripts and submit their opinions to the editor. Editors can manage the whole submission/review/revise/publish process.

KDNuggets reports the first issue of JDMDH will appear in August, 2013. Deadline for submissions for the first issue: 25 June 2013.

A great venue for topic map focused papers. (When you are not writing for the Economist.)

### New York Times – Article Search API v. 2

Sunday, May 5th, 2013

New York Times – Article Search API v. 2

From the documentation page:

With the Article Search API, you can search New York Times articles from Sept. 18, 1851 to today, retrieving headlines, abstracts, lead paragraphs, links to associated multimedia and other article metadata.

The prior Article Search API described itself as:

With the Article Search API, you can search New York Times articles from 1981 to today, retrieving headlines, abstracts, lead paragraphs, links to associated multimedia and other article metadata.

An addition of one hundred and eighty years of content for searching. No bad for a v. 2 release.

On cursory review, the API does appear to have changed significantly.

For example, the default fields for each request in version 1.0 were body, byline, date, title, url.

In version 2.0, the default fields returned are: web_url, snippet, lead_paragraph, abstract, print_page, blog, source, multimedia, headline, keywords, pub_date, document_type, news_desk, byline, type_of_material, _id, and word_count.

Five default fields for version 1.0 versus seventeen for version 2.0.

There are changes in terminology that will make discovering all the changes from version 1.0 to version 2.0 non-trivial.

Two fields that were present in version 1.0 that I don’t see (under another name?) in version 2.0 are:

dbpedia_resource:

DBpedia person names mapped to Times per_facet terms. This field is case sensitive: values must be Mixed Case.

The Times per_facet is often more comprehensive than dbpedia_resource, but the DBpedia name is easier to use with other data sources. For more information about linked open data, see data.nytimes.com.

dbpedia_resource_url:

URLs to DBpedia person names that have been mapped to Times per_facet terms. This field is case sensitive: values must be Mixed Case.

More documentation is promised, which I hope includes a mapping from version 1.0 to version 2.0.

Certainly looks like the basis for annotating content in the New York Times archives as part of a topic map.

Where users input their authentication details for the New York Times and/or other pay-per-view sites.

I can’t imagine anyone objecting to you helping them sell their content.

### Mathbabe, the book

Saturday, May 4th, 2013

Mathbabe, the book by Cathy O’Neil.

From the post:

Thanks to a certain friendly neighborhood mathbabe reader, I’ve created this mathbabe book, which is essentially all of my posts that I ever wrote (I think. Note sure about that.) bundled together mostly by date and stuck in a huge pdf. It comes to 1,243 pages.

I did it using leanpub.com, which charges $0.99 per person who downloads the pdf. I’m not charging anything over that, because the way I look at it, it’s already free. Speaking of that, I can see why I’d want a copy of this stuff, since it’s the best way I can think of to have a local version of a bunch of writing I’ve done over the past couple of years, but I don’t actually see why anyone else would. So please don’t think I’m expecting you to go buy this book! Even so, more than one reader has requested this, so here it is. And one strange thing: I don’t think it required my password on WordPress.com to do it, I just needed the url for the RSS feed. So if you want to avoid paying 99 cents, I’m pretty sure you can go to leanpub or one of its competitors and create another, identical book using that same feed. And for that matter you can also go build your own book about anything using these tools, which is pretty cool when you think about it. Readers, please tell me if there’s a way to do this that’s open source and free. The Mathbabe “book” would be one that I would be interested in reading. I can think of several other blogs that fall into that category. I hesitate to use the term “book” for such a collection. Maybe I am confusing “monograph,” which is focused on a topic, with “book,” which applies to works beyond a certain length. I think of my postings, once you remove the dated notice materials, as potential essays or chapters in a book. But they would need fleshing out and polishing to qualify for more formal publication. ### FORCE 11 Thursday, March 21st, 2013 FORCE 11 Short description: Force11 (the Future of Research Communications and e-Scholarship) is a virtual community working to transform scholarly communications toward improved knowledge creation and sharing. Currently, we have 315 active members. A longer description from the “about” page: Research and scholarship lead to the generation of new knowledge. The dissemination of this knowledge has a fundamental impact on the ways in which society develops and progresses; and at the same time, it feeds back to improve subsequent research and scholarship. Here, as in so many other areas of human activity, the Internet is changing the way things work: it opens up opportunities for new processes that can accelerate the growth of knowledge, including the creation of new means of communicating that knowledge among researchers and within the wider community. Two decades of emergent and increasingly pervasive information technology have demonstrated the potential for far more effective scholarly communication. However, the use of this technology remains limited; research processes and the dissemination of research results have yet to fully assimilate the capabilities of the Web and other digital media. Producers and consumers remain wedded to formats developed in the era of print publication, and the reward systems for researchers remain tied to those delivery mechanisms. Force11 is a community of scholars, librarians, archivists, publishers and research funders that has arisen organically to help facilitate the change toward improved knowledge creation and sharing. Individually and collectively, we aim to bring about a change in modern scholarly communications through the effective use of information technology. Force11 has grown from a small group of like-minded individuals into an open movement with clearly identified stakeholders associated with emerging technologies, policies, funding mechanisms and business models. While not disputing the expressive power of the written word to communicate complex ideas, our foundational assumption is that scholarly communication by means of semantically enhanced media-rich digital publishing is likely to have a greater impact than communication in traditional print media or electronic facsimiles of printed works. However, to date, online versions of ‘scholarly outputs’ have tended to replicate print forms, rather than exploit the additional functionalities afforded by the digital terrain. We believe that digital publishing of enhanced papers will enable more effective scholarly communication, which will also broaden to include, for example, the publication of software tools, and research communication by means of social media channels. We see Force11 as a starting point for a community that we hope will grow and be augmented by individual and collective efforts by the participants and others. We invite you to join and contribute to this enterprise. Force11 grew out of the FORC Workshop held in Dagstuhl, Germany in August 2011. FORCE11 is a movement of people interested in furthering the goals stated in the FORCE11 manifesto. An important part of our work is information gathering and dissemination. We invite anyone with relevant information to provide us links which we may include on our websites. We ask anyone with similar and/or related efforts to include links to FORCE11. We are a neutral information market, and do not endorse or seek to block any relevant work. The Tools and Resources page is particularly interesting. Current divisions are: • Alternative metrics • Author Identification • Annotation • Authoring tools • Citation analysis • Computational Linguistics/Text Mining Efforts • Data citation • Ereaders • Hypothesis/claim-based representation of the rhetorical structure of a scientific paper • Mapping initiatives between ontologies • Metadata standards and ontologies • Modular formats for science publishing • Open Citations • Peer Review: New Models • Provenance • Publications and reports relevant to scholarly digital publication and data • Semantic publishing initiatives and other enriched forms of publication • Structured Digital Abstracts – modeling science (especially biology) as triples • Structured experimental methods and workflows • Text Extraction Topic maps fit into communication agendas quite easily. The first step in communication is capturing something to say. The second step in communication is expressing what has been captured so it can be understood by others (or yourself next week). Topic maps do both quite nicely. I first saw this in a tweet by Anita de Waard. ### What tools do you use for information gathering and publishing? Thursday, January 24th, 2013 From the post: Many apps claim to be the pinnacle of content consumption and distribution. Most are a tangle of silly names and bad interfaces, but some of these tools are useful. A few are downright empowering. Finding those good ones is the tricky part. I queried O’Reilly colleagues to find out what they use and why, and that process offered a decent starting point. We put all our notes together into this public Hackpad — feel free to add to it. I also went through and plucked out some of the top choices. Those are posted below. Information gathering, however humble it may be, is the start of any topic map authoring project. Mac asks for the tools you use every week. Let’s not disappoint him! ### Intelligent Content:… Monday, January 14th, 2013 From the post: When you buy a car, it comes with a thick manual that probably sits in your glove box for the life of the car. The experience with a new luxury car may be much different. That printed, bound manual may only contain the information relevant to your car. No leather seats, no two page spread on caring for the hide. That’s intelligent content. And it’s an opportunity for APIs to help publishers go way beyond the cookie cutter printed book. It also happens to be an exciting conference coming to San Francisco in February. It takes effort to segment content, especially when it was originally written as one piece. There are many benefits to those that put in the effort to think of their content as a platform. Publisher Pearson did this with a number of its titles, most notably with its Pearson Eyewitness Guides API. Using the API, developers can take what was a standalone travel book–say, the Eyewitness Guide to London–and query individual locations. One can imagine travel apps using the content to display great restaurants or landmarks that are nearby, for example. Traditional publishing is a market that is ripe for disruption, characterized by Berkeley professor Robert Glushko co-creating a new approach to academic textbooks with his students in the Future of E-books. Glushko is one of the speakers at the Intelligent Content Conference, which will bring together content creators, technologists and publishers to discuss the many opportunities. Also speaking is Netflix’s Daniel Jacobson, who architected a large redesign of the Netflix API in order to support hundreds of devices. And yes, I will discuss the opportunities for content-as-a-service via APIs. ProgrammableWeb readers can still get in on the early bird discount to attend Intelligent Content, which takes place February 7-8 in San Francisco. San Francisco in February sounds like a good idea. Particularly if the future of publishing is on the agenda. Would observe that “intelligent content” implies that some one, that is a person, has both authored the content and designed the API. Doesn’t happen auto-magically. And with people involved, our old friend semantic diversity is going to be in the midst of the discussions, proposals and projects. Reliable collation of data from different publishers (universities with multiple subscriptions should be pushing for this now) could make access seamless to end users. ### A Paywall In Your Future? [Curated Data As Revenue Stream] Tuesday, December 25th, 2012 From the post: Ever since the New York Times rolled out its so-called paywall in March 2011, a perennial dispute has waged. Anxious publishers say they can’t afford to give away their content for free, while the blogger set claim paywalls tend to turn off readers accustomed to a free and open Web. More than a year and a half later, it’s clear the New York Times’ paywall is not only valuable, it’s helped turn the paper’s subscription dollars, which once might have been considered the equivalent of a generous tithing, into a significant revenue-generating business. As of this year, the company is expected to make more money from subscriptions than from advertising — the first time that’s happened. Digital subscriptions will generate$91 million this year, according to Douglas Arthur, an analyst with Evercore Partners. The paywall, by his estimate, will account for 12 percent of total subscription sales, which will top $768.3 million this year. That’s$52.8 million more than advertising. Those figures are for the Times newspaper and the International Herald Tribune, largely considered the European edition of the Times.

It’s a milestone that upends the traditional 80-20 ratio between ads and circulation that publishers once considered a healthy mix and that is now no longer tenable given the industrywide decline in newsprint advertising. Annual ad dollars at the Times, for example, has fallen for five straight years.

More importantly, subscription sales are rising faster than ad dollars are falling. During the 12 months after the paywall was implemented, the Times and the International Herald Tribune increased circulation dollars 7.1 percent compared with the previous 12-month period, while advertising fell 3.7 percent. Subscription sales more than compensated for the ad losses, surpassing them by \$19.2 million in the first year they started charging readers online.

I don’t think gate-keeper and camera-ready copy publishers should take much comfort from this report.

Unlike those outlets, the New York Times has a “value-add” with regard to the news it reports.

Much like UI/UX design, the open question is: What do users see as a value-add? (Hopefully a significant number of users.)

A life or death question for a new content stream, fighting for attention.

### Paying for What Was Free: Lessons from the New York Times Paywall

Sunday, November 4th, 2012

Paying for What Was Free: Lessons from the New York Times Paywall

From the post:

In a national online longitudinal survey, participants reported their attitudes and behaviors in response to the recently implemented metered paywall by the New York Times. Previously free online content now requires a digital subscription to access beyond a small free monthly allotment. Participants were surveyed shortly after the paywall was announced and again 11 weeks after it was implemented to understand how they would react and adapt to this change. Most readers planned not to pay and ultimately did not. Instead, they devalued the newspaper, visited its Web site less frequently, and used loopholes, particularly those who thought the paywall would lead to inequality. Results of an experimental justification manipulation revealed that framing the paywall in terms of financial necessity moderately increased support and willingness to pay. Framing the paywall in terms of a profit motive proved to be a noncompelling justification, sharply decreasing both support and willingness to pay. Results suggest that people react negatively to paying for previously free content, but change can be facilitated with compelling justifications that emphasize fairness.

The original article: Jonathan E. Cook and Shahzeen Z. Attari. Cyberpsychology, Behavior, and Social Networking. -Not available-, ahead of print. doi:10.1089/cyber.2012.0251

Another data point in the struggle to find a viable model for delivery of online content.

The difficulty with “free” content, followed by discovering you still need to pay expenses for that content, is that consumers, when charged, gain nothing over when the content was free. They are losers in that proposition.

I mention this because topic maps that provide content over the web face the same economic challenges as other online content providers.

A model that I haven’t seen (you may have so sing out) is one that offers the content for free, but the links to other materials, the research adds value to the content, are dead links without subscription. True, someone could track down each and every reference but if you are using the content as part of your job, do you really want to do that?

The full and complete content is simply made available. To anyone who want a copy. After all, the wider the circulation of the content, the more free advertising you are getting for your publication.

Delivery of PDF files with citations, sans links, for non-subscribers is perhaps one line of XSL-FO code. It satisfies the question of “access” and yet leaves publishers a new area to fill with features and value-added content.

Take for example, less than full article level linking. If I wanted to read another thirty pages to find a citation was just boiler-plate, I hardly need a citation network do I? Of course value-added content isn’t found directly under the lamp post, but requires some imagination.

### JournalTOCs

Wednesday, October 24th, 2012

JournalTOCs

Most publishers have TOC services for new issues of their journals.

JournalTOCs aggregates TOCs from publishers and maintains a searchable database of their TOC postings.

A database that is accessible via a free API I should add.

The API should be a useful way to add journal articles to a topic map, particularly when you want to add selected articles and not entire issues.

I am looking forward to using and exploring JournalTOCs.

Suggest you do the same.

### Books as Islands/Silos – e-book formats

Sunday, September 9th, 2012

After posting about the panel discussion on the future of the book, I looked up the listing of e-book formats at Wikipedia and found:

1. Archos Diffusion
3. Comic Book Archive file
4. Compiled HTML
5. DAISY – ANSI/NISO Z39.86
6. Desktop Author
7. DjVu
8. EPUB
10. FictionBook (Fb2)
11. Founder Electronics
12. Hypertext Markup Language
13. iBook (Apple)
14. IEC 62448
15. KF8 (Amazon Kindle)
16. Microsoft LIT
17. Mobipocket
18. Multimedia eBooks
19. Newton eBook
20. Open Electronic Package
21. Portable Document Format
22. Plain text files
23. Plucker
24. PostScript
26. TealDoc
27. TEBR
28. Text Encoding Initiative
29. TomeRaider

Beyond different formats, the additional issue being that each book stands on its own.

Imagine a “hover” over a section of interest in a book and relevant other “sections” from other books are also displayed.

Is anyone working on a mapping across these various formats? (Not conversion, “mapping across” language chosen deliberately. Conversion might violate a EULA. Navigation with due regard to the EULA would be difficult to prohibit.)

I realize some of them are too seldom used for commercially viable material to be of interest. Or may be of interest only in certain markets (SSReader for instance).

Not the classic topic map case of identifying duplicate content in different guises but producing navigation across different formats to distinct material.

### Books, Bookstores, Catalogs [30% Digital by end of 2012, Books as Islands/Silos]

Sunday, September 9th, 2012

Books, Bookstores, Catalogs by Kevin Hillstrom.

From the post:

The parallels between books, bookstores, and catalogs are significant.

So take fifty minutes this weekend, and watch this session that was recently broadcast on BookTV, titled “The Future of the Book and Bookstore“.

This is fifty minutes of absolutely riveting television, seriously! Boring setting, riveting topic.

Jim Milliot (Publishers Weekly) tossed out an early tidbit: 30% of book sales will be digital by the end of 2012.

LIssa Muscatine, Politics & Prose bootstore owner: When books are a smaller part of the revenue stream, have to diversify the revenue stream. Including print on demand from a catalog of 7 million books.

Sam Dorrance Potomac Books (publisher): Hard copy sales will likely decrease by ten percent (10%) per year for the next several years.

Recurrent theme: Independent booksellers can provide guidance to readers. Not the same thing as “recommendation” because it is more nuanced.

Rafe Sagalyn Sagalyn Literary Agency: Now a buyers market. Almost parity between hard copy and ebook sales.

Great panel but misses the point that books, hard copy or digital, remain isolated islands/silos.

Want to have a value-add that is revolutionary?

Create links across Kindle and other electronic formats, so that licensed users are not isolated within single works.

Did I hear someone say topic maps?

### Applied and implied semantics in crystallographic publishing

Thursday, August 30th, 2012

Applied and implied semantics in crystallographic publishing by Brian McMahon. Journal of Cheminformatics 2012, 4:19 doi:10.1186/1758-2946-4-19.

Abstract:

Background

Crystallography is a data-rich, software-intensive scientific discipline with a community that has undertaken direct responsibility for publishing its own scientific journals. That community has worked actively to develop information exchange standards allowing readers of structure reports to access directly, and interact with, the scientific content of the articles.

Results

Structure reports submitted to some journals of the International Union of Crystallography (IUCr) can be automatically validated and published through an efficient and cost-effective workflow. Readers can view and interact with the structures in three-dimensional visualization applications, and can access the experimental data should they wish to perform their own independent structure solution and refinement. The journals also layer on top of this facility a number of automated annotations and interpretations to add further scientific value.

Conclusions

The benefits of semantically rich information exchange standards have revolutionised the scholarly publishing process for crystallography, and establish a model relevant to many other physical science disciplines.

A strong reminder to authors and publishers of the costs and benefits of making semantics explicit. (And the trade-offs involved.)

### Topic Map Based Publishing

Monday, August 20th, 2012

After asking for ideas on publishing cheat sheets this morning, I have one to offer as well.

One problem with traditional cheat sheets is what any particular user wants in a cheat sheet?

Another problem is how expand the content of a cheat sheet?

And what if you want to sell the content? How does that work?

I don’t have a working version (yet) but here is my thinking on how topic maps could power a “cheat sheet” that meets all those requirements.

Solving the problem of what content to include seems critical to me. It is the make or break point in terms of attracting paying customers for a cheat sheet.

Content of no interest is as deadly as poor quality content. Either way, paying customers will vote with their feet.

The first step is to allow customers to “build” their own cheat sheet from some list of content. In topic map terminology, they specify an association between themselves and a set of topics to appear in “their” cheat sheet.

Most of the cheat sheets that I have seen (and printed out more than a few) are static artifacts. WYSIWYG artifacts. What there is and there ain’t no more.

Works for some things but what if what you need to know lies just beyond the edge of the cheat sheet? That’s that bad thing about static artifacts, they have edges.

In addition to building their own cheat sheet, the only limits to a topic map based cheat sheet are those imposed by lack of payment or interest.

You may not need troff syntax examples on a daily basis but there are times when they could come in quite handy. (Don’t laugh. Liam Quin got hired on the basis of the troff typesetting of his resume.)

The second step is to have a cheat sheet that can expand or contract based on the immediate needs of the user. Sometimes more or less content, depending on their need. Think of an expandable “nutshell” reference.

A WYWIWYG (What You Want Is What You Get) approach as opposed to WWWTSYIWYG (What We Want To Sell You Is What You Get) (any publishers come to mind?).

Finally, how to “sell” the content? The value-add?

Here’s one model: The user buys a version of the cheat sheet, which has embedded links to addition content. Links that when the user authenticates to a server, are treated as subject identifiers. Subject identifiers that cause merging to occur with topics on the server and deliver additional content. Each user subject identifier can be auto-generated on purchase and so are uniquely tied to a particular login.

The user can freely distribute the version of the cheat sheet they purchased, free advertising for you. But the additional content requires a separate purchase by the new user.

What blind alleys, pot holes and other hazards/dangers am I failing to account for in this scenario?

### Three Steps to Heaven: Semantic Publishing in a Real World Workflow

Tuesday, July 3rd, 2012

Three Steps to Heaven: Semantic Publishing in a Real World Workflow by Phillip Lord, Simon Cockell, and Robert Stevens.

Abstract:

Semantic publishing offers the promise of computable papers, enriched visualisation and a realisation of the linked data ideal. In reality, however, the publication process contrives to prevent richer semantics while culminating in a `lumpen’ PDF. In this paper, we discuss a web-first approach to publication, and describe a three-tiered approach which integrates with the existing authoring tooling. Critically, although it adds limited semantics, it does provide value to all the participants in the process: the author, the reader and the machine.

With a touch of irony and gloom the authors write:

… There are signi cant barriers to the acceptance of semantic publishing as a standard mechanism for academic publishing. The web was invented around 1990 as a light-weight mechanism for publication of documents. It has subsequently had a massive impact on society in general. It has, however, barely touched most scientifi c publishing; while most journals have a website, the publication process still revolves around the generation of papers, moving from Microsoft Word or LATEX [5], through to a final PDF which looks, feels and is something designed to be printed onto paper4. Adding semantics into this environment is difficult or impossible; the content of the PDF has to be exposed and semantic content retrofi tted or, in all likelihood, a complex process of author and publisher interaction has to be devised and followed. If semantic data publishing and semantic publishing of academic narratives are to work together, then academic publishing needs to change.

4. This includes conferences dedicated to the web and the use of web technologies.

One could add “…includes papers about changing the publishing process” but I digress.

I don’t disagree that adding semantics to the current system has proved problematic.

I do disagree that changing the current system, which is deeply embedded in research, publishing and social practices is likely to succeed.

At least if success is defined as a general solution to adding semantics to scientific research and publishing in general. Such projects may be successful in creating new methods of publishing scientific research but that just expands the variety of methods we must account for.

That doesn’t have a “solution like” feel to me. You?

Monday, July 2nd, 2012

Readersourcing—a manifesto by Stefano Mizzaro. (Mizzaro, S. (2012), Readersourcing—a manifesto. J. Am. Soc. Inf. Sci.. doi: 10.1002/asi.22668)

Abstract:

This position paper analyzes the current situation in scholarly publishing and peer review practices and presents three theses: (a) we are going to run out of peer reviewers; (b) it is possible to replace referees with readers, an approach that I have named “Readersourcing”; and (c) it is possible to avoid potential weaknesses in the Readersourcing model by adopting an appropriate quality control mechanism. The readersourcing.org system is then presented as an independent, third-party, nonprofit, and academic/scientific endeavor aimed at quality rating of scholarly literature and scholars, and some possible criticisms are discussed.

Mizzaro touches a number of issues that have speculative answers in his call for “readersourcing” of research. There is a website in progress, www.readersourcing.org.

I am interested in the approach as an aspect of crowdsourcing the creation of topic maps.

FYI, his statement that:

Readersourcing is a solution to a problem, but it immediately raises another problem, for which we need a solution: how to distinguish good readers from bad readers. If 200 undergraduate students say that a paper is good, but five experts (by reputation) in the field say that it is not, then it seems obvious that the latter should be given more importance when calculating the paper’s quality.

Seems problematic to me. Particularly for graduate students. If professors at their school rate research high or low, that should be calculated into a rating for that particular reader.

If that seems pessimistic, read: Fish, Stanley, “Transmuting the Lump: Paradise Lost, 1942-1979,” in Doing What Comes Naturally. Fish, Stanley (ed.), Duke University Press, 1989), which treats changing “expert” opinions on the closing chapters of Paradise Lost. So far as I know, the text did not change between 1942 and 1979 but “expert” opinion certainly did.

I offer that as a caution that all of our judgements are a matter of social consensus that changes over time. On some issues more quickly than others. Our information systems should reflect the ebb and flow of that semantic renegotiation.

### How to Get Published – Elsevier

Sunday, March 25th, 2012

Author training webcasts from Elsevier.

Whether you are thinking about publishing in professional journals or simply want to improve (write?) useful user documentation, this isn’t a bad resource.

### Real scientists never report fraud

Saturday, November 12th, 2011

Real scientists never report fraud

Daniel Lemire writes (in part):

People who want to believe that “peer reviewed work” means “correct work” will object that this is just one case. But what about the recently dismissed Harvard professor Marc Hauser? We find exactly the same story. Marc Hauser published over 200 papers in the best journals, making up data as he went. Again colleagues, journals and collaborators failed to openly challenge him: it took naive students, that is, outsiders, to report the fraud.

While I agree that other “professionals” may not have time to closely check work in the peer review process (see some of the comments), I think that illustrates the valuable role that students can play in the publication process.

Why not have a departmental requirement that papers for publication be circulated among students with an anonymous but public comment mechanism? Students are as pressed for time as anyone but they have the added incentive of wanting to become skilled at criticism of ideas and writing.

Not only would such a review process increase the likelihood of detection of fraud, but it would catch all manner of poor writing or citation practices. I regularly encounter published CS papers that incorrectly cite other published work or that cite work eventually published but under other titles. No fraud, just poor practices.