Another Word For It Patrick Durusau on Topic Maps and Semantic Diversity

December 22, 2013

…electronic laboratory notebook records

Filed under: Cheminformatics,ELN Integration,Science,Semantics — Patrick Durusau @ 7:29 pm

First steps towards semantic descriptions of electronic laboratory notebook records by Simon J Coles, Jeremy G Frey, Colin L Bird, Richard J Whitby and Aileen E Day.

Abstract:

In order to exploit the vast body of currently inaccessible chemical information held in Electronic Laboratory Notebooks (ELNs) it is necessary not only to make it available but also to develop protocols for discovery, access and ultimately automatic processing. An aim of the Dial-a-Molecule Grand Challenge Network is to be able to draw on the body of accumulated chemical knowledge in order to predict or optimize the outcome of reactions. Accordingly the Network drew up a working group comprising informaticians, software developers and stakeholders from industry and academia to develop protocols and mechanisms to access and process ELN records. The work presented here constitutes the first stage of this process by proposing a tiered metadata system of knowledge, information and processing where each in turn addresses a) discovery, indexing and citation b) context and access to additional information and c) content access and manipulation. A compact set of metadata terms, called the elnItemManifest, has been derived and caters for the knowledge layer of this model. The elnItemManifest has been encoded as an XML schema and some use cases are presented to demonstrate the potential of this approach.

And the current state of electronic laboratory notebooks:

It has been acknowledged at the highest level [15] that “research data are heterogeneous, often classified and cited with disparate schema, and housed in distributed and autonomous databases and repositories. Standards for descriptive and structural metadata will help establish a common framework for understanding data and data structures to address the heterogeneity of datasets.” This is equally the case with the data held in ELNs. (citing: 15. US National Science Board report, Digital Research Data Sharing and Management, Dec 2011 Appendix F Standards and interoperability enable data-intensive science. http://www.nsf.gov/nsb/publications/2011/nsb1124.pdf, accessed 10/07/2013.)

It is trivially true that: “…a common framework for understanding data and data structures …[would] address the heterogeneity of datasets.”

Yes, yes a common framework for data and data structures would solve the heterogeneity issues with datasets.

What is surprising is that no one had that idea up until now. 😉

I won’t recite the history of failed attempts at common frameworks for data and data structures here. To the extent that communities do adopt common practices or standards, those do help. Unfortunately there have never been any universal ones.

Or should I say there have never been any proposals for universal frameworks that succeeded in becoming universal? That’s more accurate. We have not lacked for proposals for universal frameworks.

That isn’t to say this is a bad proposal. But it will be only one of many proposals for the integration of electronic laboratory notebook records, leaving the task of integration between systems for integration left to be done.

BTW, if you are interested in further details, see the article and the XML schema at: http://www.dial-a-molecule.org/wp/blog/2013/08/elnitemmanifest-a-metadata-schema-for-accessing-and-processing-eln-records/.

February 6, 2012

Implementing Electronic Lab Notebooks – Update

Filed under: ELN Integration,Marketing — Patrick Durusau @ 6:57 pm

Just a quick note to point out that Bennett Lass, PhD, has completed his six-part series on implementing electronic lab notebooks.

My original post: Implementing Electronic Lab Notebooks has been updated with links to all six parts but I don’t know how many of you would see the update.

As in any collaborative environment, subject identity issues arise both in contemporary exchanges as well as using/mining historical data.

You don’t want to ignore/throw out old research nor do you want to become a fossil more suited for the anthropology or history of science departments. Topic maps can help you avoid those fates.

October 19, 2011

The Kepler Project

Filed under: Bioinformatics,Data Analysis,ELN Integration,Information Flow,Workflow — Patrick Durusau @ 3:16 pm

The Kepler Project

From the website:

The Kepler Project is dedicated to furthering and supporting the capabilities, use, and awareness of the free and open source, scientific workflow application, Kepler. Kepler is designed to help scien­tists, analysts, and computer programmers create, execute, and share models and analyses across a broad range of scientific and engineering disciplines. Kepler can operate on data stored in a variety of formats, locally and over the internet, and is an effective environment for integrating disparate software components, such as merging “R” scripts with compiled “C” code, or facilitating remote, distributed execution of models. Using Kepler’s graphical user interface, users simply select and then connect pertinent analytical components and data sources to create a “scientific workflow”—an executable representation of the steps required to generate results. The Kepler software helps users share and reuse data, workflows, and compo­nents developed by the scientific community to address common needs.

The Kepler software is developed and maintained by the cross-project Kepler collaboration, which is led by a team consisting of several of the key institutions that originated the project: UC Davis, UC Santa Barbara, and UC San Diego. Primary responsibility for achieving the goals of the Kepler Project reside with the Leadership Team, which works to assure the long-term technical and financial viability of Kepler by making strategic decisions on behalf of the Kepler user community, as well as providing an official and durable point-of-contact to articulate and represent the interests of the Kepler Project and the Kepler software application. Details about how to get more involved with the Kepler Project can be found in the developer section of this website.

Kepler is a java-based application that is maintained for the Windows, OSX, and Linux operating systems. The Kepler Project supports the official code-base for Kepler development, as well as provides materials and mechanisms for learning how to use Kepler, sharing experiences with other workflow developers, reporting bugs, suggesting enhancements, etc.

I found this from an announcement of an NSF grant for a bioKepler project.

Questions:

  1. Review the Kepler project and prepare a short summary of it. (3 – 5 pages)
  2. Workflow by its very nature involves subjects moving from one process or user to another. How is that handled by Kepler in general?
  3. Can you use intersect the workflow of Kepler with other workflow management software? If not, why not? (research project)

July 22, 2011

Implementing Electronic Lab Notebooks

Filed under: ELN Integration — Patrick Durusau @ 6:12 pm

Implementing Electronic Lab Notebooks

Implementing Electronic Lab Notebooks: Building the foundation

Bennett Lass is doing a series on electronic lab notebooks and I will be gathering them here.

There are two questions I have in mind:

  1. What happens when the description of the data being recorded in the ELN changes? How is old/new data captured for post-change searches?
  2. Not realistic I know but what happens when a researcher changes labs and consequently ELN solutions?

Update:

Implementing Electronic Lab Notebooks: Documenting Experiments (Part 3)
Implementing Electronic Lab Notebooks: Enabling Collaboration (Part 4)
Implementing Electronic Lab Notebooks: System Integration (Part 5)
Implementing Electronic Lab Notebooks: Research Management (Part 6)

July 21, 2011

ELN Integration: Avoiding the Spaghetti Bowl

Filed under: Data Integration,ELN Integration — Patrick Durusau @ 6:11 pm

ELN Integration: Avoiding the Spaghetti Bowl by Michael H. Elliott. (Scientific Computing, May 2011)

Michael writes:

…over 20 percent of the average scientist’s time is spend on non-value-added data aggregation, transcription, formatting and manual documentation. [p.19]

…in a recent survey of over 400 scientists, “integrating data from multiple systems” was cited as the number one laboratory data management challenge. [p. 19]

The multiple terminologies various groups use can also impact integration. For example, what a “lot” or “batch” can vary by who you ask: the medicinal chemist, formulator, or biologics process development scientist. A common vocabulary can be one of the biggest stumbling blocks, as it involves either gaining consensus, defining semantic relationships and/or data transformations. [p.21]

Good article that highlights the on-going difficulty that scientists face with ELN (Electronic Lab Notebook) solutions.

It was refreshing to hear someone mention organizational and operational issues being “…more difficult to address than writing code.”

Technical solutions cannot address personnel, organizational or semantic issues.

However tempting it may be to “wait and see,” the personnel, organizational and semantic issues you had before an integration solution will be there post-integration solution. That’s a promise.

Powered by WordPress