Archive for the ‘Computational Biology’ Category

A Primer for Computational Biology

Thursday, November 9th, 2017

A Primer for Computational Biology by Shawn T. O’Neil.

From the webpage:

A Primer for Computational Biology aims to provide life scientists and students the skills necessary for research in a data-rich world. The text covers accessing and using remote servers via the command-line, writing programs and pipelines for data analysis, and provides useful vocabulary for interdisciplinary work. The book is broken into three parts:

  1. Introduction to Unix/Linux: The command-line is the “natural environment” of scientific computing, and this part covers a wide range of topics, including logging in, working with files and directories, installing programs and writing scripts, and the powerful “pipe” operator for file and data manipulation.
  2. Programming in Python: Python is both a premier language for learning and a common choice in scientific software development. This part covers the basic concepts in programming (data types, if-statements and loops, functions) via examples of DNA-sequence analysis. This part also covers more complex subjects in software development such as objects and classes, modules, and APIs.
  3. Programming in R: The R language specializes in statistical data analysis, and is also quite useful for visualizing large datasets. This third part covers the basics of R as a programming language (data types, if-statements, functions, loops and when to use them) as well as techniques for large-scale, multi-test analyses. Other topics include S3 classes and data visualization with ggplot2.

Pass along to life scientists and students.

This isn’t the primer that separates the CS material from domain specific examples and prose. Adaptation to another domain is a question of re-writing.

I assume an adaptable primer wasn’t the author’s intention and so that isn’t a criticism but an observation that basic material is written over and over again, needlessly.

I first saw this in a tweet by Christophe Lalanne.

Applied Computational Genomics Course at UU: Spring 2017

Thursday, January 12th, 2017

Applied Computational Genomics Course at UU: Spring 2017 by Aaron Quinlan.

I initially noticed this resource from posts on the two part Introduction to Unix (part 1) and Introduction to Unix (part 2).

Both of which are too elementary for you but something you can pass onto others. They do give you an idea of the Unix skill level required for the rest of the course.

From the GitHub page:

This course will provide a comprehensive introduction to fundamental concepts and experimental approaches in the analysis and interpretation of experimental genomics data. It will be structured as a series of lectures covering key concepts and analytical strategies. A diverse range of biological questions enabled by modern DNA sequencing technologies will be explored including sequence alignment, the identification of genetic variation, structural variation, and ChIP-seq and RNA-seq analysis. Students will learn and apply the fundamental data formats and analysis strategies that underlie computational genomics research. The primary goal of the course is for students to be grounded in theory and leave the course empowered to conduct independent genomic analyses. (emphasis in the original)

I take it successful completion will also enable you to intelligently question genomic analyses by others.

The explosive growth of genomics makes that a valuable skill in public discussions as well something nice for your toolbox.

The Leek group guide to genomics papers

Thursday, January 22nd, 2015

The Leek group guide to genomics papers by Jeff Leek.

From the webpage:

When I was a student, my advisor John Storey made a list of papers for me to read on nights and weekends. That list was incredibly helpful for a couple of reasons.

  • It got me caught up on the field of computational genomics
  • It was expertly curated, so it filtered a lot of papers I didn’t need to read
  • It gave me my first set of ideas to try to pursue as I was reading the papers

I have often thought I should make a similar list for folks who may want to work wtih me (or who want to learn about statistical genomics). So this is my attempt at that list. I’ve tried to separate the papers into categories and I’ve probably missed important papers. I’m happy to take suggestions for the list, but this is primarily designed for people in my group so I might be a little bit parsimonious.

(reading list follows)

A very clever idea!

The value of such a list, when compared to the World Wide Web is that it is “curated.” Someone who knows the field has chosen and hopefully chosen well, from all the possible resources you could consult. By attending to those resources and not the page rank randomness of search results, you should get a more rounded view of a particular area.

I find such lists from time to time but they are often not maintained. Which seriously diminishes their value.

Perhaps the value-add proposition is shifting from making more data (read data, publications, discussion forums) available to filtering the sea of data into useful sized chunks. The user can always seek out more, but is enabled to start with a manageable and useful portion at first.

Hmmm, think of it as a navigational map, which lists longitude/latitude and major features. A that as you draw closer to any feature or upon request, can change its “resolution” to disclose more information about your present and impeding location.

For what area would you want to build such a navigational map?

I first saw this in a tweet by Christophe Lalanne

Online Bioinformatics / Computational Biology

Saturday, June 21st, 2014

An Annotated Online Bioinformatics / Computational Biology Curriculum by Stephen Turner.

From the post:

Two years ago David Searls published an article in PLoS Comp Bio describing a series of online courses in bioinformatics. Yesterday, the same author published an updated version, “A New Online Computational Biology Curriculum,” (PLoS Comput Biol 10(6): e1003662. doi: 10.1371/journal.pcbi.1003662).

This updated curriculum has a supplemental PDF describing hundreds of video courses that are foundational to a good understanding of computational biology and bioinformatics. The table of contents embedded into the PDF’s metadata (Adobe Reader: View>Navigation Panels>Bookmarks; Apple Preview: View>Table of Contents) breaks the curriculum down into 11 “departments” with links to online courses in each subject area:

  1. Mathematics Department
  2. Computer Science Department
  3. Data Science Department
  4. Chemistry Department
  5. Biology Department
  6. Computational Biology Department
  7. Evolutionary Biology Department
  8. Systems Biology Department
  9. Neurosciences Department
  10. Translational Sciences Department
  11. Humanities Department

The key term here is annotated. That is the author isn’t just listing courses from someone else’s list but has some experience with the course.

Should be a great resource whether you are a CS person looking at bioinformatics/computational biology or if you are a bioinformatics person trying to communicate with the CS side.


Journal of Bioinformatics and Computational Biology (JBCB)

Monday, December 19th, 2011

Journal of Bioinformatics and Computational Biology (JBCB)

From the Aims and Scope page:

The Journal of Bioinformatics and Computational Biology aims to publish high quality, original research articles, expository tutorial papers and review papers as well as short, critical comments on technical issues associated with the analysis of cellular information.

The research papers will be technical presentations of new assertions, discoveries and tools, intended for a narrower specialist community. The tutorials, reviews and critical commentary will be targeted at a broader readership of biologists who are interested in using computers but are not knowledgeable about scientific computing, and equally, computer scientists who have an interest in biology but are not familiar with current thrusts nor the language of biology. Such carefully chosen tutorials and articles should greatly accelerate the rate of entry of these new creative scientists into the field.

To give you an idea of the type of content you will find, consider:



Text mining can support the interpretation of the enormous quantity of textual data produced in biomedical field. Recent developments in biomedical text mining include advances in the reliability of the recognition of named entities (NEs) such as specific genes and proteins, as well as movement toward richer representations of the associations of NEs. We argue that this shift in representation should be accompanied by the adoption of a more detailed model of the relations holding between NEs and other relevant domain terms. As a step toward this goal, we study NE–term relations with the aim of defining a detailed, broadly applicable set of relation types based on accepted domain standard concepts for use in corpus annotation and domain information extraction approaches.

as representative content.