July 25, 2008

In early May scientists at Columbia University gathered in room 607 of the Sherman Fairchild building on Morningside campus to celebrate the new Pe'er/Bussemaker Lab for Systems Biology—the first of its kind at the University. The goal of the lab is to develop and apply complex tools that can probe and derive meaning from mountains of data now being created in the rapidly expanding field of systems and computational biology.

The positions of protein side-chains contacting a Watson-Crick base-pair in a variety of protein-DNA complexes
The Pe'er-Bussemaker Lab is using high-throughput genomics data to infer a universal protein-DNA recognition code. Shown are the positions of protein side-chains contacting a Watson-Crick base-pair in a variety of protein-DNA complexes.

Image credit: Harmen Bussemaker

The data is the result of research efforts such as the Human Genome Project and revolutionary sequencing technologies that are capable of reading over 100 billion letters of DNA in just a few days. Such technologies include high-density microarrays, which measure and analyze the activity within a cell and are capable of quantifying the levels of more than a million unique RNAs in a single experiment, and multi-laser flow cytometry, which measures the abundance of multiple signaling molecules in over 100,000 individual cells in a just few minutes.

Systems and computational biology is the meeting point between modern molecular biology and new research techniques emerging from the engineering, computer science, chemistry, mathematics, statistics and physics fields. It has the potential to allow scientists to pose limitless questions about how our cells work and issues related to general human health: the study of gene networks, analysis of protein shapes, prediction of biological function and understanding how a cell processes signals.

"Concealed in such data are answers to important questions such as 'What goes wrong in disease' and 'What drug targets can lead to a cure?'" said Dana Pe'er, cofounder of the new lab and assistant professor of biological sciences. "Ultimately, the rise of personal medicine will answer 'What is the best drug for a particular individual?' guided by an individual's unique DNA code. Our role as computational biologists is to develop the methodologies to extract those answers, currently hidden like a needle in the haystack of data."

Pe'er and fellow lab co-director, Harmen Bussemaker, say that trying to organize and gain insight into the ongoing explosion of molecular data is like being forensic detectives using flashy medical technologies to solve murders on the popular TV series, "CSI."

"If you are familiar with the show," said Pe'er, "there are basically a lot of seemingly disparate, complicated and different types of clues. The answers are not written on the wall. To get anywhere, detectives have to take all these different clues and technologies and then learn how to put the complete puzzle together. That's us. We're detectives."

The Pe'er/Bussemaker Lab for Systems Biology boasts a 'wet lab'—space equipped with molecular biology apparatus and sinks, and a 'smart board' (a whiteboard-computer with touch screen display, digital writing, video projection and other capabilities considered crucial for visualizing large amounts of data among groups of researchers). The lab is physically designed to promote and encourage the open exchange of ideas among students, faculty, staff and researchers from a variety of academic backgrounds.

"The new lab allows students from across different disciplines to interact and openly discuss their research. Each discipline—biology, computer science, physics, engineering, chemistry and mathematics—contributes tools and a particular way of thinking," said Pe'er. "Interdisciplinary collaborative science will make our science better and advance quicker towards grand challenges such as a systems-level understanding of how our cells work and even toward a cure for cancer."

While rapid advances in technology are leading to increasing amounts of biological data, the data cannot be grasped nor understood without the aid of sophisticated mathematical and computational techniques. The data are being generated faster than can be analyzed. Bussemaker says one of the lab's goals is to make sense of potentially valuable information hidden within the mountains of data already generated, and to ensure potential discoveries are not lost in the sea of data.

"Our role as computational biologists is to develop the methodologies to extract those answers, currently hidden like a needle in the haystack of data."

"Being able to perform unbiased analysis with these data sets allow you to rediscover things, and by putting together and examining all of these data patterns you can figure out hidden variables of the cell, how the gene expression is controlled, for example, and what the different regulators are," said Bussemaker.

"Vast amounts of data are being produced in super-exponential rates; novel ground-breaking technologies are being invented so much faster than the rate at which scientists can understand and leverage them to gain biological insights," adds Pe'er. "It's like buying a whole pie, eating a tiny piece and throwing the rest away. Most of the data is only looked at on the very, very surface. And most of the data is only scarcely being used, leaving the rest untouched."

Professors Pe'er and Harmen say their new lab reflects Columbia's support for computational biology, a commitment Pe'er says can be seen in the Center for Computational Biology and Bioinformatics (C2B2), established in 2006 at the Medical campus.

"Columbia has seen a very dramatic elevation in status in systems and computational biology with the initiation of the C2B2, which is fast becoming one of the best computational centers around," said Pe'er. "The activity between the uptown medical campus and here on Morningside makes Columbia one of the top five computational biology centers in the world."

Top   |    E-mail this story

© Columbia University