Paul Boutros uses bioinformatics to help realize the promise of personalized medicine

What if doctors could get one step ahead of cancer? What if they could predict how large a tumour will grow, or determine whether or not the disease will metastasize? What if they could know the likelihood that a treatment will be effective – or toxic – before it is even administered?

This is the vision of personalized medicine. By using knowledge about the human genome to discover genetic and genomic clues, researchers such as Paul Boutros, a Fellow with the Informatics and Bio-computing Platform at the Ontario Institute for Cancer Research (OICR), are helping make this vision a reality.

Biomarkers are like genetic clues, which help researchers predict a future biological state. For example, patients who express a certain combination of genes might be more likely to respond to one cancer treatment than another Other patients expressing a different combination of genes might experience toxic side effects from the very same treatment. By analyzing data generated from high-throughput analyses of cancer tissues scientists can find biomarkers, which can then be transformed into new tests and treatments for cancer.

If scientists and doctors are to realize an era of personalized medicine, they will need biomarkers to test for cancer and to develop courses of treatment tailored to specific individuals.

“There is a very strong variability in patient survival,” Boutros explains. “Our research has found that if two patients are given exactly the same treatment for very similar cases of lung cancer, they will not have the same outcome. The idea behind personalized medicine is to understand why these outcomes vary so we can develop better treatments that take into account variation between individuals.”

“So far, many researchers have attempted to make personalized medicine a reality but they have not been successful. One of the reasons for this is that they did not look at enough data,” he says.

While he was working on his PhD, Boutros and Dr. Ming-Sound Tsao, a clinician-scientist at Princess Margaret Hospital, tried to solve this problem by working with a very large lung cancer dataset: 15,000 genes in 800 patients. However, working with this much data presents another problem.

“At that point, the dataset becomes noisy – there is a lot of natural genetic variability between individuals, but only a very tiny amount of that variability is relevant to what we’re studying,” Boutros explains. “It’s like trying to navigate toward a dim light in a turbulent sea. To keep moving toward the light, you need to find ways to sort out all the waves and distortions.”

Traditionally researchers have been using a technique called linear analysis to sort through data on non-small cell lung cancer, but they have not achieved clear results. To improve this, the study by Drs. Boutros and Tsao developed and employed a new non-linear analysis methodology. Rather than assuming that each gene had additive and independent effects their technique tried to model the complex interactions between genes. This methodology allowed them to discover robust biomarkers for non-small cell lung cancer, something other researchers had previously not been able to achieve.

Their research identified a six-gene signature that is effective in predicting survival. In particular it could separate lung cancer patients into two groups, one predicted to have good prognosis and the other predicted to have poor prognosis. When patients were followed up over time, 40 per cent more patients predicted to have good prognosis were still alive five years after being diagnosed with cancer. The biomarker identified in this study has been licensed to a company that is working toward developing it into a clinically useful application.

“We think the information we’ve uncovered could be very useful for treating patients,” Boutros says.

After identifying their six-gene biomarker from the 15,000 genes they analyzed, Boutros and his colleagues took their research a step further. “We asked ourselves: is this the only good marker, or are we missing other ones?”

After further analysis, they discovered they were actually missing many. After running a study that required a massive amount of computer analysis – the equivalent of 15 years worth of computing on a normal desktop computer – they discovered that there were about a million good markers in the dataset.

“This shows us that in biology, there are many predictors of patient survival – and more broadly, many pathways that we can study,” he says. “We’re hoping that this could be a big step forward in the biomarker field.”

For example, Boutros and his colleagues imagine an approach in which different biomarkers could be analysed in the same patient, then “polled” to give doctors a more accurate answer about the progression of disease or the effectiveness of treatment. Boutros is currently working on more research in this area that combines the non-linear techniques that were successful in his lung cancer study with machine learning techniques that allow computers to improve their performance over time by “learning” from the datasets they are analysing.

For now, he and his colleagues are working primarily with breast and lung cancer datasets. However, their approach could be applicable to many types of cancer.

“We need to use very large datasets that are not available for all types of cancer, so the choice of which cancers we study is largely determined by the availability of datasets,” he explains. “To do research like this requires the coming together of many different people. We need people to generate and publish large clinical datasets. We need engaged clinicians, like Dr. Ming-Sound Tsao in our lung-cancer study. And we need the development of new algorithms and computational techniques to bring them all together.”

Limitations in the availability of datasets for most types of cancer underscore the importance of sequencing many types of cancer so researchers such as Boutros can have the data they need for important work such as biomarker development. He is excited about the opportunity to work with the pancreatic cancer data that will be produced at OICR in the coming years, along with other data generated by the International Cancer Genome Consortium (ICGC), an international effort to sequence 500 samples of each of the 50 most common types of cancer.

“The ICGC datasets will allow us to apply biomarker research to many types of cancer where it’s just not possible today for individual researchers. But most importantly, it will allow researchers to start developing personalized diagnostic tools and treatments that can be introduced to the clinic and that can reduce the mortality of all types of cancer.”

Date: 
April 1, 2009
Issue: 
2
Volume: 
3