Researchers have been given a powerful new tool to search for the mutations behind the development of cancer, leading to a better understanding of the disease, and ultimately, better care for patients. On June 6, U.S. Vice President Joseph Biden announced the launch of the Genomic Data Commons (GDC), an ambitious new project that is making a staggering amount of data available to scientists for analysis while also allowing researchers to share their own data with the wider research community.
At the time of its launch the GDC already had about 4.1 petabytes (4.1 million gigabytes) of data for researchers to mine for the gene mutations in DNA that cause cancer. Once these mutations are found they can, for example, be used to create prognostic and diagnostic tests, or as targets for new drugs. By centralizing, standardizing and harmonizing the datasets shared on the system, the GDC is making these data more widely available and is also providing tools for analysis. Bringing this interoperability to the genomic research field is a game-changer.
In his remarks Vice President Biden lauded the potential of the GDC. “This is good news in the fight against cancer,” he said. “With the launch of this new national resource, anyone can freely access raw genomic and clinical data for 12,000 patients – with more records to follow. Increasing the pool of researchers who can access data and decreasing the time it takes for them to review and find new patterns in that data is critical to speeding up development of lifesaving treatments for patients.”
The GDC is an initiative of the National Cancer Institute in the U.S. and was developed by the University of Chicago (UChicago) under a subcontract with Leidos Biomedical Research. OICR was awarded a subcontract by UChicago.
At the time of its launch the GDC already had about 4.1 petabytes of data for researchers to mine for the gene mutations in DNA that cause cancer
OICR’s efforts were led by Dr. Vincent Ferretti, Senior Principal Investigator and Associate Director, Bioinformatics Software Development. His team of bioinformaticians, software developers, a business analyst and a project manager worked to create the ‘front end’ of the GDC system. This included the GDC Data Portal, Data Submission System and the user documentation website. OICR’s web development team, led by Francis Ouellette, also contributed to the project by developing the GDC public website.
Looking at the work Ferretti and his team has done in the past it is no surprise why they were selected to work on the project. “Our group became very experienced in building this type of system through our work developing the data portal for the International Cancer Genome Consortium,” says Ferretti. “Being involved in creating the GDC allowed us to put this experience to great use and even expand the experience and capabilities of our team.”
Currently GDC users can search for cases and files that meet advanced search criteria and download corresponding data. OICR’s team will now move to a new phase of development that will focus on adding new visualizations for various data types such as simple somatic mutations, copy number variants, gene expression, and many others.
Read the news release:
Genomic Data Commons at University of Chicago heralds new era of data sharing for cancer research