News and Media
Backgrounder: Pan-Cancer Project
Supplementary information about the news release: Unprecedented exploration generates most comprehensive map of cancer genomes charted to date 

Supplementary information about the news release: Unprecedented exploration generates most comprehensive map of cancer genomes charted to date 


Overview of the Pan-Cancer Project

The ICGC/TCGA Pan-Cancer Analysis of Whole Genomes (PCAWG), known as the Pan-Cancer Project, is an international collaboration to identify common patterns of mutation in more than 2,600 whole cancer genomes from the International Cancer Genome Consortium (ICGC) and The Cancer Genome Atlas (TCGA). It builds upon the previous work of those initiatives, which predominantly concentrated on the regions of the genome that code for proteins.

Researchers aim to understand the genomic changes in many forms of cancer worldwide, with a view to enabling further research into causes, prevention, diagnosis and treatment of cancers.

The Pan-Cancer Project has explored the nature and consequences of DNA variations in cancer, across the entire genome, from both protein-coding genes and from areas of DNA that do not code for proteins. The Pan-Cancer Project is the most comprehensive analysis of the non-coding regions of cancer genomes performed to date.

DNA changes can be inherited (germline) or appear during a person’s life (somatic), and the Pan-Cancer Project is investigating both types of these variations in DNA of cancer cells, looking at areas involved in regulating genes, sites for non-coding RNA and large-scale structural rearrangements in the genome.

Why was the ICGC/TCGA Pan-Cancer Project needed?

This is the largest, most comprehensive analysis of cancer genomes to date.  To understand the complex changes in the genome, a huge amount of data was needed. This was only achieved through working collaboratively and sharing data. The project analysed almost every cancer genome throughout the world that was publically available at the start of the project.

What is the main finding from the Pan-Cancer Project?

The main point is that the cancer genome is finite and knowable, but enormously complicated. By combining sequencing of the whole cancer genome with a suite of analysis tools, it is possible to highlight and describe every genetic change found in a cancer. These include all the processes that have generated those mutations, the biochemical pathways in the cells that are affected by these genetic changes, the kinds of cells that were originally transformed from normal to cancerous, and even the order of key events during a cancer’s life history.

How will this help cancer research?

The Pan-Cancer researchers have provided comprehensive insights into many aspects of cancer genomes. Previous work had documented some of these features in some tumour types, but here, on the same, large international cohort of patients across all the common tumour types, all these aspects have been analysed together. This provides a more comprehensive, more uniform map of the cancer genome than the earlier snapshots had provided.

The ICGC/TCGA Pan-Cancer Project researchers have established an enormous resource for the scientific community to use, a resource that will underpin ongoing development of analysis methods, provide a testing ground for new ideas about cancer development and act as a benchmark for comparison of future sequencing studies.

Pan-Cancer Project data is available to the research community, and will help accelerate additional discoveries. Over time, these discoveries will lead to improved detection, management and treatment of cancer.

Cancer genomes are complex, and much more data, potentially in thousands to tens of thousands of patients per tumour type, are needed to fully understand them – this is why shared data and resources like the Pan-Cancer Project are so important.

The suite of analysis tools generated by the project has been also released to the scientific and clinical communities, and is free to be used and further developed. This is important because data analysis has been a major barrier to improving access to cancer genome sequencing. The raw sequencing data and downstream analytical results are also released to the community under appropriate controls to safeguard participants’ privacy.

How will the Pan-Cancer Project help cancer patients?

The study will enable more personalised medicine in the future, once clinical whole genome sequencing of a patient’s cancer becomes more widely adopted. This will include accurate diagnosis of tumour type, better prediction of clinical outcome, and choice of the optimal treatment for the patient.

The Pan-Cancer researchers have developed a method to find out where cancers come from (find the ‘cell of origin’) in patients in whom this wasn’t possible to identify using standard diagnostic techniques. This could impact diagnosis and treatment in the future.

Due to the study, researchers can now carbon-date cancers, and identify the age of tumours and the key genomic stages they pass through. This has helped us identify what the earliest changes are in the evolution of many cancer types, with the potential to develop new strategies for diagnosing or intervening in tumours at earlier stages. We are not there yet, but this would be the goal.

By looking at the 99% of the cancer genome that was previously invisible – the part that doesn’t code for proteins – the study filled in gaps in our knowledge of what drives cancer. At least one causative genetic change was found in more than 95% of all cancers in the study, and many individual tumours had 5-10 or more causative mutations identified. This information will help us find better methods for diagnosis, because the causative mutations inform what type of tumour developed, and better drugs, because the causative mutations may suggest useful drug targets. A future goal, begun in the Pan-Cancer Project, is to be able to identify for any given patient in clinic all of the specific mutations that drive his or her cancer.

Researchers described many new processes generating mutations in cancer genomes. These processes leave distinctive ‘mutational signatures’ in the genome, and these signatures can give clues as to what may have caused the cancer. For example, lifestyle exposures such as cigarette smoking and sun-bathing can cause patterns of mutation that are highly distinctive; likewise, inherited cancer disorders can lead to distinctive signatures. These signatures can be read from a patient’s cancer genome, and then compared against the compendium of signatures generated in this study.

What else has the Pan-Cancer Project revealed?

  • By combining data on coding and non-coding cancer-causing genetic changes, at least one mutation that caused cancer was found in virtually all (95%) of the cancers analyzed, with most patients’ tumours having a handful of genetic causal events identified. This suggests that we are close to the goal of cataloguing all of the biological pathways involved in cancer.
  • Revealed new “roads leading to Rome” that may provide avenues for treatment. Cancers use various ways to activate pathways that lead to tumours (oncogenic pathways). The Pan-Cancer Project study has mapped out additional routes involving structure, transcription, and driver mutations in the non-coding parts of the genome for a comprehensive set of tumour types.
  • There is massive complexity in how the cancer cell interprets the genome. Different genetic changes in the DNA can lead to extensive variability in the RNA transcription undertaken by the cell, which is the first level of a cell’s interpretation of the genome. Many of these RNA changes are important first messages instructing the cell to behave like a cancer cell.
  • The processes that generate mutations in cancer genomes are hugely diverse, with more than 80 different patterns of mutation, ranging from changes affecting single DNA letters to large-scale reorganisation of whole chromosomes.
  • Many specialised regions of the genome are disrupted in cancers compared to normal cells, including DNA in mitochondria, the power-houses of cells; telomeres, which cap the ends of chromosomes; repetitive DNA sequences, which can reactivate and multiply in a tumour’s genome; and virus genomes, which can insert nearby particular cancer genes.

Data resources – how people can access the data

Pan-Cancer project researchers established an enormous resource for the scientific community to use, enabling a wider and deeper exploration of the cancer genome, by making sequencing data on genomes’ non-coding regions available and providing tools to examine this data. It is expected that the availability of this resource will lead to further discoveries and help researchers improve the detection, management and treatment of cancer.

  • Open-tier data can be viewed at https://dcc.icgc.org/
  • Detailed instructions for obtaining access to the controlled-tier PCAWG data can be found in the DCC PCAWG documentation pages (https://docs.icgc.org/pcawg/data/).
  • Researchers can contact dcc-support@icgc.org if they have inquiries about data access.

Next steps

Further insights into cancer biology are expected to be made using the Pan-Cancer data and related software tools that have been made available to the global cancer research community.

In 2015, the ICGC, in response to the realization of the potential of genomics in healthcare, released a position “white paper” on the evolution of ICGC into more directly impacting on human health. Emanating from the ICGC for Medicine (ICGCmed) white paper is ICGC’s next project which aims to Accelerate Research in Genomic Oncology (The ARGO Project), where key clinical questions and patient clinical data drive the interrogation of cancer genomes. More information can be found at https://icgc-argo.org/.

Return to the main release