Detection of circulating tumor DNA by whole genome sequencing enables prediction of recurrence in stage III colorectal cancer patients

All inquiries regarding this data set should be directed towards the following contact persons:

Study description

Whole-genome sequencing (WGS) data was generated from tumor DNA (DatasetID A) and paired normal DNA from peripheral blood mononuclear cells (DatasetID B) from 144 patients with stage III colorectal cancer (CRC). This included two patients with synchronous tumors (n = 2), resulting in a total of 146 tumor biopsies. Tumor samples were either acquired as fresh-frozen (n = 111) or formalin-fixed paraffin-embedded (n = 35) tissue biopsies. Furthermore, WGS data was generated for plasma-derived cell-free DNA (DatasetID D) from 1283 serially collected plasma samples from the 144 stage III CRC patients.

Processed sequencing files generated by Mutect2, Strelka2, and FACETS were generated for all tumor samples (n = 146) (DatasetID D) and post-operative plasma samples with an estimated tumor fraction above 10% (n = 17) (DatasetID E). For all plasma samples, ctDNA status (positive/negative), and the estimated tumor fractions were computed (DatasetID F). Clinical information, such as sample collection timepoint (relative to primary surgery), treatment information, recurrence status, and information on recurrence intervention were collected for all patients (DatasetID G).


Dataset IDSamplesSample TypeTechnologySequencing Platform
ACRC-001_T01, CRC-002_T01, CRC-003_T01, … etc.Tumor DNAIllumina NGSIllumina NovaSeq 6000
BCRC-001_N01, CRC-002_N01, CRC-003_N01, … etc.Normal DNAIllumina NGSIllumina NovaSeq 6000
CCRC-001_P01, CRC-001_P02, CRC-001_P03, CRC-001_P04, CRC-002_P01, CRC-002_P02, … etc.cfDNAIllumina NGSIllumina NovaSeq 6000


Dataset IDSamplesSample TypeMethodFile type
DCRC-001_T01, CRC-002_T01, CRC-003_T01, … etc.Tumor DNAMutect2, Strelka2, FACETSVCF
ECRC-039_P02, CRC-039_P06, CRC-041_P05, CRC-041_P06, CRC-059_P07, … etc.cfDNAMutect2, Strelka2, FACETSVCF
FCRC-001_P01, CRC-001_P02, CRC-001_P03, CRC-001_P04, CRC-002_P01, CRC-002_P02, … etc.cfDNATumor-informed ctDNA results using custom proprietary algorithmCSV
GCRC-001, CRC-002, CRC-003, … etc.Patient level informationN/ACSV

Original publication

Frydendahl et al. Detection of circulating tumor DNA by whole genome sequencing enables prediction of recurrence in stage III colorectal cancer patients.

Data access

External researchers (academic or commercial) interested in analysing the colorectal cancer dataset will need to contact the Data Access Committee via email to Access to clinical data and processed sequencing data output files (Mutect2 v4.2.4.1, Strelka2 v2.9.10, and FACETS v0.6.2) used in the article requires that the data requestor (legal entity) enter into Collaboration and Data Processing Agreements, with the Central Denmark Region (the legal entity controlling and responsible for the data). Request for access to raw sequencing data furthermore requires that the purpose of the data re-analysis is approved by The Danish National Committee on Health Research Ethics. Upon reasonable request, the authors, on behalf of the Central Denmark Region, will enter into a collaboration with the data requestor to apply for approval.