Hartwig Medical Foundation Data Access Request Guide

This page provides practical information on how to access / work with the data that will be made available to you within the context of a Data Access Request (DR) from Hartwig Medical Foundation.

Note: more details on the methods used to generate both the genomic and clinical data can be found on a separate Methods page.

Contents

General Notes

Please use the unique ID given to your request (eg. "DR-XXX") in any communication with us about your data request.

Sample selection

By default, in addition to data-request specific criteria, samples for which one of the below applies are excluded:

Doid in database

You can find the three with doids in de database here.

Format of the data made available

Clinical Data (TSV format)

Clinical data will be made available in a metadata.tar via GCP.

Some notes about the clinical data:

Please find more details on the methods used to generate both the genomic and clinical data on a separate Methods page.

Somatic Data (VCF/TXT formats)

Somatic data will be made available in a somatics.tar via GCP.

Per sample the following files are present:

For an explanation of the contents of these files, see PURPLE.

Germline Data (VCF/TXT formats)

Germline data will be made available in a germline.tar file via GCP.

We share the SNVs and small INDELs called from the reference sample using GATK haplotype caller.

Aligned readout data (CRAM format)

Aligned readout data will be made available per sample via GCP.

Some notes to keep in mind:

Example loading CRAM file in IGV:

It is possible to directly load CRAM files into IGV using the Google Cloud Storage URL. Please note that to do this, IGV requires your permission to access both Google Cloud Storage and Google Drive. It is at this time not possible to exclude Google Drive from these permissions. To load a CRAM file directly from Google Cloud Storage:

Please find more details on the methods used to generate both the genomic and clinical data on a separate Methods page.

RNA-seq data (FASTQ format)

RNA-seq data will be made available per sample via GCP.

Some notes to keep in mind:

More information